xinlai/Math-Step-DPO-10K
General NLPEN
Xinlai/Math-Step-DPO-10K is a General NLP dataset in EN from xinlai in Parquet format.
About xinlai/Math-Step-DPO-10K
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
🖥️Code | 🤗Data | 📄Paper
This repo contains the Math-Step-DPO-10K dataset for our paper Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs, St...
Details
- Task
- General NLP
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- xinlai
- Year
- 2024