qihoo360/Light-R1-SFTData
Text GenerationEnglish
Qihoo360/Light-R1-SFTData is a text generation dataset in English from qihoo360 in Parquet format.
About qihoo360/Light-R1-SFTData
Light-R1: Surpassing R1-Distill from Scratch* with $1000 through Curriculum SFT & DPO
*from models without long COT
technical report
GitHub page
Here are the two-stage SFT data we used to train Light-R1-32B.
Simply refer to stage1-76k.json and ...
Details
- Task
- Text Generation
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- qihoo360
- Year
- 2025