Skip to content

Congliu/Chinese-DeepSeek-R1-Distill-data-110k

Text GenerationQuestion AnsweringZHapache-2.0

Created by Congliu at 2025, the Congliu/Chinese-DeepSeek-R1-Distill-data-110k is a text generation dataset in ZH in Parquet format. With 800 downloads and 764 likes, it is actively used by the community. It is released under the apache-2.0 license and is a 100K<n<1M-scale dataset.

About Congliu/Chinese-DeepSeek-R1-Distill-data-110k

中文基于满血DeepSeek-R1蒸馏数据集(Chinese-Data-Distill-From-R1) 🤗 Hugging Face   |   🤖 ModelScope    |   🚀 Github    |   📑 Blog 注意:提供了直接SFT使用的版本,点击下载。将数据中的思考和答案整合成output字段,大部分SFT代码框架均可直接直接加载训练。 本数据集为中文开源蒸馏满血R1的数据集,数据集中不仅包含math数据,还包括大量的通用类型数据,总数量为110K。 ...

Details

Task
Text Generation, Question Answering
Language
ZH
Format
Parquet
Rows / instances
N/A
Size
100K<n<1M
Creator
Congliu
Year
2025
License
apache-2.0
Downloads
800
Likes
764
Download Homepage

Related Text Generation, Question Answering datasets

FAQ