opencsg/chinese-cosmopedia
Text GenerationZH
Opencsg/chinese-cosmopedia is a text generation dataset in ZH from opencsg in Parquet format.
About opencsg/chinese-cosmopedia
Chinese Cosmopedia Dataset [中文] [English]
[OpenCSG Community] [👾github] [wechat] [Twitter]
📖Technical Report
The Chinese Cosmopedia dataset contains a total of 15 million entries, approximately 60B tokens. Two key ele...
Details
- Task
- Text Generation
- Language
- ZH
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- opencsg
- Year
- 2024