BAAI/CCI3-Data
General NLPZHapache-2.0
Created by BAAI at 2024, the BAAI/CCI3-Data is a General NLP dataset in ZH in Parquet format. With 818 downloads and 38 likes, it is actively used by the community. It is released under the apache-2.0 license.
About BAAI/CCI3-Data
Data Description
To address the scarcity of high-quality safety datasets in the Chinese, we open-sourced the CCI (Chinese Corpora Internet) dataset on November 29, 2023. Building on this foundation, we continue to expand the data source, adopt ...
Details
- Task
- General NLP
- Language
- ZH
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- BAAI
- Year
- 2024
- License
- apache-2.0
- Downloads
- 818
- Likes
- 38