Skip to content

BAAI/CCI3-Data

General NLPZHapache-2.0

Created by BAAI at 2024, the BAAI/CCI3-Data is a General NLP dataset in ZH in Parquet format. With 818 downloads and 38 likes, it is actively used by the community. It is released under the apache-2.0 license.

About BAAI/CCI3-Data

Data Description To address the scarcity of high-quality safety datasets in the Chinese, we open-sourced the CCI (Chinese Corpora Internet) dataset on November 29, 2023. Building on this foundation, we continue to expand the data source, adopt ...

Details

Task
General NLP
Language
ZH
Format
Parquet
Rows / instances
N/A
Creator
BAAI
Year
2024
License
apache-2.0
Downloads
818
Likes
38
Download Homepage

Related General NLP datasets

FAQ