ai4bharat/BPCC
General NLPEnglish
The ai4bharat/BPCC dataset is a English General NLP resource from ai4bharat at 2025. With 511 downloads and 32 likes, it is actively used by the community and is a 100M<n<1B-scale dataset.
About ai4bharat/BPCC
BPCC Dataset
Training
Bharat Parallel Corpus Collection (BPCC) is a comprehensive and publicly available parallel corpus that includes both existing and new data for all 22 scheduled Indic languages. It is comprised of two parts: BPC...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 100M<n<1B
- Creator
- ai4bharat
- Year
- 2025
- Downloads
- 511
- Likes
- 32