natgillin/translations-raw
General NLPMULTILINGUAL
Natgillin/translations-raw is a General NLP-focused dataset in MULTILINGUAL distributed in Parquet format.
About natgillin/translations-raw
natgillin/translations-raw
Frozen, canonical raw bitext consolidated from upstream alvations/mtdata-raw* snapshots (since deleted). This is the read-only source-of-truth for downstream quality-filtering pipelines.
31,663 parquet files (1566.8 ...
Details
- Task
- General NLP
- Language
- MULTILINGUAL
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- natgillin
- Year
- 2026