Skip to content

natgillin/translations-raw

General NLPMULTILINGUAL

Natgillin/translations-raw is a General NLP-focused dataset in MULTILINGUAL distributed in Parquet format.

About natgillin/translations-raw

natgillin/translations-raw Frozen, canonical raw bitext consolidated from upstream alvations/mtdata-raw* snapshots (since deleted). This is the read-only source-of-truth for downstream quality-filtering pipelines. 31,663 parquet files (1566.8 ...

Details

Task
General NLP
Language
MULTILINGUAL
Format
Parquet
Rows / instances
N/A
Creator
natgillin
Year
2026
Download

Related General NLP datasets

FAQ