wmt/wmt14
TranslationCS, DE, EN
Wmt/wmt14 is a translation dataset in CS, DE, EN from wmt in Parquet format.
About wmt/wmt14
Dataset Card for "wmt14"
Dataset Summary
Warning: There are issues with the Common Crawl corpus data (training-parallel-commoncrawl.tgz):
Non-English files contain many English sentences.
Their "parallel" sentences in E...
Details
- Task
- Translation
- Language
- CS, DE, EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- wmt
- Year
- 2022