Skip to content

wmt/wmt14

TranslationCS, DE, EN

Wmt/wmt14 is a translation dataset in CS, DE, EN from wmt in Parquet format.

About wmt/wmt14

Dataset Card for "wmt14" Dataset Summary Warning: There are issues with the Common Crawl corpus data (training-parallel-commoncrawl.tgz): Non-English files contain many English sentences. Their "parallel" sentences in E...

Details

Task
Translation
Language
CS, DE, EN
Format
Parquet
Rows / instances
N/A
Creator
wmt
Year
2022
Download

Related Translation datasets

FAQ