Skip to content

Helsinki-NLP/fineweb-edu-translated

TranslationText GenerationBOS, BUL, CATodc-by

The Helsinki-NLP/fineweb-edu-translated dataset is a BOS, BUL, CAT translation resource from Helsinki-NLP at 2025. With 98.1K downloads and 16 likes, it is actively used by the community. It is released under the odc-by license and is a 1B<n<10B-scale dataset.

About Helsinki-NLP/fineweb-edu-translated

Helsinki-NLP/fineweb-edu-translated fineweb-edu-tanslated is a collection of automatically translated documents from fineweb-edu. Translations are based on OPUS-MT and HPLT-MT models. The data in v1.0 covers 36,704,000 documents with over 28 bi...

Details

Task
Translation, Text Generation
Language
BOS, BUL, CAT
Format
Parquet
Rows / instances
N/A
Size
1B<n<10B
Creator
Helsinki-NLP
Year
2025
License
odc-by
Downloads
98101
Likes
16
Download Homepage

Related Translation, Text Generation datasets

FAQ