Skip to content

Helsinki-NLP/nemotron-cc-translated

TranslationText GenerationBOS, BUL, CATcc0-1.0

Created by Helsinki-NLP at 2025, the Helsinki-NLP/nemotron-cc-translated is a translation dataset in BOS, BUL, CAT in Parquet format. With 43.5K downloads and 5 likes, it is actively used by the community. It is released under the cc0-1.0 license and is a 1B<n<10B-scale dataset.

About Helsinki-NLP/nemotron-cc-translated

Helsinki-NLP/nemotron-cc-translated nemotron-cc-tanslated is a collection of automatically translated documents from nemotron-cc taken out of the high-quality subset. Translations are based on OPUS-MT and HPLT-MT models. The data in v1.0 covers...

Details

Task
Translation, Text Generation
Language
BOS, BUL, CAT
Format
Parquet
Rows / instances
N/A
Size
1B<n<10B
Creator
Helsinki-NLP
Year
2025
License
cc0-1.0
Downloads
43513
Likes
5
Download Homepage

Related Translation, Text Generation datasets

FAQ