Helsinki-NLP/tatoeba
TranslationAB, ACM, ADYcc-by-2.0
Created by Helsinki-NLP at 2022, the Helsinki-NLP/tatoeba is a translation dataset in AB, ACM, ADY containing 413,190 records in Parquet format. With 2.2K downloads and 56 likes, it is actively used by the community. It is released under the cc-by-2.0 license and is a 10K<n<100K-scale dataset.
About Helsinki-NLP/tatoeba
This is a collection of translated sentences from Tatoeba
359 languages, 3,403 bitexts
total number of files: 750
total number of tokens: 65.54M
total number of sentence fragments: 8.96M
Details
- Task
- Translation
- Language
- AB, ACM, ADY
- Format
- Parquet
- Rows / instances
- 413190
- Size
- 10K<n<100K
- Creator
- Helsinki-NLP
- Year
- 2022
- License
- cc-by-2.0
- Downloads
- 2165
- Likes
- 56