Skip to content

CohereLabs/aya_collection

Text ClassificationSummarizationTranslationACE, AFR, AMHapache-2.0

The CohereLabs/aya_collection dataset is a ACE, AFR, AMH text classification resource from CohereLabs at 2024 comprising 513,758,189 examples. With 37.8K downloads and 239 likes, it is actively used by the community. It is released under the apache-2.0 license and is a 100M<n<1B-scale dataset.

About CohereLabs/aya_collection

This dataset is uploaded in two places: here and additionally here as 'Aya Collection Language Split.' These datasets are identical in content but differ in structure of upload. This dataset is structured by folders split according to dataset name...

Details

Task
Text Classification, Summarization, Translation
Language
ACE, AFR, AMH
Format
Parquet
Rows / instances
513758189
Size
100M<n<1B
Creator
CohereLabs
Year
2024
License
apache-2.0
Downloads
37801
Likes
239
Download Homepage

FAQ