OpenDataArena/ODA-Mixture-100k
General NLPEnglish
OpenDataArena/ODA-Mixture-100k is a General NLP-focused dataset in English distributed in Parquet format.
About OpenDataArena/ODA-Mixture-100k
ODA-Mixture-100k
ODA-Mixture-100k is a compact general-purpose post-training dataset curated from top-performing open corpora (selected via the *OpenDataArena* leaderboard) and refined through deduplication, benchmark decontamination.
...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- OpenDataArena
- Year
- 2025