Skip to content

OpenDataArena/ODA-Mixture-100k

General NLPEnglish

OpenDataArena/ODA-Mixture-100k is a General NLP-focused dataset in English distributed in Parquet format.

About OpenDataArena/ODA-Mixture-100k

ODA-Mixture-100k ODA-Mixture-100k is a compact general-purpose post-training dataset curated from top-performing open corpora (selected via the *OpenDataArena* leaderboard) and refined through deduplication, benchmark decontamination. ...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Creator
OpenDataArena
Year
2025
Download

Related General NLP datasets

FAQ