Skip to content

lucius1022/DeMix_Corpora

General NLPEnglish

Created by lucius1022 at 2026, the lucius1022/DeMix_Corpora is a General NLP dataset in English in Parquet format.

About lucius1022/DeMix_Corpora

Dataset Card for DeMix Corpora DeMix šŸ“„ Paper: Decouple Searching from Training: Scaling Data Mixing via Model Merging for Large Language Model Pre-training šŸ¤— Dataset: DeMix Corpora 🐱 Github: Demix Dataset Details ...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Creator
lucius1022
Year
2026
Download

Related General NLP datasets

FAQ