Skip to content

NovaSky-AI/Sky-T1_data_17k

General NLPEnglish

The NovaSky-AI/Sky-T1_data_17k dataset is a English General NLP resource from NovaSky-AI at 2025.

About NovaSky-AI/Sky-T1_data_17k

Sky-T1_data_17k.json: The 17k training data used to train Sky-T1-32B-Preview. The final data contains 5k coding data from APPs and TACO, and 10k math data from AIME, MATH, and Olympiads subsets of the NuminaMATH dataset. In addition, we maintain 1...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Creator
NovaSky-AI
Year
2025
Download

Related General NLP datasets

FAQ