Skip to content

nvidia/ChatQA2-Long-SFT-data

General NLPEN

Nvidia/ChatQA2-Long-SFT-data is a General NLP-focused dataset in EN distributed in Parquet format.

About nvidia/ChatQA2-Long-SFT-data

Data Description Here, we release the full long SFT training dataset of ChatQA2. It consists of two parts: long_sft and NarrativeQA_131072. The long_sft dataset is built and derived from existing datasets: LongAlpaca12k, GPT-4 samples from Open...

Details

Task
General NLP
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
nvidia
Year
2024
Download

Related General NLP datasets

FAQ