togethercomputer/RedPajama-Data-Instruct
General NLPEnglish
Created by togethercomputer at 2023, the togethercomputer/RedPajama-Data-Instruct is a General NLP dataset in English in Parquet format.
About togethercomputer/RedPajama-Data-Instruct
Dataset Summary
RedPajama-Instruct-Data is curated from a diverse collection of NLP tasks from both P3 (BigScience) and Natural Instruction (AI2),
and conduct aggressive decontamination against HELM,
in two steps: (1) We first conduct semanti...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- togethercomputer
- Year
- 2023