princeton-nlp/prolong-data-64K
General NLPEN
Princeton-nlp/prolong-data-64K is a General NLP-focused dataset in EN distributed in Parquet format.
About princeton-nlp/prolong-data-64K
princeton-nlp/prolong-data-64K
[Paper] [HF Collection] [Code]
ProLong (Princeton long-context language models) is a family of long-context models that are continued trained and supervised fine-tuned from Llama-3-8B, with a maximum context windo...
Details
- Task
- General NLP
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- princeton-nlp
- Year
- 2024