openbmb/InfLLM-V2-data-5B
General NLPEN, ZH
Openbmb/InfLLM-V2-data-5B is a General NLP-focused dataset in EN, ZH distributed in Parquet format.
About openbmb/InfLLM-V2-data-5B
InfLLM-V2 Long-Context Training Dataset with 5B Tokens
Project Links: [Paper] [InfLLM-V2 Models] [CUDA Kernel Code]
🚀 About InfLLM-V2
InfLLM-V2 is a native sparse attention framework designed for the efficient processing of long-seq...
Details
- Task
- General NLP
- Language
- EN, ZH
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- openbmb
- Year
- 2025