LLM360/TxT360
Text GenerationEN
LLM360/TxT360 is a text generation dataset in EN from LLM360 in Parquet format.
About LLM360/TxT360
TxT360: A Top-Quality LLM Pre-training Dataset Requires the Perfect Blend
Changelog
Version
Details
v1.1
Added new data sources: TxT360_BestOfWeb, TxT360_QA, europarl-aligned, and wikipedia_extended.
...
Details
- Task
- Text Generation
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- LLM360
- Year
- 2026