Skip to content

sapientinc/HRM-Text-data-io-cleaned-20260515

Text GenerationEN

The sapientinc/HRM-Text-data-io-cleaned-20260515 dataset is a EN text generation resource from sapientinc at 2026.

About sapientinc/HRM-Text-data-io-cleaned-20260515

Pre-built HRM-Text pretraining dataset from raw data using the data_io cleaning scripts. Citation If you find this project or our paper useful, please consider citing our paper: @misc{wang2026hrmtextefficientpretrainingscaling, ti...

Details

Task
Text Generation
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
sapientinc
Year
2026
Download

Related Text Generation datasets

FAQ