Salesforce/wikitext
Text GenerationFill MaskEN
Created by Salesforce at 2026, the Salesforce/wikitext is a text generation dataset in EN containing 3,708,608 records in Parquet format.
About Salesforce/wikitext
Dataset Card for "wikitext"
Dataset Summary
The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified
Good and Featured articles on Wikipedia. The dataset is availa...
Details
- Task
- Text Generation, Fill Mask
- Language
- EN
- Format
- Parquet
- Rows / instances
- 3,708,608
- Creator
- Salesforce
- Year
- 2026