Skip to content

Skylion007/openwebtext

Text GenerationFill MaskEN

Skylion007/openwebtext is a text generation-focused dataset in EN distributed in Parquet format.

About Skylion007/openwebtext

Dataset Card for "openwebtext" Dataset Summary An open-source replication of the WebText dataset from OpenAI, that was used to train GPT-2. This distribution was created by Aaron Gokaslan and Vanya Cohen of Brown University. ...

Details

Task
Text Generation, Fill Mask
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
Skylion007
Year
2022
Download

Related Text Generation, Fill Mask datasets

FAQ