Skip to content

htriedman/grokipedia-v0.1-dump

General NLPEN

Created by htriedman at 2025, the htriedman/grokipedia-v0.1-dump is a General NLP dataset in EN in Parquet format. With 10.5K downloads and 14 likes, it is actively used by the community. It is released under the other license and is a 10M<n<100M-scale dataset.

About htriedman/grokipedia-v0.1-dump

Grokipedia v0.1 Scrape This dataset represents a strctured, nearly-full point-in-time scrape of Grokipedia v0.1 as of the end of October / beginning of November 2025. It also includes embeddings of 250-token semi-overlapping chunks of the Groki...

Details

Task
General NLP
Language
EN
Format
Parquet
Rows / instances
N/A
Size
10M<n<100M
Creator
htriedman
Year
2025
License
other
Downloads
10537
Likes
14
Download Homepage

Related General NLP datasets

FAQ