htriedman/grokipedia-v0.1-dump
General NLPEN
Created by htriedman at 2025, the htriedman/grokipedia-v0.1-dump is a General NLP dataset in EN in Parquet format. With 10.5K downloads and 14 likes, it is actively used by the community. It is released under the other license and is a 10M<n<100M-scale dataset.
About htriedman/grokipedia-v0.1-dump
Grokipedia v0.1 Scrape
This dataset represents a strctured, nearly-full point-in-time scrape of Grokipedia v0.1 as of the end of October / beginning of November 2025.
It also includes embeddings of 250-token semi-overlapping chunks of the Groki...
Details
- Task
- General NLP
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10M<n<100M
- Creator
- htriedman
- Year
- 2025
- License
- other
- Downloads
- 10537
- Likes
- 14