HuggingFaceFW/finewiki
Text GenerationEnglishcc-by-sa-4.0
HuggingFaceFW/finewiki is a text generation dataset in English from HuggingFaceFW in Parquet format. It is distributed under the cc-by-sa-4.0 license and falls in the 10M<n<100M size category, and has been downloaded 11.9K times.
About HuggingFaceFW/finewiki
This is an updated and better extracted version of the wikimedia/Wikipedia dataset originally released in 2023. We carefully parsed Wikipedia HTML dumps from August of 2025 covering 325 languages.
This dataset:
fully renders templates as it was ...
Details
- Task
- Text Generation
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10M<n<100M
- Creator
- HuggingFaceFW
- Year
- 2025
- License
- cc-by-sa-4.0
- Downloads
- 11870
- Likes
- 305