Skip to content

HuggingFaceFW/finewiki

Text GenerationEnglishcc-by-sa-4.0

HuggingFaceFW/finewiki is a text generation dataset in English from HuggingFaceFW in Parquet format. It is distributed under the cc-by-sa-4.0 license and falls in the 10M<n<100M size category, and has been downloaded 11.9K times.

About HuggingFaceFW/finewiki

This is an updated and better extracted version of the wikimedia/Wikipedia dataset originally released in 2023. We carefully parsed Wikipedia HTML dumps from August of 2025 covering 325 languages. This dataset: fully renders templates as it was ...

Details

Task
Text Generation
Language
English
Format
Parquet
Rows / instances
N/A
Size
10M<n<100M
Creator
HuggingFaceFW
Year
2025
License
cc-by-sa-4.0
Downloads
11870
Likes
305
Download Homepage

Related Text Generation datasets

FAQ