Skip to content

HuggingFaceFW/finepdfs-edu

Text GenerationEN, DE, JAodc-by

HuggingFaceFW/finepdfs-edu is a text generation dataset in EN, DE, JA from HuggingFaceFW in Parquet format. It is distributed under the odc-by license and falls in the 10M<n<100M size category, and has been downloaded 10.3K times.

About HuggingFaceFW/finepdfs-edu

šŸ“š FinePDFs-Edu 350B+ of highly educational tokens from PDFs šŸ“„ What is it? šŸ“š FinePDFs-Edu dataset consists of 350B+ tokens of educational PDFs filtered from šŸ“„ FinePDFs dataset covering 69 languages. FinePDFs was created using the f...

Details

Task
Text Generation
Language
EN, DE, JA
Format
Parquet
Rows / instances
N/A
Size
10M<n<100M
Creator
HuggingFaceFW
Year
2025
License
odc-by
Downloads
10307
Likes
91
Download Homepage

Related Text Generation datasets

FAQ