HuggingFaceFW/finepdfs-edu
Text GenerationEN, DE, JAodc-by
HuggingFaceFW/finepdfs-edu is a text generation dataset in EN, DE, JA from HuggingFaceFW in Parquet format. It is distributed under the odc-by license and falls in the 10M<n<100M size category, and has been downloaded 10.3K times.
About HuggingFaceFW/finepdfs-edu
š FinePDFs-Edu
350B+ of highly educational tokens from PDFs š
What is it?
š FinePDFs-Edu dataset consists of 350B+ tokens of educational PDFs filtered from š FinePDFs dataset covering 69 languages.
FinePDFs was created using the f...
Details
- Task
- Text Generation
- Language
- EN, DE, JA
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 10M<n<100M
- Creator
- HuggingFaceFW
- Year
- 2025
- License
- odc-by
- Downloads
- 10307
- Likes
- 91