HuggingFaceTB/finemath
General NLPEnglishodc-by
The HuggingFaceTB/finemath dataset is a English General NLP resource from HuggingFaceTB at 2024 comprising 48,283,984 examples. With 22.8K downloads and 368 likes, it is actively used by the community. It is released under the odc-by license and is a 10M<n<100M-scale dataset.
About HuggingFaceTB/finemath
š FineMath
What is it?
š FineMath consists of 34B tokens (FineMath-3+) and 54B tokens (FineMath-3+ with InfiMM-WebMath-3+) of mathematical educational content filtered from CommonCrawl. To curate this dataset, we trained a mathemati...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- 48283984
- Size
- 10M<n<100M
- Creator
- HuggingFaceTB
- Year
- 2024
- License
- odc-by
- Downloads
- 22764
- Likes
- 368