Skip to content

HuggingFaceTB/finemath

General NLPEnglishodc-by

The HuggingFaceTB/finemath dataset is a English General NLP resource from HuggingFaceTB at 2024 comprising 48,283,984 examples. With 22.8K downloads and 368 likes, it is actively used by the community. It is released under the odc-by license and is a 10M<n<100M-scale dataset.

About HuggingFaceTB/finemath

šŸ“ FineMath What is it? šŸ“ FineMath consists of 34B tokens (FineMath-3+) and 54B tokens (FineMath-3+ with InfiMM-WebMath-3+) of mathematical educational content filtered from CommonCrawl. To curate this dataset, we trained a mathemati...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
48283984
Size
10M<n<100M
Creator
HuggingFaceTB
Year
2024
License
odc-by
Downloads
22764
Likes
368
Download Homepage

Related General NLP datasets

FAQ