armanc/scientific_papers
SummarizationEN
Armanc/scientific_papers is a summarization-focused dataset in EN that provides 349,128 labeled examples distributed in Parquet format. It is distributed under the unknown license and falls in the 100K<n<1M size category, and has been downloaded 4.4K times.
About armanc/scientific_papers
Scientific papers datasets contains two sets of long and structured documents.
The datasets are obtained from ArXiv and PubMed OpenAccess repositories.
Both "arxiv" and "pubmed" have two features:
- article: the body of the document, pagragraph...
Details
- Task
- Summarization
- Language
- EN
- Format
- Parquet
- Rows / instances
- 349128
- Size
- 100K<n<1M
- Creator
- armanc
- Year
- 2022
- License
- unknown
- Downloads
- 4360
- Likes
- 175