Skip to content

armanc/scientific_papers

SummarizationEN

Armanc/scientific_papers is a summarization-focused dataset in EN that provides 349,128 labeled examples distributed in Parquet format. It is distributed under the unknown license and falls in the 100K<n<1M size category, and has been downloaded 4.4K times.

About armanc/scientific_papers

Scientific papers datasets contains two sets of long and structured documents. The datasets are obtained from ArXiv and PubMed OpenAccess repositories. Both "arxiv" and "pubmed" have two features: - article: the body of the document, pagragraph...

Details

Task
Summarization
Language
EN
Format
Parquet
Rows / instances
349128
Size
100K<n<1M
Creator
armanc
Year
2022
License
unknown
Downloads
4360
Likes
175
Download Homepage

Related Summarization datasets

FAQ