Skip to content

ccdv/cnn_dailymail

SummarizationText GenerationENapache-2.0

The ccdv/cnn_dailymail dataset is a EN summarization resource from ccdv at 2022. With 39.4K downloads and 33 likes, it is actively used by the community. It is released under the apache-2.0 license and is a 100K<n<1M-scale dataset.

About ccdv/cnn_dailymail

CNN/DailyMail non-anonymized summarization dataset. There are two features: - article: text of news article, used as the document to be summarized - highlights: joined text of highlights with <s> and </s> around each highlight, which is t...

Details

Task
Summarization, Text Generation
Language
EN
Format
Parquet
Rows / instances
N/A
Size
100K<n<1M
Creator
ccdv
Year
2022
License
apache-2.0
Downloads
39429
Likes
33
Download Homepage

Related Summarization, Text Generation datasets

FAQ