Skip to content

Kaichengalex/YFCC15M

General NLPEnglish

The Kaichengalex/YFCC15M dataset is a English General NLP resource from Kaichengalex at 2024 comprising 10,000 examples. With 13.8K downloads and 8 likes, it is actively used by the community and is a 10M<n<100M-scale dataset.

About Kaichengalex/YFCC15M

YFCC15M Recaption Dataset This YFCC15M Dataset is filtered by DeCLIP and recaptioned utilize the diverse description generation framework proposed in RWKV-CLIP. The text is a list of text tokens with a length of 77, encoded using the CLIP token...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
10000
Size
10M<n<100M
Creator
Kaichengalex
Year
2024
Downloads
13769
Likes
8
Download Homepage

Related General NLP datasets

FAQ