Kaichengalex/YFCC15M
General NLPEnglish
The Kaichengalex/YFCC15M dataset is a English General NLP resource from Kaichengalex at 2024 comprising 10,000 examples. With 13.8K downloads and 8 likes, it is actively used by the community and is a 10M<n<100M-scale dataset.
About Kaichengalex/YFCC15M
YFCC15M Recaption Dataset
This YFCC15M Dataset is filtered by DeCLIP and recaptioned utilize the diverse description generation framework proposed in RWKV-CLIP.
The text is a list of text tokens with a length of 77, encoded using the CLIP token...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- 10000
- Size
- 10M<n<100M
- Creator
- Kaichengalex
- Year
- 2024
- Downloads
- 13769
- Likes
- 8