Salmonnn/cc12
Image To TextEnglish
Salmonnn/cc12 is a image to text dataset in English from Salmonnn in Parquet format.
About Salmonnn/cc12
Dataset Card for Conceptual Captions 12M (CC12M)
Dataset Summary
Conceptual 12M (CC12M) is a dataset with 12 million image-text pairs specifically meant to be used for visionand-language pre-training.
Its data collection pipeline...
Details
- Task
- Image To Text
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- Salmonnn
- Year
- 2025