Skip to content

Salmonnn/cc12

Image To TextEnglish

Salmonnn/cc12 is a image to text dataset in English from Salmonnn in Parquet format.

About Salmonnn/cc12

Dataset Card for Conceptual Captions 12M (CC12M) Dataset Summary Conceptual 12M (CC12M) is a dataset with 12 million image-text pairs specifically meant to be used for visionand-language pre-training. Its data collection pipeline...

Details

Task
Image To Text
Language
English
Format
Parquet
Rows / instances
N/A
Creator
Salmonnn
Year
2025
Download

Related Image To Text datasets

FAQ