Skip to content

tomg-group-umd/pixelprose

Image To TextText To ImageVisual Question AnsweringEN

Tomg-group-umd/pixelprose is a image to text dataset in EN from tomg-group-umd in Parquet format.

About tomg-group-umd/pixelprose

From Pixels to Prose: A Large Dataset of Dense Image Captions [ arXiv paper ] | [ 🌮 image tars ] PixelProse is a comprehensive dataset of over 16M (million) synthetically generated captions, leveraging cutting-edge vision-language models (Gemi...

Details

Task
Image To Text, Text To Image, Visual Question Answering
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
tomg-group-umd
Year
2024
Download

FAQ