Skip to content

philschmid/amazon-product-descriptions-vlm

Image To TextEN

Philschmid/amazon-product-descriptions-vlm is a image to text-focused dataset in EN distributed in Parquet format.

About philschmid/amazon-product-descriptions-vlm

Amazon Multimodal Product dataset This is a modfied and slim verison of bprateek/amazon_product_description helpful to get started training multimodal LLMs. The description field was generated used Gemini Flash.

Details

Task
Image To Text
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
philschmid
Year
2024
Download

Related Image To Text datasets

FAQ