philschmid/amazon-product-descriptions-vlm
Image To TextEN
Philschmid/amazon-product-descriptions-vlm is a image to text-focused dataset in EN distributed in Parquet format.
About philschmid/amazon-product-descriptions-vlm
Amazon Multimodal Product dataset
This is a modfied and slim verison of bprateek/amazon_product_description helpful to get started training multimodal LLMs.
The description field was generated used Gemini Flash.
Details
- Task
- Image To Text
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- philschmid
- Year
- 2024