Skip to content

jingyaogong/minimind-v_dataset

Visual Question AnsweringZH, ENapache-2.0

Jingyaogong/minimind-v_dataset is a visual question answering-focused dataset in ZH, EN distributed in Parquet format. It is distributed under the apache-2.0 license and falls in the n<1K size category, and has been downloaded 1.4K times.

About jingyaogong/minimind-v_dataset

Ⅰ 数据集 本轮训练用到的图文数据全部来自 ALLaVA-4V 系列。 相比以往从几份 LLaVA 衍生集拼接得到的数据,ALLaVA-4V 的质量更整齐、中英双语原生对照,细粒度描述也更充分。 它由两个子源构成:一份是 LAION 里挑出来的高质量图片(自然图像为主),一份是 VFLAN 指令流里挑出来的图片(文档、图表、合成场景居多)。 Pretrain(pretrain_i2t.parquet,约 127 万条 / ~64 万张唯一图像) ALLaVA-Caption-LA...

Details

Task
Visual Question Answering
Language
ZH, EN
Format
Parquet
Rows / instances
N/A
Size
n<1K
Creator
jingyaogong
Year
2024
License
apache-2.0
Downloads
1431
Likes
36
Download Homepage

Related Visual Question Answering datasets

FAQ