Skip to content

nvidia/describe-anything-dataset

Image To TextVideo Text To TextEN

Created by nvidia at 2025, the nvidia/describe-anything-dataset is a image to text dataset in EN in Parquet format.

About nvidia/describe-anything-dataset

Describe Anything: Detailed Localized Image and Video Captioning NVIDIA, UC Berkeley, UCSF Long Lian, Yifan Ding, Yunhao Ge, Sifei Liu, Hanzi Mao, Boyi Li, Marco Pavone, Ming-Yu Liu, Trevor Darrell, Adam Yala, Yin Cui [Paper] | [Code] | [Projec...

Details

Task
Image To Text, Video Text To Text
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
nvidia
Year
2025
Download

FAQ