nvidia/describe-anything-dataset
Image To TextVideo Text To TextEN
Created by nvidia at 2025, the nvidia/describe-anything-dataset is a image to text dataset in EN in Parquet format.
About nvidia/describe-anything-dataset
Describe Anything: Detailed Localized Image and Video Captioning
NVIDIA, UC Berkeley, UCSF
Long Lian, Yifan Ding, Yunhao Ge, Sifei Liu, Hanzi Mao, Boyi Li, Marco Pavone, Ming-Yu Liu, Trevor Darrell, Adam Yala, Yin Cui
[Paper] | [Code] | [Projec...
Details
- Task
- Image To Text, Video Text To Text
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- nvidia
- Year
- 2025