Skip to content

BLIP-2 (Q-Former)

Salesforce ResearchVisual question answeringImage captioningOpen weights

BLIP-2 (Q-Former) is visual question answering model published by Salesforce Research in 2023 featuring 1480000000.0 parameters.

About BLIP-2 (Q-Former)

The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models. This paper proposes BLIP-2, a generic and efficient pre-training strategy that bootstraps vision-language pre-training

Details

Provider
Salesforce Research
Task
Visual question answering,Image captioning
Parameters
1480000000.0
Released
2023-01-30
Open weights
Yes
View model source

Explore

FAQ