Skip to content

BIOMEDICA/biomedica_webdataset_24M

General NLPEnglish

BIOMEDICA/biomedica_webdataset_24M is a General NLP dataset in English from BIOMEDICA in Parquet format. And falls in the n>1T size category, and has been downloaded 2.5K times.

About BIOMEDICA/biomedica_webdataset_24M

Dataset Card for Dataset Name Arxiv: Arxiv     |     Website: Biomedica     |     Training instructions: OpenCLIP     |     Tutorial: Google Colab BIOMEDICA Dataset is a large-scale, deep-learning-ready biomedi...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Size
n>1T
Creator
BIOMEDICA
Year
2025
Downloads
2538
Likes
35
Download Homepage

Related General NLP datasets

FAQ