BIOMEDICA/biomedica_webdataset_24M
General NLPEnglish
BIOMEDICA/biomedica_webdataset_24M is a General NLP dataset in English from BIOMEDICA in Parquet format. And falls in the n>1T size category, and has been downloaded 2.5K times.
About BIOMEDICA/biomedica_webdataset_24M
Dataset Card for Dataset Name
Arxiv: Arxiv
|
Website: Biomedica
|
Training instructions: OpenCLIP
|
Tutorial: Google Colab
BIOMEDICA Dataset is a large-scale, deep-learning-ready biomedi...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Size
- n>1T
- Creator
- BIOMEDICA
- Year
- 2025
- Downloads
- 2538
- Likes
- 35