miriad/miriad-5.8M
General NLPEnglish
Miriad/miriad-5.8M is a General NLP-focused dataset in English distributed in Parquet format.
About miriad/miriad-5.8M
Dataset Summary
MIRIAD is a curated million scale Medical Instruction and RetrIeval Dataset. It contains 5.8 million medical question-answer pairs, distilled from peer-reviewed biomedical literature using LLMs. MIRIAD provides structured, high-...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- miriad
- Year
- 2025