Skip to content

Henrychur/MMedC

General NLPEN, ZH, JAcc-by-nc-sa-4.0

The Henrychur/MMedC dataset is a EN, ZH, JA General NLP resource from Henrychur at 2024. With 158 downloads and 37 likes, it is actively used by the community. It is released under the cc-by-nc-sa-4.0 license and is a 10B<n<100B-scale dataset.

About Henrychur/MMedC

MMedC 💻Github Repo 🖨️arXiv Paper The official pre-training dataset for "Towards Building Multilingual Language Model for Medicine". News We add Arabic and German corpus to MMedC. Introduction This repo contains MMedC, ...

Details

Task
General NLP
Language
EN, ZH, JA
Format
Parquet
Rows / instances
N/A
Size
10B<n<100B
Creator
Henrychur
Year
2024
License
cc-by-nc-sa-4.0
Downloads
158
Likes
37
Download Homepage

Related General NLP datasets

FAQ