Skip to content

osunlp/SMolInstruct

General NLPEN

The osunlp/SMolInstruct dataset is a EN General NLP resource from osunlp at 2024.

About osunlp/SMolInstruct

SMolInstruct is a large-scale instruction tuning dataset for chemistry tasks and centers around small molecules. It contains a total of 14 chemistry tasks and over 3 million samples. It is designed to be large-scale, comprehensive, and high-quality.

Details

Task
General NLP
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
osunlp
Year
2024
Download

Related General NLP datasets

FAQ