Skip to content

huggan/smithsonian_butterflies_subset

General NLPEnglish

The huggan/smithsonian_butterflies_subset dataset is a English General NLP resource from huggan at 2022. With 2K downloads and 56 likes, it is actively used by the community and is a 1K<n<10K-scale dataset.

About huggan/smithsonian_butterflies_subset

This a subset of "ceyda/smithsonian_butterflies" dataset with additional processing done to train the "ceyda/butterfly_gan" model. The preprocessing includes: Adding "sim_score" to images with CLIP model using "pretty butterfly","one butterfly","...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Size
1K<n<10K
Creator
huggan
Year
2022
Downloads
2018
Likes
56
Download Homepage

Related General NLP datasets

FAQ