Skip to content

arcee-ai/The-Tome

General NLPEnglishBenchmarkmit

The arcee-ai/The-Tome dataset is a English General NLP resource from arcee-ai at 2024. With 176 downloads and 107 likes, it is actively used by the community. It is released under the mit license and is a 1M<n<10M-scale dataset.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About arcee-ai/The-Tome

The Tome is a curated dataset designed for training large language models with a focus on instruction following. It was used in the training of our Arcee-Nova/Spark models, which was later merged with Qwen2-72B-Instruct (or 7B with the Spark model...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Size
1M<n<10M
Creator
arcee-ai
Year
2024
License
mit
Downloads
176
Likes
107
Download Homepage

Related General NLP datasets

FAQ