Skip to content

Upstash/wikipedia-2024-06-bge-m3

General NLPEN, DE, ES

Upstash/wikipedia-2024-06-bge-m3 is a General NLP dataset in EN, DE, ES from Upstash in Parquet format.

About Upstash/wikipedia-2024-06-bge-m3

Wikipedia Embeddings with BGE-M3 This dataset contains embeddings from the June 2024 Wikipedia dump for the 11 most popular languages. The embeddings are generated with the multilingual BGE-M3 model. The dataset consists of Wikipedia articles s...

Details

Task
General NLP
Language
EN, DE, ES
Format
Parquet
Rows / instances
N/A
Creator
Upstash
Year
2024
Download

Related General NLP datasets

FAQ