Skip to content

bsmock/pubtables-1m

General NLPEnglish

Created by bsmock at 2022, the bsmock/pubtables-1m is a General NLP dataset in English in Parquet format. With 1.2K downloads and 64 likes, it is actively used by the community. It is released under the cdla-permissive-2.0 license.

About bsmock/pubtables-1m

PubTables-1M GitHub: https://github.com/microsoft/table-transformer Paper: "PubTables-1M: Towards comprehensive table extraction from unstructured documents" Hugging Face: Detection model Structure recognition model Currently we only suppo...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Creator
bsmock
Year
2022
License
cdla-permissive-2.0
Downloads
1237
Likes
64
Download Homepage

Related General NLP datasets

FAQ