HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

TorchicTab: Semantic Table Annotation with Wikidata and Language Models

{Anastasia Dimou Xuemin Duan Duo Yang Ioannis Dasoulas}

TorchicTab: Semantic Table Annotation with Wikidata and Language Models

Abstract

An abundance of tabular data exists and is used by a wide range of applications. However, a big portion of these data lack the semantic information necessary for users and machines to properly understand them. This lack of table semantic understanding impedes their usage in data analytics pipelines. Solutions to semantically interpret tables exist but they are focused on specific annotation tasks and types of tables, and rely on large knowledge bases, making it difficult to re-use in real-world settings. Thus, more robust systems that produce more precise annotations and adapt to different table types are needed. The Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab) was introduced in an effort to benchmark semantic table interpretation systems, by evaluating them over diverse datasets and tasks. In this paper, we introduce TorchicTab, a versatile semantic table interpretation system able to annotate tables with varied structures by using either an external knowledge graph, such as Wikidata, or annotated tables with pre-defined terms for training. We evaluate our proposed system according to the different annotation tasks of the SemTab challenge. The results show that our system can produce accurate annotations for different tasks across varied datasets.

Benchmarks

BenchmarkMethodologyMetrics
column-type-annotation-on-wdc-sotab-v2TorchicTab
Micro F1: 89.66
columns-property-annotation-on-wdc-sotab-v2TorchicTab
Micro F1: 87.11

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
TorchicTab: Semantic Table Annotation with Wikidata and Language Models | Papers | HyperAI