HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models

Daniel Bermuth Alexander Poeppel Wolfgang Reif

Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models

Abstract

In Spoken Language Understanding (SLU) the task is to extract important information from audio commands, like the intent of what a user wants the system to do and special entities like locations or numbers. This paper presents a simple method for embedding intents and entities into Finite State Transducers, and, in combination with a pretrained general-purpose Speech-to-Text model, allows building SLU-models without any additional training. Building those models is very fast and only takes a few seconds. It is also completely language independent. With a comparison on different benchmarks it is shown that this method can outperform multiple other, more resource demanding SLU approaches.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
intent-classification-on-slurpFinstreder (Quartznet)
Accuracy (%): 43.15
intent-classification-on-slurpFinstreder (Conformer)
Accuracy (%): 53.11
slot-filling-on-slurpFinstreder (Conformer)
F1: 0.395
slot-filling-on-slurpFinstreder (Quartznet)
F1: 0.313
spoken-language-understanding-on-fluentFinstreder (Quartznet + AMT)
Accuracy (%): 99.7
spoken-language-understanding-on-fluentFinstreder (Conformer + AMT, character-based)
Accuracy (%): 99.8
spoken-language-understanding-on-fluentFinstreder (Conformer)
Accuracy (%): 99.5
spoken-language-understanding-on-fluentAmazon Alexa
Accuracy (%): 98.7
spoken-language-understanding-on-fluentFinstreder (Quartznet)
Accuracy (%): 99.2
spoken-language-understanding-on-snipsFinstreder (Conformer, character-based)
Accuracy (%): 89.0
spoken-language-understanding-on-snipsFinstreder (Conformer)
Accuracy (%): 88.0
spoken-language-understanding-on-snipsFinstreder (Quartznet)
Accuracy (%): 84.8
spoken-language-understanding-on-snips-1Finstreder (Quartznet)
Accuracy-EN (%): 77.6
Accuracy-FR (%): 77.8
spoken-language-understanding-on-snips-1Finstreder (Conformer, character-based)
Accuracy-EN (%): 87.9
Accuracy-FR (%): 86.5
spoken-language-understanding-on-snips-1Finstreder (Conformer)
Accuracy-EN (%): 80.4
Accuracy-FR (%): 78.3
spoken-language-understanding-on-timers-andFinstreder (Quartznet)
Accuracy (%): 90.0
spoken-language-understanding-on-timers-andFinstreder (Conformer)
Accuracy (%): 95.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models | Papers | HyperAI