HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Long Short-Term Memory for Japanese Word Segmentation

Yoshiaki Kitagawa; Mamoru Komachi

Long Short-Term Memory for Japanese Word Segmentation

Abstract

This study presents a Long Short-Term Memory (LSTM) neural network approach to Japanese word segmentation (JWS). Previous studies on Chinese word segmentation (CWS) succeeded in using recurrent neural networks such as LSTM and gated recurrent units (GRU). However, in contrast to Chinese, Japanese includes several character types, such as hiragana, katakana, and kanji, that produce orthographic variations and increase the difficulty of word segmentation. Additionally, it is important for JWS tasks to consider a global context, and yet traditional JWS approaches rely on local features. In order to address this problem, this study proposes employing an LSTM-based approach to JWS. The experimental results indicate that the proposed model achieves state-of-the-art accuracy with respect to various Japanese corpora.

Benchmarks

BenchmarkMethodologyMetrics
japanese-word-segmentation-on-bccwjLSTM
F1-score (Word): 0.9842

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Long Short-Term Memory for Japanese Word Segmentation | Papers | HyperAI