7 months ago

Natural Language Processing

Document Understanding

Method/Architecture

Natural Language Processing

Seba Susan Akanksha Karotia

Abstract

In this era where a large amount of information has flooded the Internet, manual extraction and consumption of relevant information is very difficult and time-consuming. Therefore, an automated document summarization tool is necessary to excerpt important information from a set of documents that have similar or related subjects. Multi-document summarization allows retrieval of important and relevant content from multiple documents while minimizing redundancy. A multi-document text summarization system is developed in this study using an unsupervised extractive-based approach. The proposed model is a fusion of two learning paradigms: the T5 pre-trained transformer model and the K-Means clustering algorithm. We perform the experiments on the benchmark news article corpus Document Understanding Conference (DUC2004). The ROUGE evaluation metrics were used to estimate the performance of the proposed approach on the DUC2004. Outcomes validate that our proposed model shows greatly enhanced performance as compared to the existent unsupervised state-of-the-art approaches.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

7 months ago

Natural Language Processing

Document Understanding

Method/Architecture

Natural Language Processing

Seba Susan Akanksha Karotia

Abstract

In this era where a large amount of information has flooded the Internet, manual extraction and consumption of relevant information is very difficult and time-consuming. Therefore, an automated document summarization tool is necessary to excerpt important information from a set of documents that have similar or related subjects. Multi-document summarization allows retrieval of important and relevant content from multiple documents while minimizing redundancy. A multi-document text summarization system is developed in this study using an unsupervised extractive-based approach. The proposed model is a fusion of two learning paradigms: the T5 pre-trained transformer model and the K-Means clustering algorithm. We perform the experiments on the benchmark news article corpus Document Understanding Conference (DUC2004). The ROUGE evaluation metrics were used to estimate the performance of the proposed approach on the DUC2004. Outcomes validate that our proposed model shows greatly enhanced performance as compared to the existent unsupervised state-of-the-art approaches.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp