HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Revisiting Distributional Correspondence Indexing: A Python Reimplementation and New Experiments

Alejandro Moreo; Andrea Esuli; Fabrizio Sebastiani

Revisiting Distributional Correspondence Indexing: A Python Reimplementation and New Experiments

Abstract

This paper introduces PyDCI, a new implementation of Distributional Correspondence Indexing (DCI) written in Python. DCI is a transfer learning method for cross-domain and cross-lingual text classification for which we had provided an implementation (here called JaDCI) built on top of JaTeCS, a Java framework for text classification. PyDCI is a stand-alone version of DCI that exploits scikit-learn and the SciPy stack. We here report on new experiments that we have carried out in order to test PyDCI, and in which we use as baselines new high-performing methods that have appeared after DCI was originally proposed. These experiments show that, thanks to a few subtle ways in which we have improved DCI, PyDCI outperforms both JaDCI and the above-mentioned high-performing methods, and delivers the best known results on the two popular benchmarks on which we had tested DCI, i.e., MultiDomainSentiment (a.k.a. MDS -- for cross-domain adaptation) and Webis-CLS-10 (for cross-lingual adaptation). PyDCI, together with the code allowing to replicate our experiments, is available at https://github.com/AlexMoreo/pydci .

Code Repositories

AlexMoreo/pydci
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
sentiment-analysis-on-multi-domain-sentimentDistributional Correspondence Indexing
Average: 83.30
Books: 81.4
DVD: 81.00
Electronics: 85,06
Kitchen: 85.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Revisiting Distributional Correspondence Indexing: A Python Reimplementation and New Experiments | Papers | HyperAI