Command Palette
Search for a command to run...
Matthew Henderson; Paweł Budzianowski; Iñigo Casanueva; Sam Coope; Daniela Gerz; Girish Kumar; Nikola Mrkšić; Georgios Spithourakis; Pei-Hao Su; Ivan Vulić; Tsung-Hsien Wen

Abstract
Progress in Machine Learning is often driven by the availability of large datasets, and consistent evaluation metrics for comparing modeling approaches. To this end, we present a repository of conversational datasets consisting of hundreds of millions of examples, and a standardised evaluation procedure for conversational response selection models using '1-of-100 accuracy'. The repository contains scripts that allow researchers to reproduce the standard datasets, or to adapt the pre-processing and data filtering steps to their needs. We introduce and evaluate several competitive baselines for conversational response selection, whose implementations are shared in the repository, as well as a neural encoder model that is trained on the entire training set.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| conversational-response-selection-on-polyai | PolyAI Encoder | 1-of-100 Accuracy: 61.3% |
| conversational-response-selection-on-polyai-1 | PolyAI Encoder | 1-of-100 Accuracy: 30.6% |
| conversational-response-selection-on-polyai-2 | PolyAI Encoder | 1-of-100 Accuracy: 71.3% |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.