Command Palette
Search for a command to run...
Quora Duplicate Questions Text Classification Research Dataset
The Quora Duplicate Questions Dataset is a dataset for determining whether question pairs in text are duplicates. It is used for text classification research and aims to provide anyone with the opportunity to train and test semantically equivalent models.
The dataset consists of over 400,000 rows of potential question-duplicate pairs, with each row containing the question ID, the full text of the question, and a binary value indicating whether the row contains a duplicate pair.
This dataset was released by the Quora team in 2017, with the main publishers being Shankar Iyer, Nikhil Dandekar, and Kornél Csernai.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.