HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

ManyTypes4TypeScript: A Comprehensive TypeScript Dataset for Sequence-Based Type Inference

{Premkumar T. Devanbu Kevin Jesse}

Abstract

In this paper, we present ManyTypes4TypeScript, a very largecorpus for training and evaluating machine-learning models forsequence-based type inference in TypeScript. The dataset includesover 9 million type annotations, across 13,953 projects and 539,571files. The dataset is approximately 10x larger than analogous typeinference datasets for Python, and is the largest available for TypeScript. We also provide API access to the dataset, which can beintegrated into any tokenizer and used with any state-of-the-artsequence-based model. Finally, we provide analysis and performance results for state-of-the-art code-specific models, for baselining. ManyTypes4TypeScript is available on Huggingface, Zenodo,and CodeXGLUE.

Benchmarks

BenchmarkMethodologyMetrics
type-prediction-on-manytypes4typescriptGraphCodeBERT-MT4TS
Average Accuracy: 63.42

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
ManyTypes4TypeScript: A Comprehensive TypeScript Dataset for Sequence-Based Type Inference | Papers | HyperAI