Command Palette
Search for a command to run...
ManyTypes4TypeScript: A Comprehensive TypeScript Dataset for Sequence-Based Type Inference
{Premkumar T. Devanbu Kevin Jesse}
Abstract
In this paper, we present ManyTypes4TypeScript, a very largecorpus for training and evaluating machine-learning models forsequence-based type inference in TypeScript. The dataset includesover 9 million type annotations, across 13,953 projects and 539,571files. The dataset is approximately 10x larger than analogous typeinference datasets for Python, and is the largest available for TypeScript. We also provide API access to the dataset, which can beintegrated into any tokenizer and used with any state-of-the-artsequence-based model. Finally, we provide analysis and performance results for state-of-the-art code-specific models, for baselining. ManyTypes4TypeScript is available on Huggingface, Zenodo,and CodeXGLUE.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| type-prediction-on-manytypes4typescript | GraphCodeBERT-MT4TS | Average Accuracy: 63.42 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.