HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Learning Type Inference for Enhanced Dataflow Analysis

Seidel Lukas ; Effendi Sedick David Baker ; Pinho Xavier ; Rieck Konrad ; van der Merwe Brink ; Yamaguchi Fabian

Learning Type Inference for Enhanced Dataflow Analysis

Abstract

Statically analyzing dynamically-typed code is a challenging endeavor, aseven seemingly trivial tasks such as determining the targets of procedure callsare non-trivial without knowing the types of objects at compile time.Addressing this challenge, gradual typing is increasingly added todynamically-typed languages, a prominent example being TypeScript thatintroduces static typing to JavaScript. Gradual typing improves the developer'sability to verify program behavior, contributing to robust, secure anddebuggable programs. In practice, however, users only sparsely annotate typesdirectly. At the same time, conventional type inference facesperformance-related challenges as program size grows. Statistical techniquesbased on machine learning offer faster inference, but although recentapproaches demonstrate overall improved accuracy, they still performsignificantly worse on user-defined types than on the most common built-intypes. Limiting their real-world usefulness even more, they rarely integratewith user-facing applications. We propose CodeTIDAL5, a Transformer-based modeltrained to reliably predict type annotations. For effective result retrievaland re-integration, we extract usage slices from a program's code propertygraph. Comparing our approach against recent neural type inference systems, ourmodel outperforms the current state-of-the-art by 7.85% on theManyTypes4TypeScript benchmark, achieving 71.27% accuracy overall. Furthermore,we present JoernTI, an integration of our approach into Joern, an open sourcestatic analysis tool, and demonstrate that the analysis benefits from theadditional type information. As our model allows for fast inference times evenon commodity CPUs, making our system available through Joern leads to highaccessibility and facilitates security research.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
type-prediction-on-manytypes4typescriptCodeTIDAL5
Average Accuracy: 71.27

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Learning Type Inference for Enhanced Dataflow Analysis | Papers | HyperAI