HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Incorporating External Knowledge through Pre-training for Natural Language to Code Generation

Frank F. Xu Zhengbao Jiang Pengcheng Yin Bogdan Vasilescu Graham Neubig

Incorporating External Knowledge through Pre-training for Natural Language to Code Generation

Abstract

Open-domain code generation aims to generate code in a general-purpose programming language (such as Python) from natural language (NL) intents. Motivated by the intuition that developers usually retrieve resources on the web when writing code, we explore the effectiveness of incorporating two varieties of external knowledge into NL-to-code generation: automatically mined NL-code pairs from the online programming QA forum StackOverflow and programming language API documentation. Our evaluations show that combining the two sources with data augmentation and retrieval-based data re-sampling improves the current state-of-the-art by up to 2.2% absolute BLEU score on the code generation testbed CoNaLa. The code and resources are available at https://github.com/neulab/external-knowledge-codegen.

Code Repositories

zorazrw/multilingual-conala
pytorch
Mentioned in GitHub
neulab/external-knowledge-codegen
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
code-generation-on-conalaExternal Knowledge With API + Reranking
BLEU: 32.26
code-generation-on-conalaExternal Knowledge With API
BLEU: 30.69
code-generation-on-conala-extExternal Knowledge With API
BLEU: 20.37
code-generation-on-conala-extExternal Knowledge With API + Reranking
BLEU: 20.54

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Incorporating External Knowledge through Pre-training for Natural Language to Code Generation | Papers | HyperAI