HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Retrieval Augmented Code Generation and Summarization

Md Rizwan Parvez Wasi Uddin Ahmad Saikat Chakraborty Baishakhi Ray Kai-Wei Chang

Retrieval Augmented Code Generation and Summarization

Abstract

Software developers write a lot of source code and documentation during software development. Intrinsically, developers often recall parts of source code or code summaries that they had written in the past while implementing software or documenting them. To mimic developers' code or summary generation behavior, we propose a retrieval augmented framework, REDCODER, that retrieves relevant code or summaries from a retrieval database and provides them as a supplement to code generation or summarization models. REDCODER has a couple of uniqueness. First, it extends the state-of-the-art dense retrieval technique to search for relevant code or summaries. Second, it can work with retrieval databases that include unimodal (only code or natural language description) or bimodal instances (code-description pairs). We conduct experiments and extensive analysis on two benchmark datasets of code generation and summarization in Java and Python, and the promising results endorse the effectiveness of our proposed retrieval augmented framework.

Code Repositories

kagnlp/CodeGenerator
Mentioned in GitHub
rizwan09/redcoder
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
code-generation-on-codexglue-codesearchnetRedcoder-ext
Java/BLEU: 28.98
Java/CodeBLEU: 33.18
Java/EM: 10.21
Python/BLEU: 24.43
Python/CodeBLEU: 30.21
Python/EM: 9.61
code-generation-on-concodeRedcoder-ext
BLEU: 42.5
CodeBLEU: 43.4
Exact Match: 23.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Retrieval Augmented Code Generation and Summarization | Papers | HyperAI