HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Leveraging pre-trained language models for code generation

{Mayada Hadhoud Samir Shaheen Ahmed Soliman}

Leveraging pre-trained language models for code generation

Abstract

Code assistance refers to the utilization of various tools, techniques, and models to help developers in the process of software development. As coding tasks become increasingly complex, code assistant plays a pivotal role in enhancing developer productivity, reducing errors, and facilitating a more efficient coding workflow. This assistance can manifest in various forms, including code autocompletion, error detection and correction, code generation, documentation support, and context-aware suggestions. Language models have emerged as integral components of code assistance, offering developers the capability to receive intelligent suggestions, generate code snippets, and enhance overall coding proficiency. In this paper, we propose new hybrid models for code generation by leveraging pre-trained language models BERT, RoBERTa, ELECTRA, and LUKE with the Marian Causal Language Model. Selecting these models based on their strong performance in various natural language processing tasks. We evaluate the performance of these models on two datasets CoNaLa and DJANGO and compare them to existing state-of-the-art models. We aim to investigate the potential of pre-trained transformer language models to revolutionize code generation, offering improved precision and efficiency in navigating complex coding scenarios. Additionally, conducting error analysis and refining the generated code. Our results show that these models, when combined with the Marian Decoder, significantly improve code generation accuracy and efficiency. Notably, the RoBERTaMarian model achieved a maximum BLEU score of 35.74 and an exact match accuracy of 13.8% on CoNaLa, while LUKE-Marian attained a BLEU score of 89.34 and an exact match accuracy of 78.50% on DJANGO. Implementation of this work is available at https://github.com/AhmedSSoliman/Leveraging-Pretrained-Language-Models-for-Code-Generation.

Benchmarks

BenchmarkMethodologyMetrics
code-generation-on-conalaELECTRAMarian
BLEU: 30.18
Exact Match Accuracy: 10.0
code-generation-on-conalaRoBERTaMarian
BLEU: 35.74
Exact Match Accuracy: 13.8
code-generation-on-conalaBERTMarian
BLEU: 32.46
Exact Match Accuracy: 12.40
code-generation-on-conalaLUKEMarian
BLEU: 29.83
Exact Match Accuracy: 7.6
code-generation-on-djangoLUKEMarian
Accuracy: 78.50
BLEU Score: 89.34
code-generation-on-djangoRoBERTaMarian
Accuracy: 77.95
BLEU Score: 88.91
code-generation-on-djangoBERTMarian
Accuracy: 76.68
BLEU Score: 56.55
code-generation-on-djangoELECTRAMarian
Accuracy: 65.32
BLEU Score: 53.02

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Leveraging pre-trained language models for code generation | Papers | HyperAI