HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing

Elnaggar Ahmed ; Ding Wei ; Jones Llion ; Gibbs Tom ; Feher Tamas ; Angerer Christoph ; Severini Silvia ; Matthes Florian ; Rost Burkhard

CodeTrans: Towards Cracking the Language of Silicon's Code Through
  Self-Supervised Deep Learning and High Performance Computing

Abstract

Currently, a growing number of mature natural language processingapplications make people's life more convenient. Such applications are built bysource code - the language in software engineering. However, the applicationsfor understanding source code language to ease the software engineering processare under-researched. Simultaneously, the transformer model, especially itscombination with transfer learning, has been proven to be a powerful techniquefor natural language processing tasks. These breakthroughs point out apromising direction for process source code and crack software engineeringtasks. This paper describes CodeTrans - an encoder-decoder transformer modelfor tasks in the software engineering domain, that explores the effectivenessof encoder-decoder transformer models for six software engineering tasks,including thirteen sub-tasks. Moreover, we have investigated the effect ofdifferent training strategies, including single-task learning, transferlearning, multi-task learning, and multi-task learning with fine-tuning.CodeTrans outperforms the state-of-the-art models on all the tasks. To expeditefuture works in the software engineering domain, we have published ourpre-trained models of CodeTrans. https://github.com/agemagician/CodeTrans

Code Repositories

agemagician/CodeTrans
Official
tf
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
api-sequence-recommendation-on-deepapiCodeTrans-MT-TF-Large
BLEU-4: 73.39
code-comment-generation-on-deepcomCodeTrans-TF-Large
Smoothed BLEU-4: 39.50
code-documentation-generation-onCodeTrans-MT-Base
Smoothed BLEU-4: 20.39
code-documentation-generation-on-1CodeTrans-MT-Large
Smoothed BLEU-4: 21.87
code-documentation-generation-on-2CodeTrans-TF-Large
Smoothed BLEU-4: 19.54
code-documentation-generation-on-3CodeTrans-MT-Base
Smoothed BLEU-4: 26.23
code-documentation-generation-on-4CodeTrans-MT-Base
Smoothed BLEU-4: 15.26
code-documentation-generation-on-5CodeTrans-TF-Large
Smoothed BLEU-4: 18.98
git-commit-message-generation-on-commitgenCodeTrans-TF-Large
BLEU-4: 44.41
program-synthesis-on-algolispCodeTrans-MT-TF-Small
Accuracy: 90.31
source-code-summarization-on-summarizing-1CodeTrans-MT-Large
Smoothed BLEU-4: 23.57
source-code-summarization-on-summarizing-2CodeTrans-MT-Base
Smoothed BLEU-4: 13.37
source-code-summarization-on-summarizing-3CodeTrans-MT-TF-Large
Smoothed BLEU-4: 19.98

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing | Papers | HyperAI