Command Palette
Search for a command to run...
CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing
Elnaggar Ahmed ; Ding Wei ; Jones Llion ; Gibbs Tom ; Feher Tamas ; Angerer Christoph ; Severini Silvia ; Matthes Florian ; Rost Burkhard

Abstract
Currently, a growing number of mature natural language processingapplications make people's life more convenient. Such applications are built bysource code - the language in software engineering. However, the applicationsfor understanding source code language to ease the software engineering processare under-researched. Simultaneously, the transformer model, especially itscombination with transfer learning, has been proven to be a powerful techniquefor natural language processing tasks. These breakthroughs point out apromising direction for process source code and crack software engineeringtasks. This paper describes CodeTrans - an encoder-decoder transformer modelfor tasks in the software engineering domain, that explores the effectivenessof encoder-decoder transformer models for six software engineering tasks,including thirteen sub-tasks. Moreover, we have investigated the effect ofdifferent training strategies, including single-task learning, transferlearning, multi-task learning, and multi-task learning with fine-tuning.CodeTrans outperforms the state-of-the-art models on all the tasks. To expeditefuture works in the software engineering domain, we have published ourpre-trained models of CodeTrans. https://github.com/agemagician/CodeTrans
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| api-sequence-recommendation-on-deepapi | CodeTrans-MT-TF-Large | BLEU-4: 73.39 |
| code-comment-generation-on-deepcom | CodeTrans-TF-Large | Smoothed BLEU-4: 39.50 |
| code-documentation-generation-on | CodeTrans-MT-Base | Smoothed BLEU-4: 20.39 |
| code-documentation-generation-on-1 | CodeTrans-MT-Large | Smoothed BLEU-4: 21.87 |
| code-documentation-generation-on-2 | CodeTrans-TF-Large | Smoothed BLEU-4: 19.54 |
| code-documentation-generation-on-3 | CodeTrans-MT-Base | Smoothed BLEU-4: 26.23 |
| code-documentation-generation-on-4 | CodeTrans-MT-Base | Smoothed BLEU-4: 15.26 |
| code-documentation-generation-on-5 | CodeTrans-TF-Large | Smoothed BLEU-4: 18.98 |
| git-commit-message-generation-on-commitgen | CodeTrans-TF-Large | BLEU-4: 44.41 |
| program-synthesis-on-algolisp | CodeTrans-MT-TF-Small | Accuracy: 90.31 |
| source-code-summarization-on-summarizing-1 | CodeTrans-MT-Large | Smoothed BLEU-4: 23.57 |
| source-code-summarization-on-summarizing-2 | CodeTrans-MT-Base | Smoothed BLEU-4: 13.37 |
| source-code-summarization-on-summarizing-3 | CodeTrans-MT-TF-Large | Smoothed BLEU-4: 19.98 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.