Command Palette
Search for a command to run...
3 months ago
The impact of lexical and grammatical processing on generating code from natural language
Nathanaël Beau Benoît Crabbé

Abstract
Considering the seq2seq architecture of TranX for natural language to code translation, we identify four key components of importance: grammatical constraints, lexical preprocessing, input representations, and copy mechanisms. To study the impact of these components, we use a state-of-the-art architecture that relies on BERT encoder and a grammar-based decoder for which a formalization is provided. The paper highlights the importance of the lexical substitution component in the current natural language to code systems.
Code Repositories
https://gitlab.com/codegenfact/BertranX
Official
pytorch
Mentioned in GitHub
https://gitlab.com/codegenfactors/BertranX
Official
pytorch
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| code-generation-on-conala | TranX + BERT w/mined | BLEU: 34.2 Exact Match Accuracy: 5.8 |
| code-generation-on-django | TranX + BERT w/mined | Accuracy: 81.03 BLEU Score: 79.86 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.
AI Co-coding
Ready-to-use GPUs
Best Pricing
Hyper Newsletters
Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp