Command Palette
Search for a command to run...

Abstract
Code Large Language Models (Code LLMs), such as StarCoder, have demonstratedexceptional performance in code-related tasks. However, most existing modelsare solely pre-trained on extensive raw code data without instructionfine-tuning. In this paper, we introduce WizardCoder, which empowers Code LLMswith complex instruction fine-tuning, by adapting the Evol-Instruct method tothe domain of code. Through comprehensive experiments on four prominent codegeneration benchmarks, namely HumanEval, HumanEval+, MBPP, and DS-1000, weunveil the exceptional capabilities of our model. It surpasses all otheropen-source Code LLMs by a substantial margin. Moreover, our model evenoutperforms the largest closed LLMs, Anthropic's Claude and Google's Bard, onHumanEval and HumanEval+. Our code, model weights, and data are public athttps://github.com/nlpxucan/WizardLM
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| code-generation-on-codecontests | WizardCoder-15B | Test Set pass@1: 1.11 Test Set pass@5: 3.18 Val Set pass@1: 1.98 Val Set pass@5: 3.27 |
| code-generation-on-mbpp | WizardCoder 15B | Accuracy: 51.8 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.