3 months ago

DoRA: Weight-Decomposed Low-Rank Adaptation

Shih-Yang Liu Chien-Yi Wang Hongxu Yin Pavlo Molchanov Yu-Chiang Frank Wang Kwang-Ting Cheng Min-Hung Chen

Abstract

Among the widely used parameter-efficient fine-tuning (PEFT) methods, LoRA and its variants have gained considerable popularity because of avoiding additional inference costs. However, there still often exists an accuracy gap between these methods and full fine-tuning (FT). In this work, we first introduce a novel weight decomposition analysis to investigate the inherent differences between FT and LoRA. Aiming to resemble the learning capacity of FT from the findings, we propose Weight-Decomposed Low-Rank Adaptation (DoRA). DoRA decomposes the pre-trained weight into two components, magnitude and direction, for fine-tuning, specifically employing LoRA for directional updates to efficiently minimize the number of trainable parameters. By employing \ours, we enhance both the learning capacity and training stability of LoRA while avoiding any additional inference overhead. \ours~consistently outperforms LoRA on fine-tuning LLaMA, LLaVA, and VL-BART on various downstream tasks, such as commonsense reasoning, visual instruction tuning, and image/video-text understanding. Code is available at https://github.com/NVlabs/DoRA.

Code Repositories

NVlabs/DoRA

Official

pytorch

Mentioned in GitHub

catid/dora

pytorch

Mentioned in GitHub

nbasyl/DoRA

Official

Mentioned in GitHub

seanzhang-zhichen/llama3-chinese

pytorch

Mentioned in GitHub

ayyucedemirbas/DoRA

Benchmarks

Benchmark	Methodology	Metrics
parameter-efficient-fine-tuning-on-boolq	LLaMA2-7b	Accuracy (% ): 81.93
parameter-efficient-fine-tuning-on-hellaswag	LLaMA2-7b	Accuracy (% ): 76.27
parameter-efficient-fine-tuning-on-winogrande	LLaMA2-7b	Accuracy (% ): 70.09

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

DoRA: Weight-Decomposed Low-Rank Adaptation

Shih-Yang Liu Chien-Yi Wang Hongxu Yin Pavlo Molchanov Yu-Chiang Frank Wang Kwang-Ting Cheng Min-Hung Chen

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters