HyperAIHyperAI

Command Palette

Search for a command to run...

KV Cache

Date

2 years ago

KV Cache stands for Key-value Cache. It is a commonly used technology for optimizing the reasoning performance of large models. This technology can improve the reasoning performance by exchanging space for time without affecting any calculation accuracy. KV Cache is an important engineering technology for optimizing Transformer reasoning performance.All major inference frameworks have implemented and encapsulated it (for example, the generate function of the transformers library has encapsulated it, and users do not need to manually pass in past_key_values) and it is enabled by default (use_cache=True in the config.json file).

References

【1】https://zhuanlan.zhihu.com/p/630832593

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
KV Cache | Wiki | HyperAI