Date

2 years ago

KV Cache stands for Key-value Cache. It is a commonly used technology for optimizing the reasoning performance of large models. This technology can improve the reasoning performance by exchanging space for time without affecting any calculation accuracy. KV Cache is an important engineering technology for optimizing Transformer reasoning performance.All major inference frameworks have implemented and encapsulated it (for example, the generate function of the transformers library has encapsulated it, and users do not need to manually pass in past_key_values) and it is enabled by default (use_cache=True in the config.json file).

References

【1】https://zhuanlan.zhihu.com/p/630832593

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Date

2 years ago

References

【1】https://zhuanlan.zhihu.com/p/630832593

Related Wiki

Cache-to-Cache (C2C)

C2C enables direct semantic communication by transforming and fusing key-value (KV) caches between models.

2 months ago

Representation Autoencoders

With its significant advantages, RAE is poised to become the new default choice for training diffusion Transformers.

3 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

KV Cache

References

Build AI with AI

HyperAI Newsletters

Command Palette

KV Cache

References

Related Wiki

Cache-to-Cache (C2C)

Representation Autoencoders

Build AI with AI

HyperAI Newsletters

Command Palette

KV Cache

References

Related Wiki

Cache-to-Cache (C2C)

Representation Autoencoders

Build AI with AI

HyperAI Newsletters

Related Wiki

Cache-to-Cache (C2C)

Representation Autoencoders

Related Wiki

Cache-to-Cache (C2C)

Representation Autoencoders