HyperAIHyperAI

Command Palette

Search for a command to run...

Multimodal Learning

Date

3 years ago

Modality refers to the specific way people receive information. Since multimedia data is often a medium for transmitting multiple types of information (for example, a video often transmits text, visual, and auditory information at the same time), multimodal learning has gradually developed into the main means of multimedia content analysis and understanding.

Multimodal learning mainly includes the following research directions:

  1. Multimodal representation learning: mainly studies how to digitize the semantic information contained in multiple modal data into real-valued vectors.
  2. Inter-modal mapping: mainly studies how to map the information in a specific modality data to another modality.
  3. Alignment: Mainly studies how to identify the correspondence between components and elements between different modes.
  4. Fusion: Mainly studies how to integrate models and features between different modalities.
  5. Collaborative learning: mainly studies how to transfer knowledge learned in information-rich modalities to information-poor modalities, so that the learning of each modality can assist each other. Typical methods include multimodal zero-shot learning and domain adaptation.

References

【1】AI Review Column - Review of Multimodal Learning Research Progress (Zhihu)

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Multimodal Learning | Wiki | HyperAI