6 months ago

Chunyu Xie Heng Cai Jincheng Li Fanjing Kong Xiaoyu Wu Jianfei Song Henrique Morimitsu Lin Yao Dexin Wang Xiangzheng Zhang

Abstract

Vision-language pre-training (VLP) on large-scale datasets has shown premier performance on various downstream tasks. In contrast to plenty of available benchmarks with English corpus, large-scale pre-training datasets and downstream datasets with Chinese corpus remain largely unexplored. In this work, we build a large-scale high-quality Chinese Cross-Modal Benchmark named CCMB for the research community, which contains the currently largest public pre-training dataset Zero and five human-annotated fine-tuning datasets for downstream tasks. Zero contains 250 million images paired with 750 million text descriptions, plus two of the five fine-tuning datasets are also currently the largest ones for Chinese cross-modal downstream tasks. Along with the CCMB, we also develop a VLP framework named R2D2, applying a pre-Ranking + Ranking strategy to learn powerful vision-language representations and a two-way distillation method (i.e., target-guided Distillation and feature-guided Distillation) to further enhance the learning capability. With the Zero and the R2D2 VLP framework, we achieve state-of-the-art performance on twelve downstream datasets from five broad categories of tasks including image-text retrieval, image-text matching, image caption, text-to-image generation, and zero-shot image classification. The datasets, models, and codes are available at https://github.com/yuxie11/R2D2

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

6 months ago

Multimodal Representation

Chunyu Xie Heng Cai Jincheng Li Fanjing Kong Xiaoyu Wu Jianfei Song Henrique Morimitsu Lin Yao Dexin Wang Xiangzheng Zhang

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

6 months ago

Multimodal Representation

Chunyu Xie Heng Cai Jincheng Li Fanjing Kong Xiaoyu Wu Jianfei Song Henrique Morimitsu Lin Yao Dexin Wang Xiangzheng Zhang

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

CCMB: A Large-scale Chinese Cross-modal Benchmark

Chunyu Xie Heng Cai Jincheng Li Fanjing Kong Xiaoyu Wu Jianfei Song Henrique Morimitsu Lin Yao Dexin Wang Xiangzheng Zhang4 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

CCMB: A Large-scale Chinese Cross-modal Benchmark

Chunyu Xie Heng Cai Jincheng Li Fanjing Kong Xiaoyu Wu Jianfei Song Henrique Morimitsu Lin Yao Dexin Wang Xiangzheng Zhang4 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

CCMB: A Large-scale Chinese Cross-modal Benchmark

Chunyu Xie Heng Cai Jincheng Li Fanjing Kong Xiaoyu Wu Jianfei Song Henrique Morimitsu Lin Yao Dexin Wang Xiangzheng Zhang4 more

Abstract

Build AI with AI

HyperAI Newsletters

Chunyu Xie Heng Cai Jincheng Li Fanjing Kong Xiaoyu Wu Jianfei Song Henrique Morimitsu Lin Yao Dexin Wang Xiangzheng Zhang

Chunyu Xie Heng Cai Jincheng Li Fanjing Kong Xiaoyu Wu Jianfei Song Henrique Morimitsu Lin Yao Dexin Wang Xiangzheng Zhang

Chunyu Xie Heng Cai Jincheng Li Fanjing Kong Xiaoyu Wu Jianfei Song Henrique Morimitsu Lin Yao Dexin Wang Xiangzheng Zhang