4 个月前

无训练域转换的组合图像检索

无训练域转换的组合图像检索

摘要

这项研究在域转换的背景下探讨了组合图像检索问题,即根据查询文本指定的域检索查询图像的内容。我们展示了强大的视觉-语言模型在无需额外训练的情况下提供了足够的描述能力。通过文本反转技术,查询图像被映射到文本输入空间。与通常在连续的文本标记空间中进行反转的做法不同,我们使用离散词汇空间,通过在文本词汇表中进行最近邻搜索来实现这一目标。借助这种反转方法,图像可以在词汇表中进行软映射,并通过基于检索的增强技术提高其鲁棒性。数据库中的图像通过加权集成的文本查询进行检索,这些查询结合了映射后的单词和域文本。我们的方法在标准基准和新引入的基准上均大幅优于现有技术。代码:https://github.com/NikosEfth/freedom

代码仓库

nikosefth/freedom
官方
pytorch
GitHub 中提及

基准测试

基准方法指标
zero-shot-composed-image-retrieval-zs-cir-on-6FreeDom (CLIP-L/14)
mAP: 29.91
zero-shot-composed-image-retrieval-zs-cir-on-6CompoDiff (CLIP-L/14)
mAP: 12.88
zero-shot-composed-image-retrieval-zs-cir-on-6SEARLE (CLIP-L/14)
mAP: 14.04
zero-shot-composed-image-retrieval-zs-cir-on-6Pic2Word (CLIP-L/14)
mAP: 7.88
zero-shot-composed-image-retrieval-zs-cir-on-6WeiCom (CLIP-L/14)
mAP: 10.47
zero-shot-composed-image-retrieval-zs-cir-on-6MagicLens (CLIP-L/14)
mAP: 9.13
zero-shot-composed-image-retrieval-zs-cir-on-7CompoDiff (CLIP-L/14)
mAP: 22.95
zero-shot-composed-image-retrieval-zs-cir-on-7Pic2Word (CLIP-L/14)
mAP: 12
zero-shot-composed-image-retrieval-zs-cir-on-7MagicLens (CLIP-L/14)
mAP: 20.06
zero-shot-composed-image-retrieval-zs-cir-on-7SEARLE (CLIP-L/14)
mAP: 21.78
zero-shot-composed-image-retrieval-zs-cir-on-7WeiCom (CLIP-L/14)
mAP: 8.52
zero-shot-composed-image-retrieval-zs-cir-on-7FreeDom (CLIP-L/14)
mAP: 37.27
zero-shot-composed-image-retrieval-zs-cir-on-8FreeDom (CLIP-L/14)
mAP: 26.1
zero-shot-composed-image-retrieval-zs-cir-on-8WeiCom (CLIP-L/14)
mAP: 10.54
zero-shot-composed-image-retrieval-zs-cir-on-8MagicLens (CLIP-L/14)
mAP: 19.66
zero-shot-composed-image-retrieval-zs-cir-on-8SEARLE (CLIP-L/14)
mAP: 15.13
zero-shot-composed-image-retrieval-zs-cir-on-8CompoDiff (CLIP-L/14)
mAP: 10.32
zero-shot-composed-image-retrieval-zs-cir-on-8Pic2Word (CLIP-L/14)
mAP: 9.76
zero-shot-composed-image-retrieval-zs-cir-on-9FreeDom (CLIP-L/14)
mAP: 33.24
zero-shot-composed-image-retrieval-zs-cir-on-9SEARLE (CLIP-L/14)
mAP: 25.46
zero-shot-composed-image-retrieval-zs-cir-on-9Pic2Word (CLIP-L/14)
mAP: 21.27
zero-shot-composed-image-retrieval-zs-cir-on-9CompoDiff (CLIP-L/14)
mAP: 21.61
zero-shot-composed-image-retrieval-zs-cir-on-9WeiCom (CLIP-L/14)
mAP: 26.6
zero-shot-composed-image-retrieval-zs-cir-on-9MagicLens (CLIP-L/14)
mAP: 24.21

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
无训练域转换的组合图像检索 | 论文 | HyperAI超神经