HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

Liyiming Ke; Xiujun Li; Yonatan Bisk; Ari Holtzman; Zhe Gan; Jingjing Liu; Jianfeng Gao; Yejin Choi; Siddhartha Srinivasa

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

Abstract

We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et. al. (2018). Given a natural language instruction and photo-realistic image views of a previously unseen environment, the agent was tasked with navigating from source to target location as quickly as possible. While all current approaches make local action decisions or score entire trajectories using beam search, ours balances local and global signals when exploring an unobserved environment. Importantly, this lets us act greedily but use global signals to backtrack when necessary. Applying FAST framework to existing state-of-the-art models achieved a 17% relative gain, an absolute 6% gain on Success rate weighted by Path Length (SPL).

Code Repositories

Kelym/FAST
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
vision-and-language-navigation-on-vlnTactical Rewind - short
error: 5.14
length: 22.08
oracle success: 0.64
spl: 0.41
success: 0.54
vision-and-language-navigation-on-vlnTactical Rewind - long
error: 4.29
length: 196.53
oracle success: 0.9
spl: 0.03
success: 0.61
vision-language-navigation-on-room2roomTactical Rewind - short
spl: 0.41

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation | Papers | HyperAI