HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence

Yining Hong Rui Sun Bingxuan Li Xingcheng Yao Maxine Wu Alexander Chien Da Yin Ying Nian Wu Zhecan James Wang Kai-Wei Chang

Embodied Web Agents: Bridging Physical-Digital Realms for Integrated
  Agent Intelligence

Abstract

AI agents today are mostly siloed - they either retrieve and reason over vastamount of digital information and knowledge obtained online; or interact withthe physical world through embodied perception, planning and action - butrarely both. This separation limits their ability to solve tasks that requireintegrated physical and digital intelligence, such as cooking from onlinerecipes, navigating with dynamic map data, or interpreting real-world landmarksusing web knowledge. We introduce Embodied Web Agents, a novel paradigm for AIagents that fluidly bridge embodiment and web-scale reasoning. Tooperationalize this concept, we first develop the Embodied Web Agents taskenvironments, a unified simulation platform that tightly integrates realistic3D indoor and outdoor environments with functional web interfaces. Buildingupon this platform, we construct and release the Embodied Web Agents Benchmark,which encompasses a diverse suite of tasks including cooking, navigation,shopping, tourism, and geolocation - all requiring coordinated reasoning acrossphysical and digital realms for systematic assessment of cross-domainintelligence. Experimental results reveal significant performance gaps betweenstate-of-the-art AI systems and human capabilities, establishing bothchallenges and opportunities at the intersection of embodied cognition andweb-scale knowledge access. All datasets, codes and websites are publiclyavailable at our project page https://embodied-web-agent.github.io/.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence | Papers | HyperAI