Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing































Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing






























MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion
Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing
CARE-Edit: Condition-Aware Routing of Experts for Contextual Image Editing
Believe Your Model: Distribution-Guided Confidence Calibration
LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory
How Far Can Unsupervised RLVR Scale LLM Training?
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs
DreamCAD: Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces
Real-Time AI Service Economy: A Framework for Agentic Computing Across the Continuum
NOTAI.AI: Explainable Detection of Machine-Generated Text via Curvature and Feature Attribution
Safer Reasoning Traces: Measuring and Mitigating Chain-of-Thought Leakage in LLMs
RACAS: Controlling Diverse Robots With a Single Agentic System
Bias In, Bias Out? Finding Unbiased Subnetworks in Vanilla Models
ArtLLM: Generating Articulated Assets via 3D LLM
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images
RoboPocket: Improve Robot Policies Instantly with Your Phone
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval
SkillNet: Create, Evaluate, and Connect AI Skills
MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier
SURvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis
PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms
ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors
Proact-VL: A Proactive VideoLLM for Real-Time AI Companions
T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning
Heterogeneous Agent Collaborative Reinforcement Learning
Helios: Real Real-Time Long Video Generation Model
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion
Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing
CARE-Edit: Condition-Aware Routing of Experts for Contextual Image Editing
Believe Your Model: Distribution-Guided Confidence Calibration
LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory
How Far Can Unsupervised RLVR Scale LLM Training?
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs
DreamCAD: Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces
Real-Time AI Service Economy: A Framework for Agentic Computing Across the Continuum
NOTAI.AI: Explainable Detection of Machine-Generated Text via Curvature and Feature Attribution
Safer Reasoning Traces: Measuring and Mitigating Chain-of-Thought Leakage in LLMs
RACAS: Controlling Diverse Robots With a Single Agentic System
Bias In, Bias Out? Finding Unbiased Subnetworks in Vanilla Models
ArtLLM: Generating Articulated Assets via 3D LLM
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images
RoboPocket: Improve Robot Policies Instantly with Your Phone
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval
SkillNet: Create, Evaluate, and Connect AI Skills
MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier
SURvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis
PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms
ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors
Proact-VL: A Proactive VideoLLM for Real-Time AI Companions
T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning
Heterogeneous Agent Collaborative Reinforcement Learning
Helios: Real Real-Time Long Video Generation Model