Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

NitroGen: An Open Foundation Model for Generalist Gaming Agents

H-Neurons: On the Existence, Impact, and Origin of Hallucination-Associated Neurons in LLMs

The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text

Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation

Generative Refocusing: Flexible Defocus Control from a Single Image

StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors

Next-Embedding Prediction Makes Strong Vision Learners

Agent AI: Surveying the Horizons of Multimodal Interaction

AI Mathematician as a Partner in Advancing Mathematical Discovery -- A Case Study in Homogenization Theory

GenEval 2: Addressing Benchmark Drift in Text-to-Image Evaluation

PrivateXR: Defending Privacy Attacks in Extended Reality Through Explainable AI-Guided Differential Privacy

Temporal Frictions and Judicial Outcomes: Analyzing the Impact of Time Delays on Criminal Sentencing in Cook County (2020-2024)

Meta-RL Induces Exploration in Language Agents

LLMCache: Layer-Wise Caching Strategies for Accelerated Reuse in Transformer Inference

OPENTOUCH: Bringing Full-Hand Touch to Real-World Interaction

VideoRewardBench: Comprehensive Evaluation of Multimodal Reward Models for Video Understanding

Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation

IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual

RecGPT-V2 Technical Report

Vector Prism: Animating Vector Graphics by Stratifying Semantic Structure

OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value

Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans?

WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling

MMGR: Multi-Modal Generative Reasoning

FrontierScience: Evaluating AI’s Ability To Perform Expert-Level Scientific Tasks

The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality

Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

KlingAvatar 2.0 Technical Report

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics

NitroGen: An Open Foundation Model for Generalist Gaming Agents

H-Neurons: On the Existence, Impact, and Origin of Hallucination-Associated Neurons in LLMs

The World is Your Canvas: Painting Promptable Events with Reference Images, Trajectories, and Text

Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection

Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation

Generative Refocusing: Flexible Defocus Control from a Single Image

StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors

Next-Embedding Prediction Makes Strong Vision Learners

Agent AI: Surveying the Horizons of Multimodal Interaction

AI Mathematician as a Partner in Advancing Mathematical Discovery -- A Case Study in Homogenization Theory

GenEval 2: Addressing Benchmark Drift in Text-to-Image Evaluation

PrivateXR: Defending Privacy Attacks in Extended Reality Through Explainable AI-Guided Differential Privacy

Temporal Frictions and Judicial Outcomes: Analyzing the Impact of Time Delays on Criminal Sentencing in Cook County (2020-2024)

Meta-RL Induces Exploration in Language Agents

LLMCache: Layer-Wise Caching Strategies for Accelerated Reuse in Transformer Inference

OPENTOUCH: Bringing Full-Hand Touch to Real-World Interaction

VideoRewardBench: Comprehensive Evaluation of Multimodal Reward Models for Video Understanding

Soul: Breathe Life into Digital Human for High-fidelity Long-term Multimodal Animation

IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual

RecGPT-V2 Technical Report

Vector Prism: Animating Vector Graphics by Stratifying Semantic Structure

OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value

Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans?

WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling

MMGR: Multi-Modal Generative Reasoning

FrontierScience: Evaluating AI’s Ability To Perform Expert-Level Scientific Tasks

The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality

Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

KlingAvatar 2.0 Technical Report

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics