Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Conditional Memory via Scalable Lookup:A New Axis of Sparsity for Large Language Models































Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Conditional Memory via Scalable Lookup:A New Axis of Sparsity for Large Language Models






























EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis
Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards
CaricatureGS: Exaggerating 3D Gaussian Splatting Faces With Gaussian Curvature
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning
MMFormalizer: Multimodal Autoformalization in the Wild
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization
Breaking the Sorting Barrier for Directed Single-Source Shortest Paths
GR-Dexter Technical Report
VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
RelayLLM: Efficient Reasoning via Collaborative Decoding
Token-Level LLM Collaboration via FusionRoute
RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes
Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
MemRL: Self-Evolving Agents via Runtime Reinforcement Learning on Episodic Memory
From Failure to Mastery: Generating Hard Samples for Tool-use Agents
Choreographing a World of Dynamic Objects
Klear: Unified Multi-Task Audio-Video Joint Generation
Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning
Benchmark^2: Systematic Evaluation of LLM Benchmarks
MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
Diversity or Precision? A Deep Dive into Next Token Prediction
Confucius Code Agent: Scalable Agent Scaffolding for Real-World Codebases
DreamStyle: A Unified Framework for Video Stylization
UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision
LTX-2: Efficient Joint Audio-Visual Foundation Model
SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields
EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis
Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards
CaricatureGS: Exaggerating 3D Gaussian Splatting Faces With Gaussian Curvature
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning
MMFormalizer: Multimodal Autoformalization in the Wild
Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization
Breaking the Sorting Barrier for Directed Single-Source Shortest Paths
GR-Dexter Technical Report
VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
RelayLLM: Efficient Reasoning via Collaborative Decoding
Token-Level LLM Collaboration via FusionRoute
RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes
Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
MemRL: Self-Evolving Agents via Runtime Reinforcement Learning on Episodic Memory
From Failure to Mastery: Generating Hard Samples for Tool-use Agents
Choreographing a World of Dynamic Objects
Klear: Unified Multi-Task Audio-Video Joint Generation
Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning
Benchmark^2: Systematic Evaluation of LLM Benchmarks
MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting
Diversity or Precision? A Deep Dive into Next Token Prediction
Confucius Code Agent: Scalable Agent Scaffolding for Real-World Codebases
DreamStyle: A Unified Framework for Video Stylization
UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision
LTX-2: Efficient Joint Audio-Visual Foundation Model
SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields