Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

GraphLocator: Graph-guided Causal Reasoning for Issue Localization

Evaluating Parameter Efficient Methods for RLVR































GraphLocator: Graph-guided Causal Reasoning for Issue Localization

Evaluating Parameter Efficient Methods for RLVR






























End-to-End Test-Time Training for Long Context
DreamOmni3: Scribble-based Editing and Generation
UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement
mimic-video: Video-Action Models for Generalizable Robot Control Beyond VLAs
HY-Motion 1.0: Scaling Flow Matching Models for Text-To-Motion Generation
SurgWorld: Learning Surgical Robot Policies from Videos via World Modeling
SpotEdit: Selective Region Editing in Diffusion Transformers
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents
Yume-1.5: A Text-Controlled Interactive World Generation Model
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
LongFly: Long-Horizon UAV Vision-and-Language Navigation with Spatiotemporal Context Integration
Attention Is Not What You Need
SlideTailor: Personalized Presentation Slide Generation for Scientific Papers
InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion
Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding
Measuring short-form factuality in large language models
DeepSearchQA: Bridging the Comprehensiveness Gap for Deep Research Agents
MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents
AI-Trader: Benchmarking Autonomous Agents in Real-Time Financial Markets
Latent Implicit Visual Reasoning
LLM Personas as a Substitute for Field Experiments in Method Benchmarking
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models
DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation
End-to-End Test-Time Training for Long Context
DreamOmni3: Scribble-based Editing and Generation
UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement
mimic-video: Video-Action Models for Generalizable Robot Control Beyond VLAs
HY-Motion 1.0: Scaling Flow Matching Models for Text-To-Motion Generation
SurgWorld: Learning Surgical Robot Policies from Videos via World Modeling
SpotEdit: Selective Region Editing in Diffusion Transformers
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation
SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents
Yume-1.5: A Text-Controlled Interactive World Generation Model
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
LongFly: Long-Horizon UAV Vision-and-Language Navigation with Spatiotemporal Context Integration
Attention Is Not What You Need
SlideTailor: Personalized Presentation Slide Generation for Scientific Papers
InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion
Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding
Measuring short-form factuality in large language models
DeepSearchQA: Bridging the Comprehensiveness Gap for Deep Research Agents
MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents
AI-Trader: Benchmarking Autonomous Agents in Real-Time Financial Markets
Latent Implicit Visual Reasoning
LLM Personas as a Substitute for Field Experiments in Method Benchmarking
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming
TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models
DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation