Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation

Stateful Conformer with Cache-Based Inference for Streaming Automatic Speech Recognition































Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation

Stateful Conformer with Cache-Based Inference for Streaming Automatic Speech Recognition






























Native and Compact Structured Latents for 3D Generation
Continuous Audio Language Models
Evolving Interactive Diagnostic Agents in a Virtual Clinical Environment
WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
HunyuanVideo-Foley: Multimodal Diffusion with Representation Alignment for High-Fidelity Foley Audio Generation
Fara-7B: An Efficient Agentic Model for Computer Use
Fun-ASR Technical Report
Accelerating Scientific Research with Gemini: Case Studies and Common Techniques
Scaling Small Agents Through Strategy Auctions
Vibe AIGC: A New Paradigm for Content Generation via Agentic Orchestration
PaperSearchQA: Learning to Search and Reason over Scientific Papers with RLVR
EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models
A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization
SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation
daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently
Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration
No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints
CL-bench: A Benchmark for Context Learning
Reinforcement Learning via Self-Distillation
Chatbots as social companions: How people perceive consciousness, human likeness, and social health benefits in machines
POPE: Learning to Reason on Hard Problems via Privileged On-Policy Exploration
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing
Closing the Loop: Universal Repository Representation with RPG-Encoder
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models
Native and Compact Structured Latents for 3D Generation
Continuous Audio Language Models
Evolving Interactive Diagnostic Agents in a Virtual Clinical Environment
WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
HunyuanVideo-Foley: Multimodal Diffusion with Representation Alignment for High-Fidelity Foley Audio Generation
Fara-7B: An Efficient Agentic Model for Computer Use
Fun-ASR Technical Report
Accelerating Scientific Research with Gemini: Case Studies and Common Techniques
Scaling Small Agents Through Strategy Auctions
Vibe AIGC: A New Paradigm for Content Generation via Agentic Orchestration
PaperSearchQA: Learning to Search and Reason over Scientific Papers with RLVR
EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models
A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization
SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation
daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently
Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks
AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration
No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints
CL-bench: A Benchmark for Context Learning
Reinforcement Learning via Self-Distillation
Chatbots as social companions: How people perceive consciousness, human likeness, and social health benefits in machines
POPE: Learning to Reason on Hard Problems via Privileged On-Policy Exploration
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing
Closing the Loop: Universal Repository Representation with RPG-Encoder
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models