Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Measuring Agents in Production

PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts































Measuring Agents in Production

PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts






























ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models
SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning
OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Soft Adaptive Policy Optimization
Scaling Zero-Shot Reference-to-Video Generation
Voxify3D: Pixel Art Meets Volumetric Rendering
DoVer: Intervention-Driven Auto Debugging for LLM Multi-Agent Systems
Unified Video Editing with Temporal Reasoner
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning
iSeal: Encrypted Fingerprinting for Reliable LLM Ownership Verification
DAVSP: Safety Alignment for Large Vision-Language Models via Deep Aligned Visual Safety Prompt
WorldGen: From Text to Traversable and Interactive 3D Worlds
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance
DTS: Enhancing Large Reasoning Models via Decoding Tree Sketching
Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs
DePass: Unified Feature Attributing by Simple Decomposed Forward Pass
COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence
From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture
EditThinker: Unlocking Iterative Reasoning for Any Image Editor
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows
CARE-PD: A Multi-Site Anonymized Clinical Dataset for Parkinson's Disease Gait Assessment
WenetSpeech-Chuan: A Large-Scale Sichuanese Corpus with Rich Annotation for Dialectal Speech Processing
PolypSense3D: A Multi-Source Benchmark Dataset for Depth-Aware Polyp Size Measurement in Endoscopy
PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring
ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models
SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning
OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
Preserving Source Video Realism: High-Fidelity Face Swapping for Cinematic Quality
Visionary: The World Model Carrier Built on WebGPU-Powered Gaussian Splatting Platform
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Soft Adaptive Policy Optimization
Scaling Zero-Shot Reference-to-Video Generation
Voxify3D: Pixel Art Meets Volumetric Rendering
DoVer: Intervention-Driven Auto Debugging for LLM Multi-Agent Systems
Unified Video Editing with Temporal Reasoner
Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning
iSeal: Encrypted Fingerprinting for Reliable LLM Ownership Verification
DAVSP: Safety Alignment for Large Vision-Language Models via Deep Aligned Visual Safety Prompt
WorldGen: From Text to Traversable and Interactive 3D Worlds
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance
DTS: Enhancing Large Reasoning Models via Decoding Tree Sketching
Adaptive Kernel Design for Bayesian Optimization Is a Piece of CAKE with LLMs
DePass: Unified Feature Attributing by Simple Decomposed Forward Pass
COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence
From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling
EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture
EditThinker: Unlocking Iterative Reasoning for Any Image Editor
TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows
CARE-PD: A Multi-Site Anonymized Clinical Dataset for Parkinson's Disease Gait Assessment
WenetSpeech-Chuan: A Large-Scale Sichuanese Corpus with Rich Annotation for Dialectal Speech Processing
PolypSense3D: A Multi-Source Benchmark Dataset for Depth-Aware Polyp Size Measurement in Endoscopy
PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring