Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

Group Sequence Policy Optimization

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45 Law































Group Sequence Policy Optimization

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45 Law






























Decoupling Knowledge and Reasoning in LLMs: An Exploration Using Cognitive Dual-System Theory
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny
RAVine: Reality-Aligned Evaluation for Agentic Search
Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning
DesignLab: Designing Slides Through Iterative Detection and Correction
Yume: An Interactive World Generation Model
Pixels, Patterns, but No Poetry: To See The World like Humans
MedChatZH: a Better Medical Adviser Learns from Better Instructions
Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning
HySafe-AI: Hybrid Safety Architectural Analysis Framework for AI Systems: A Case Study
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning
Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning
Upsample What Matters: Region-Adaptive Latent Sampling for Accelerated Diffusion Transformers
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning
Step-Audio 2 Technical Report
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning
Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report
Uncertainty-Aware Knowledge Transformers for Peer-to-Peer Energy Trading with Multi-Agent Reinforcement Learning
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining
Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampling
WebShaper: Agentically Data Synthesizing via Information-Seeking Formalization
The Invisible Leash: Why RLVR May Not Escape Its Origin
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization
Design of intrinsically disordered region binding proteins
An All-Atom Generative Model for Designing Protein Complexes
RedOne: Revealing Domain-specific LLM Post-Training in Social Networking Services
CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models
Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models
Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning
Decoupling Knowledge and Reasoning in LLMs: An Exploration Using Cognitive Dual-System Theory
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny
RAVine: Reality-Aligned Evaluation for Agentic Search
Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning
DesignLab: Designing Slides Through Iterative Detection and Correction
Yume: An Interactive World Generation Model
Pixels, Patterns, but No Poetry: To See The World like Humans
MedChatZH: a Better Medical Adviser Learns from Better Instructions
Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning
HySafe-AI: Hybrid Safety Architectural Analysis Framework for AI Systems: A Case Study
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning
Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning
Upsample What Matters: Region-Adaptive Latent Sampling for Accelerated Diffusion Transformers
MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning
Step-Audio 2 Technical Report
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning
Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report
Uncertainty-Aware Knowledge Transformers for Peer-to-Peer Energy Trading with Multi-Agent Reinforcement Learning
NoHumansRequired: Autonomous High-Quality Image Editing Triplet Mining
Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampling
WebShaper: Agentically Data Synthesizing via Information-Seeking Formalization
The Invisible Leash: Why RLVR May Not Escape Its Origin
GUI-G^2: Gaussian Reward Modeling for GUI Grounding
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization
Design of intrinsically disordered region binding proteins
An All-Atom Generative Model for Designing Protein Complexes
RedOne: Revealing Domain-specific LLM Post-Training in Social Networking Services
CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models
Mono-InternVL-1.5: Towards Cheaper and Faster Monolithic Multimodal Large Language Models
Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning