a month ago

Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration

Haoran Zhang Yafu Li Xuyang Hu Dongrui Liu Zhilin Wang Bo Li Yu Cheng

Abstract

Large language models (LLMs) are increasingly applied in diverse real-worldscenarios, each governed by bespoke behavioral and safety specifications (spec)custom-tailored by users or organizations. These spec, categorized intosafety-spec and behavioral-spec, vary across scenarios and evolve with changingpreferences and requirements. We formalize this challenge as specificationalignment, focusing on LLMs' ability to follow dynamic, scenario-specific specfrom both behavioral and safety perspectives. To address this challenge, wepropose Align3, a lightweight method that employs Test-Time Deliberation (TTD)with hierarchical reflection and revision to reason over the specificationboundaries. We further present SpecBench, a unified benchmark for measuringspecification alignment, covering 5 scenarios, 103 spec, and 1,500 prompts.Experiments on 15 reasoning and 18 instruct models with several TTD methods,including Self-Refine, TPO, and MoreThink, yield three key findings: (i)test-time deliberation enhances specification alignment; (ii) Align3 advancesthe safety-helpfulness trade-off frontier with minimal overhead; (iii)SpecBench effectively reveals alignment gaps. These results highlight thepotential of test-time deliberation as an effective strategy for reasoning overthe real-world specification boundaries.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration

Haoran Zhang Yafu Li Xuyang Hu Dongrui Liu Zhilin Wang Bo Li Yu Cheng

Abstract

Build AI with AI

Hyper Newsletters