HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation

Han He; Qianchu Liu; Lei Xu; Chaitanya Shivade; Yi Zhang; Sundararajan Srinivasan; Katrin Kirchhoff

CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation

Abstract

Existing automatic prompt engineering methods are typically designed for discriminative tasks, where new task prompts are iteratively refined with limited feedback from a single metric reflecting a single aspect. However, these approaches are suboptimal for generative tasks, which require more nuanced guidance beyond a single numeric metric to improve the prompt and optimize multiple aspects of the generated text. To address these challenges, we propose a novel multi-aspect Critique-Suggestion-guided automatic Prompt Optimization (CriSPO) approach. CriSPO introduces a critique-suggestion module as its core component. This module spontaneously discovers aspects, and compares generated and reference texts across these aspects, providing specific suggestions for prompt modification. These clear critiques and actionable suggestions guide a receptive optimizer module to make more substantial changes, exploring a broader and more effective search space. To further improve CriSPO with multi-metric optimization, we introduce an Automatic Suffix Tuning (AST) extension to enhance the performance of task prompts across multiple metrics. We evaluate CriSPO on 4 state-of-the-art LLMs across 4 summarization and 5 QA datasets. Extensive experiments show 3-4% ROUGE score improvement on summarization and substantial improvement of various metrics on QA. Code available at https://github.com/amazon-science/crispo

Code Repositories

amazon-science/crispo
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
abstractive-text-summarization-on-cnn-dailyCriSPO 3-shot
ROUGE-L: 27.4
abstractive-text-summarization-on-cnn-daily-2CriSPO 3-shot
ROUGE-1: 42.1
ROUGE-2: 17
text-summarization-on-aci-benchCriSPO 3-shot
ROUGE-1: 63.1
ROUGE-2: 32.5
ROUGE-L: 41
text-summarization-on-meetingbankCriSPO 3-shot
ROUGE-2: 46.5
ROUGE-L: 54.1
Rouge-1: 58.5
text-summarization-on-samsum-corpusCriSPO 3-shot
ROUGE-1: 47.2
ROUGE-2: 20.8
ROUGE-L: 38.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation | Papers | HyperAI