6 months ago

Yifei Xu Yuning Chen Xumiao Zhang Xianshang Lin Pan Hu Yunfei Ma Songwu Lu Wan Du Zhuoqing Mao Ennan Zhai

Abstract

Among the thriving ecosystem of cloud computing and the proliferation of Large Language Model (LLM)-based code generation tools, there is a lack of benchmarking for code generation in cloud-native applications. In response to this need, we present CloudEval-YAML, a practical benchmark for cloud configuration generation. CloudEval-YAML tackles the diversity challenge by focusing on YAML, the de facto standard of numerous cloud-native tools. We develop the CloudEval-YAML benchmark with practicality in mind: the dataset consists of hand-written problems with unit tests targeting practical scenarios. We further enhanced the dataset to meet practical needs by rephrasing questions in a concise, abbreviated, and bilingual manner. The dataset consists of 1011 problems that take more than 1200 human hours to complete. To improve practicality during evaluation, we build a scalable evaluation platform for CloudEval-YAML that achieves a 20 times speedup over a single machine. To the best of our knowledge, the CloudEval-YAML dataset is the first hand-written dataset targeting cloud-native applications. We present an in-depth evaluation of 12 LLMs, leading to a deeper understanding of the problems and LLMs, as well as effective methods to improve task performance and reduce cost.

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

6 months ago

Natural Language Processing

Task/Problem

Yifei Xu Yuning Chen Xumiao Zhang Xianshang Lin Pan Hu Yunfei Ma Songwu Lu Wan Du Zhuoqing Mao Ennan Zhai

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

6 months ago

Natural Language Processing

Task/Problem

Yifei Xu Yuning Chen Xumiao Zhang Xianshang Lin Pan Hu Yunfei Ma Songwu Lu Wan Du Zhuoqing Mao Ennan Zhai

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

CloudEval-YAML: A Practical Benchmark for Cloud Configuration Generation

Yifei Xu Yuning Chen Xumiao Zhang Xianshang Lin Pan Hu Yunfei Ma Songwu Lu Wan Du Zhuoqing Mao Ennan Zhai1 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

CloudEval-YAML: A Practical Benchmark for Cloud Configuration Generation

Yifei Xu Yuning Chen Xumiao Zhang Xianshang Lin Pan Hu Yunfei Ma Songwu Lu Wan Du Zhuoqing Mao Ennan Zhai1 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

CloudEval-YAML: A Practical Benchmark for Cloud Configuration Generation

Yifei Xu Yuning Chen Xumiao Zhang Xianshang Lin Pan Hu Yunfei Ma Songwu Lu Wan Du Zhuoqing Mao Ennan Zhai1 more

Abstract

Build AI with AI

HyperAI Newsletters

Yifei Xu Yuning Chen Xumiao Zhang Xianshang Lin Pan Hu Yunfei Ma Songwu Lu Wan Du Zhuoqing Mao Ennan Zhai

Yifei Xu Yuning Chen Xumiao Zhang Xianshang Lin Pan Hu Yunfei Ma Songwu Lu Wan Du Zhuoqing Mao Ennan Zhai

Yifei Xu Yuning Chen Xumiao Zhang Xianshang Lin Pan Hu Yunfei Ma Songwu Lu Wan Du Zhuoqing Mao Ennan Zhai