2 months ago

UI-Level Evaluation of ALLaM 34B: Measuring an Arabic-Centric LLM via HUMAIN Chat

Omer Nacar

Abstract

Large language models (LLMs) trained primarily on English corpora oftenstruggle to capture the linguistic and cultural nuances of Arabic. To addressthis gap, the Saudi Data and AI Authority (SDAIA) introduced the ALLaM familyof Arabic-focused models. The most capable of these available to the public,ALLaM-34B, was subsequently adopted by HUMAIN, who developed and deployedHUMAIN Chat, a closed conversational web service built on this model. Thispaper presents an expanded and refined UI-level evaluation of ALLaM-34B.Using a prompt pack spanning modern standard Arabic, five regional dialects,code-switching, factual knowledge, arithmetic and temporal reasoning, creativegeneration, and adversarial safety, we collected 115 outputs (23 prompts times5 runs) and scored each with three frontier LLM judges (GPT-5, Gemini 2.5 Pro,Claude Sonnet-4). We compute category-level means with 95\% confidenceintervals, analyze score distributions, and visualize dialect-wise metric heatmaps. The updated analysis reveals consistently high performance on generationand code-switching tasks (both averaging 4.92/5), alongside strong results inMSA handling (4.74/5), solid reasoning ability (4.64/5), and improved dialectfidelity (4.21/5). Safety-related prompts show stable, reliable performance of(4.54/5). Taken together, these results position ALLaM-34B as a robust andculturally grounded Arabic LLM, demonstrating both technical strength andpractical readiness for real-world deployment.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

UI-Level Evaluation of ALLaM 34B: Measuring an Arabic-Centric LLM via HUMAIN Chat

Omer Nacar

Abstract

Build AI with AI

Hyper Newsletters