An open-weight model from LG AI Research, offering top-tier reasoning capabilities and impressive speed, albeit with a high cost and notable verbosity.
EXAONE 4.0 32B (Reasoning) is a formidable entry into the open-weight model space from LG AI Research. As its name suggests, this 32-billion parameter model is specifically tuned for complex reasoning, logical deduction, and multi-step problem-solving. It distinguishes itself with a potent combination of high intelligence, ranking in the top tier of our benchmarks, and impressive generation speed, making it a compelling option for demanding, real-time applications.
Scoring an impressive 43 on the Artificial Analysis Intelligence Index, EXAONE 4.0 32B places 10th out of 84 models, significantly outperforming the class average of 26. This intellectual prowess is complemented by a generation speed of over 109 tokens per second, which is faster than the average model in its class. This pairing of smarts and speed is rare and positions the model as a premium tool for developers who need both high-quality output and a responsive user experience.
However, this premium performance comes with a premium price tag. At $0.60 per million input tokens and $1.00 per million output tokens, it is substantially more expensive than its open-weight peers. The model also exhibits a strong tendency towards verbosity, generating over four times the average number of tokens during our intelligence evaluation. This combination of high per-token cost and high token output means that operational expenses can accumulate quickly. Developers must weigh the model's exceptional capabilities—including its massive 131k context window—against a total cost of ownership that rivals some proprietary, closed-source models.
Ultimately, EXAONE 4.0 32B is a specialist model. It's not a cost-effective choice for simple, high-volume tasks. Instead, it excels in scenarios where its deep reasoning, large context handling, and rapid response times are critical requirements that justify the higher operational cost. Use cases like in-depth legal document analysis, complex scientific research, or sophisticated multi-turn conversational AI are where this model is designed to shine.
43 (10 / 84)
109.3 tokens/s
$0.60 / 1M tokens
$1.00 / 1M tokens
100M tokens
0.33 seconds
| Spec | Details |
|---|---|
| Owner | LG AI Research |
| License | Open |
| Model Size | 32 Billion Parameters |
| Context Window | 131,072 tokens |
| Input Modality | Text |
| Output Modality | Text |
| Variant | Reasoning (Fine-tuned) |
| Benchmarked Provider | FriendliAI |
| Latency (TTFT) | 0.33 seconds |
| Output Speed | 109.3 tokens/second |
| Input Token Price | $0.60 / 1M tokens |
| Output Token Price | $1.00 / 1M tokens |
| Blended Price (3:1) | $0.70 / 1M tokens |
Currently, EXAONE 4.0 32B is available through a limited number of API providers. Our benchmarks focus on FriendliAI, which offers a performant and reliable endpoint for accessing the model's capabilities. As the ecosystem matures, we expect to see more providers offering this model.
| Priority | Pick | Why | Tradeoff to accept |
|---|---|---|---|
| Best Overall | FriendliAI | As the sole benchmarked provider, FriendliAI is the default choice. It delivers the model's full potential with excellent speed (109.3 tokens/s) and low latency (0.33s TTFT). | The primary tradeoff is the model's inherent high cost, which is a function of the model itself, not the provider. |
| Fastest | FriendliAI | With an output speed well above the class average, FriendliAI's serving infrastructure proves highly effective for this model, making it the fastest option available. | Speed comes at the model's set price; there is no slower, cheaper alternative for this specific model. |
| Cheapest | FriendliAI | By default, FriendliAI is also the most cost-effective option. The pricing of $0.60 (input) and $1.00 (output) is the current market rate for this model. | 'Cheapest' is relative; the model remains one of the most expensive open-weight options on the market. |
Provider analysis is based on public pricing and performance benchmarks conducted by Artificial Analysis. Performance can vary based on workload, concurrency, and region. Prices are subject to change. This is not a sponsored placement.
The true cost of an AI model emerges in real-world application. The following scenarios illustrate the estimated cost of using EXAONE 4.0 32B for various tasks, based on its pricing on FriendliAI. Note how the interplay between input size, output verbosity, and per-token price affects the final cost.
| Scenario | Input | Output | What it represents | Estimated cost |
|---|---|---|---|---|
| Legal Contract Review | 50,000 tokens | 2,000 tokens | Summarizing a long document, leveraging the large context window. | ~$0.032 |
| Complex Code Scaffolding | 1,000 tokens | 4,000 tokens | Generating a functional application skeleton from a detailed prompt. | ~$0.0046 |
| Multi-Turn RAG Session | 12,500 tokens (total) | 7,500 tokens (total) | A 5-turn chat using retrieved documents for context in each turn. | ~$0.015 |
| Creative Story Generation | 200 tokens | 5,000 tokens | A simple prompt yielding a long, verbose creative output. | ~$0.0051 |
| Email Classification (Batch) | 250,000 tokens (1000 emails) | 5,000 tokens (1000 labels) | A simple task where the model's high cost and power are overkill. | ~$0.155 |
These examples highlight that EXAONE 4.0 32B is most cost-effective when its reasoning power is essential and the output is controlled. For simple, high-volume tasks like classification, its cost is prohibitive compared to smaller, cheaper models. The key is to reserve its use for high-value problems that justify the expense.
Given its high price and verbosity, managing the cost of EXAONE 4.0 32B is crucial for building a sustainable application. Proactive strategies can help you leverage its power without incurring runaway expenses. Below are several tactics to consider.
This model's default behavior is to be verbose, which directly increases costs due to the high output token price. You can mitigate this through careful prompt engineering.
max_tokens Parameter: Set a hard limit on the length of the generated output to prevent unexpectedly long and expensive responses.The 131k context window is a powerful but expensive feature. Filling it unnecessarily will lead to high costs on every call.
Many applications receive repetitive user queries. Re-calculating the same answer is a waste of money and compute.
EXAONE 4.0 32B is a specialist tool. Using it for simple tasks is like using a sledgehammer to crack a nut—inefficient and expensive.
EXAONE 4.0 32B is a 32-billion parameter large language model developed by LG AI Research. This specific version, designated "Reasoning," has been fine-tuned to excel at tasks requiring logical deduction, complex instruction following, and multi-step problem-solving. It is part of LG's broader EXAONE family of multimodal foundation models.
LG AI Research is the central artificial intelligence research hub for the South Korean conglomerate LG Group. Their mission is to advance AI technology and apply it across LG's various industries, from electronics to chemicals. The development of the EXAONE model series is one of their flagship initiatives.
The "(Reasoning)" tag indicates that this is a specialized variant of the base EXAONE 4.0 model. It has undergone additional training (fine-tuning) on datasets specifically designed to enhance its abilities in logic, mathematics, code generation, and following complex, multi-part instructions. This makes it more powerful for analytical tasks than a general-purpose base model.
EXAONE 4.0 32B (Reasoning) is a top performer. Its intelligence score of 43 places it well above the average and in the same league as many leading proprietary models. It is faster than the average model in its class. However, it is also significantly more expensive and more verbose than most other open-weight models of a similar size.
This model is best suited for high-value tasks where its specific strengths can justify its cost. Ideal use cases include:
An "Open" license generally means the model weights are publicly available, allowing for greater flexibility than closed, API-only models. Developers can potentially self-host the model for privacy and control, or fine-tune it on their own proprietary data to create a more specialized version. However, developers should always consult the specific license agreement for EXAONE 4.0 to understand the precise terms, conditions, and any restrictions on commercial use.