Sakana Fugu Ultra just beat the other models on visual polish in a live trading-desk coding test, got close to GLM 5.2, …

Rohan Paul Twitter · Rohan Paul (@rohanpaul_ai) · 2026-06-22

Sakana AI's Fugu Ultra model outperforms rivals on visual polish in a live trading-desk coding test run on the atomic.chat desktop app, but costs 17x more than competing models and trails GLM 5.2 on overall performance.

Open original ↗

Appears in

Sakana AI Fugu Ultra: Multi-Model Orchestration Layer Launch and Early Benchmarks

Extraction

Topics: llm-benchmarkscoding-modelsmodel-comparisonsakana-ai

Claims

Sakana Fugu Ultra achieved the best visual polish score among tested models in a live trading-desk UI coding task.
Fugu Ultra costs approximately 17x more than comparable models in the test.
GLM 5.2 came close to or exceeded Fugu Ultra on overall performance metrics.
The test was conducted on atomic.chat, a desktop application for running LLMs locally.

Key quotes

Sakana Fugu Ultra just beat the other models on visual polish in a live trading-desk coding test, got close to GLM 5.2, but at 17x the cost.

Fugu produced the richest interface, with multiple panels.