today · 2026-06-21

jun 21, 2026

2 links from the engineering internet.

llama.cpp adds real-time model load progress over /models/sse

the inference engine's server now streams model load progress in real time through a /models/sse server-sent events endpoint, so clients can show load status instead of waiting blind on a slow model spin-up.

github.com

opencode v1.17.9 fixes agent step limits and adds glm-5.2 thinking modes

the terminal coding agent's patch forces a final text response when a run hits its configured agent step limit instead of failing mid-run, fixes devstral model detection across provider id casing, and adds high and max thinking variants for glm-5.2.

llama.cpp adds real-time model load progress over /models/sse

opencode v1.17.9 fixes agent step limits and adds glm-5.2 thinking modes

Command Palette