jun 21, 2026
2 links from the engineering internet.
G
G
github.com
ai
llama.cpp adds real-time model load progress over /models/sse
the inference engine's server now streams model load progress in real time through a /models/sse server-sent events endpoint, so clients can show load status instead of waiting blind on a slow model spin-up.
G
G
github.com
ai
opencode v1.17.9 fixes agent step limits and adds glm-5.2 thinking modes
the terminal coding agent's patch forces a final text response when a run hits its configured agent step limit instead of failing mid-run, fixes devstral model detection across provider id casing, and adds high and max thinking variants for glm-5.2.