AI coding agents are now writing a big slice of real-world code, but they’re also driving 3x debugging time, expensive incidents, and some nasty security bugs when people vibe-code straight into production. The infra stack around them is hardening into Docker/Proxmox plus Postgres/SQLite/Redis, while LLM performance is increasingly about KV-cache engineering and renting GPUs or cheap models in the cloud instead of buying more hardware.
The weakest points in the stack this week are AI-connected tools, agent frameworks, and API keys, not your core language or web framework.
Key Events
/OpenClaw became GitHub's most-starred project with 246k stars, overtaking React.
/Security reviews found over 2,000 vulnerabilities in OpenClaw and documented a new 'ClawJacked' attack path.
/Claude Code and other AI agents now author about 4% of public GitHub commits, with projections above 20% by 2026.
/Supabase was blocked by multiple ISPs in India following a government order, breaking access for hosted apps.
/Vercel suffered regional downtime impacting users in Dubai and the EU.
Report
AI agents are now in the critical path for real codebases, but the numbers show debugging overhead and security incidents climbing alongside usage. At the same time, the LLM infra and hosting stack are consolidating into a few de facto patterns, and most of the new failure modes live in the AI layers, not your core language or database.
aio agents in your deploy pipeline
Anthropic reports that over 80% of its deployed code is written by Claude, and individual engineers describe 2026 workflows where they don't write any code manually, leaning entirely on tools like Cursor, Claude, and Codex.
Claude Code already accounts for about 4% of public GitHub commits, with surveys projecting this could exceed 20% by the end of 2026. Coding agents crossed a reliability threshold in December and are now running long, multi-step tasks, with some developers wiring up 13 Claude agents to ship software every day.
The downside is quantified: debugging AI-generated code takes roughly 3x longer than human code, and production incidents from AI bugs are averaging about $40,000 per hit.
A vibe-coded app shipped with 16 vulnerabilities that exposed data from 18,000 users, and a scan of agent repos found 80% had at least one vulnerability, 38% of them critical.
aI tooling as an attack surface
GitHub's Copilot CLI has been observed downloading and executing malware, turning a convenience tool into a direct code-execution risk on developer machines.
OpenClaw, now the most-starred project on GitHub, comes with reports of over 2,000 known vulnerabilities and a new 'ClawJacked' web attack that lets sites hijack the agent.
A broader review of AI agent repositories found that 80% contain at least one vulnerability and highlighted missing human oversight as the most common design flaw. 41% of official MCP servers are running without authentication, even as France deploys a national MCP server hosting all government data and new CLIs let agents act over SSH on remote machines.
On the credential side, 2,863 Google API keys sitting in public webpages now silently authenticate to Gemini and expose previously safe APIs via the assistant, and separate work estimates that 86% of production LLM apps are currently exposed to prompt injection.
llm infra: kv cache, local vs cloud economics
Qwen3.5-35B-A3B hits about 74.7 tokens/s with a q8_0 KV cache on an RTX 5080, making it one of the faster large models for local inference. The same A3B variant can stretch context windows past 1M tokens on 32GB consumer GPUs, but people are running into slowdowns tied to frequent KV-cache clearing rather than raw compute limits.
Precision choices on the KV cache are directly changing correctness: fp8 KV produces corrupt outputs for Qwen3.5, while bf16 fixes the problem.
Tooling like ContextCache shows 29x speedups by caching schema tokens for tool-calling LLMs, and KV-based communication between agents is saving roughly 73–78% of tokens in multi-agent setups.
At the hardware level, the GPU market is described as highly inflated with capable gen-AI PCs often costing over $2,000, while cloud offerings like Colab's RTX 6000 Pro at $0.87/hr and cheap models like Gemini 3.1 Flash-Lite at $0.25 per million tokens are pulling a lot of heavy experimentation back to the cloud.
infra stack: docker, k8s, and managed platforms
For small teams and homelabs, Docker and Docker Compose on bare metal or Proxmox remain the default: there’s a public catalog of 450+ self-hostable apps with Compose files, and people routinely run Nextcloud, media servers, and AI services as containers on Proxmox clusters.
Users keep emphasizing that one-process-per-container with Compose makes upgrades and rollbacks trivial compared to traditional installs, while also repeatedly calling out security worries around privilege escalation and secrets inside containers.
Full Kubernetes is mostly showing up where there’s real multi-node scale or AI workloads, and even there folks are fighting etcd pain at scale, CRD lifecycle breakage, and FluxCD taking too long to notice new images.
On the hosted side, Supabase now powers 55% of a recent YC batch but has been blocked by ISPs in India under a government order, and Vercel had regional downtime in Dubai and the EU even as it launches an AI Agent marketplace and tight Claude Code integrations.
Underneath, the data layer is coalescing into PostgreSQL for SaaS backends, SQLite for local and agent memory, and Redis for ephemeral context and JWT blacklists, with people explicitly calling out SQLite’s multi-user limits and Redis-based blacklists turning into hot-path bottlenecks.
languages, runtimes, and wasm
Teams documenting migrations from PHP or Python/React to Elixir Phoenix report roughly 35% reductions in operational costs, plus simpler backends built on a surprisingly broad ecosystem of Elixir-based tooling and tutorials.
Ruby on Rails is still being chosen over React-only stacks for some web apps, with developers pointing at better performance for their workloads and a preference for straightforward CRUD over JS-heavy frontends.
In the systems layer, Rust and Go are increasingly used for infra and agent services, visible in Rust-based Frankensqlite with concurrent writers, a 1.4 GB/s Rust FITS image processor, and a Go community explicitly positioning the language as a top choice for AI agents.
JavaScript and TypeScript remain unavoidable for frontends and much AI tooling, but people are openly venting about design flaws, complex async semantics, and type-system overkill even as multi-year migrations from JS to TS grind on.
WebAssembly is being treated as a precision tool rather than a full runtime, with a WASM vector DB running 5x faster than JS on one side and a JVM-in-QEMU-in-WASM that takes 55 seconds just to print 'Hello World' on the other.
What This Means
AI and agents are now deeply intertwined with both application code and infra, and most of the interesting wins and failures this period come from that layer rather than from traditional language or database choices. The practical stack is narrowing around Docker/Proxmox, Postgres/SQLite/Redis, and specialized LLM infra, while the most brittle links are increasingly AI-powered tools, keys, and orchestration frameworks instead of the core app.
On Watch
/Ghostty is gaining traction as a fast, agent-friendly terminal (backing tools like cmux) but still ships with SSH glitches, missing scrollback search, and reports of slowness and bugs, so its stability as a primary dev terminal is still in flux.
/Zed is emerging as a low-RAM, high-speed editor competing with VS Code, but age-gating of AI features, licensing concerns, and forks like Gram that strip AI hint at potential ecosystem fragmentation.
/Client-isolation tooling is getting more serious—SIMPLE-ICS can emulate multi-stage APT campaigns and new local-first Linux microVMs provide disposable sandboxes—while CTF-style competitions are being used to benchmark AI systems on security tasks.
Interesting
/LightMem, accepted to ICLR 2026, offers over 10× gains in long-context reasoning for LLM agents at significantly lower costs.
/AgentChatBus enables multiple AI agents to communicate persistently, allowing them to collaboratively discover bugs that humans might miss.
/Cloudflare managed to rewrite Next.js in just one week with a single developer, utilizing $1,100 in tokens.
/The CLI tool npx preflyt-check helps identify security mistakes in deployments, including open Redis ports, enhancing security practices.
/Invisible characters in text can manipulate AI agents into following hidden instructions, as demonstrated in tests across multiple models.
We processed 10,000+ comments and posts to generate this report.
AI-generated content. Verify critical information independently.
/OpenClaw became GitHub's most-starred project with 246k stars, overtaking React.
/Security reviews found over 2,000 vulnerabilities in OpenClaw and documented a new 'ClawJacked' attack path.
/Claude Code and other AI agents now author about 4% of public GitHub commits, with projections above 20% by 2026.
/Supabase was blocked by multiple ISPs in India following a government order, breaking access for hosted apps.
/Vercel suffered regional downtime impacting users in Dubai and the EU.
On Watch
/Ghostty is gaining traction as a fast, agent-friendly terminal (backing tools like cmux) but still ships with SSH glitches, missing scrollback search, and reports of slowness and bugs, so its stability as a primary dev terminal is still in flux.
/Zed is emerging as a low-RAM, high-speed editor competing with VS Code, but age-gating of AI features, licensing concerns, and forks like Gram that strip AI hint at potential ecosystem fragmentation.
/Client-isolation tooling is getting more serious—SIMPLE-ICS can emulate multi-stage APT campaigns and new local-first Linux microVMs provide disposable sandboxes—while CTF-style competitions are being used to benchmark AI systems on security tasks.
Interesting
/LightMem, accepted to ICLR 2026, offers over 10× gains in long-context reasoning for LLM agents at significantly lower costs.
/AgentChatBus enables multiple AI agents to communicate persistently, allowing them to collaboratively discover bugs that humans might miss.
/Cloudflare managed to rewrite Next.js in just one week with a single developer, utilizing $1,100 in tokens.
/The CLI tool npx preflyt-check helps identify security mistakes in deployments, including open Redis ports, enhancing security practices.
/Invisible characters in text can manipulate AI agents into following hidden instructions, as demonstrated in tests across multiple models.