Edge | AI & Engineering
Subscribe
LIVE · 22:07MONDAY · · JUNE 29, 2026VOL. I

Today in AI engineering.

A reading room of curated AI summaries. The signal, distilled. One short brief when something good lands; the rest waits here for you.

Today0summaries
This week67summaries
Sources120curated
Archive2,166since launch
№ 01 / 03

Today's reading — editor's picks

View all 2166 →
№ 01 / 03MLOPS & INFRASTRUCTURE
arXiv cs.AI

Scaling Item Knowledge with JD's Oxygen AIIC Platform

JD.com's Oxygen AIIC uses a hybrid LLM/VLM architecture to automate item-knowledge production at scale, achieving 94.2% precision and 82.8% recall across tens of billions of SKUs.

arXiv cs.AI
№ 02 / 03AGENTS & ORCHESTRATION
arXiv cs.AI

Agent-Native Immune System (ANIS): Architecture for Runtime Defense

The Agent-Native Immune System (ANIS) shifts AI security from static training-time alignment to dynamic, runtime defense, using a six-layer 'Immune Tower' to protect autonomous agents against memory poisoning and tool-chain manipulation.

arXiv cs.AI
№ 03 / 03AGENTS & ORCHESTRATION
arXiv cs.AI

ATOD: Hybrid Distillation for Autonomous Agent Training

ATOD combines on-policy distillation with reinforcement learning using an annealed schedule and turn-level reweighting to train small agent models that outperform their larger teacher models.

arXiv cs.AI
№ 02 / 03

The stream — chronological

0 today · 67 this week
DAY 01Today JUN 29 · 202641 SUMMARIES
arXiv cs.AIMLOps & Infrastructure

Scaling Item Knowledge with JD's Oxygen AIIC Platform

JD.com's Oxygen AIIC uses a hybrid LLM/VLM architecture to automate item-knowledge production at scale, achieving 94.2% precision and 82.8% recall across tens of billions of SKUs.

arXiv cs.AI
arXiv cs.AIAgents & Orchestration

Agent-Native Immune System (ANIS): Architecture for Runtime Defense

The Agent-Native Immune System (ANIS) shifts AI security from static training-time alignment to dynamic, runtime defense, using a six-layer 'Immune Tower' to protect autonomous agents against memory poisoning and tool-chain manipulation.

arXiv cs.AIAgents & Orchestration

ATOD: Hybrid Distillation for Autonomous Agent Training

ATOD combines on-policy distillation with reinforcement learning using an annealed schedule and turn-level reweighting to train small agent models that outperform their larger teacher models.

arXiv cs.AIAgents & Orchestration

Reducing LLM Agent Hallucinations with Grounded Iterative Planning

Grounded Iterative Language Planning (GILP) combines LLM-based reasoning with a small, trained transition-predictor backbone to catch and correct hallucinated state changes, significantly improving planning reliability.

arXiv cs.AIAgents & Orchestration

Odyssey: A Categorical Framework for Verifiable Foundation Models

Odyssey uses categorical sheaf theory to compose modular 'foundries'—verifiable, truth-preserving architectural components—that allow for structured, queryable, and auditable LLM-based systems.

arXiv cs.AIRAG & Retrieval

DysLexLens: Analyzing Dyslexic AI User Experiences via LLMs

DysLexLens is an end-to-end framework that extracts, structures, and validates insights from noisy online forum data to understand how dyslexic learners interact with AI tools.

arXiv cs.AIAgents & Orchestration

ToE: Hierarchical Claim Verification Against Adversarial Misinformation

Tree of Evidence (ToE) is a fact-checking framework that uses a reinforcement learning-driven agent to decompose claims into hierarchical argument trees, significantly improving verification accuracy against adversarially poisoned inputs.

arXiv cs.AIAgents & Orchestration

Improving Long-Horizon LLM Planning via Symbolic Feedback

This framework enhances LLM planning reliability by using a symbolic verifier to identify errors and provide corrective, interpretable instructions for iterative self-refinement.

arXiv cs.AIAgents & Orchestration

AI-ModelNet: A Networked Architecture for Collaborative AI

AI-ModelNet proposes a hierarchical, Internet-inspired architecture to enable interconnection and collaborative reasoning among heterogeneous, domain-specific models, addressing the fragmentation of the current AI landscape.

arXiv cs.AIAgents & Orchestration

Personality Prompting in Multi-Agent Teams: Task-Dependent Impact

Personality manipulation in LLM agents significantly alters communication style but only degrades task performance in open-ended or collaborative domains, while remaining largely neutral in structured coding tasks.

arXiv cs.AIAgents & Orchestration

Internalizing Future-Aware Planning in LLM Agents

To move LLM agents beyond reactive behavior, this paper introduces a three-stage training paradigm that enables agents to perform grounded 'what-if' simulations and success estimation.

The Pragmatic Engineer (Gergely Orosz)Coding Agents & Dev Productivity

The Shift in Software Engineering: AI Agents and Production Risk

AI agents have fundamentally transformed software development in six months, enabling massive increases in code output. However, this shift risks quality and security when organizations prioritize AI adoption over core engineering rigor, as evidenced by recent high-profile outages.

Ahead of AI (Sebastian Raschka)Agents & Orchestration

Building and Auditing Local Coding Agents

A practical guide to setting up a local coding agent stack using Ollama and open-weight models, emphasizing performance benchmarking, secure auditing of agent harnesses, and the trade-offs of running local vs. proprietary infrastructure.

Interconnects (Nathan Lambert)Models & Frontier Labs

The Diversification of the Open Model Ecosystem

The open model landscape is shifting from a few dominant players to a diverse ecosystem of niche, product-focused, and sovereign AI developers, signaling a move toward a long-tail of specialized models.

Interconnects (Nathan Lambert)Models & Frontier Labs

GLM-5.2: A New Benchmark for Open-Weight Agentic Coding

GLM-5.2 marks a pivotal shift in the open-weight landscape, offering the first credible, high-performance alternative to frontier closed models like Claude Opus for complex agentic coding tasks.

Latent Space (Newsletter)Agents & Orchestration

Claude Tag: Moving AI from Chat to Team-Based Delegation

Claude Tag shifts LLM interaction from synchronous chat to asynchronous, team-wide delegation within Slack, positioning Claude as a persistent, proactive coworker rather than a standalone tool.

Latent Space (Newsletter)Inference & Serving

SpaceX's Neocloud and the Rise of Owned Intelligence

SpaceX is emerging as a massive compute provider with $28B/year in annualized GPU rental deals, while developers increasingly prioritize 'owned intelligence' via open-weight models like GLM-5.2 to gain control over their AI stacks.

Latent Space (Newsletter)Models & Frontier Labs

OpenAI's GPT-5.6 Launch: Frontier Models as Managed Assets

OpenAI released the GPT-5.6 family (Sol, Terra, Luna) as a restricted, government-mediated preview, signaling a shift where release governance is now a core component of the model specification.

Latent Space (Newsletter)Agents & Orchestration

The Rise of Meta-Harnesses and Vertical AI Integration

The AI industry is shifting toward 'meta-harnesses'—standardized agent orchestration layers—while frontier labs move toward vertical integration of custom silicon and agent-native UX.

Latent Space (Newsletter)Agents & Orchestration

Internal AI Adoption & The Rise of Agentic Workflows

OpenAI reports massive internal token growth across all departments, signaling that agentic workflows—supported by review loops and persistent infrastructure—are moving from experimental to core production patterns.

Simon Willison's WeblogInference & Serving

Porting PyTorch Models to the Browser with Claude Code

By leveraging Claude Code to convert PyTorch models to ONNX, developers can run sophisticated AI features like image inpainting directly in the browser using WebGPU and the CacheStorage API.

Claude Code ChangelogFrameworks & Tooling

Claude Code Changelog: Production Reliability & Agentic Control

Recent updates to Claude Code focus on hardening production workflows, improving agentic reliability through stricter permissioning and background task management, and enhancing the developer experience in terminal-based environments.

Claude Code ChangelogFrameworks & Tooling

Claude Code Changelog: Production Reliability and Agentic Control

Recent updates to Claude Code focus on hardening agentic workflows through improved background task management, granular permission controls, enhanced MCP reliability, and significant performance optimizations for terminal-based AI development.

Claude Code ChangelogFrameworks & Tooling

Claude Code Changelog: Production Reliability & Agentic Control

Recent updates to Claude Code focus on hardening agentic workflows, improving background task management, and refining safety controls for autonomous shell and MCP operations.

Claude Code ChangelogFrameworks & Tooling

Claude Code Changelog: System Reliability and Agentic UX

Recent updates to Claude Code focus on hardening background agent reliability, improving TUI responsiveness, and refining safety controls for autonomous operations.

Claude Code ChangelogFrameworks & Tooling

Claude Code Changelog: Production Reliability and Agentic Control

Recent updates to Claude Code focus on hardening background agent reliability, refining safety controls for auto-mode, and optimizing terminal performance for professional engineering workflows.

Together AI BlogInference & Serving

ParallelKernelBench: Frontier LLMs Struggle with Multi-GPU Kernels

While LLMs excel at single-GPU kernel generation, they currently struggle with multi-GPU tasks where communication bottlenecks and complex rank coordination dominate performance.

Hugging Face BlogInference & Serving

Deploying vLLM Endpoints on Hugging Face Jobs

Hugging Face Jobs allows engineers to spin up private, OpenAI-compatible vLLM endpoints on demand using a single command, providing a pay-per-second alternative for testing and experimentation.

Anthropic NewsAgents & Orchestration

Claude Tag: Collaborative Agentic Workflows in Slack

Claude Tag integrates Claude into Slack as a persistent, multiplayer agent capable of autonomous task execution, cross-channel context awareness, and proactive collaboration.

Import AI (Jack Clark)Agents & Orchestration

Agentic Robotics, Large-Scale Infra, and Future Uncertainty

Recent developments in agentic robot self-improvement, large-scale GPU cluster telemetry, and legal data infrastructure highlight the rapid maturation of AI systems, even as experts debate the long-term implications for human autonomy.

TechCrunch — AIMLOps & Infrastructure

Real-Time Fluid Monitoring for Data Center Cooling Efficiency

Omen AI is deploying miniaturized spectrometers to monitor coolant chemistry in real-time, preventing bacterial outbreaks and hardware wear that cause costly data center downtime.

IBM TechnologyCoding Agents & Dev Productivity

Optimizing Software Workflows with AI Code Review

AI code review accelerates development by automating static and dynamic analysis, but it requires human oversight to manage context, mitigate false positives, and ensure architectural alignment.

OpenAI NewsEvals & Reliability

Building Interoperable Standards for Advanced AI Systems

OpenAI is co-founding the Appia Foundation to translate high-level AI safety frameworks into modular, open technical specifications that enable consistent, third-party evaluation across the global AI supply chain.

AI EngineerInference & Serving

Prototype Big, Deploy Small: A Framework for Local LLM Adoption

Stop overpaying for frontier models. By using a 'prototype big, deploy small' framework and rigorous capability evals, you can identify 'Sage' (Small and Good Enough) models that provide production-grade performance on-device, saving costs and improving latency.

AI EngineerAgents & Orchestration

The Future of AI: Shifting from Monolithic Agents to Composition

Justin Schroeder argues that the future of AI lies in 'domain-specific agents'—small, specialized, composable units—rather than monolithic agents, to solve the reliability, cost, and complexity issues inherent in current agentic architectures.

AI EngineerAI News & Trends

Moving Upstream: Why Product Strategy Beats Prompting

As AI makes coding cheap, the bottleneck has shifted to product discovery. Success now depends on human-centric techniques like story mapping and value-based requirements to ensure you build what is actually worth building.

AI EngineerMLOps & Infrastructure

Building Deterministic Infrastructure for Autonomous AI Agents

Reliability in agentic systems is an infrastructure challenge, not a model one. To scale agents, you must build a 'control plane' that separates model reasoning from production execution via validation, policy enforcement, and circuit breakers.

AI EngineerAgents & Orchestration

The Agentic AI Engineer: Scaling Agent Development via Loops

To scale agent development, teams must move from manual iteration to an 'Agentic AI Engineer' model: a multi-agent system that automates the entire lifecycle of spec, build, eval, diagnose, and optimize.

AI EngineerAgents & Orchestration

The Prompt as a Platform: Agentic Engineering for Distributed Systems

Dominik Tornow argues that software engineering is shifting from general-purpose implementations to bespoke systems synthesized by agents from abstract specifications, using deterministic simulation as the critical feedback loop for design.

AI EngineerAgents & Orchestration

RL-Guided ETL Pipeline Remediation: Architecture and Evals

Automate ETL failure recovery using a deterministic anomaly detection layer, a Q-learning policy for action selection, and a hard-coded safety guardrail to ensure operational reliability.

AI EngineerEvals & Reliability

Debugging Production AI Agents via Record and Replay

Stop chasing bitwise determinism in LLMs. Instead, implement a record-and-replay architecture to capture agent state transitions, enabling deterministic debugging and regression testing of non-deterministic production failures.

DAY 02Yesterday JUN 28 · 202610 SUMMARIES
Machine Learning Street TalkInference & Serving

Thermodynamic Computing and the Future of AI-Driven Chip Design

Thomas Ahle of Normal Computing discusses using AI agents to automate chip design, the risks of 'understanding debt' in agentic code, and the development of thermodynamic chips that use physical noise to perform stochastic computations.

Machine Learning Street Talk
AI EngineerAgents & Orchestration

Building Low-Latency Voice-In, Visuals-Out AI Agents

To achieve a seamless AI UX, shift from voice-in/voice-out to voice-in/visuals-out. This leverages the human brain's visual processing capacity and a more forgiving 1-second latency budget compared to the strict 200ms required for fluid speech.

AI EngineerRAG & Retrieval

Cross-Document AI for Predictive Financial Compliance

Moving from document-level validation to cross-document graph correlation and probabilistic risk modeling reduces false positives by 76% and enables proactive fraud detection.

TechCrunch — AIMLOps & Infrastructure

Why Ford Reintegrated Human Expertise After AI Quality Failures

Ford rehired 350 veteran engineers to address quality issues caused by over-reliance on automated AI systems, resulting in significant cost savings and improved quality rankings.

Google Cloud TechAgents & Orchestration

Building Full-Stack Apps with AI Sub-Agents

Google Antigravity uses voice-prompted sub-agents to orchestrate complex full-stack development, leveraging specialized guidance and MCP tools to build, test, and deploy multilingual applications.

OpenAI NewsAgents & Orchestration

The Shift from Chatbots to Agentic Workflows

OpenAI's internal data shows a transition from short-horizon chatbot interactions to long-horizon agentic tasks, with non-technical departments adopting agents faster than engineers to perform cross-functional work.

OpenAI NewsAI News & Trends

Omio's Shift to AI-Native Travel and Operations

Omio transformed its travel booking platform and internal development workflows by integrating OpenAI models, resulting in a 5x increase in development speed and a shift toward conversational commerce.

OpenAI NewsInference & Serving

OpenAI and Broadcom Unveil Jalapeño Inference Chip

OpenAI and Broadcom have developed 'Jalapeño,' a custom ASIC designed specifically for LLM inference, aiming to improve performance-per-watt and reduce latency through hardware-software co-design.

IBM TechnologyEvals & Reliability

The Promptware Kill Chain: Understanding AI Malware

Promptware exploits the lack of separation between instructions and data in LLMs to execute a multi-stage attack, requiring a zero-trust approach where AI agents are treated as hostile runtimes.

Jason Liu (jxnl.co)Agents & Orchestration

Scheduled Work: Task vs. Message Architectures

Distinguish between scheduled tasks (fresh threads) and scheduled messages (persistent threads) by asking if the job requires the context of previous runs.