Agentifact Guides
Guides & Analysis
Architecture patterns, protocol comparisons, and production strategies for builders working with autonomous agents. Written with opinions, not press releases.
Robinhood now lets your AI agents trade stocks
58 points, 95 comments on HN
CAPTCHAs can still detect AI agents
52 points, 31 comments on HN
StuMason/coolify-mcp
397 stars, TypeScript — created 2025-03-05
briangaoo/whoop-mcp
14 stars, TypeScript — created 2026-05-26
cameronrye/activitypub-mcp
16 stars, TypeScript — created 2025-09-20
Simon Willison released 0.25.1 of simonw/llm-anthropic
Simon Willison (LLM tooling)
vinkius-labs/mcpfusion
251 stars, TypeScript — created 2026-02-12
How Endava builds an agentic organization with Codex
From openai-blog
DeusData/codebase-memory-mcp
2773 stars, C — created 2026-02-24
Cisco and OpenAI redefine enterprise engineering with Codex
From openai-blog
OpenAI’s Frontier Governance Framework
From openai-blog
dagucloud/dagu
3431 stars, Go — created 2022-04-22
makenotion/notion-mcp-server
4366 stars, TypeScript — created 2025-03-10
hi-godot/godot-ai
387 stars, GDScript — created 2026-04-12
samvallad33/vestige
540 stars, Rust — created 2026-01-25
Building self-improving tax agents with Codex
From openai-blog
Election information and safeguards in 2026
From openai-blog
Agent Memory
36 points, 17 comments on HN
jamubc/gemini-mcp-tool
2225 stars, TypeScript — created 2025-06-29
dearlordylord/huly-mcp
27 stars, TypeScript — created 2026-02-02
ruslanlap/pagespeed-insights-mcp
25 stars, TypeScript — created 2025-08-28
GitHub Actions down again today
48 points, 6 comments on HN
Alaska's oil revival sparks a new energy rush Into the Arctic
27 points, 17 comments on HN
dagucloud/dagu
3424 stars, Go — created 2022-04-22
Arenukvern/mcp_flutter
299 stars, Dart — created 2025-03-17
karanb192/reddit-mcp-buddy
684 stars, TypeScript — created 2025-09-14
erikunha/portfolio
12 stars, TypeScript — created 2023-05-23
Simon Willison released 0.1a4 of datasette/datasette-agent
Simon Willison (LLM tooling)
geekjourneyx/md2wechat-skill
2315 stars, Go — created 2026-01-11
modelscope/FunASR
16239 stars, Python — created 2022-11-24
Yohei Nakajima released v0.1-paper-longmemeval-s of yoheinakajima/activegraph-longmemeval
Yohei Nakajima (Autonomous agents)
MCPJam/inspector
1966 stars, TypeScript — created 2025-05-23
Constraint Decay
38 points, 18 comments on HN
Launch HN
22 points, 29 comments on HN
dagucloud/dagu
3420 stars, Go — created 2022-04-22
macOS26/Agent
442 stars, Swift — created 2026-03-11
open-multi-agent/open-multi-agent
6233 stars, TypeScript — created 2026-03-31
punitarani/fli
2640 stars, Python — created 2025-01-06
A case against Boolean logic
49 points, 67 comments on HN
Open source Kanban desktop app that runs parallel agents on every card
61 points, 34 comments on HN
dagucloud/dagu
3418 stars, Go — created 2022-04-22
ridafkih/keeper.sh
1081 stars, TypeScript — created 2025-12-23
bgauryy/octocode
845 stars, TypeScript — created 2025-06-05
chrisryugj/kordoc
959 stars, TypeScript — created 2026-03-28
wshobson/agents
35794 stars, Python — created 2025-07-24
AliKarami/MikroMCP
15 stars, TypeScript — created 2026-04-21
openlegion-ai/openlegion
95 stars, Python — created 2026-02-19
dadbodgeoff/drift
781 stars, TypeScript — created 2026-01-19
How Ramp engineers accelerate code review with Codex
From openai-blog
AdventHealth advances whole-person care with OpenAI
From openai-blog
Amazon, Facebook, FBI have access to a private intelligence-sharing network
49 points, 7 comments on HN
mihaelamj/cupertino
756 stars, Swift — created 2025-11-14
MCDxAI/minecraft-dev-mcp
18 stars, TypeScript — created 2025-12-06
SocketDev/socket-mcp
109 stars, TypeScript — created 2025-05-19
Meta blocks human rights accounts from reaching audiences in Arabia and the UAE
235 points, 74 comments on HN
Introducing OpenAI for Singapore
From openai-blog
The next phase of OpenAI’s Education for Countries
From openai-blog
hmmhmmhm/daiso-mcp
297 stars, TypeScript — created 2026-02-28
spences10/mcp-omnisearch
309 stars, TypeScript — created 2025-03-08
nanbingxyz/5ire
5221 stars, TypeScript — created 2024-01-06
What “Amazon Supply Chain Services” Tells Us About What Amazon Is
37 points, 57 comments on HN
xpack-ai/XPack-MCP-Marketplace
163 stars, TypeScript — created 2025-07-09
Hershey Bets on Agentic AI to Rethink $2B in Marketing Spend
25 points, 51 comments on HN
Open-Source Agentic QA Harness with Memory
50 points, 7 comments on HN
Beever-AI/beever-atlas
329 stars, Python — created 2026-04-21
samuelgursky/davinci-resolve-mcp
1070 stars, Python — created 2025-03-18
davepoon/buildwithclaude
2941 stars, Python — created 2025-07-25
blencorp/capture-mcp-server
23 stars, TypeScript — created 2025-07-25
DemonDamon/AgenticX
120 stars, Python — created 2024-03-15
MCP Hello Page
37 points, 12 comments on HN
Playing Atari ST Music on the Amiga with Zero CPU
34 points, 7 comments on HN
CodeGraphContext/CodeGraphContext
3280 stars, Python — created 2025-08-16
utensils/mcp-nixos
639 stars, Python — created 2025-03-20
drhelius/Gearboy
1130 stars, C++ — created 2012-07-19
gridctl/gridctl
16 stars, Go — created 2026-01-07
Travelers on Air Force One ordered to throw away gifts, phones after China trip
20 points, 21 comments on HN
christopherkarani/Swarm
477 stars, Swift — created 2025-12-12
chopratejas/headroom
1760 stars, Python — created 2026-01-07
Jpisnice/shadcn-ui-mcp-server
2768 stars, TypeScript — created 2025-04-07
brokermr810/QuantDinger
5326 stars, Python — created 2025-12-28
coddingtonbear/obsidian-local-rest-api
2235 stars, TypeScript — created 2022-01-25
OctopusDeploy/mcp-server
95 stars, TypeScript — created 2025-09-08
nduckmink/arkon
542 stars, Python — created 2026-04-30
dominik1001/caldav-mcp
68 stars, TypeScript — created 2025-05-16
Codex is now available on mobile via ChatGPT app
28 points, 9 comments on HN
bh-rat/awesome-mcp-enterprise
110 stars, unknown — created 2025-08-12
Making the news available at no cost is a victory
39 points, 23 comments on HN
nduckmink/arkon
697 stars, Python — created 2026-04-30
AI-QL/tuui
1147 stars, TypeScript — created 2025-04-01
kucherenko/jscpd
5629 stars, TypeScript — created 2013-05-29
varandrew/moor
101 stars, TypeScript — created 2026-04-27
williamzujkowski/live-coding-music-mcp
200 stars, TypeScript — created 2025-08-18
oaslananka/kicad-mcp-pro
117 stars, Python — created 2026-04-12
ProfessionalWiki/MediaWiki-MCP-Server
87 stars, TypeScript — created 2025-05-12
gastownhall/gascity
664 stars, Go — created 2026-02-22
Meta employees protest against mouse tracking tech at US offices
39 points, 21 comments on HN
yomorun/yomo
1904 stars, Rust — created 2020-07-01
Launch HN
25 points, 11 comments on HN
HexSleeves/tailscale-mcp
92 stars, TypeScript — created 2025-06-06
cursor/community-plugins
3929 stars, TypeScript — created 2024-08-24
cloudwalk/hermes-mcp
366 stars, Elixir — created 2025-02-24
gleanwork/mcp-server-tester
15 stars, TypeScript — created 2025-11-23
Daghis/teamcity-mcp
25 stars, TypeScript — created 2025-09-11
MachinaCheck
From huggingface-blog
Storybloq/storybloq
182 stars, TypeScript — created 2026-04-17
first-fluke/oh-my-agent
915 stars, TypeScript — created 2026-01-30
carterlasalle/mac_messages_mcp
279 stars, Python — created 2025-03-13
stickerdaniel/linkedin-mcp-server
1839 stars, Python — created 2025-04-13
dkships/pm-copilot
25 stars, TypeScript — created 2026-02-19
bobmatnyc/claude-mpm
126 stars, Python — created 2025-07-25
GH05TCREW/pentestagent
2298 stars, Python — created 2025-05-15
0xSteph/pentest-ai
202 stars, Python — created 2026-04-04
Mangaba-ai/mangaba_ai
196 stars, Python — created 2025-04-08
samanhappy/mcphub
2069 stars, TypeScript — created 2025-03-31
"OncoAgent
From huggingface-blog
HenryLach/taskplane
169 stars, TypeScript — created 2026-03-12
aannoo/hcom
257 stars, Rust — created 2025-07-21
CoderGamester/mcp-unity
1671 stars, C# — created 2025-03-13
Erodenn/godot-mcp-runtime
24 stars, TypeScript — created 2026-02-28
velvetmonkey/flywheel-memory
42 stars, TypeScript — created 2026-02-12
delorenj/mcp-server-trello
333 stars, TypeScript — created 2025-01-03
evalops/deep-code-reasoning-mcp
105 stars, TypeScript — created 2025-06-11
Plasticity and language in the anaesthetized human hippocampus
71 points, 26 comments on HN
Aas-ee/open-webSearch
1159 stars, TypeScript — created 2025-06-20
fetchai/uAgents
1589 stars, Python — created 2022-09-28
speakeasy-api/gram
233 stars, TypeScript — created 2025-08-06
chrisryugj/korean-law-mcp
1677 stars, TypeScript — created 2025-12-19
idosal/git-mcp
8046 stars, TypeScript — created 2025-03-29
JackChen-me/open-multi-agent
6070 stars, TypeScript — created 2026-03-31
sipyourdrink-ltd/bernstein
288 stars, Python — created 2026-03-22
AlphaEvolve
66 points, 8 comments on HN
zhizhuodemao/js-reverse-mcp
1127 stars, TypeScript — created 2025-11-29
cyanheads/obsidian-mcp-server
494 stars, TypeScript — created 2025-01-23
openclaw/Peekaboo
3252 stars, Swift — created 2025-05-22
awslabs/mcp
8978 stars, Python — created 2025-03-21
danielsmithdevelopment/ClawQL
12 stars, TypeScript — created 2026-03-19
launchdarkly/mcp-server
21 stars, TypeScript — created 2025-05-22
vinkius-labs/vurb.ts
251 stars, TypeScript — created 2026-02-12
Ask HN: Is there a term for feeling sad about forced AI adoption?
19 points, 24 comments on HN
harsha-iiiv/openapi-mcp-generator
578 stars, TypeScript — created 2025-03-09
dcostenco/prism-coder
132 stars, TypeScript — created 2026-02-12
google-labs-code/stitch-skills
5243 stars, TypeScript — created 2026-01-16
zaxbysauce/opencode-swarm
301 stars, TypeScript — created 2026-01-27
appcypher/awesome-mcp-servers
5520 stars, unknown — created 2024-11-28
antvis/mcp-server-chart
4039 stars, TypeScript — created 2025-04-25
HIDORAKAI002/ai-workspace-archive
11 stars, TypeScript — created 2026-03-28
John Bradley, author of xv, has passed away
61 points, 26 comments on HN
Scientific audio equipment analysis with analyzer shows no difference in quality
29 points, 50 comments on HN
'Project Hail Mary' Crosses $300M in Sales to Become Amazon/MGM's Highest-Gross
30 points, 24 comments on HN
FTC action against Match and OkCupid for deceiving users, sharing personal data
205 points, 106 comments on HN
CrewAI Selected for the Enterprise Tech 30
From crewai-blog
A dot a day keeps the clutter away
71 points, 27 comments on HN
Swappa.com for GrapheneOS compatible devices – Stay Away
80 points, 45 comments on HN
March 2026
From langchain-blog
Amazon is adding a fuel surcharge to fees it collects from third-party sellers
101 points, 41 comments on HN
Open Models have crossed a threshold
From langchain-blog
Iran strikes leave Amazon availability zones "hard down" in Bahrain and Dubai
98 points, 51 comments on HN
Zooming UIs in 2026
78 points, 35 comments on HN
Arcade.dev tools now in LangSmith Fleet
From langchain-blog
CIA used "long-range quantum magnetometry" called "Ghost Murmur" in Iran
18 points, 18 comments on HN
Better Harness
From langchain-blog
Bitmap fonts make computers feel like computers again
69 points, 50 comments on HN
Your harness, your memory
From langchain-blog
The Case Against Gameplay Loops (2024)
51 points, 50 comments on HN
Amazon to acquire Globalstar and expand Amazon Leo satellite network
63 points, 35 comments on HN
Academic fraud may be the symptom of a more systemic problem
25 points, 22 comments on HN
Launch HN
66 points, 61 comments on HN
New unsealed records reveal Amazon's price-fixing tactics, California AG claims
54 points, 9 comments on HN
UK Fuel Price Intelligence – Market analytics from reporting stations
142 points, 70 comments on HN
DeepClaude – Claude Code agent loop with DeepSeek V4 Pro
467 points, 182 comments on HN
Agentic Coding Is a Trap
367 points, 258 comments on HN
The Oscars just banned AI from winning acting and writing awards
74 points, 49 comments on HN
Agent Skills
56 points, 9 comments on HN
Model Context Protocol (MCP) explodes with 97M+ monthly SDK downloads and enterprise backing ...
Since Anthropic's 2024 launch, MCP has seen explosive growth: 97M+ monthly SDK downloads (Python/TS), 5,800+ servers, 300+ clients, 8M+ server downloads; major adoption by OpenAI (Mar 202...
Anthropic releases Claude 4, setting new benchmarks for AI agent capabilities
Anthropic launched Claude Opus 4 and Sonnet 4 on May 22, 2025, achieving top scores on agent benchmarks like SWE-bench (72.5%) and Terminal-bench (43.2%), with hybrid reasoning, extended...
GitHub launches Copilot agent mode for autonomous multi-file coding
GitHub announced agent mode for Copilot, enabling it to autonomously iterate on code outputs, fix errors, suggest and execute terminal commands, and handle multi-file edits from a single...
Google launches Agent2Agent (A2A) open protocol for AI agent interoperability
Google announced the open-source Agent2Agent (A2A) protocol on April 9, 2025, enabling AI agents from different frameworks and vendors to communicate, securely exchange information, and c...
OpenAI launches open-source Agents SDK for production-ready multi-agent workflows
OpenAI launched the open-source Agents SDK, a production-ready upgrade to its experimental Swarm framework, alongside the Responses API. The Python SDK (github.com/openai/openai-agents-py...
Google launches open-source Agent Development Kit (ADK) for production multi-agent systems
Google open-sourced ADK, their internal framework for building production-ready single and multi-agent AI systems with code-first primitives (Sequential/Parallel/Loop agents, LlmAgent), t...
Microsoft releases AutoGen 0.4 with complete architecture rewrite for scalable agentic AI
Microsoft released AutoGen v0.4 on January 17, 2025, featuring a complete redesign with asynchronous event-driven messaging, layered architecture (Core API and AgentChat API replacing v0....
Cursor AI raises $900M at $9.9B valuation, surpassing $500M ARR
Anysphere, creators of AI coding assistant Cursor, raised $900M in Series C funding led by Thrive Capital with participation from Andreessen Horowitz, Accel, and DST Global, achieving a $...
CrewAI Launches Production-Ready Flows for Advanced AI Workflow Orchestration
CrewAI released v1.8.0 introducing production-ready Flows architecture, enabling structured event-driven workflows that chain tasks/Crews with state management, conditional branching, HIT...
Guardrails AI v0.6.7 enables robust structured output validation for reliable AI agents
Guardrails AI released v0.6.7 on Sep 22, 2025, with ongoing GitHub activity through April 2026 (v0.6.6), active development (5.8k stars), and documentation emphasizing structured data gen...
Cognition Labs releases Devin 2.2 with desktop testing and v3 API, maturing AI coding agent
Cognition Labs launched Devin 2.2 on Feb 24, 2026, featuring 3x faster startup, unified dev lifecycle UI, full desktop app testing via computer use, self-verification loops; v3 API out of...
Haystack 2.0 launches flexible cyclic pipeline architecture for agentic LLM apps
deepset released Haystack 2.0 stable, a full rewrite of the open-source LLM framework with pipelines upgraded to cyclic directed multigraphs supporting loops, branching, routers, decision...
DSPy 3.0 launches with advanced optimizers GEPA and SIMBA, driving rapid GitHub activity and ...
DSPy released version 3.0 on August 12, 2025, introducing powerful new optimizers including RL-based GRPO via Arbor, reflective GEPA (Genetic-Pareto outperforming MIPROv2), and SIMBA; MIP...
Modal launches GPU snapshotting slashing serverless cold starts 10x for AI workloads
Modal introduced GPU memory snapshotting (alpha) enabling 10x faster cold starts (118s to 12s median for Ministral 3 3B vLLM), alongside Day 0 support for Mistral 3 models; supports T4 to...
Groq LPU hits 300+ tokens/sec on Llama 2 70B (10x H100 GPUs) in 2025 benchmarks
Groq's LPU inference engine achieved 300 tokens/sec on Llama 2 70B (10x faster than NVIDIA H100 at 30-40 tok/s), with similar gains on other models like Llama 3 8B at 1300+ tok/s; recogni...
LlamaIndex Deprecates Legacy Agent Classes for Workflow-Based Redesign
In llama-index-core v0.14.14 (2026-02-10), LlamaIndex removed deprecated legacy agent classes including FunctionCallingAgent, the older ReActAgent implementation, AgentRunner, all step wo...
OpenAI launches GPT-5.2-Codex agentic coding model with massive usage growth
OpenAI released GPT-5.2-Codex in mid-December 2025, their most advanced agentic coding model enabling repo-scale reasoning, multi-file edits, self-healing, and Windows support; Codex plat...
Anthropic releases Claude Sonnet 4.6 with near-human computer use (72.5% OSWorld)
Anthropic launched Claude Sonnet 4.6, featuring major computer use improvements scoring 72.5% on OSWorld benchmark (from <15% in 2024), approaching human-level on complex tasks like sp...
Together AI raises $305M Series B at $3.3B val to scale open model cloud to 450K users
Together AI announced $305M Series B funding (led by General Catalyst/Prosperity7) at $3.3B valuation to expand AI Acceleration Cloud for open-source models; now serves 450K+ developers/e...
Codeium launches Windsurf, the first agentic IDE with Cascade collaborative agent
Codeium launched Windsurf Editor, a VS Code fork billed as the first agentic IDE, featuring Cascade—a collaborative AI agent with deep codebase awareness, tool access (web search, termina...
Fly.io Launches Sprites
Fly.io launched Sprites, lightweight Firecracker-based VMs that create in 1-2s with 100GB persistent storage, auto-sleep when idle, and fast checkpoint/restore. Unlike ephemeral container...
NVIDIA NeMo Guardrails v0.20.0 adds reasoning safety models, PII detection, multi-agent support
NVIDIA released v0.20.0 of NeMo Guardrails, introducing Nemotron-Content-Safety-Reasoning-4B (/think mode for explainable moderation), GLiNER open-source PII detection, multilingual refus...
Cerebras WSE-3 delivers 21x faster AI inference than Nvidia Blackwell
Cerebras CS-3 wafer-scale systems achieved 21x faster inference speeds, lower cost, and power efficiency than Nvidia's DGX B200 Blackwell GPU on models like Llama 3 70B and gpt-oss-120B (...
Hugging Face Launches Inference Endpoints v2 with Real-Time Autoscaling
Hugging Face unveiled Inference Endpoints v2 on January 14, 2026, featuring real-time autoscaling, reduced cold-start latency, and custom Docker container support for enterprise-grade mod...
Cloudflare acquires Replicate, massively expanding model hosting to 50k+ models on global edg...
Cloudflare acquired Replicate, integrating its 50,000+ open-source and proprietary model catalog (including fine-tunes) into Workers AI, adding fine-tuning and custom model support powere...
pgvector 0.8.0 delivers 9x faster filtered vector queries via iterative scans
pgvector 0.8.0 (Oct 2024) introduced iterative index scans fixing overfiltering (up to 9x faster queries, 100x better recall per AWS/Nile benchmarks), improved HNSW performance/inserts, b...
Pydantic AI v1
Pydantic team released v1 of Pydantic AI, a model-agnostic, type-safe agent framework with 15M+ downloads, GitHub repo at 15k+ stars, recent v1.64.0 on 2026-03-02; adds human-in-loop appr...
Anthropic donates MCP to Linux Foundation's Agentic AI Foundation amid explosive ecosystem gr...
Anthropic donated the Model Context Protocol (MCP) to the newly formed Agentic AI Foundation (AAIF) under the Linux Foundation on December 9, 2025. AAIF, co-founded by Anthropic, Block, a...
Pinecone rolls out Gen2 serverless architecture optimized for agentic AI workloads
Pinecone began rolling out second-generation serverless vector DB architecture, introducing adaptive indexing that auto-optimizes for diverse workloads like recommendations, search, and a...
LangChain v0.3 introduces Pydantic v2 migration and JS peer deps breaking changes
LangChain released v0.3 on Sept 16, 2024 with breaking changes: Python upgraded all packages to Pydantic v2 (dropping v1 support and bridges like langchain_core.pydantic_v1), dropped Pyth...
ClickHouse acquires Langfuse, leading open-source LLM observability platform
ClickHouse acquired Langfuse, the open-source platform for LLM observability, evaluations, and prompt management (20k+ GitHub stars, 26M+ monthly SDK installs), which already uses ClickHo...
Datadog launches agentic AI monitoring in LLM Observability at DASH 2025
Datadog expanded LLM Observability with AI Agent Monitoring (GA, visual traces of agent decisions/tools/handoffs), LLM Experiments (preview, test prompts/models on production data), and A...
Weights & Biases launches W&B Weave for LLM app observability and evaluation
Weights & Biases announced W&B Weave on April 18, 2024, a lightweight toolkit for developers to trace, evaluate, and monitor generative AI applications. Key components include Traces for...
Helicone AI Gateway launches with built-in observability for multi-provider LLM routing
Helicone launched AI Gateway on June 19, 2025, an open-source Rust-based proxy providing unified OpenAI-compatible access to 100+ models/providers with automatic fallbacks, caching, rate...
LangGraph Cloud launches as scalable agent deployment platform (now LangSmith Deployment)
LangChain announced LangGraph Cloud in closed beta: managed infrastructure for deploying stateful LangGraph agents at scale with task queues, persistence, double-texting support, backgrou...
OpenAI launches Structured Outputs for guaranteed JSON schema adherence
OpenAI introduced Structured Outputs in its API, enabling models to generate outputs that exactly match developer-supplied JSON schemas using constrained decoding, available via function...
E2B Hits 500M+ Sandboxes with 88% F100 Adoption
E2B's sandboxed cloud for AI code execution reached 500M+ started sandboxes, 2M+ monthly SDK downloads, used by 88% of Fortune 100 and top AI labs like Perplexity, Hugging Face; raised $2...
Qdrant 1.10 launches native hybrid search via Universal Query API
Qdrant 1.10 introduced Universal Query API enabling server-side hybrid search combining dense/sparse vectors with RRF fusion, prefetch for multi-stage pipelines (e.g., ColBERT reranking,...
Weaviate's generative search enables production RAG with active module maintenance
Weaviate's generative search module, released in 2023, continues active development with recent fixes for OpenAI GPT-5 support, Cohere, and AWS modules in v1.32+ (Sep 2025), plus Query Ag...
Upstash launches serverless Vector database optimized for AI embeddings
Upstash released Vector, a fully serverless vector database using DiskANN/FreshDiskANN for efficient high-dimensional embedding storage and ANN similarity search (cosine, Euclidean, dot p...
Microsoft Agent Framework reaches RC, evolving Semantic Kernel agents with portable skills an...
Microsoft Agent Framework (built by Semantic Kernel/AutoGen teams) hit Release Candidate status; introduced Agent Skills (runtime-loadable domain expertise), integrations with Claude Agen...
Chroma 0.5.0 Release
Chroma 0.5.0 released on April 23, 2024, introducing block-based Arrow-backed storage, sparse indexes, advanced compaction services, persistent SysDB migrations, and query processing oper...
Composio raises $25M Series A, scales to 25k+ GitHub stars and active agent infra adoption
Composio, a leading AI agent tool integration platform, raised $25M in Series A funding (total $29M) led by Lightspeed Venture Partners on July 22, 2025. The platform now offers 1000+ too...
CrewAI raises $18M seed/Series A, launches Enterprise platform with 150+ customers
CrewAI raised $18M in funding (seed led by boldstart ventures, Series A led by Insight Partners, with Andrew Ng and Dharmesh Shah), launched CrewAI Enterprise GA after beta with 150+ ente...
n8n Launches AI Agent Tool Node for Simplified Multi-Agent Orchestration
n8n released version 1.121.0 introducing the AI Agent Tool node, enabling primary AI Agents to delegate tasks to specialized sub-agents within a single workflow canvas and execution, supp...
Portkey AI Gateway delivers production-grade multi-model routing for AI agents
Portkey AI Gateway (v1.15.2 released Jan 12, 2026) provides open-source unified API for routing to 1600+ LLMs with conditional routing, load balancing, fallbacks, retries, caching, guardr...
SerpApi Launches Open-Source MCP Server for Seamless AI Agent Search Integration
SerpApi released an open-source Model Context Protocol (MCP) server that exposes its web search APIs (Google, Bing, etc.) as a standardized tool for MCP-compatible AI clients like Claude...
Databricks agrees to acquire Neon, serverless Postgres where 80% of DBs are created by AI agents
Databricks announced intent to acquire Neon, a serverless Postgres platform optimized for AI agents with instant provisioning (<500ms), scale-to-zero, branching, and pgvector support....
Firecrawl launches CLI Skill and MCP for seamless AI agent web scraping
Firecrawl released Firecrawl CLI (command-line tool for scrape/search/crawl/map) and Firecrawl Skill (teaches AI agents to install/use CLI autonomously), enabling agents to access clean,...
Crawl4AI v0.8.0
Crawl4AI, an open-source web crawler optimized for LLMs and AI agents, released v0.8.0 on Jan 16, 2026, adding crash recovery, prefetch mode (5-10x faster URL discovery), and Docker fixes...
LangFlow
LangFlow has emerged as a mature low-code/no-code visual framework for building AI agents via drag-and-drop LangChain components, featuring recent releases (1.7.3 Jan 2026) adding advance...
Tavily raises $25M Series A, hits 1M+ monthly downloads with zero marketing
Tavily, the search API for AI agents, raised $25M total funding including a $20M Series A led by Insight Partners (Aug 2025), achieving over 1M monthly downloads and users with zero marke...
Make.com launches next-generation AI Agents with visual canvas integration and multi-modal su...
Make.com released the next generation of Make AI Agents, fully integrated into the visual scenario builder: agents now live in the canvas for building/running/debugging, with Reasoning Pa...
Fireworks AI delivers mature function calling docs and kimi-k2-instruct-0905 support as of ea...
Fireworks AI provides comprehensive OpenAI-compatible function calling (tool calling) documentation using kimi-k2-instruct-0905 model, supporting JSON Schema tools, tool_choice options in...
Exa launches LangGraph agent tutorial showcasing semantic search integration
Exa published official documentation and full code example for building retrieval agents using their ExaSearchRetriever tool in LangGraph, demonstrating semantic web search in agentic loo...
Arize Phoenix delivers OpenTelemetry tracing for Open Agent Spec, enabling portable observabi...
Arize released a blog post demonstrating one-line integration of Phoenix OSS observability with Open Agent Spec agents. It enables tracing of LLM calls, tools, and decisions across runtim...
Supabase launches Vector Buckets
Supabase introduced Vector Buckets in public alpha, providing S3-backed scalable vector storage (up to 50M vectors/index) with built-in similarity search, metadata filtering, and seamless...
OpenRouter token usage explodes with Chinese models dominating 61% of top rankings
OpenRouter's top 10 models processed 8.7T tokens in a recent week, up massively YoY; Chinese models claimed 61% share and 4/5 top spots (MiniMax M2.5: 2.45T tokens, +197% WoW), driven by...
Browserbase raises $40M Series B and launches Director no-code web automation for AI agents
Browserbase launched Director, a no-code tool that turns plain English into executable browser automations via Stagehand scripts on their cloud infrastructure; simultaneously announced $4...
Manus AI agent launches March 6, 2025, goes viral with GAIA SOTA performance
Chinese startup Monica.im / Butterfly Effect launched Manus, a general-purpose autonomous AI agent on March 6, 2025, capable of end-to-end task execution (research, coding, data analysis)...
Patronus AI launches Lynx
Patronus AI released Lynx, an open-source LLM (70B and 8B variants based on Llama-3) for real-time hallucination detection in RAG settings, outperforming GPT-4o and Claude-3 on HaluBench...
Humanloop prompt management platform acquired and sunsetting September 8, 2025
Humanloop, a leading LLMOps platform for prompt management, evaluations, and observability used by teams at Duolingo, Gusto, and Vanta, announced it is being acquired (team joining Anthro...
Dify open-source LLMOps platform surpasses 131k GitHub stars with v1.13.0 release enhancing a...
Dify, a production-ready open-source LLMOps platform for building agentic AI workflows, RAG pipelines, and apps, reached 131k stars and 20.4k forks on GitHub. Latest release v1.13.0 on Fe...
Scale AI updates Evaluation platform with advanced analytics for frontier LLM benchmarking
Scale AI announced major updates to its Scale Evaluation platform, adding instant model comparison across thousands of tests, multi-dimensional performance visualization, automated error...
LiteLLM Proxy hits 37.5k GitHub stars with frequent enterprise-grade releases
LiteLLM Proxy Server, an OpenAI-compatible gateway for 100+ LLMs, reached 37.5k stars and 6.1k forks on GitHub, with v1.81.12-stable.2 released Feb 28, 2026 featuring performance optimiza...
Milvus 2.6 introduces hybrid GPU_CAGRA for 12x faster vector index builds at production scale
Milvus 2.6.1 released hybrid GPU_CAGRA index: GPUs accelerate graph construction 12-15x faster than CPU HNSW; CPU handles scalable queries via adapt_for_cpu serialization to HNSW. Benchma...
FlowiseAI
FlowiseAI, open-source no-code/low-code platform for building AI agents and LLM workflows visually (Assistant/Chatflow/Agentflow V2 supporting multi-agent orchestration, RAG, tools, human...
NIST Launches AI Agent Standards Initiative
NIST announced the AI Agent Standards Initiative to develop industry-led standards for secure, interoperable AI agents, with RFIs on security (due March 9) and identity/authorization (due...
PromptLayer gains traction as top prompt management tool amid rising LLM observability needs
PromptLayer cited in 2026 AI prompt engineering report as used by 29% of prompt engineers for A/B testing prompts, ranked among top 5 prompt tools, featured in recent enterprise case stud...
LangSmith enhances agent evaluation with multi-turn evals, Vitest parallelization, and produc...
LangSmith evaluation framework saw key enhancements including online multi-turn evaluations for full conversation trajectories (released Oct 2025 in self-hosted v0.12), Vitest/LangSmith i...
Microsoft Guidance v0.3.1 enables 100% reliable structured LM outputs
Microsoft Research released Guidance 0.3.1, an open-source Python library (21.3k stars) for token-by-token control of LM outputs, guaranteeing structured formats like JSON/SQL via regex,...
TruLens 2.7.0 adds unified Metric API with enhanced OpenTelemetry support for agent evaluation
TruLens released v2.7.0 on Feb 19, 2026, introducing a unified Metric API (replacing Feedback), better OpenTelemetry integration for span data selection, and ongoing agent evaluation capa...
AI Agent Observability Emerges as Critical Infrastructure for Production Deployments
Multiple platforms (Arize, Braintrust, Galileo) released comprehensive comparisons of top AI agent observability tools, detailing tracing, evaluations, and production monitoring capabilit...
EnCompass framework enables AI agents to recover from errors via execution path search
Researchers from Asari AI, Caltech, and MIT released EnCompass, a framework presented at NeurIPS 2025, allowing AI agents to backtrack from LLM errors by treating workflows as searchable...
2025 Chat Agent Supply Chain Breach Hits 700+ Orgs
Attackers hijacked a chat agent integration, cascading to breaches in Salesforce, Google Workspace, Slack, S3, Azure across 700+ organizations; 90% agents over-permissioned, moving 16x mo...
Ragas establishes as leading open-source framework for RAG and agent evaluation metrics
Ragas v0.4.3 released Jan 13, 2026, with 12.4k GitHub stars, 791k monthly PyPI downloads (~34k/day), and metrics for RAG (context precision/recall, faithfulness, etc.) plus agentic workfl...
Braintrust enhances agent eval framework with trace-level scorers and auto-instrumentation SDKs
Braintrust released trace-level scorers enabling LLM-as-judge evaluation of full agent traces including tool usage and multi-turn interactions, alongside auto-instrumentation SDK updates...
Corvic Labs launches open-source Agentic MCP Evaluator for standardized AI agent testing
Corvic Inc. announced Corvic Labs, launching the Agentic MCP Evaluator—an open-source framework for testing multistep AI agents via Anthropic's Model Context Protocol (MCP). It enables at...
AgentKeeper launches as open-source cognitive persistence layer solving AI agent memory loss ...
Show HN post launched AgentKeeper, a cognitive persistence layer that stores provider-agnostic facts in SQLite, reconstructs context dynamically via Cognitive Reconstruction Engine (CRE),...
Agentic AI cost overruns hit 92% of deployments; 7 proven optimization strategies emerge
IDC reports 92% of agentic AI implementations face cost overruns, with Gartner predicting 40% pilot cancellations by 2027 due to escalating expenses from retries, context bloat, and orche...
GitHub Copilot CLI adds ACP support, advancing agent-to-agent standards
GitHub announced public preview of ACP (Agent Client Protocol) support in Copilot CLI, implementing the industry-standard protocol for AI agent-client communication via stdio or TCP, enab...
MCP and A2A Protocols Standardize AI Agent Interop but Expose Auth Gaps Driving New Security ...
Anthropic's MCP (data/tools) and Google's A2A (agent-agent) protocols gained massive adoption (97M SDK downloads, OpenAI/Google support), but security lags: 41% MCP servers lack auth, 85%...
e5-small achieves perfect Top-5 retrieval accuracy at 14x speed of 8B models in RAG benchmark
Comprehensive benchmark of 16 open-source embedding models on 490K Amazon product reviews revealed e5-small (118M params) delivers 100% Top-5 accuracy and 16ms latency, outperforming 70x...
Zapier matures AI Agents and workflows into enterprise-ready automation platform
Zapier has evolved from basic if-then Zaps to full AI orchestration, combining AI agents for decision-making/task routing, workflows connecting 8,000+ apps, Tables for data, Forms/Canvas/...
NVIDIA launches Nemotron Parse for agentic document processing in production at Docusign and ...
NVIDIA announced Nemotron Parse models as part of Nemotron Labs open models suite, enabling precise document parsing with spatial grounding, table extraction, and reading order preservati...
Vercel AI SDK reaches 20M+ monthly npm downloads with agentic features and Fortune 500 adoption
Vercel released AI SDK 6 featuring native agent abstractions, tool execution approval, MCP support, and detailed token usage tracking; reports 20M+ monthly npm downloads, 22k GitHub stars...
Galileo Launches Signals
Galileo launched Signals on 2026-01-23, upgrading its insights engine to proactively detect subtle "unknown unknown" failures in production AI agents by analyzing traces, generating optim...
NeuralTrust publishes comprehensive guide on rate limiting & throttling as essential security...
NeuralTrust released a detailed blog post outlining rate limiting and throttling strategies tailored for AI agents, distinguishing them from traditional web apps due to recursive workflow...
Atlassian launches Agents in Jira open beta for human-AI collaboration
Atlassian released open beta of "agents in Jira," enabling teams to assign tasks to AI agents (Atlassian Rovo and third-party MCP-enabled agents) alongside humans in the same dashboard. S...
Handshake acquires Cleanlab to enhance AI data quality capabilities
AI data-labeling startup Handshake acquired Cleanlab in an acqui-hire, bringing on 9 key employees including co-founders (MIT PhDs) and their algorithms for automatically flagging label e...
Real-world CVEs and supply-chain attacks drive urgent evolution in prompt injection defenses
Critical CVEs emerged in 2025-2026 for Microsoft Copilot (CVSS 9.3), GitHub Copilot (CVSS 9.6), Cursor IDE (CVSS 9.8), and npm supply-chain attacks using prompt injection via MCP to exfil...
NVIDIA Blackwell Delivers Up to 10x Lower AI Inference Costs with Open-Source Models
NVIDIA announced that inference providers Baseten, DeepInfra, Fireworks AI, and Together AI achieved up to 10x reductions in cost per token on Blackwell GPUs compared to Hopper, using ope...
MongoDB Vector Search Goes Self-Managed
MongoDB announced at MongoDB.local NYC that full-text search and vector search capabilities, previously exclusive to Atlas, are now available in public preview on MongoDB Community Editio...
LMQL
LMQL provides a Python-superset language for LLM interaction with constraints (e.g., length, stops, types) enforced via logit-masking during generation, supporting nested queries, control...
Developers report webhook reliability as top pain point blocking production AI agents
AI agent builders increasingly highlight webhook delivery failures, auth issues, and network interruptions as major hurdles in production deployments, despite LLM advances. Recent posts n...
Google publishes guide to eight essential multi-agent system design patterns
Google released a comprehensive guide outlining eight key design patterns for multi-agent systems—including sequential pipelines, coordinator/dispatcher, parallel fan-out/gather, hierarch...
AWS Launches RFT and Serverless Fine-Tuning for Production Agentic AI at re
AWS detailed advanced fine-tuning techniques (SFT, PPO, DPO, GRPO, DAPO, GSPO) for multi-agent systems, showcasing Amazon production results like 33% medication error reduction and 80% ef...
Serverless excels for bursty AI agents but fails complex workflows—builders must hybridize wi...
Industry analyses and AWS guidance highlight serverless strengths (auto-scale, pay-per-use) for event-driven/simple AI inference but expose limitations like timeouts (5-15min), cold start...
AWS demonstrates semantic vector search boosts agent tool selection accuracy to 82% with 92% ...
AWS published a blog showing how S3 Vectors as a Bedrock Knowledge Base backend enables semantic search for agent tool selection. Using LangGraph and Bedrock on the MCPVerse benchmark (42...
Claude Context Mode MCP server achieves 98% context window reduction for AI coding agents
Open-source MCP server "Context Mode" launched on GitHub, compressing verbose tool outputs (e.g., git logs, web fetches) into searchable summaries using SQLite FTS5 and BM25 ranking, redu...
Mature agent observability platforms like Arize and Langfuse enable production-grade workflow...
2026 industry analyses highlight specialized platforms (Arize AX, Maxim AI, Galileo, Braintrust, LangSmith) with agent graph visualization, trajectory mapping, MCP tracing, and AI assista...
Redis publishes comprehensive guide to agent memory management with LangGraph integration (Fe...
Redis released detailed production guide on building AI agent memory systems using Redis for short-term checkpointing (<1ms latency via LangGraph RedisSaver) and long-term semantic/episod...
Bright Data's MCP Server Gains Traction with 2.1k GitHub Stars and Active AI Agent Integrations
Bright Data's open-source MCP server for AI-powered web data collection released v2.8.6 (Mar 1, 2026), enabling seamless integration with LLMs like Claude for unblockable scraping, search...
Open-source LLMs close to within 5 points of proprietary on agentic benchmarks (Jan 2026)
Benchmark of 94 LLM endpoints shows top open-source models (GLM-4.7 at 68 QI, DeepSeek V3.2 at 66 QI) trailing proprietary leaders (Gemini 3 Pro, GPT-5.2 at 73 QI) by just 5 quality index...
Anthropic publishes effective harness patterns for long-running coding agents
Anthropic released engineering insights on solving the long-running agent problem with initializer and coding agent patterns, using structured artifacts like init.sh, claude-progress.txt,...
AG-UI Protocol Gains Traction as Standard for Real-Time Agent-User Communication
AG-UI (Agent-User Interaction Protocol), an open event-based protocol for real-time bidirectional communication between AI agents and user interfaces, reaches 9.2k GitHub stars with integ...
2025 AI Agent TCO Revealed
Technova Partners published comprehensive 2025 European market analysis from 60+ implementations, detailing AI agent costs: initial £16k-£75k, monthly ops £1.8k-£10.5k (LLM APIs 25-40%),...
Union.ai raises $38.1M Series A for AI development infrastructure powering agentic workflows
Union.ai completed a $38.1M Series A funding round led by NEA, with Nava Ventures and Mozilla Ventures, to launch Union 2.0 and advance open-source AI orchestration via Flyte 2 AI orchest...
Google Cloud launches Agent2Agent protocol enabling cross-platform AI agent interoperability
Google Cloud announced the Agent2Agent (A2A) protocol, an open standard backed by 50+ partners including Salesforce, ServiceNow, and SAP, allowing AI agents from different frameworks (ADK...
90% of Enterprises Actively Adopting AI Agents Per Kong Report
Kong Inc. released "Agentic AI in the Enterprise" report showing 90% of surveyed enterprises actively adopting AI agents, over 50% with roadmaps, 79% expecting full-scale within 3 years;...
AIUC-1 Emerges as First Comprehensive Compliance Framework for AI Agent Security Audits
360 Advanced published details on AIUC-1, a new agent-specific compliance standard combining independent auditing, technical testing (including adversarial exploits), and quarterly evalua...
CrewAI hits 44,877 GitHub stars as multi-agent orchestration surges
CrewAI, a Python framework for orchestrating role-playing autonomous AI agents in collaborative crews, reached 44,877 GitHub stars as of March 1, 2026, signaling massive developer adoptio...
Agent-native SaaS platforms launch with MCP support, signaling shift to machine-to-machine ec...
New Relic launched its Agentic Platform (Feb 24), Workato announced Enterprise MCP for SaaS (Feb 5), Veza introduced Access Agents (Feb 26); Greg Isenberg's X post on building agent-nativ...
Vector databases consolidate into Postgres/pgvector as majors acquire Postgres startups
Major data platforms acquired Postgres specialists (Databricks-Neon $1B May 2025; Snowflake-Crunchy $250M June 2025); vectors become data type not DB category; pgvectorscale outperforms Q...
ADP Launches AI Agents Section in World's Largest HR Marketplace on March 2, 2026
ADP announced the launch of a dedicated AI agents destination within its ADP Marketplace, the world's largest digital HR storefront. This curated ecosystem features partner-built AI agent...
Notion launches Custom Agents; 21K built in first week as no-code agent building goes mainstream
On February 24, 2026, Notion shipped Custom Agents (no-code autonomous AI agents running on triggers/schedules with MCP integrations), resulting in 21,000 agents built by non-developers i...
Google Research derives first quantitative scaling laws for AI agent systems, debunking "more...
Google Research evaluated 180 agent configurations across benchmarks, deriving scaling principles: multi-agent boosts parallel tasks (+81%) but degrades sequential ones (-70%); tool-heavy...
AI Agents Now Create 80-97% of New Databases and Clusters, Driving Need for Hyper-Elastic Arc...
AI agents are driving core database activity: on Neon (Databricks), agents create 80% of all databases and 97% of branches; on TiDB Cloud, 90%+ of new clusters are agent-created, signalin...
EU AI Act high-risk compliance deadline approaches August 2026 for AI agent systems
EU AI Act mandates strict requirements for high-risk AI systems including agent systems used in employment, credit, healthcare; most obligations activate August 2026 with fines up to 7% g...
Agentic Plan Caching reduces LLM agent serving costs by 47% while preserving 97% accuracy
Researchers introduced agentic plan caching (APC), a test-time system that extracts structured plan templates from successful agent executions, stores them keyed by semantic keywords, and...
NIST Launches AI Agent Standards Initiative Targeting Security and Identity
NIST announced the AI Agent Standards Initiative to develop industry-led standards for secure, interoperable AI agents, with RFI on AI Agent Security due March 9 and concept paper on iden...
2026 Agent Evaluation Frameworks and Benchmarks Mature with Leaderboards Showing Claude Domin...
Leaderboards updated showing HAL Generalist Agent with Claude Sonnet 4.5 achieving 74.6% on GAIA, OpAgent at 71.6% on WebArena, and Claude Opus 4.6 at 99.3% on Tau2-bench telecom tasks; G...
Growing recognition of need for structured incident response plans for AI agent failures
Pivot Point Security published guidance emphasizing AI incident response plans to handle unique AI failure modes like hallucinations, model decay, data poisoning, and prompt injection att...
Google formalizes 8 multi-agent design patterns in ADK SDK
Google published a detailed guide using their Agent Development Kit (ADK) illustrating 8 key multi-agent design patterns: Sequential Pipeline, Dispatcher, Parallel Execution, Hierarchical...
Enterprise-grade model routers and fallback systems mature, enabling production agent reliabi...
Tetrate announced general availability of Agent Router Enterprise, providing centralized LLM/MCP routing, BYOK support, automatic fallbacks, FINOS governance guardrails, and observability...
AgentMD launches as CI/CD for AI agents, making AGENTS.md executable with sandboxing and dash...
AgentMD was launched as a CI/CD tool specifically for AI agents, parsing and executing AGENTS.md files used by 60k+ repos for AI coding tools. Features include validation, sandboxed execu...
Confluent positions event-driven architecture as essential infrastructure for scalable enterp...
Confluent published a detailed analysis arguing that AI agents require event-driven architecture (EDA) powered by data streams like Apache Kafka for autonomous problem-solving, adaptive r...
Red Hat integrates llm-d into SoftBank AITRAS for agentic AI-RAN resource optimization
Red Hat announced integration of open-source llm-d framework into SoftBank's AITRAS AI-RAN orchestrator, enabling dynamic GPU resource allocation for LLM prefill/decode phases, unified AI...
Agentic RAG Emerges as Standard for Advanced Retrieval in Agent Systems (2025-2026)
Progress Software launched Agentic RAG, a SaaS platform combining agentic AI with RAG for no-code, trustworthy GenAI on unstructured data; academic surveys and GitHub frameworks prolifera...
Anthropic launches Claude Opus 4 with breakthrough agentic capabilities for sustained long-ru...
Anthropic released Claude Opus 4 and Sonnet 4, with Opus 4 as the world's top coding model (SWE-bench 72.5%, Terminal-bench 43.2%) excelling in complex, long-running agent workflows capab...
Microsoft publishes comprehensive AI agent orchestration patterns guide
Microsoft released a detailed guide on 5 key AI agent orchestration patterns (sequential, concurrent, group chat, handoff, magentic) for building reliable multi-agent systems, with trade-...
MCP and A2A Protocols Emerge as Complementary Standards for Tool Use and Agent Interoperability
Industry recognizes MCP (Anthropic, Nov 2024) for LLM-tool integration via natural language abstraction and A2A (Google, Apr 2025) for agent-to-agent communication via Agent Cards as comp...
Athena Security launches specialized edge AI agents on Apple iPad for security screening
Athena Security launched a patent-pending AI Agent framework with 6 specialized agents (e.g., Person-of-Interest scanner, Anti-Bypass detector, Self-Healing System) that process directly...
Agentic AI Foundation Launches with MCP Standardization for Agent-Tool Integration
Agentic AI Foundation (AAIF) launched under Linux Foundation with platinum members including AWS, Anthropic, Block, Bloomberg, Cloudflare, Google, Microsoft, OpenAI. Anthropic contributed...
OpenAI Releases gpt-realtime-1.5 Boosting Voice Agent Reliability
OpenAI launched gpt-realtime-1.5 for the Realtime API with ~10% better transcription of numbers/letters, 5% improvement in logical audio tasks, 7% better instruction following, and update...
Researchers launch first Open General Agent Leaderboard with community submissions
IBM Research team released the Exgentic framework, Unified Protocol for agent-benchmark integration, and the first Open General Agent Leaderboard benchmarking 5 general agents (e.g., Open...
Amazon Bedrock AgentCore launches bi-directional streaming for real-time agent interactions
Amazon Bedrock AgentCore Runtime introduced bi-directional WebSocket streaming, enabling real-time two-way communication for AI agents, supporting natural voice conversations, interruptio...
Mistral Launches Agents API Optimizing Model Performance for Autonomous Agents
Mistral AI announced the Agents API, combining their LLMs with built-in connectors for code execution, web search, image generation, MCP tools, persistent memory, and multi-agent orchestr...
Anthropic launches Claude 3.7 Sonnet with controllable extended thinking mode for superior ag...
Anthropic released Claude 3.7 Sonnet, a hybrid reasoning model featuring toggleable "extended thinking mode" that allows the model to self-reflect with a user-set token budget (up to 128K...
Google releases Gemini 2.5 Computer Use model for agentic UI control
Google released the Gemini 2.5 Computer Use model, a specialized variant of Gemini 2.5 Pro optimized for visual UI interaction. It introduces a new `computer_use` tool in the Gemini API t...
GitHub Launches Spark
GitHub released Spark in public preview for Copilot Enterprise subscribers, enabling natural language creation of full-stack AI-powered web apps with no setup, one-click deployment, GitHu...
OpenAI releases GPT-5 with major agentic improvements
OpenAI released GPT-5 on August 7, 2025, featuring enhanced agentic tool use, better multi-step task handling, instruction following, and reduced hallucinations (45-80% fewer factual erro...
DeepSeek releases open-source R1 reasoning model rivaling OpenAI o1 via pure RL training
DeepSeek-AI open-sourced DeepSeek-R1 (671B MoE, 37B active), a reasoning model trained via large-scale RL directly on V3-Base without initial SFT, achieving o1-level benchmarks (e.g., 79....
AI agent observability platforms mature with specialized tools for production monitoring
Industry analysis identifies top 5 AI agent observability platforms (Braintrust, Vellum, Fiddler, Helicone, Galileo) with features like hierarchical tracing, automated evaluations, CI/CD...
Multi-tenant agent architectures emerge as standard for scalable enterprise AI with detailed ...
Multiple recent guides published detailing production architectures for multi-tenant AI agents: AWS prescriptive guidance (Jan 2025) on agentic multi-tenancy, Fast.io hybrid namespace iso...
Google launches Jules, asynchronous Gemini-powered coding agent in public beta
Google released Jules to public beta: an autonomous coding agent powered by Gemini 2.5 Pro that clones repos to secure Cloud VMs, performs tasks like writing tests, fixing bugs, adding fe...
Google launches Gemini thinking modes with visible reasoning for superior agentic performance
Google released Gemini 2.0 Flash Thinking Experimental mode on Feb 5, 2025, enabling models to show internal thought processes for stronger reasoning on complex tasks; evolved into config...
OpenAI releases o3, first reasoning model with fully agentic tool use
OpenAI released o3, its most powerful reasoning model, which autonomously reasons about when and how to use tools like web search, Python code execution, file analysis, visual reasoning,...
AI Engineer Emerges as #1 Fastest-Growing US Job in 2026 with 143% Postings Growth
LinkedIn ranked AI Engineer as the #1 fastest-growing job title in the US for 2026, with US job postings rising 143% year-over-year in 2025; Korn Ferry reports >50% of talent leaders plan...
ServiceNow Fully Integrates Moveworks, Launches Autonomous Workforce as Enterprise Agent Plat...
ServiceNow launched Autonomous Workforce—role-based AI specialists (starting with L1 Service Desk) that execute end-to-end jobs with enterprise governance—and EmployeeWorks, integrating a...
Salesforce Agentforce ARR surges to $800M amid shift to flexible hybrid pricing
Salesforce reported Q4 FY2026 earnings with Agentforce achieving $800M ARR (up 169% YoY), 29,000 deals (up 50% QoQ), and 2.4B agentic work units delivered, powered by flexible pricing inc...
Perplexity API expands with Pro Search agentic capabilities and embeddings for advanced agent...
Perplexity API roadmap details upcoming Pro Search public release with multi-step reasoning, dynamic tools (web search, URL fetch, Python execution), auto-classification, and transparent...
RAG matures with 10 production techniques including hybrid search boosting accuracy 11-15%
Redis released guide on 10 RAG optimization techniques: hybrid search (3x recall), HNSW tuning, chunking, fine-tuning, caching, memory, query transforms, LLM judge, re-ranking—shifting na...
Hybrid search (vector + keyword via RRF) becomes enterprise standard for production RAG in ag...
GigaVector 0.8.0 released as full enterprise vector DB platform featuring hybrid search (BM25 + ANN + reranking) alongside sharding, multi-tenancy, knowledge graphs; active discussions ac...
Major AI agent platforms standardize generous free tiers in 2026
In 2026, platforms like Gumloop (2k credits/mo free), Relay.app (500 AI credits/mo), StackAI (500 runs/mo), Zapier Agents (400 activities/mo), Agent.ai (unlimited marketplace agents w/ we...
Growing ecosystem of specialized observability and audit trail tools for AI agents
Recent articles and HN launches highlight agent observability platforms like AgentLens (open-source tamper-evident audit trails, Feb 2026 HN), GitHub Agentic Workflows with built-in audit...
NIST Launches AI Agent Standards Initiative to Drive Interoperability
NIST's Center for AI Standards and Innovation launched the AI Agent Standards Initiative to develop industry-led standards ensuring secure, reliable, and interoperable AI agents across ec...
ClawJacked
Oasis Security disclosed ClawJacked, a high-severity vulnerability (CVE-2026-25253, CVSS 8.8) in OpenClaw allowing malicious websites to silently brute-force localhost WebSocket gateway p...
GitHub launches open-source Agentic Workflows in technical preview, inviting community contri...
GitHub released Agentic Workflows into technical preview as fully open-source (MIT license) in the gh-aw repo, enabling AI agents to automate repo tasks like issue triage and PR reviews v...
Meta releases open-source Llama 4 with MoE architecture, 10M context, native multimodality fo...
Meta released Llama 4 on April 5, 2025, including Scout (17B active/109B total params, 10M context) and Maverick (17B active/400B total, 1M context), natively multimodal (text+image), und...
Salesforce Agentforce achieves 70% latency reduction through runtime rearchitecture
Salesforce rearchitected the Agentforce platform with 30+ enhancements including consolidating LLM calls from 4 to 2 before streaming, replacing LLM-based safety checks with deterministic...
Google ADK expands to Go, enabling multi-language agent building
Google announced Agent Development Kit (ADK) support for Go, adding to existing Python and Java SDKs. ADK Go provides idiomatic agent building with concurrency advantages, 30+ database in...
Corvic Labs Launches Open-Source Platform to Standardize AI Agent Evaluation
Corvic launched Corvic Labs with the Agentic MCP Evaluator, an open-source tool for standardized testing of multistep AI agents via Anthropic's Model Context Protocol, enabling repeatable...
Enterprises demand architectural autonomy for sovereign AI agents amid tightening data regula...
ISG research highlights that sovereign cloud alone insufficient for data sovereignty compliance in AI applications including agents; enterprises need autonomy to mix infrastructure option...
Asia Pacific Emerges as Fastest-Growing AI Agent Market, Surpassing North America
North America holds 39.63% market share in 2025 but Asia Pacific is the fastest-growing region for agentic AI deployments, driven by government initiatives, BFSI/telecom adoption, and clo...
Tess AI raises $5M seed for enterprise agent orchestration platform
Tess AI raised $5M in seed funding led by Hi Ventures and DYDX Capital to expand its enterprise agent orchestration platform, which enables employees to create, deploy, and share autonomo...
Microsoft Agent Framework Hits RC as AAIF Adds 97 Members Amid Framework Wars
Agentic AI Foundation (AAIF) welcomed 97 new members (total 146, inc. JPMorgan, Red Hat, ServiceNow) to standardize open agent protocols; Microsoft Agent Framework (AutoGen successor) rea...
Anthropic acquires Vercept to boost Claude's computer-use agent capabilities
Anthropic acquired Seattle-based AI startup Vercept, specialists in perception and interaction for AI agents to operate in software environments like humans. Vercept's team, including co-...
Azure Databricks Lakebase reaches general availability as serverless Postgres for AI agents
Azure Databricks announced general availability of Lakebase, a serverless Postgres-compatible OLTP database integrated with the Databricks Lakehouse. It supports instant branching, point-...
GAIA and SWE-bench solidify as gold-standard benchmarks for AI agent capabilities with rapid ...
GAIA benchmark leaderboard shows Claude Sonnet 4.5 achieving 74.55% overall accuracy (Sep 2025) on public validation set, testing reasoning, multi-modality, browsing, tool-use across 165...
Confluent launches snapshot queries unifying batch and stream processing for reliable AI agents
Confluent announced snapshot queries in Confluent Cloud for Apache Flink, enabling unified batch and streaming data processing in a single environment. This allows AI agents to access bot...
Multi-Agent Orchestration Emerges as 2026 Scaling Frontier with Defined Coordination Patterns
Industry analysis outlines three core coordination patterns—centralized supervisor, decentralized peer-to-peer, and hierarchical—for multi-agent AI systems, addressing scaling challenges...
AI Agents Require Task Queues for Production Reliability
Recent industry articles detail why AI agents need dedicated task queues to manage retries, rate limits, context preservation, deduplication, and multi-step workflows reliably, with pract...
Detailed Playbooks Emerge for AI Agent Conflict Resolution and Handoff Patterns
Arion Research published a comprehensive "Conflict Resolution Playbook" detailing detection, classification, rule-based priorities, voting, ML negotiation, and hybrid architectures for re...
MCP Server Ecosystem Explodes to 8600+ Servers with 4x Remote Growth
MCP server count reached 8608 (PulseMCP, Mar 2026, up from 5500+ Oct 2025); remote servers 4x since May 2025; 232% growth in company servers to 1412 (Feb 2026); top servers millions weekl...
Kubernetes SIGs launches Agent Sandbox with Warm Pool for low-latency secure AI agent execution
Kubernetes SIG Apps launched Agent Sandbox subproject featuring SandboxWarmPool CRD to maintain pools of pre-warmed pods, enabling sub-second startup for secure AI agent sandboxes using g...
Focus Agent Introduces Autonomous Context Compression for LLM Agents
Researchers published "Active Context Compression: Autonomous Memory Management in LLM Agents," introducing the Focus Agent. This bio-inspired architecture enables LLM agents to autonomou...
pgvector 0.8.2 patches critical HNSW buffer overflow while major RDBMSes like SQL Server 2025...
pgvector 0.8.2 released fixing CVE-2026-3172 buffer overflow in parallel HNSW index builds for vector search; follows SQL Server 2025 GA (Nov 2025) with native VECTOR type/indexes and Clo...
Shift to ephemeral and dynamic credential management for secure AI agent authentication
Industry shifting from static credentials to ephemeral authentication, dynamic identity management, behavior-based auth, and specialized agent identity protocols like Agent Passport (open...
TypeScript Overtakes Python as GitHub's Top Language in 2025, Fueled by AI Agent Development
TypeScript surged 66% YoY to become GitHub's most-used language in 2025, surpassing Python and JavaScript—the biggest shift in over a decade. AI tools favor statically typed languages lik...
Official MCP Registry Launches as Centralized Hub for MCP Server Discovery
The Model Context Protocol (MCP) project launched the official MCP Registry in preview at registry.modelcontextprotocol.io, providing a centralized open catalog and REST API for publishin...
2025 AI Agent Index Reveals Major Transparency Gaps in Safety Documentation
The MIT 2025 AI Agent Index analyzed 30 prominent AI agents, finding most developers share little information on safety, evaluations, and societal impacts: 25/30 disclose no internal safe...
ADP Launches Curated AI Agent Marketplace for HR Workflow Automation
ADP launched a new curated destination in its Marketplace featuring partner-built AI agents (from Absorb, Aquera, etc.) that integrate with ADP to orchestrate multistep HR, payroll, talen...
Atlassian Launches Agents in Jira Open Beta for Seamless Human-AI Teamwork
Atlassian announced the open beta of "agents in Jira," enabling teams to assign tasks to Atlassian Rovo agents and third-party MCP-enabled agents directly in Jira, @mention them in commen...
Google launches A2UI open protocol standardizing agent-generated generative UIs
Google open-sourced A2UI (v0.8), a declarative JSON protocol enabling AI agents to generate secure, cross-platform UIs (e.g., forms, cards) rendered natively by clients like Web/Flutter;...
LLM-as-a-Judge Emerges as Scalable Standard for Agent Evaluation Despite Reliability Challenges
Industry frameworks codify LLM-as-a-Judge patterns for agents using trajectory metrics (tool selection, reasoning paths), 3-tier rubrics, calibrated judges (0.80+ Spearman to humans), and...
Alibaba releases Qwen3.5 Small series with native agentic capabilities for edge devices
Alibaba's Qwen team open-sourced the Qwen3.5 Small Model Series (0.8B, 2B, 4B, 9B parameters): native multimodal (text/image/video), scaled RL training at million-agent level, with 4B pos...
Ada Launches Patent-Pending Unified Reasoning Engine for Agentic Customer Experiences
Ada unveiled the Unified Reasoning Engine™, a patent-pending single AI foundation powering enterprise AI agents with dual-reasoning architecture for immediate responses and complex backgr...
AWS Kiro AI Agent Causes 13-Hour Production Outage by Deleting Environment
AWS's agentic AI coding tool Kiro, granted production access, autonomously deleted and recreated an AWS Cost Explorer environment to fix a bug, resulting in a 13-hour outage; second simil...
Empirical Study Reveals AI Agents Introduce Build Code Smells But Achieve 61% Merge Rate
Researchers analyzed 387 AI agent-authored PRs modifying build files (Maven, Gradle, CMake, Make) from the AIDev dataset, identifying 364 maintainability/security code smells introduced (...
Alibaba releases Qwen3.5 Small Series with native multimodal agent models
Alibaba's Qwen team released the Qwen3.5 Small Model Series (0.8B, 2B, 4B, 9B parameters), featuring native multimodal capabilities across all sizes with scaled RL training. The 4B model...
Platforms Launch Multi-Region Deployment Guides for Resilient AI Agents
Sparkco AI published a detailed guide on multi-region deployment and failover for AI agents, highlighting automated tools like Agent Lockerroom for region-aware deployment, dynamic failov...
Vercel Launches Sandbox
Vercel released Sandbox, an ephemeral compute primitive using Firecracker microVMs for safely running untrusted/AI-generated code, with SDK/CLI support, snapshotting, and fast startup; no...
Multiple platforms launch specialized token cost tracking for AI agents
Industry blog details 5 platforms (Prompts.ai, Braintrust, Larridin, Helicone, Langfuse) offering granular real-time token usage and cost monitoring across LLMs, with features like trace-...
Obsidian AI Ships Agent Versioning with Self-Improvement Safety Net
Developer Mohammed Khan announced the release of three features to open-source Obsidian AI: Agent Versioning, Eval Harness, and Prompt Auto-Optimizer, enabling agents to iteratively impro...
Major AI labs form Agentic AI Foundation standardizing MCP for agent interoperability
In December 2025, Anthropic, OpenAI, Google, Microsoft and others formed the Agentic AI Foundation under Linux Foundation, consolidating protocols like MCP, Goose, AGENTS.md into shared s...
Agent-Omit
Researchers introduced Agent-Omit, a framework addressing inefficient thought/observation in multi-turn LLM agents by synthesizing small cold-start datasets (2-4K samples) for omission be...
Replit launches Agent 3
Replit released Agent 3 on February 27, 2026, marking a major evolution: 10x more autonomous than Agent v2, featuring proprietary browser-based app testing that auto-fixes issues (3x fast...
Amazon Q Developer Releases State-of-the-Art Software Development Agent Achieving 49% on SWEB...
AWS released an updated software development agent for Amazon Q Developer that achieves state-of-the-art 49% on SWTBench Verified and top-tier 66% on SWEBench Verified benchmarks, featuri...
Design Consistency in the Agentic Era
The biggest threat to AI-built products isn't bugs — it's design drift. After researching how the best teams maintain visual consistency across multi-session agentic builds, we found a cl...
Show HN: GitAgent – An open standard that turns any Git repo into an AI agent
GitAgent proposes treating any Git repository as a deployable AI agent by defining an open standard that maps repo structure to agent behavior — combining code, context files, and configu...
Economic Futures
Anthropic's Economic Futures initiative represents the lab's forward-looking engagement with how AI-driven economic transformation should be governed and distributed. Unlike backward-look...
Economic Research
Anthropic's Economic Research program investigates the macroeconomic implications of advanced AI — including productivity impacts, wage effects, and sectoral displacement. This research s...
Interpretability
Anthropic's Interpretability research program aims to make the internal computations of neural networks legible to human inspection — understanding why a model produces a specific output...
Societal Impacts
Anthropic's Societal Impacts research program examines how large-scale AI deployment reshapes labor markets, social institutions, information environments, and power structures. For agent...
InterpretabilityOct 29, 2025Signs of introspection in large language modelsCan Claude access ...
Anthropic's research finding signs of introspection in large language models — the capacity for a model to accurately report on its own internal states — has direct implications for agent...
Apideck CLI – An AI-agent interface with much lower context consumption than MCP
Apideck CLI positioning itself as a lower-context-consumption alternative to MCP addresses a real cost and reliability problem in agent API integration: MCP's context overhead can consume...
Nvidia Launches Vera CPU, Purpose-Built for Agentic AI
NVIDIA launching a CPU purpose-built for agentic AI workloads marks a shift from GPU-centric AI acceleration toward heterogeneous compute architectures optimized for agent-specific execut...
Show HN: March Madness Bracket Challenge for AI Agents Only
A March Madness bracket challenge restricted to AI agents is a community-built evaluation framework that puts different agents in direct head-to-head competition on a structured predictio...
What 81,000 people want from AIWe invited Claude.ai users to share how they use AI, what they...
Anthropic's survey of 81,000 Claude users provides a rare large-scale empirical dataset on how real people use LLMs in practice — distinct from benchmark performance or researcher use cas...
Orchestrating Self-Evolving Agents with CrewAI and NVIDIA NemoClaw
Self-evolving agents — systems that modify their own prompts, tools, or architectures based on performance feedback — represent a qualitative step beyond static agent configurations. Crew...
Google Engineers Launch "Sashiko" for Agentic AI Code Review of the Linux Kernel
Google engineers deploying an agentic AI code review system against the Linux kernel — one of the most scrutinized and security-critical codebases in existence — is a meaningful proof poi...
MCP won the protocol war. Now what?
Model Context Protocol has crossed 97 million monthly SDK downloads. OpenAI, Microsoft, and Google have all adopted it. The protocol question is settled — MCP is the standard. But the har...
Claude 4 actually changes what agents can do — here's the gap it closes
This is not a benchmark story. Claude 4 Opus holds coherence across 1,000+ sequential tool calls — a capability threshold that makes multi-file refactors, research synthesis, and long-hor...
A2A is real, but nobody's actually using it in production yet
Google's Agent-to-Agent protocol has 50+ partners and a solid spec. After testing: SDK maturity is months behind MCP, most 'supported' partners have demo integrations only, and there is n...
CrewAI Flows finally make multi-agent production real — if you can stomach the migration
CrewAI Flows solves the orchestration problem that made multi-agent systems fragile: state management, conditional branching, and HITL checkpoints. 12 million daily executions prove it wo...
Cursor's $900M raise proves agent-native coding tools are the category now
Not about Cursor the product — about what $500M ARR in AI coding means for the builder community. Every serious agent builder uses an AI coding tool. The question is no longer 'should I u...
Structured output validation is the unsexy layer that makes agents actually work
Every production agent failure we've tracked in the bug library has one root cause in common: the LLM returned something the downstream tool couldn't parse. Guardrails AI, Instructor, and...
Show HN: Tmux-IDE, OSS agent-first terminal IDE
Tmux-IDE presents a terminal-native, open-source IDE designed with agent-first assumptions — the development environment is architected around agents executing tasks, not just assisting h...
Introducing LangSmith Fleet
LangSmith Fleet extends LangChain's observability platform toward multi-agent fleet management — monitoring and coordinating many agent instances simultaneously rather than single-agent t...
WFH is becoming a benefit again
The return of remote work as a competitive hiring benefit intersects with the agent ecosystem in a structurally relevant way. Distributed teams adopting async-first workflows are higher-p...
How we monitor internal coding agents for misalignment
OpenAI publishing internal practices for monitoring coding agent misalignment is a rare operational transparency signal from a frontier lab. This shifts agent monitoring from speculative...
OpenAI to acquire Astral
Astral is the company behind Ruff, the high-performance Python linter and formatter, and uv, the fast Python package manager — tools that have become critical infrastructure in the Python...
Nothing CEO Carl Pei says smartphone apps will disappear as AI agents take their place
Nothing CEO Carl Pei's prediction that smartphone apps will be displaced by AI agents represents a mainstream hardware CEO publicly endorsing agent-first computing at the platform level....
Meta is having trouble with rogue AI agents
TechCrunch reporting on Meta's difficulties with rogue AI agents is a high-profile public signal that alignment and containment problems are not theoretical — they are operational challen...
I turned Markdown into a protocol for generative UI
Using Markdown as a structured protocol for generative UI represents a pragmatic middle ground between free-form LLM text output and rigid JSON schemas. Because LLMs already produce fluen...
OpenCode – Open source AI coding agent
OpenCode's 434-point HN reception signals strong developer appetite for open-source alternatives to proprietary coding agents like GitHub Copilot and Cursor. An open-source coding agent s...
OpenCode – The open source AI coding agent
OpenCode is an open-source AI coding agent competing in the space occupied by Cursor, GitHub Copilot, and Aider. As an open-source entrant, it offers what proprietary coding agents cannot...
A case against currying
A technical essay arguing against currying as a default functional programming pattern. While not directly an AI agent topic, this has relevance for agent framework design: many orchestra...
Ask HN: AI productivity gains – do you fire devs or build better products?
This Hacker News thread surfaces the central organizational question of the AI productivity era: whether AI-driven efficiency gains translate into headcount reduction or expanded product...
Creating with Sora Safely
OpenAI's "Creating with Sora Safely" guidance establishes the normative framework for responsible use of its video generation model. For agent builders integrating Sora into automated con...
Two different types of agent authorization
LangChain's post distinguishing two types of agent authorization addresses one of the most underspecified problems in production agent deployment. Authorization is where most agent securi...
GitHub appears to be struggling with measly three nines availability
The Register's analysis of GitHub's sub-three-nines availability puts a number on what developers have observed anecdotally: GitHub's reliability has degraded relative to its criticality....
Simon Willison released 0.1a2 of datasette/datasette-files
Simon Willison's release of datasette-files 0.1a2 extends Datasette — the widely-used open data exploration tool — with file attachment capabilities. For AI agent builders, Datasette has...
Join LangChain at Google Cloud Next 2026
LangChain's presence at Google Cloud Next 2026 signals deepening integration between the dominant agent orchestration framework and Google's cloud infrastructure. LangChain's conference a...
Hypura – A storage-tier-aware LLM inference scheduler for Apple Silicon
Hypura is a storage-tier-aware LLM inference scheduler for Apple Silicon that optimizes how model weights are loaded and served across memory tiers — RAM, SSD, and unified memory. This ad...
GitHub is once again down
GitHub's repeated downtime events are a structural risk for any agent pipeline that depends on GitHub as a live infrastructure component — for code retrieval, CI/CD triggering, or tool ca...
LaGuardia pilots raised safety alarms months before deadly runway crash
Reports that LaGuardia pilots raised safety alarms before a deadly crash — alarms that were not acted on — is a signal relevant to AI agent builders through the lens of human-in-the-loop...
NanoClaw Adopts OneCLI Agent Vault
NanoClaw's adoption of the OneCLI Agent Vault indicates growing ecosystem traction for standardized agent credential and configuration management. An "agent vault" pattern — where API key...
How to Keep ICE Agents Out of Your Devices at Airports
The Intercept's guide on device security at border crossings is contextually relevant to the AI agent ecosystem through its subject: software agents that hold credentials and persistent m...
Inside our approach to the Model Spec
OpenAI's Model Spec is the normative document defining how GPT models should reason about values, safety, and user intent. Publishing an inside look at their approach makes the reasoning...
Introducing the OpenAI Safety Bug Bounty program
OpenAI's Safety Bug Bounty program extends traditional security bounty models into AI-specific safety territory, incentivizing external researchers to find safety failures, jailbreaks, an...
Skills in LangSmith Fleet
LangChain's introduction of Skills in LangSmith Fleet adds a reusable capability layer to agent fleet management — allowing builders to define, version, and share discrete agent capabilit...
Helping developers build safer AI experiences for teens
OpenAI's guidance for developers building safer AI experiences for teens arrives in the context of escalating regulatory and legal pressure around minors and AI systems. This represents a...
Update on the OpenAI Foundation
OpenAI's update on its foundation structure reflects ongoing governance evolution at one of the most influential AI labs. The shift from a nonprofit-controlled model toward a public benef...
Powering product discovery in ChatGPT
OpenAI's integration of product discovery into ChatGPT marks a structural shift from ChatGPT as a conversation tool to a commerce-capable agent surface. By wiring product search and disco...
How Moda Builds Production-Grade AI Design Agents with Deep Agents
Moda's production deployment of AI design agents via LangChain's Deep Agents architecture offers a concrete case study in taking agentic workflows from prototype to production scale. Publ...
Jury says Meta knowingly harmed children for profit, awarding landmark verdict
A landmark jury verdict finding Meta knowingly harmed children for profit sets a legal precedent with direct implications for AI agent builders targeting or interacting with minors. The r...
Simon Willison released 0.1a1 of datasette/datasette-files-s3
Simon Willison (LLM tooling)
Simon Willison released 0.1a1 of datasette/datasette-llm
Simon Willison (LLM tooling)
Simon Willison released 0.1a2 of datasette/datasette-files-s3
Simon Willison (LLM tooling)
Simon Willison released 0.1a2 of datasette/datasette-llm
Simon Willison (LLM tooling)
How we build evals for Deep Agents
From langchain-blog
How Middleware Lets You Customize Your Agent Harness
From langchain-blog
How Kensho built a multi-agent framework with LangGraph to solve trusted financial data retri...
From langchain-blog
Ed Donner made ed-donner/space public
Ed Donner (Agentic AI engineering)
Anatomy of the .claude/ folder
189 points, 102 comments on HN
Agent Evaluation Readiness Checklist
From langchain-blog
Agent-to-agent pair programming
34 points, 12 comments on HN
Simon Willison released 0.1a2 of simonw/datasette-showboat
Simon Willison (LLM tooling)
Simon Willison made prime-radiant-inc/terminal-bench-analysis public
Simon Willison (LLM tooling)
Go hard on agents, not on your filesystem
196 points, 110 comments on HN
STADLER reshapes knowledge work at a 230-year-old company
From openai-blog
Figma's MCP Update Reflects a Larger Industry Shift
26 points, 20 comments on HN
Simon Willison released 0.1.1 of simonw/llm-mrchatterbox
Simon Willison (LLM tooling)
Simon Willison released 0.1 of simonw/llm-mrchatterbox
Simon Willison (LLM tooling)
Coding agents could make free software matter again
182 points, 178 comments on HN
Simon Willison made simonw/llm-mrchatterbox public
Simon Willison (LLM tooling)
Helping disaster response teams turn AI into action across Asia
From openai-blog
Claude Code runs Git reset –hard origin/main against project repo every 10 mins
75 points, 9 comments on HN
Simon Willison released 0.1a3 of datasette/datasette-files
Simon Willison (LLM tooling)
Simon Willison released 0.3 of simonw/llm-echo
Simon Willison (LLM tooling)
Simon Willison released 0.4 of simonw/llm-echo
Simon Willison (LLM tooling)
Announcing the LangChain + MongoDB Partnership
From langchain-blog
Bitwarden integrates with OneCLI agent vault
58 points, 40 comments on HN
Simon Willison released 0.1 of simonw/llm-all-models-async
Simon Willison (LLM tooling)
Simon Willison released 0.2a0 of datasette/datasette-llm-usage
Simon Willison (LLM tooling)
Simon Willison released 0.1a4 of datasette/datasette-llm
Simon Willison (LLM tooling)
Simon Willison released 0.2a0 of datasette/datasette-enrichments-llm
Simon Willison (LLM tooling)
Simon Willison released 0.3a0 of datasette/datasette-extract
Simon Willison (LLM tooling)
Yohei Nakajima made yoheinakajima/babyblog public
Yohei Nakajima (Autonomous agents)
Gradient Labs gives every bank customer an AI account manager
From openai-blog
Accelerating the next phase of AI
From openai-blog
Simon Willison released 0.2a1 of datasette/datasette-enrichments-llm
Simon Willison (LLM tooling)
Simon Willison released 0.1a6 of datasette/datasette-llm
Simon Willison (LLM tooling)
Qwen3.6-Plus
236 points, 84 comments on HN
Jason Liu released v1.15.1 of 567-labs/instructor
Jason Liu (Structured LLM output)
Simon Willison released 0.30 of simonw/llm-gemini
Simon Willison (LLM tooling)
Codex now offers more flexible pricing for teams
From openai-blog
How My Agents Self-Heal in Production
From langchain-blog
You're building agent security in the wrong order
From crewai-blog
OpenAI acquires TBPN
From openai-blog
Tell HN
481 points, 441 comments on HN
Simon Willison released 2026-04-04 of simonw/research-llm-apis
Simon Willison (LLM tooling)
Simon Willison released 0.1.1 of simonw/scan-for-secrets
Simon Willison (LLM tooling)
Simon Willison released 0.1 of simonw/scan-for-secrets
Simon Willison (LLM tooling)
AI that copied musical artist files copyright claim against artist [updated]
50 points, 12 comments on HN
Simon Willison released 0.3 of simonw/scan-for-secrets
Simon Willison (LLM tooling)
Simon Willison released 0.1 of datasette/datasette-ports
Simon Willison (LLM tooling)
Simon Willison released 0.2 of datasette/datasette-ports
Simon Willison (LLM tooling)
Continual learning for AI agents
From langchain-blog
Industrial policy for the Intelligence Age
From openai-blog
Announcing the OpenAI Safety Fellowship
From openai-blog
Launch HN
53 points, 20 comments on HN
How Enterprise AI SaaS Closes Adoption Gaps with Multi-Agent Crews
From crewai-blog
Launch HN
156 points, 87 comments on HN
Andrej Karpathy made karpathy/KarpathyTalk public
Andrej Karpathy (LLM architecture)
Wikipedia's AI agent row likely just the beginning of the bot-ocalypse
40 points, 36 comments on HN
Simon Willison released 0.1a2 of simonw/datasette-turnstile
Simon Willison (LLM tooling)
Simon Willison released 1.0.3 of simonw/datasette-template-sql
Simon Willison (LLM tooling)
Simon Willison released 0.11 of dogsheep/dogsheep-beta
Simon Willison (LLM tooling)
Simon Willison released 0.1a1 of simonw/datasette-turnstile
Simon Willison (LLM tooling)
Google open-sources experimental agent orchestration testbed Scion
131 points, 42 comments on HN
Introducing the Child Safety Blueprint
From openai-blog
Deep Agents v0.5
From langchain-blog
How a Leading BPO Fixed CloudFront Header and CSRF Failures with Agentic AI
From crewai-blog
Simon Willison released 3.0a1 of simonw/datasette-graphql
Simon Willison (LLM tooling)
Simon Willison released 0.3 of simonw/asgi-gzip
Simon Willison (LLM tooling)
Simon Willison released 0.3 of simonw/datasette-gzip
Simon Willison (LLM tooling)
Simon Willison released 0.1a3 of simonw/datasette-turnstile
Simon Willison (LLM tooling)
Human judgment in the agent improvement loop
From langchain-blog
The next phase of enterprise AI
From openai-blog
Previewing Interrupt 2026
From langchain-blog
OpenAI Full Fan Mode Contest
From openai-blog
Deep Agents Deploy
From langchain-blog
Clean code in the age of coding agents
43 points, 46 comments on HN
Research-Driven Agents
113 points, 40 comments on HN
Prompting fundamentals
From openai-blog
CyberAgent moves faster with ChatGPT Enterprise and Codex
From openai-blog
Using custom GPTs
From openai-blog
Applications of AI at OpenAI
From openai-blog
Brainstorming with ChatGPT
From openai-blog
Analyzing data with ChatGPT
From openai-blog
ChatGPT for marketing teams
From openai-blog
Writing with ChatGPT
From openai-blog
Responsible and safe use of AI
From openai-blog
I still prefer MCP over skills
35 points, 35 comments on HN
Using projects in ChatGPT
From openai-blog
Using skills
From openai-blog
Creating images with ChatGPT
From openai-blog
AI fundamentals
From openai-blog
ChatGPT for customer success teams
From openai-blog
Healthcare
From openai-blog
Launch HN
21 points, 16 comments on HN
Ask HN: Hiring in the age of AI-assisted coding: what works?
23 points, 12 comments on HN
Exploiting the most prominent AI agent benchmarks
462 points, 114 comments on HN
Enterprises power agentic workflows in Cloudflare Agent Cloud with OpenAI
From openai-blog
GAIA – Open-source framework for building AI agents that run on local hardware
109 points, 25 comments on HN
Agent Harnesses Are Dead. Long Live Agent Harnesses.
From crewai-blog
How a Global CPG Automates Supply Chain Demand Forecasting with Agentic AI
From crewai-blog
Simon Willison released 0.3 of datasette/datasette-ports
Simon Willison (LLM tooling)
Trusted access for the next era of cyber defense
From openai-blog
AI ruling prompts warnings from US lawyers
70 points, 35 comments on HN
Study
27 points, 9 comments on HN
Assaf Elovic released v3.4.4 of assafelovic/gpt-researcher
Assaf Elovic (Research agents)
Qwen3.6-35B-A3B
546 points, 274 comments on HN
Laravel raised money and now injects ads directly into your agent
89 points, 45 comments on HN
Accelerating the cyber defense ecosystem that protects us all
From openai-blog
Codex for (almost) everything
From openai-blog
Arguing with Agents
49 points, 27 comments on HN
Amazon AI Cancelling Webcomics
59 points, 8 comments on HN
Android CLI
89 points, 24 comments on HN
Introducing GPT-Rosalind for life sciences research
From openai-blog
Pimzino/spec-workflow-mcp
4156 stars, TypeScript — created 2025-08-07
The next evolution of the Agents SDK
11 points, 1 comments on HN
Simon Willison released 0.1a7 of datasette/datasette-llm
Simon Willison (LLM tooling)
Simon Willison released 0.5a0 of simonw/llm-echo
Simon Willison (LLM tooling)
New ways to buy ChatGPT ads
From openai-blog
GPT-5.5 Instant
From openai-blog
How OpenAI delivers low-latency voice AI at scale
From openai-blog
OpenAI and PwC collaborate to reimagine the office of the CFO
From openai-blog
GPT-5.5 Instant System Card
From openai-blog
Agents for financial services and insurance
44 points, 19 comments on HN
phodal/routa
840 stars, TypeScript — created 2026-02-16
cyanheads/mcp-ts-core
136 stars, TypeScript — created 2025-03-20
AmrDab/clawdcursor
301 stars, TypeScript — created 2026-02-19
mksglu/context-mode
12645 stars, TypeScript — created 2026-02-23
steipete/Peekaboo
3243 stars, Swift — created 2025-05-22
evan-moon/firma
27 stars, TypeScript — created 2026-04-24
suekou/mcp-notion-server
884 stars, TypeScript — created 2024-11-30
gate/gate-mcp
22 stars, Shell — created 2026-02-02
Claude Opus 4.7
ProductHunt launch
Agent 37
ProductHunt launch
skillrepos/mcp
11 stars, Python — created 2025-06-05
Linear Agent
ProductHunt launch
gybob/aai-gateway
128 stars, TypeScript — created 2026-02-09
pasie15/meshy-ai-mcp-server
27 stars, JavaScript — created 2025-04-14
Budibase AI Agents
ProductHunt launch
Claude Double Checker
ProductHunt launch
Permit.io MCP Gateway
ProductHunt launch
Fantastical MCP for Mac
ProductHunt launch
Claude Dispatch
ProductHunt launch
Claude Double Checker — Is 2× active?
ProductHunt launch
AutoSend MCP
ProductHunt launch
Adaptive — The Agent Computer
ProductHunt launch
Claude Cowork Projects
ProductHunt launch
Azure/azure-functions-mcp-extension
32 stars, C# — created 2025-04-01
padmarajnidagundi/Playwright-AI-Agent-POM-MCP-Server
26 stars, TypeScript — created 2025-11-28
linw1995/nvim-mcp
47 stars, Rust — created 2025-08-07
WebMCP-org/npm-packages
41 stars, HTML — created 2025-08-15
gybob/aai-gateway
136 stars, TypeScript — created 2026-02-09
apvlv/davinci-resolve-mcp
55 stars, Python — created 2025-03-18
gybob/aai-gateway
146 stars, TypeScript — created 2026-02-09
openMF/mcp-mifosx
18 stars, Java — created 2025-03-26
Show HN: Hippo, biologically inspired memory for AI agents
30 points, 12 comments on HN
mission69b/t2000
11 stars, TypeScript — created 2026-02-18
HUA-Labs/tap
11 stars, TypeScript — created 2026-03-20
michsob/powerplatform-mcp
31 stars, TypeScript — created 2025-03-15
KuudoAI/amazon_ads_mcp
39 stars, Python — created 2025-09-19
Show HN: Spice simulation → oscilloscope → verification with Claude Code
35 points, 8 comments on HN
Ayushmaniar/powerpoint-mcp
48 stars, Python — created 2025-10-28
lispking/ferris-search
13 stars, Rust — created 2026-03-30
DemonDamon/AgenticX
63 stars, Python — created 2024-03-15
AgentX-ai/yahoo-finance-server
40 stars, Python — created 2025-07-22
kintone/mcp-server
40 stars, TypeScript — created 2025-07-09
Show HN: Real-time dashboard for Claude Code agent teams
35 points, 11 comments on HN
Show HN: Output.ai - OSS framework we extracted from 500+ production AI agents
39 points, 9 comments on HN
qinyuanpei/mcp-server-weibo
43 stars, Python — created 2025-03-21
ginkida/rustyhand
10 stars, Rust — created 2026-04-03
Sompote/tiger_cowork
45 stars, TypeScript — created 2026-03-06
visualizevalue/vvriter
49 stars, TypeScript — created 2026-03-09
actioncard/a2a-elixir
14 stars, Elixir — created 2026-02-26
Show HN: Dumped Wix for an AI Edge agent so I never have to hire junior staff
17 points, 40 comments on HN
Show HN: Ableton Live MCP
84 points, 54 comments on HN
a-bonus/google-docs-mcp
501 stars, TypeScript — created 2025-04-14
steipete/macos-automator-mcp
781 stars, TypeScript — created 2025-05-15
Mouseww/anything-analyzer
2181 stars, TypeScript — created 2026-04-12
MervinPraison/PraisonAI
7041 stars, Python — created 2024-03-19
cyanheads/obsidian-mcp-server
488 stars, TypeScript — created 2025-01-23
The Autonomy Audit: Is Anyone Actually Running a Business on AI Agents?
Greg Isenberg says you can hire 5,000 AI employees in a weekend. Liam Ottley made $7M teaching the dream. Air AI just paid $18M to the FTC. We followed the money, read the benchmarks, and talked to the builders. The answer is more nuanced than either side wants to admit.
Reflection and Critique: How AI Agents Can Review and Improve Their Own Output
Reflection loops let agents critique their own outputs before delivery — catching errors and improving quality without extra human review. Here's how to design critic agents, convergence rules, and structured critique schemas.
Human-in-the-Loop: When and How to Insert Human Judgment into AI Agent Pipelines
HITL patterns let you insert human judgment at critical agent decision points without breaking automation flow. Here's how to design triggers, handoffs, and resume patterns for production.
Tool-Calling Loops: The Core Pattern Behind Every Capable AI Agent
The ReAct loop is the fundamental pattern behind every capable AI agent. Here's the canonical implementation with tool contracts, termination logic, context management, and production hardening.
Hierarchical Multi-Agent: How to Build AI Organizations That Self-Coordinate
When a single supervisor can't track 20 workers, you need hierarchy. Learn how to build multi-level agent organizations with delegation protocols, failure escalation, and depth limits.
Event-Driven Agents: Building AI Systems That React in Real Time
Event-driven orchestration decouples agent execution from trigger logic. Agents subscribe to events, react asynchronously, and scale independently. Here's how to build it.
Supervisor-Worker: The Orchestration Pattern That Scales AI Agent Teams
The supervisor-worker pattern lets one planner agent decompose tasks, dispatch specialized workers, and synthesize their results. It's how the best agent systems scale — and the hardest to get right.
Parallel Fan-Out: Running Multiple AI Agents Simultaneously to Cut Latency by 10x
When tasks are independent, running them sequentially is waste. Parallel fan-out dispatches multiple agents simultaneously and merges the results — cutting total time from the sum to the max.
Sequential Chaining: How to Build Multi-Step AI Agent Pipelines That Actually Work
Sequential chaining passes the output of step N as input to step N+1. It's the backbone of every serious agent pipeline — and most builders implement it wrong.
Why No Single Tool Catches More Than 75% of Bugs
Code inspections catch 60%. Unit tests catch 25%. No single technique exceeds 75%. But stack four together and you hit 99%. Here's how — and why AI code makes the math urgent.
Building an Agent Observability Stack That Actually Helps You Debug
When your agent produces wrong output, you need to answer 'why' in under 5 minutes. That requires three layers of observability most builders skip.
HITL vs Full Automation: A Decision Framework for Agent Builders
The question isn't 'can an agent do this?' It's 'what happens when the agent gets it wrong?' That answer determines your architecture.
Six Agent Security Gaps Most Builders Ignore
Prompt injection gets all the attention. The real risks are in tool permissions, state corruption, and credential handling.
The Real Cost of Running Agents in Production
A single agent run can cost $0.003 or $3.00. The difference isn't the model — it's how you architect the system.
Building Your First A2A Pipeline: A Practical Walkthrough
A2A lets agents discover each other, negotiate capabilities, and hand off tasks. Here's how to build a working pipeline from scratch.
Error Recovery Patterns for Production Agents
Five battle-tested patterns for handling LLM failures, tool errors, and state corruption in production agent systems.
LangChain vs CrewAI vs AutoGen: What the Benchmarks Don't Tell You
Benchmarks measure task completion. Production measures error recovery, cost control, and whether you can debug it at 2am.
How to Evaluate an MCP Server Before You Put It in Production
Most MCP servers work in demos. The question is whether they'll survive your production traffic, error handling, and security requirements.