The Agent Lexicon

Glossary

The definitive reference for autonomous agent terminology. Every term explained for builders — what it means, why it matters, and how it connects to the tools you use.

A B C D E F G H I K L M O P R S T V W

A

Agent CardProtocol

A machine-readable JSON metadata file (typically hosted at /.well-known/agent.json) that describes an agent's capabilities, supported protocols, authentication requirements, and operational constraints. Agent Cards enable capability discovery — allowing other agents and systems to understand what an agent can do before attempting to interact with it. Defined as part of the A2A protocol specification.

Agent Communication Protocol (ACP)Protocol

A protocol for structured communication between agents and external systems, focusing on standardized message formats, capability negotiation, and session management. ACP defines how agents declare what they can do, how they exchange requests and responses, and how they handle errors and timeouts in multi-turn interactions.

Agent MarketplaceEcosystem

A platform where pre-built agents, skills, or agent components can be discovered, evaluated, and deployed — analogous to app stores for mobile or package registries for developers. Agent marketplaces may offer: complete agents (ready to run), agent templates (customizable starting points), skills/plugins (capabilities that can be added to existing agents), or MCP servers (tool integrations). The marketplace model assumes agents become composable — you build your system by assembling proven components rather than building everything from scratch.

Agent MemoryConcept

The mechanism by which an agent persists and retrieves information across interactions — enabling it to learn from past conversations, maintain context over long tasks, and build knowledge over time. Memory systems vary in scope: working memory (current conversation context), short-term memory (recent interactions within a session), long-term memory (persistent across sessions), and episodic memory (specific past events). Implementation approaches include: conversation history (simplest), vector stores (semantic retrieval), structured databases (relational queries), and knowledge graphs (entity relationships).

Agent OrchestrationConcept

The practice of coordinating multiple AI agents (or a single agent with multiple tools) to accomplish complex tasks that exceed what any single agent can do alone. Orchestration encompasses: task decomposition (breaking a goal into subtasks), agent selection (choosing which agent or tool handles each subtask), execution sequencing (parallel vs sequential, dependencies), state management (tracking progress across agents), error handling (retries, fallbacks, escalation), and result aggregation (combining outputs into a coherent response). Common orchestration patterns include supervisor-worker, fan-out/fan-in, pipeline, and event-driven handoff.

Agent RegistryEcosystem

A directory or catalog that lists available agents with their capabilities, interfaces, trust properties, and operational metadata. Unlike a marketplace (which facilitates transactions), a registry facilitates discovery — helping developers and other agents find the right agent for a task. Registries typically include: capability descriptions, supported protocols (MCP, A2A), authentication requirements, SLA information, and trust/quality scores. Agentifact itself functions as an agent registry for the developer ecosystem.

Agent RuntimeConcept

The execution environment that hosts an autonomous agent — managing the agent's event loop, tool execution, memory access, state persistence, and lifecycle across sessions. The runtime is the infrastructure layer between the model and the tools: it receives the model's tool call specifications, executes them, returns results, and maintains the conversation context across turns. Runtimes vary significantly in what they provide: some are minimal wrappers (a Python script with a while loop), others are full-featured systems with built-in persistence, retry logic, observability, horizontal scaling, and multi-agent coordination. Runtime choice determines reliability, debuggability, and operational cost more than model choice.

Agent SecuritySecurity

The discipline of securing autonomous AI agents against adversarial attacks, data leakage, unauthorized actions, and unintended behavior. Agent security encompasses: input security (prompt injection defense), output security (preventing information leakage), action security (constraining tool access and permissions), data security (protecting sensitive information in context), and operational security (securing the agent's runtime, credentials, and deployment). Agent security is harder than traditional application security because the agent's behavior is non-deterministic and its attack surface includes natural language.

Agent TracingObservability

The practice of capturing a complete, structured record of an agent's execution — every model call, tool invocation, decision point, and intermediate result — as a trace that can be inspected, debugged, and analyzed. Agent tracing extends traditional application tracing (spans, events, durations) with AI-specific metadata: prompts, completions, token usage, tool call parameters, and decision reasoning. A trace lets you answer: why did the agent do X? where did it go wrong? how much did this task cost?

Agent-to-Agent Protocol (A2A)Protocol

A communication protocol, developed by Google, that enables autonomous agents to discover, negotiate with, and delegate tasks to other agents — without human mediation. Unlike MCP (which connects models to tools), A2A connects agents to agents. It supports capability discovery via Agent Cards (JSON metadata at /.well-known/agent.json), task lifecycle management (submitted → working → completed/failed), streaming updates via SSE, and multi-turn conversations between agents. A2A is designed for asynchronous, long-running tasks where agents collaborate as peers.

Agentic LoopConcept

The core execution cycle of an autonomous agent: observe → decide → act → observe result → decide next step. Unlike a single model call (prompt in, response out), an agentic loop runs indefinitely until the task is complete, an error threshold is reached, or a human intervenes. The loop is what makes an agent autonomous — it can pursue multi-step goals, recover from errors, and adapt its strategy based on intermediate results. Different frameworks implement the loop differently (event-driven, graph-based, simple while-loop), but the fundamental cycle is the same.

AIX (AI Experience)Concept

A framework defined by Diana Wolosin (Indeed) that extends UX design to treat AI agents as first-class users of design systems. Core principle: 'Just as UX shapes how humans behave in a system, the structure of a design system shapes how AI behaves when generating interfaces.' AIX posits that better structure = more consistent AI behavior. Three-layer metadata architecture: WHAT (raw assets → structured, machine-readable metadata), HOW (implementation rules, prop types, accessibility, interaction states), WHY (strategic intent, usage guidelines, decision rationale). Three-layer MCP configuration: Visual (Figma MCP), Implementation (Design System MCP), Bridge (Code Connect). Powered thousands of AI-generated prototypes at Indeed with impressive component selection accuracy.

Autonomous AgentConcept

An AI system that independently pursues goals by perceiving its environment, making decisions, and taking actions — without requiring human instruction at every step. Autonomy exists on a spectrum: from fully supervised (human approves every action) to fully autonomous (agent operates independently for extended periods). True autonomous agents maintain their own task queue, recover from errors, adapt their strategy based on results, and know when to escalate to humans. The key differentiator from a chatbot: an agent takes actions in the world, not just generates text.

B

Browser AgentAgent Type

An AI agent that controls a web browser programmatically — navigating pages, clicking elements, filling forms, extracting data, and completing web-based tasks. Browser agents use browser automation protocols (CDP, Playwright, Puppeteer) or computer use APIs to interact with websites as a human would. They bridge the gap between API-first automation (fast but requires integration work) and manual web tasks (flexible but slow). Browser agents are essential for interacting with services that don't provide APIs or for tasks that require visual understanding of web interfaces.

C

Capability DiscoveryEcosystem

The process by which one agent or system determines what another agent or tool can do at runtime. Capability discovery enables dynamic, composable agent systems where agents aren't hardcoded to specific tools — they discover available capabilities and select the appropriate ones for each task. Discovery mechanisms include: Agent Cards (A2A), tool listings (MCP), OpenAPI specs, and registry queries. Effective capability discovery requires: standardized capability descriptions, semantic matching (understanding what a tool does, not just its name), and trust metadata (is this tool reliable?).

Chain of Thought (CoT)Pattern

A prompting technique where the model is encouraged to break down complex reasoning into intermediate steps before producing a final answer. In the agent context, CoT is the 'reason' phase of the ReAct loop — the model explicitly articulates its plan before taking action. Extended thinking or 'thinking tokens' are the model-native implementation of CoT, where the model uses dedicated tokens to reason internally before generating visible output.

Code AgentAgent Type

An autonomous AI agent specialized in software development tasks — reading codebases, writing code, running tests, debugging failures, and managing development workflows. Code agents operate in a development environment (IDE, terminal, or CI/CD pipeline) and use tools like file read/write, shell execution, code search, and version control. They range from single-task assistants (code completion, bug fixing) to fully autonomous developers that can take a specification and produce working software with tests.

Cognitive DebtConcept

The gap between what code does and what developers understand about the code. Coined by Margaret-Anne Storey (February 2026). Unlike technical debt, which lives in the codebase, cognitive debt lives in the developer's head. Even if AI agents produce code that could be easy to understand, the humans involved may have simply lost the plot — they may not understand what the program is supposed to do, how their intentions were implemented, or how to change it. Cognitive debt compounds silently: each AI-generated session adds working code that the developer didn't write and may not fully comprehend. Published as part of a convergent moment where five independent groups identified the same structural problem in one week (February 15-21, 2026).

ComposabilityConcept

The design principle that agent systems should be built from modular, interchangeable components that can be combined in different ways to solve different problems. A composable agent system separates: the model (reasoning engine), the runtime (execution environment), the tools (actions), the memory (state), and the orchestration (coordination). Each component can be swapped, upgraded, or scaled independently. Composability is what enables the agent ecosystem — standardized interfaces (MCP, A2A) allow components from different providers to work together.

Computer Use AgentAgent Type

An AI agent that interacts with a computer's graphical user interface — taking screenshots, moving the mouse, clicking buttons, and typing text — to accomplish tasks that would normally require a human operating the desktop. Computer use agents work at the pixel level rather than the DOM level (unlike browser agents), making them capable of operating any application: desktop software, web apps, system settings, and custom interfaces. They represent the most general form of digital automation.

Context WindowConcept

The maximum number of tokens (input + output) that a language model can process in a single interaction. The context window determines how much information an agent can 'see' at once — including the system prompt, conversation history, retrieved documents, tool call results, and the response being generated. Context windows range from 4K tokens (early GPT-3.5) to 1M+ tokens (Gemini 1.5 Pro, Claude). Longer context windows enable agents to handle more complex tasks without information loss, but attention quality can degrade with very long contexts (the 'lost in the middle' problem).

D

Design DriftConcept

The gradual accumulation of visual and structural inconsistencies across a product's UI when multiple agents, sessions, or developers make locally reasonable design decisions that collectively diverge from the intended design system. In the agentic coding era, design drift is the visual equivalent of technical debt — each AI coding session produces working code fast, but without shared contracts the result is 5 different filter patterns, 10+ heading sizes, 8 max-widths, 3 breadcrumb formats, inconsistent sidebars, and pages that feel like different sites. Design drift is the primary failure mode when building UIs across multiple agent sessions without a machine-readable design system.

Design TokensConcept

Named, platform-agnostic values that encode design decisions — colors, spacing, typography, shadows, motion — as structured data instead of hardcoded values. The W3C Design Tokens Community Group published the first stable specification (2025.10) on October 28, 2025, defining a vendor-neutral JSON format with $value, $type, $description properties. Design tokens follow a three-layer hierarchy: Option Tokens (WHAT — available palette), Decision Tokens (HOW — contextual application, e.g., 'grey-900 is default text color'), and Component Tokens (WHERE — specific UI mappings, e.g., button.primary.background). Tokens are the atomic unit of design system consistency.

E

EmbeddingInfrastructure

A dense numerical vector representation of text (or other data) that captures semantic meaning in a high-dimensional space. Semantically similar texts produce vectors that are close together (measured by cosine similarity or dot product). Embeddings are generated by specialized models (OpenAI text-embedding-3, Cohere embed, open-source models like nomic-embed) and are the foundation of semantic search, RAG, clustering, and classification in agent systems. Embedding quality — how well the vectors capture the nuances of your specific domain — directly determines the quality of downstream tasks.

Evaluation (Evals)Observability

The practice of systematically measuring an AI agent's performance against defined criteria using test datasets, automated metrics, and/or human judgment. Evaluations cover: task completion (did the agent accomplish the goal?), quality (how good was the output?), efficiency (how many steps/tokens did it take?), safety (did it violate any policies?), and reliability (does it produce consistent results?). Evaluation is essential for: comparing model versions, measuring prompt changes, catching regressions, and building confidence before deployment.

F

Fine-TuningInfrastructure

The process of further training a pre-trained language model on a specific dataset to improve its performance on targeted tasks. Fine-tuning adjusts the model's weights to better handle domain-specific language, follow particular output formats, or exhibit desired behaviors. In the agent context, fine-tuning can improve: tool selection accuracy, output format compliance, domain knowledge, and response style. Methods range from full fine-tuning (expensive, updates all weights) to parameter-efficient methods like LoRA (cheap, updates a small adapter layer).

Function CallingConcept

A model capability that allows an AI to generate structured requests to invoke predefined functions — outputting a JSON object specifying the function name and arguments rather than freeform text. Function calling is the implementation mechanism underlying tool use: the developer provides a schema of available functions (name, description, parameter types, required fields), and the model decides when to call a function and with what arguments based on the conversation context. The model does not execute the function — it generates the call specification, which the host application uses to dispatch the actual function and return the result. Modern implementations support parallel function calls (multiple functions in a single turn), forced function use (require the model to use a specific function), and streaming responses.

G

GroundingConcept

The practice of anchoring an agent's outputs to verifiable, authoritative sources of information — retrieved at inference time rather than encoded in model weights. Grounding reduces hallucination by giving the model specific, current, and contextually relevant evidence to reason from. The primary grounding mechanism is retrieval-augmented generation (RAG): before responding, the agent retrieves relevant documents from a knowledge base or the live web and injects them into its context. Grounding can also be achieved through tool use (querying a database for current values), structured data access (reading from a live API), or verification steps (cross-checking generated claims against retrieved sources).

GuardrailsPattern

Safety mechanisms applied to agent systems to constrain behavior, validate outputs, detect policy violations, and prevent harmful actions. Guardrails operate at multiple layers: input guardrails screen what enters the agent (prompt injection detection, PII redaction, topic filtering); output guardrails validate what the agent produces (factual grounding checks, toxicity detection, format validation); action guardrails constrain what the agent can do (permission scoping, rate limiting, blast-radius caps, irreversibility checks). Guardrails can be implemented as blocking filters (halt if violated), soft warnings (log and continue), or human escalation triggers (pause and request approval).

H

HallucinationConcept

When an AI model generates statements that are fluent and confident but factually incorrect, unsupported by the provided context, or entirely fabricated. In agent systems, hallucinations are especially dangerous because agents act on their outputs — a hallucinated API endpoint gets called, a hallucinated data point gets written to a database, a hallucinated recommendation gets sent to a user. Hallucination types: factual (wrong facts), citation (fake sources), logical (invalid reasoning), and confabulation (filling gaps with plausible fiction).

Human-in-the-Loop (HITL)Pattern

A design pattern where a human operator reviews, approves, or corrects an autonomous agent's actions at critical decision points before they are executed. HITL is the safety valve between full automation and full manual control. It can be implemented as: approval gates (agent proposes, human approves), escalation triggers (agent handles routine cases, escalates edge cases), correction loops (agent acts, human reviews and corrects), or oversight dashboards (human monitors agent activity in real-time). The key design decision is where to place HITL checkpoints — too many kills throughput, too few risks catastrophic errors.

I

InferenceInfrastructure

The process of generating predictions or outputs from a trained model — in the agent context, the act of sending a prompt to a language model and receiving a response. Inference is the primary cost and latency driver in agent systems. Each agent loop iteration requires at least one inference call, and complex agents may make dozens per task. Inference can be served by: cloud APIs (OpenAI, Anthropic, Google), self-hosted models (vLLM, TGI, Ollama), or edge deployment (on-device models). The key metrics are: time-to-first-token (TTFT), tokens-per-second throughput, and cost-per-token.

K

Knowledge GraphInfrastructure

A structured representation of entities and their relationships, stored as a graph of nodes (entities) and edges (relationships). In agent systems, knowledge graphs complement vector databases by providing structured, queryable relationships that pure semantic search cannot capture. While a vector DB answers 'what documents are similar to this query?', a knowledge graph answers 'what entities are connected to this entity, and how?' Knowledge graphs are especially valuable for agent systems that need to reason about complex domains with many interrelated entities.

L

LLM ObservabilityObservability

The broader practice of monitoring, measuring, and understanding the behavior of systems that use language models — encompassing agent tracing, performance monitoring (latency, throughput, error rates), cost tracking, quality measurement (accuracy, relevance, safety), and drift detection (performance degradation over time). LLM observability platforms provide dashboards, alerting, and analytics specifically designed for AI systems, complementing traditional APM tools.

M

Model Context Protocol (MCP)Protocol

An open protocol, originally developed by Anthropic, that standardizes how AI models connect to external data sources and tools. MCP provides a universal interface — similar to USB-C for hardware — allowing any AI model to call any tool through a consistent request/response format. It replaces brittle, per-tool API integrations with a single protocol layer. MCP servers expose capabilities (tools, resources, prompts) that MCP clients (AI models, IDEs, agent frameworks) can discover and invoke at runtime.

Model ServingInfrastructure

The infrastructure layer responsible for hosting trained models and serving inference requests at production scale. Model serving systems handle: request routing, batching (combining multiple requests for GPU efficiency), auto-scaling (adjusting compute based on demand), model loading/unloading, version management, and health monitoring. For agent systems that self-host models, the serving layer determines throughput, latency, and cost efficiency. Key technologies: vLLM (optimized LLM serving), TGI (Hugging Face), Triton (NVIDIA), and managed platforms (Together, Replicate, Modal).

Multi-Agent SystemConcept

An architecture in which multiple autonomous AI agents collaborate to accomplish tasks that exceed the capability, context window, or specialization of any single agent. Agents in a multi-agent system may run in parallel (fan-out), sequentially in a pipeline, hierarchically (supervisor dispatching to workers), or as peers negotiating via a protocol like A2A. Each agent typically has a specialized role, a bounded context, and a defined interface with other agents. Coordination mechanisms include: shared memory/state stores, message passing via queues, event-driven triggers, and explicit orchestration by a supervisor agent. Multi-agent systems are most valuable when tasks are naturally decomposable, require diverse specialized knowledge, or exceed a single context window.

O

OpenAPI SpecificationProtocol

A standard for describing REST APIs in a machine-readable format (JSON or YAML). In the agent context, OpenAPI specs serve as the bridge between traditional APIs and agent tool use — agents can parse an OpenAPI spec to understand available endpoints, parameters, and response formats, then invoke them as tools. Many MCP servers are auto-generated from OpenAPI specs.

P

PII Detection & RedactionSecurity

The process of identifying and masking personally identifiable information (names, emails, phone numbers, addresses, SSNs, financial data) in text before it is processed by an AI model or stored in logs. PII detection is a critical compliance requirement (GDPR, CCPA, HIPAA) and a trust requirement for agent systems that handle user data. Detection methods include: regex patterns, named entity recognition (NER), and purpose-built classification models.

Prompt EngineeringConcept

The practice of designing, testing, and iterating on the instructions given to language models to achieve reliable, high-quality outputs. In the agent context, prompt engineering encompasses: system prompt design (defining the agent's role, capabilities, and constraints), tool descriptions (explaining when and how to use each tool), few-shot examples (demonstrating expected behavior), and dynamic prompt assembly (constructing prompts from templates, retrieved context, and conversation history). Effective prompt engineering is empirical — it requires systematic testing against representative inputs, not just intuition.

Prompt InjectionSecurity

An attack where malicious instructions are embedded in user input, retrieved documents, or tool results — attempting to override the agent's system prompt and redirect its behavior. Prompt injection is the most critical security vulnerability in agent systems because agents act on their instructions: a successful injection can cause the agent to exfiltrate data, call unauthorized tools, or produce harmful outputs. Direct injection embeds instructions in user messages; indirect injection hides instructions in data the agent retrieves (web pages, documents, API responses).

R

ReAct (Reason + Act)Pattern

An agent architecture pattern where the model alternates between reasoning (thinking about what to do) and acting (calling tools or taking actions). In each iteration, the agent: (1) observes the current state, (2) reasons about what to do next, (3) selects and executes an action, (4) observes the result, and repeats. ReAct is the most common agent loop pattern and the foundation of most production agents. It outperforms pure chain-of-thought (reasoning only) and pure action (acting without reasoning) approaches.

Red TeamingSecurity

A structured adversarial testing process where testers deliberately attempt to break, mislead, or exploit an AI agent — testing its robustness against prompt injection, jailbreaks, social engineering, tool misuse, and edge cases. Red teaming goes beyond automated evaluation by simulating realistic attack scenarios that exploit the interaction between the model, its tools, and the deployment context. It is a critical step before deploying any agent that handles sensitive data, makes consequential decisions, or interacts with external systems.

Retrieval-Augmented Generation (RAG)Pattern

An architecture that combines information retrieval with language model generation. Before generating a response, the system retrieves relevant documents from a knowledge base (using semantic search, keyword search, or hybrid approaches) and includes them in the model's context. RAG addresses the fundamental limitation of parametric knowledge — models can only know what was in their training data — by providing real-time access to current, domain-specific, or private information. RAG pipelines typically involve: query formulation, document retrieval, re-ranking, context assembly, and grounded generation.

S

Safety EvaluationSecurity

Systematic testing and measurement of an AI agent's behavior against safety criteria — covering factual accuracy, harmful content avoidance, instruction following, bias detection, and robustness to adversarial inputs. Safety evaluation combines automated metrics (hallucination rate, toxicity scores, compliance with safety policies) with human evaluation (red teaming, user studies). It is distinct from capability evaluation (which measures how well the agent performs tasks) — an agent can be highly capable but unsafe.

Sandbox EnvironmentInfrastructure

An isolated execution environment where an agent's code and tool calls run with restricted permissions, limited network access, and no ability to affect the host system or production data. Sandboxes are critical for agent safety because agents execute arbitrary tool calls — a bug or prompt injection could otherwise lead to data loss, credential exposure, or system damage. Sandbox implementations range from lightweight containers (Docker) to specialized agent sandboxes (E2B, Modal) that provide on-demand compute with automatic cleanup.

Semantic Token GapConcept

The failure point in AI-generated UIs where primitive design tokens (blue-500, spacing-4, font-lg) undermine AI effectiveness because they carry no contextual meaning, while semantic tokens (color-button-background-brand, spacing-section-padding, font-heading-page) enable accurate generation by encoding intent. Identified as the largest design consistency failure point in Figma-to-code pipelines — not a limitation of tools but of upstream design system structure. The gap exists when a design system defines options (palette values) but not decisions (what those values mean in context). AI tools replicate whatever token structure exists; if the structure is primitive, the output is inconsistent.

Spec-Driven Development (SDD)Pattern

A software engineering methodology where well-crafted specifications precede code generation, serving as structured prompts for AI coding agents. The 2026 industry consensus response to 'vibe coding' — the practice of generating code from informal natural language without specifications. SDD principles: collaborate with AI to create clear specifications BEFORE coding, break work into small testable increments, AI generates unit tests for its own code, design documentation is enforced even for AI-generated components. GitHub's Spec Kit is the primary open-source toolkit. Promoted by Thoughtworks and InfoQ as the 2026 standard. Distinct from but complementary to context engineering (SDD = the WHAT, context engineering = the HOW).

Structured OutputConcept

A model capability that constrains generation to match a specific schema — JSON, XML, or custom format — rather than freeform text. Structured output ensures that model responses are machine-parseable, type-safe, and schema-compliant. Implementations include: JSON mode (constrain to valid JSON), schema mode (constrain to a specific JSON schema), and grammar-based decoding (constrain to a formal grammar). Structured output is essential for agents because every tool call, database write, and API interaction requires predictable data formats.

T

Token EconomicsConcept

The cost structure of operating language model-based agents, measured in tokens consumed (input and output) per task. Token economics encompasses: prompt cost (system prompt + context per call), reasoning cost (chain-of-thought tokens), tool interaction cost (tool call + result tokens per loop iteration), and total task cost (all tokens across all iterations). Understanding token economics is essential for making agents commercially viable — the difference between a well-optimized and poorly-optimized agent can be 10-100x in per-task cost.

Tool RoutingPattern

The mechanism by which an agent system selects which tool(s) to use for a given request — from a potentially large set of available tools. Tool routing addresses the practical limitation that models degrade in tool selection accuracy when given too many options (typically above 20 tools). Routing strategies include: semantic routing (match request to tool descriptions via embeddings), category-based routing (narrow to a tool category first, then select within it), hierarchical routing (a routing agent selects a specialist agent), and dynamic tool loading (load only relevant tools into context per request).

Tool UseConcept

The ability of an AI model to invoke external functions, APIs, or services during generation — pausing its response to call a tool, receive the result, and continue reasoning with that data. Tool use is the mechanism that transforms a language model from a static text generator into an agent that can take actions in the world. It is implemented via structured function-calling schemas (JSON describing available tools, their parameters, and return types) that the model uses to decide when and how to invoke a tool. The model does not execute the tool itself — it generates a structured call specification that the host application executes on its behalf.

Trust ScoreEcosystem

A quantitative assessment of an agent tool's reliability, security, interoperability, and documentation quality — aggregated into a single composite score that helps developers evaluate and compare tools. Trust scores are typically computed from multiple dimensions: community adoption (stars, downloads, contributors), security posture (vulnerability history, auth practices), documentation quality (completeness, accuracy, examples), interoperability (protocol support, standard compliance), and operational track record (uptime, breaking changes). Agentifact computes trust scores across 5 dimensions weighted by importance to production agent builders.

V

Vector DatabaseInfrastructure

A database optimized for storing, indexing, and querying high-dimensional vector embeddings — numerical representations of text, images, or other data that capture semantic meaning. Vector databases enable similarity search: given a query vector, find the most similar stored vectors efficiently using algorithms like HNSW, IVF, or product quantization. In the agent context, vector databases power RAG systems (storing document embeddings for retrieval), memory systems (storing and recalling past interactions), and recommendation engines.

Verification DebtConcept

The growing gap between how fast code can be generated and how fast it can be validated. Defined by Lars Janssen (March 2026). Unlike technical debt, which usually announces itself through mounting friction, verification debt breeds false confidence — the code works, the demos look good, but nobody has verified the implementation against requirements, edge cases, or design specifications. Survey finding: 96% of developers don't fully trust AI-generated code to be functionally correct, but only 48% say they always check it before committing. The remaining 52% are accumulating verification debt with every commit.

Visual Regression Testing (VRT)Pattern

Automated testing that detects unintended visual changes by comparing screenshots of UI states across code changes. Captures a 'baseline' screenshot, then after code changes captures a 'current' screenshot and diffs the two — flagging pixel-level or semantic differences. Three tiers: (1) Playwright built-in (free, pixelmatch, toHaveScreenshot() API, baselines in Git — covers 80% of use cases); (2) Cloud services (Percy by BrowserStack, Chromatic — add cross-browser rendering, AI false-positive filtering, team collaboration); (3) AI-native (Applitools Eyes — semantic visual understanding, recognizes dynamic content, self-healing selectors). VRT is the verification layer that catches design drift that unit tests and integration tests miss.

Voice AgentAgent Type

An AI agent that communicates via spoken language — processing speech input (STT), reasoning about the request, taking actions, and responding with synthesized speech (TTS). Voice agents operate in real-time conversational contexts where latency is critical (>500ms response time feels unnatural). They combine language understanding with audio processing, prosody interpretation (tone, emphasis, pauses), and turn-taking management. Voice agents are deployed in customer service, healthcare, education, and personal assistant applications.

W

WebhookInfrastructure

An HTTP callback that delivers real-time event notifications from one system to another — when an event occurs, the source system sends an HTTP POST request to a pre-configured URL with event data. In agent systems, webhooks serve as triggers (starting an agent workflow when an event occurs), notification channels (alerting humans when an agent completes a task or needs approval), and integration points (connecting agents to external services that push updates). Webhooks complement MCP's pull-based model with push-based event delivery.

Workflow AutomationPattern

The use of AI agents to automate multi-step business processes that previously required human judgment, manual data entry, or coordination across multiple systems. Unlike traditional automation (if-then rules on structured data), agent-based workflow automation can handle unstructured inputs (emails, documents, conversations), make judgment calls (classify, prioritize, route), and adapt to exceptions (escalate unusual cases, retry with different strategies). Common patterns: document processing pipelines, customer support triage, data extraction and enrichment, and report generation.

59 terms · Updated daily · Methodology