The conversation around AI shopping agents is loud about the frontend: the chat interface, the product recommendations, the "buy" button inside ChatGPT. The harder problem sits one layer down, at the point where an autonomous program actually moves money.
When software initiates a purchase, the assumptions baked into electronic commerce stop holding. Gateways, fraud engines, and checkout flows were built around a human clicking a button, typing a password, and reading a verification code off a phone. Strip out the human and the sequence has nothing to anchor to. That much the industry now agrees on. What's less discussed is that the foundational standards are already being built, and that the real engineering work is no longer "invent the protocol" but "operate it at production quality."
The Standards Landscape Is No Longer Empty
A year ago, you could credibly argue that agentic payments were a greenfield. You can't anymore. As of mid-2026, there is a working stack, and it's worth being precise about what each layer does, because the interesting problems live in the gaps between them.
For authorization, Google's Agent Payments Protocol (AP2) has moved from a September 2025 announcement to a v0.2 specification and, as of May 2026, governance under the FIDO Alliance, with Mastercard- and Visa-chaired working groups building on it. It works through cryptographically signed mandates: an Intent Mandate capturing what the user authorized, a Cart Mandate locking the exact items and price, both built on the W3C Verifiable Credentials standard. This is precisely the "delegated authority" problem the industry was hand-waving about eighteen months ago, now expressed as a concrete, portable, revocable credential.
For discovery and checkout, OpenAI and Stripe's Agentic Commerce Protocol (ACP) and Google and Shopify's Universal Commerce Protocol (UCP) define how an agent reads a catalog and completes a session. Under ACP, the merchant stays a merchant of record: settlement, compliance, and disputes remain with the merchant and their PSP, and the agent is just the messenger. Beneath both sits MCP as the data-connectivity layer that lets agents query real-time inventory and pricing. And for machine-to-machine settlement, Coinbase's x402 has processed a reported 165 million transactions at an average ticket of about thirty cents: a different rail entirely, optimised for high-volume micro-payments rather than consumer card flows.
None of this means the problem is solved. It means the problem has moved. The questions that matter for an engineering partner now are about overhead, reliability, and the seams where these protocols meet legacy infrastructure.
Where The Real Work Is
-
The Authorization Layer Is Expensive
AP2's strength, three signed mandates per transaction, generating, signing, and verifying each one, is also its cost. That overhead is meaningfully higher than standard card-not-present authentication, and whether issuers will absorb it for low-value transactions, where the risk-mitigation benefit is small, is an open question through 2026. The engineering challenge isn't proving delegated authority is possible: it's making the proof cheap enough to use at scale.
-
Discovery Is Fragmented Across Competing Standards
ACP optimises for the conversational ChatGPT surface: UCP spans the broader Google journey. Most enterprise merchants will implement both rather than choose, and OpenAI's mid-flight deprecation of Instant Checkout in March 2026, pivoting from "buy in the chat" to "discover in the chat, transact on the merchant site", is a reminder that these specifications are still moving under everyone's feet. Building against a target that reissues its spec every few weeks is its own discipline. The defensible architecture keeps protocol compliance in a layer separate from payment processing, so that a merchant implements the endpoints once and routes each agent-initiated transaction to whichever PSP is most likely to approve it.
-
The Data Layer Has To Be Honest
Agents ignore the CSS and the persuasive copy that merchants spend millions on: they need structured feeds with accurate inventory, dimensions, and tax. When that data is wrong, the failure is no longer cosmetic: an agent acting on a bad feed buys the wrong thing, and unlike a human, it won't notice the mismatch before paying. Which leads directly to the messiest unsolved question.
Liability Is Genuinely Unsettled
When an agent completes a transaction that goes wrong, fault is ambiguous in a way the existing chargeback system was never designed to handle. If an agent buys the wrong server hardware, the cause could sit with the foundation model, the agent developer, or, and this is the part the early framing tends to skip, the merchant, whose structured data may have been wrong in the first place. AP2's signed mandates create a tamper-proof, non-repudiable record of exactly what the user authorised and what the agent committed to, which is a real advance for reconstructing what happened. But a clean audit trail establishes what occurred: it does not by itself settle who pays. That allocation is still being worked out, and not only in code: regulators are weighing money-transmission rules, state and federal authorization frameworks, and whether a given agent is a "pure technology conduit" or something that has retained meaningful control over funds.
This is the layer where conditional, programmable authorization logic earns its place: not blockchain smart contracts, which the conventional card rails don't use, but the kind of scoped, rule-bound consent that AP2's Intent Mandate already encodes: spend up to this amount, from this merchant, under these conditions, and not otherwise.
Compliance Is Per-Account, Enforced Per-Transaction
KYC and anti-money-laundering checks don't happen on every payment: they happen at onboarding, when an account is established. What an agent transaction needs to carry is not a fresh background check but a verifiable *reference* back to that established identity: a token proving the underlying human owner is in good standing. The engineering task is wiring that reference into the transaction flow so that thousands of agent-initiated, cross-border micro-transactions remain auditable without re-running compliance from scratch each time. Treasury teams running compliant stablecoin flows already use execution-time compliance patterns that look structurally similar, which is encouraging evidence that the approach works.
Fraud Detection Has To Learn A New Normal
Traditional fraud engines look for human tells: typing cadence, mouse movement, browser fingerprints. An agent has none of those: it calls an API directly, and a naive system reads that as an attack. Two distinct mechanisms tend to get conflated here, and they fail for different reasons: rate-limiting and bot-detection (the WAF layer) block on volume and shape of traffic, while fraud scoring blocks on behavioural signals. Both need to learn that a legitimate agent is allowed to behave like a machine. The fix is cryptographic identity, each agent presenting a verifiable credential proving its origin and the financial backing of the request, paired with models trained to tell an authorised shopping agent apart from a credential-stuffing run. A whole tier of vendors now sits in exactly this space.
A related point worth getting right: well-built agents do not hammer a price endpoint thousands of times a minute. That's the naive implementation, and designing for it as if it were inherent agent behaviour misreads the problem. The sustainable pattern is event-driven: servers push price and inventory changes to registered agents rather than agents polling for them. The engineering investment is in the messaging and streaming infrastructure that makes push viable at scale.
What This Means For Building
The transition to agentic commerce is a real shift in how digital systems transfer value, and the existing infrastructure does break under it. But the framing that matters in mid-2026 is not "someone needs to invent the standards." The standards exist, they're consolidating under bodies like the FIDO Alliance, and the card networks are live. The advantage now goes to whoever can operate this stack well: absorb the per-transaction overhead, build against specs that are still moving, keep protocol compliance cleanly separated from payment processing, and wire in identity and compliance without breaking the audit trail.
That is an engineering problem before it is anything else: data structure, cryptographic identity, scalable transaction models, and a great deal of unglamorous integration work across rails that don't yet agree on everything. The teams that get the backend right while the protocols settle will be the ones merchants and PSPs can actually depend on when the volume arrives.