Web bot auth Glossary

Internet-Draft	Web bot auth Glossary	April 2025
Meunier	Expires 30 October 2025	[Page]

Abstract

Automated traffic authentication presents unique security challenges, constraints, and opportunities that impact all Internet users. This document seeks to collect terminology and examples within the space, with a specific focus on AI related technologies.¶

1. Introduction

Agents are increasingly used in business and user workflows, including AI assistants, search indexing, content aggregation, and automated testing. These agents need to reliably identify themselves to origins for several reasons:¶

Regulatory compliance requiring transparency of automated systems¶
Origin resource management and access control¶
Protection against impersonation and reputation management¶
Service level differentiation between human and automated traffic¶

Current identification methods such as IP allow-listing, User-Agent strings, or shared API keys have significant limitations in security, scalability, manageability, and fairness. This document presents these examples, as well as possible paths to address them.¶

2. Motivation

There is an increase in agent traffic on the Internet. Many agents choose to identify their traffic today via lists of IP Addresses and/or unique User-Agents. This is often done to demonstrate trust and safety claims, support allow-listing/deny-listing the traffic in a granular manner, and enable sites to monitor and rate limit per agent operator. However, these mechanisms have drawbacks:¶

User-Agent, when used alone, can be spoofed meaning anyone may attempt to act as that agent. It is also overloaded - an agent may be using Chromium and wish to present itself as such to ensure rendering works, yet it still wants to differentiate its traffic to the site.¶
IP blocks alone can present a confusing story. IPs on cloud plaforms have layers of ownership - the platform owns the IP and registers it in their published IP blocks, only to be re-published by the agent with little to bind the publication to the actual service provider that may be renting infra. Purchasing dedicated IP blocks is expensive, time consuming, and requires significant specialist knowledge to set up. These IP blocks may have prior reputation history that needs to be carefully inspected and managed before purchase and use.¶
An agent may go to every website on the Internet and share a secret with them like a Bearer from [OAUTH-BEARER-RFC]. This is impractical to scale for any agent beyond select partnerships, and insecure, as key rotation is challenging and becomes less secure as the consumers scale.¶

Using well-established cryptography, we can instead define a simple and secure mechanism that empowers small and large agents to share their identity.¶

3. Conventions and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶

Agent: An autonomous entity that perceives the environment and can take actions on behalf of users.¶
Bot: A type of agent that operates automatically, often performing repetitive tasks. Bots may identify themselves or attempt to mimic human behavior.¶
Origin (Server): The primary server hosting the web content or service that an agent intends to access.¶
Application Firewall: Controls incoming traffic to an origin based on a set of rules. This may include but is not limited to IP filtering, User-Agent matching, or cryptographic signature verification.¶
Reverse proxy: An intermediary server that forwards client requests to the origin server, often performing functions like load balancing, authentication, or caching.¶
Browser: A client application used to access web content. Browsers may also be orchestrated.¶
Human: A physical person, like you and me.¶
Rate limit: A control mechanism that restricts the access of an Agent to a resource provided by an Origin Server. An Origin can decide to rate limit all connections from an individual Client, from a specific Provider, or to a specific resource. This may be a fixed number of requests, a budget, a time, a location, or legal requirements.¶
Unlinkability: A property ensuring that multiple interactions or credentials from the same agent cannot be correlated by the verifier.¶
Account: Persistent identifier of an entity to an origin. This requires a registration.¶
Registration: The creation of an identity. It can involve one time payment, a subscription, an account with user name/password, an age, a legal jurisdiction, others.¶

Issuer: An entity that generates and provides credentials to agents after the Attester has verified certain attributes.¶
Attester: An entity that evaluates an agent's characteristics or behavior and provides evidence to an Issuer to support credential issuance.¶
Verifier: An entity that validates the authenticity and integrity of a credential presented by an agent.¶

4. Web bot authentication categories

We divide web bot authentication in three categories.¶

4.1. Identifying providers

Organizations operating bots may need to authenticate their agents to access certain web resources. Authentication mechanisms can help distinguish legitimate bots from malicious ones.¶

Examples:¶

Web crawlers wanting to authenticate against origins such as search engines,¶
Security companies that want to perform scans to identify malicious URLs,¶
AI augmented queries that are looking to identify themselves to a set of newspapers.¶

4.2. User Account Identification

Bots acting on behalf of registered users may require authentication to access user-specific data or services.¶

Examples:¶

Authenticating and authorizing a known user against particular resources, such as newspapers they have a subscription for,¶
Most authorization use cases for [MCP-AUTH] and [A2A-AUTH].¶

4.3. Attribute-Based Access

In scenarios where full identification is unnecessary or undesirable, agents may present credentials that attest to specific attributes without revealing their identity.¶

Examples:¶

Add a signal to limit visual CAPTCHA challenge such as [PRIVATE-ACCESS-TOKEN],¶
Gating access to a resource for longstanding users such as [LOX],¶
Using a search engine with a fixed number of requests such as [PRIVACY-PASS-KAGI],¶
Selective disclosure of a credential attribute (location, age) such as [PRIVATE-PROOF-API].¶

5. Ecosystem overview

The ecosystem involves multiple actors: a credential issuer that requires an certain criteria to be passed via an attester, the client which can be a bot or human-mediated agent whose IP is unknown, and the web origin placed behind a reverse proxy that may be fronting its infrastructure. The issuer provides cryptographic credentials to the client, which are then linked to requests and optionally verified by proxies before reaching the origin. This chain allows for authentication without necessarily revealing identifying details to each intermediate.¶

5.1. AI agent use example

Humans and bots often interact with origins indirectly via clients such as browsers, agents, or CLI tools. These clients handle requests, potentially traversing reverse proxies that manage TLS termination, DDoS protection, and caching.¶

The rise of advanced browser orchestration blurs the line between human-driven and automated requests, making identifying traffic as automated or not increasingly ambiguous.¶

The attester/issuer roles could be filled by the AI company, reverse proxy, origin, or a third party. Origins need mechanisms to identify organizations, rate-limit individuals, and authenticate users without relying solely on client IP or heuristics presented in Section 2.¶

6. Security goals and threat model

The security model includes several actors: credential issuers, attesters, clients (bots or agents), reverse proxies, and origin servers. The primary goals are to prevent impersonation, allow for credential revocation, support delegation and rotation, and maintain trust boundaries.¶

6.1. Public vs private presentation

If the Issuer is also the Origin or its reverse proxy, it is possible to use shared secrets for verification. In cases where the issuer and verifier are different entities, asymmetric cryptography becomes necessary, allowing the bot to prove its identity using a public key infrastructure.¶

6.2. Single vs multi show

Some credentials may be designed for one-time use only (for anti replay or privacy reasons), while others can support multiple presentations through the use of cryptographic derivation techniques. This distinction affects privacy, scalability, and implementation complexity.¶

6.3. Transport

Authentication tokens may be exchanged at different protocol layers and through different transports. Each may have different deployment, performance, and security guarantees.¶

For TLS, we have seen [REQ-MTLS] and [PRIVACYPASS-IN-TLS] respectively addressing Section 4.1 and Section 4.3.¶

For HTTP, we see [HTTP-MESSAGE-SIGNATURE-FOR-BOTS] or [DPOP-AUTH-RFC], and [PRIVACYPASS-HTTP-AUTH-RFC] respectively addressing Section 4.1 and Section 4.3. [OAUTH-BEARER-RFC] fits as well for Section 4.2.¶

Other methods have been seen such as leveraging a dedicated format on top of a JavaScript API. This is the case for W3C [PRIVATE-STATE-TOKEN] or the more recent [PRIVATE-PROOF-API].¶

Focusing on AI specifically, it's worth mentioning two proponent protocol definition efforts:¶

[A2A-AUTH] which follows [OPENAPI3-AUTH]. This means it allows for Basic, Bearer, API Keys, and [OAUTH2-RFC]. OpenAPI mentions using the [HTTP-AUTHSCHEME] registry, but there does not seem to be a definition for recent schemes such as [PRIVACYPASS-HTTP-AUTH-RFC], [CONCEALED-AUTH-RFC], or [DPOP-AUTH-RFC].¶
[MCP-AUTH] uses [OAUTH2-RFC] as a resource server.¶

6.4. Round trip

Protocols should strive to minimise the number of round trips between a client and the issuer, and between clients and the origin.¶

7. Key management and discovery

7.1. Catalog

Just as there are registries to resolve IP address metadata, there are going to be registries to identify the owner of public key material. These are mentioned by [A2A-DISCOVERY] and [MCP-DISCOVERY].¶

The primary goal of these catalogs is to associate metadata with a public key, and the discovery of the associated metadata. They SHOULD have some sort of tamper resistance, to prevent the provider of a catalog providing incorrect information.¶

As an analogy, one can think of [CERTIFICATE-TRANSPARENCY-RFC], or the more recent effort in [KEY-TRANSPARENCY-ARCHITECTURE].¶

[RFC2119]: Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC6973]: Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., Morris, J., Hansen, M., and R. Smith, "Privacy Considerations for Internet Protocols", RFC 6973, DOI 10.17487/RFC6973, July 2013, <https://www.rfc-editor.org/rfc/rfc6973>.
[RFC8174]: Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.
[RFC8890]: Nottingham, M., "The Internet is for End Users", RFC 8890, DOI 10.17487/RFC8890, August 2020, <https://www.rfc-editor.org/rfc/rfc8890>.
[RFC9518]: Nottingham, M., "Centralization, Decentralization, and Internet Standards", RFC 9518, DOI 10.17487/RFC9518, December 2023, <https://www.rfc-editor.org/rfc/rfc9518>.

11.2. Informative References

[A2A-AUTH]: "A2A protocol Authentication", n.d., <https://google.github.io/A2A/#/documentation?id=authentication-and-authorization>.
[A2A-DISCOVERY]: "A2A protocol Agent discovery", n.d., <https://google.github.io/A2A/#/topics/agent_discovery>.
[CERTIFICATE-TRANSPARENCY-RFC]: Laurie, B., Langley, A., and E. Kasper, "Certificate Transparency", RFC 6962, DOI 10.17487/RFC6962, June 2013, <https://www.rfc-editor.org/rfc/rfc6962>.
[CONCEALED-AUTH-RFC]: Schinazi, D., Oliver, D., and J. Hoyland, "The Concealed HTTP Authentication Scheme", RFC 9729, DOI 10.17487/RFC9729, February 2025, <https://www.rfc-editor.org/rfc/rfc9729>.
[DPOP-AUTH-RFC]: Fett, D., Campbell, B., Bradley, J., Lodderstedt, T., Jones, M., and D. Waite, "OAuth 2.0 Demonstrating Proof of Possession (DPoP)", RFC 9449, DOI 10.17487/RFC9449, September 2023, <https://www.rfc-editor.org/rfc/rfc9449>.
[HTTP-AUTHSCHEME]: "IANA HTTP Authentication Scheme Registry", n.d., <https://www.iana.org/assignments/http-authschemes/http-authschemes.xhtml>.
[HTTP-MESSAGE-SIGNATURE-FOR-BOTS]: Meunier, T., "HTTP Message Signatures for automated traffic Architecture", Work in Progress, Internet-Draft, draft-meunier-web-bot-auth-architecture-00, 15 April 2025, <https://datatracker.ietf.org/doc/html/draft-meunier-web-bot-auth-architecture-00>.
[KEY-TRANSPARENCY-ARCHITECTURE]: McMillion, B., "Key Transparency Architecture", Work in Progress, Internet-Draft, draft-ietf-keytrans-architecture-03, 25 February 2025, <https://datatracker.ietf.org/doc/html/draft-ietf-keytrans-architecture-03>.
[LOX]: "Lox: Protecting the Social Graph in Bridge Distribution", n.d., <https://petsymposium.org/2023/files/papers/issue1/popets-2023-0029.pdf>.
[MCP-AUTH]: "Model Context Protocol Authorization", n.d., <https://modelcontextprotocol.io/specification/2025-03-26/basic/authorization>.
[MCP-DISCOVERY]: "Model Context Protocol Registry", n.d., <https://modelcontextprotocol.io/development/roadmap#registry>.
[OAUTH-BEARER-RFC]: Jones, M. and D. Hardt, "The OAuth 2.0 Authorization Framework: Bearer Token Usage", RFC 6750, DOI 10.17487/RFC6750, October 2012, <https://www.rfc-editor.org/rfc/rfc6750>.
[OAUTH2-RFC]: Hardt, D., Ed., "The OAuth 2.0 Authorization Framework", RFC 6749, DOI 10.17487/RFC6749, October 2012, <https://www.rfc-editor.org/rfc/rfc6749>.
[OPENAPI3-AUTH]: "OpenAPI 3.0 Authentication", n.d., <https://swagger.io/docs/specification/v3_0/authentication/>.
[PRIVACY-PASS-KAGI]: "Introducing Privacy Pass authentication for Kagi Search", n.d., <https://blog.kagi.com/kagi-privacy-pass>.
[PRIVACYPASS-HTTP-AUTH-RFC]: Pauly, T., Valdez, S., and C. A. Wood, "The Privacy Pass HTTP Authentication Scheme", RFC 9577, DOI 10.17487/RFC9577, June 2024, <https://www.rfc-editor.org/rfc/rfc9577>.
[PRIVACYPASS-IN-TLS]: Pauly, T. and S. Hendrickson, "Including Privacy Pass Tokens in TLS Handshakes", Work in Progress, Internet-Draft, draft-pauly-privacypass-for-tls-00, 3 March 2025, <https://datatracker.ietf.org/doc/html/draft-pauly-privacypass-for-tls-00>.
[PRIVATE-ACCESS-TOKEN]: "Challenge: Private Access Tokens", n.d., <https://developer.apple.com/news/?id=huqjyh7k>.
[PRIVATE-PROOF-API]: "Explainer by Googlers Private Proof API", n.d., <https://explainers-by-googlers.github.io/private-proof/>.
[PRIVATE-STATE-TOKEN]: "W3C Private State Token API", n.d., <https://wicg.github.io/trust-token-api/>.
[REQ-MTLS]: Hoyland, J., "TLS Flag - Request mTLS", Work in Progress, Internet-Draft, draft-jhoyla-req-mtls-flag-02, 28 February 2025, <https://datatracker.ietf.org/doc/html/draft-jhoyla-req-mtls-flag-02>.
[VERIFIABLE-CREDENTIALS]: "Verifiable Credentials Data Model v1.1", n.d., <https://www.w3.org/TR/2022/REC-vc-data-model-20220303/>.

Web bot auth Glossary

Abstract

About This Document

Status of This Memo

Copyright Notice

Table of Contents

1. Introduction

2. Motivation

3. Conventions and Definitions

4. Web bot authentication categories

4.1. Identifying providers

4.2. User Account Identification

4.3. Attribute-Based Access

5. Ecosystem overview

5.1. AI agent use example

6. Security goals and threat model

6.1. Public vs private presentation

6.2. Single vs multi show

6.3. Transport

6.4. Round trip

7. Key management and discovery

7.1. Catalog

7.2. Submission / out-of-band

7.3. On-path

7.4. Format

8. Security Considerations

9. Privacy Considerations

10. IANA Considerations

11. References

11.1. Normative References

11.2. Informative References

Acknowledgments

Author's Address