Independent Submission                                       K. Cardillo
Internet-Draft                                               Independent
Intended status: Informational                              12 June 2026
Expires: 14 December 2026


  AI.TXT: A Declaration File for AI Usage Preferences, Licensing, and
                                 Policy
                     draft-car-ai-txt-wellknown-00

Abstract

   This document requests registration of two Well-Known URIs under the
   "/.well-known/" path: "ai.txt" and "ai.json".  These URIs define a
   structured, machine-readable file in which a site operator can
   declare AI usage preferences (training, scraping, indexing, caching),
   licensing terms, required attribution, and per-agent rules.

   "ai.txt" is positioned as a structured attachment surface for AI
   usage preferences in addition to robots.txt and HTTP-header carriage
   proposed by the IETF AIPREF working group.  As the AIPREF vocabulary
   stabilizes, "ai.txt" can carry those preferences in a typed, single-
   file form alongside the broader licensing, attribution, and policy
   declarations defined in this document.

   This format is complementary to "robots.txt" [ROBOTS].  Where
   "robots.txt" can block crawling entirely, "ai.txt" expresses nuanced
   policies such as "you may crawl but not train on this content" -- a
   distinction that "robots.txt" alone cannot express.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 14 December 2026.


Cardillo                Expires 14 December 2026                [Page 1]

Internet-Draft                   ai-txt                        June 2026


Copyright Notice

   Copyright (c) 2026 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
     1.1.  Relationship to Existing Standards  . . . . . . . . . . .   3
     1.2.  Relationship to AIPREF  . . . . . . . . . . . . . . . . .   4
     1.3.  Related Work  . . . . . . . . . . . . . . . . . . . . . .   4
     1.4.  Requirements Language . . . . . . . . . . . . . . . . . .   5
   2.  The "ai.txt" Well-Known URI . . . . . . . . . . . . . . . . .   5
     2.1.  Location  . . . . . . . . . . . . . . . . . . . . . . . .   5
     2.2.  Format  . . . . . . . . . . . . . . . . . . . . . . . . .   6
     2.3.  Site Fields . . . . . . . . . . . . . . . . . . . . . . .   6
     2.4.  Content Policy Fields . . . . . . . . . . . . . . . . . .   6
     2.5.  Training Path Fields  . . . . . . . . . . . . . . . . . .   7
     2.6.  Licensing Fields  . . . . . . . . . . . . . . . . . . . .   7
     2.7.  Agent Blocks  . . . . . . . . . . . . . . . . . . . . . .   7
     2.8.  Content Requirement Fields  . . . . . . . . . . . . . . .   8
     2.9.  Compliance Fields . . . . . . . . . . . . . . . . . . . .   8
   3.  The "ai.json" Well-Known URI  . . . . . . . . . . . . . . . .   8
     3.1.  Location  . . . . . . . . . . . . . . . . . . . . . . . .   8
     3.2.  Format  . . . . . . . . . . . . . . . . . . . . . . . . .   8
   4.  Agent Behavior  . . . . . . . . . . . . . . . . . . . . . . .   9
     4.1.  Discovery . . . . . . . . . . . . . . . . . . . . . . . .   9
     4.2.  Compliance  . . . . . . . . . . . . . . . . . . . . . . .   9
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  10
     6.1.  Well-Known URI Registration: "ai.txt" . . . . . . . . . .  10
     6.2.  Well-Known URI Registration: "ai.json"  . . . . . . . . .  10
   7.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  11
     7.1.  Normative References  . . . . . . . . . . . . . . . . . .  11
     7.2.  Informative References  . . . . . . . . . . . . . . . . .  11
   Appendix A.  Example: News Site . . . . . . . . . . . . . . . . .  12
   Appendix B.  Acknowledgments  . . . . . . . . . . . . . . . . . .  12
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  13


Cardillo                Expires 14 December 2026                [Page 2]

Internet-Draft                   ai-txt                        June 2026


1.  Introduction

   AI systems increasingly interact with website content in ways that go
   beyond traditional crawling: training language models on web content,
   indexing content for retrieval-augmented generation, caching content
   for future reference, and scraping data for analysis.  Website
   operators currently have no standard, machine-readable mechanism to
   communicate their policies regarding these AI-specific uses.

   "robots.txt" [ROBOTS] can block crawling entirely, but it cannot
   express nuanced policies.  A newspaper may wish to allow crawling
   (for search indexing) while prohibiting training (for model
   development).  A blog may wish to allow training under a specific
   license.  A corporation may wish to allow some AI agents while
   blocking others.

   "ai.txt" addresses this gap.  It is a policy declaration file, served
   at a well-known location, that communicates to AI systems:

   *  Whether content may be used for AI model training

   *  Whether content may be scraped, indexed, or cached

   *  Under what license terms AI training is permitted

   *  Which AI agents are permitted and under what conditions

   *  What attribution and disclosure requirements apply

   *  What compliance and audit expectations exist

1.1.  Relationship to Existing Standards

   "ai.txt" is complementary to, and does not replace, existing
   standards:

   robots.txt [ROBOTS]:  Declares crawling restrictions. "ai.txt" adds
      training, licensing, and per-agent policy declarations that
      "robots.txt" cannot express.  Both files may coexist.

   agents.txt:  Declares AI agent capabilities (endpoints, protocols,
      auth). "ai.txt" declares policy.  A site may use both:
      "agents.txt" to declare what agents can DO, and "ai.txt" to
      declare what is ALLOWED.

   security.txt [RFC9116]:  Declares security vulnerability disclosure
      contacts.  Similar well-known file pattern; different domain.


Cardillo                Expires 14 December 2026                [Page 3]

Internet-Draft                   ai-txt                        June 2026


1.2.  Relationship to AIPREF

   The IETF AIPREF working group is developing a vocabulary
   [AIPREF-VOCAB] for expressing AI usage preferences and an attachment
   specification [AIPREF-ATTACH] for carrying those preferences via
   robots.txt directives and HTTP response headers.

   "ai.txt" complements that work; it does not replace it.  AIPREF
   defines the vocabulary (the set of preference terms and their
   semantics) and two carriage mechanisms (robots.txt and HTTP headers).
   "ai.txt" is a third carriage mechanism -- a single, structured, typed
   file -- that provides three properties not addressed by robots.txt
   attachment or per-response headers:

   *  Carriage of preferences for an entire site, independent of any
      individual response or robots.txt path block.

   *  A single audit surface -- one file at one URL -- that can be
      fetched once and cached for site-wide preference resolution.

   *  A place to declare preferences alongside related declarations
      (licensing, attribution, per-agent rate limits) that fall outside
      AIPREF's scope.

   When the AIPREF vocabulary stabilizes, "ai.txt" implementations
   SHOULD use AIPREF preference names where they apply.  Implementations
   SHOULD treat the preferences carried in "ai.txt" as equivalent in
   authority to the same preferences carried via the AIPREF robots.txt
   or HTTP-header mechanisms.  Where multiple carriers disagree for the
   same site and resource, conflict resolution is out of scope for this
   document and may be addressed by future AIPREF output.

1.3.  Related Work

   The following efforts overlap with or are adjacent to this document.

   Spawning ai.txt (2023) [SPAWNING-AITXT]:  An earlier file at
      "/ai.txt" published by Spawning Inc. for text-and-data-mining opt-
      out, scoped narrowly to TDM permission per file pattern.  The
      format defined in this document is a strict superset, covering
      training, scraping, indexing, caching, per-agent rules, licensing,
      and attribution.  The present document acknowledges Spawning's
      prior use of the name and positions itself as a successor
      declaration surface rather than a competing one.

   W3C TDM Reservation Protocol [TDMREP]:  Defines a "/.well-known/


Cardillo                Expires 14 December 2026                [Page 4]

Internet-Draft                   ai-txt                        June 2026


      tdmrep.json" file for declaring text and data mining reservations
      under EU Directive 2019/790.  Adjacent in domain (machine-readable
      opt-outs) but narrower in scope (TDM reservation only). "ai.txt"
      can reference or coexist with "tdmrep.json"; sites with TDM-only
      requirements MAY use "tdmrep.json" alone.

   Cloudflare Content Signals Policy [CF-CONTENT-SIGNALS]:  A robots.txt
      extension deployed at scale (millions of domains) that adds AI-
      specific signals (search, ai-input, ai-train) to robots.txt User-
      agent / Allow / Disallow records.  Like AIPREF attach, it carries
      preferences inside robots.txt. "ai.txt" carries the same class of
      preferences -- plus licensing, attribution, and per-agent metadata
      -- in a separate file.  Sites MAY publish both; their semantics
      SHOULD agree.

   agents.txt:  A companion well-known file that declares what AI agents
      CAN do on a site (sanctioned endpoints, protocols,
      authentication).  Where "ai.txt" expresses usage preferences and
      policy, "agents.txt" expresses positive capability.  They are
      designed to coexist.

1.4.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

2.  The "ai.txt" Well-Known URI

2.1.  Location

   The "ai.txt" file MUST be served at:

   https://example.com/.well-known/ai.txt

   The file MUST be served over HTTPS in production deployments.  HTTP
   is permitted only in development or testing environments.

   The file MUST be served with Content-Type "text/plain; charset=utf-
   8".


Cardillo                Expires 14 December 2026                [Page 5]

Internet-Draft                   ai-txt                        June 2026


2.2.  Format

   The "ai.txt" file uses a block-based key-value format inspired by
   "robots.txt".  Each line contains a key, a colon, and a value.  Lines
   beginning with "#" are comments.  Indented lines (two or more spaces,
   or one or more tabs) belong to the preceding block.

   A minimal "ai.txt" file:

   # ai.txt
   Spec-Version: 1.0
   Site-Name: My Blog
   Site-URL: https://myblog.com
   Training: deny

2.3.  Site Fields

   Site-Name (REQUIRED):  Human-readable name of the site or service.

   Site-URL (REQUIRED):  Canonical HTTPS URL of the site.

   Spec-Version (OPTIONAL):  Version of the "ai.txt" specification the
      file conforms to (e.g., "1.0").  This is a regular field, not a
      comment.

   Generated-At (OPTIONAL):  ISO 8601 timestamp of when the file was
      generated.  This is a regular field, not a comment.

   Description (OPTIONAL):  Brief description of the site.

   Contact (OPTIONAL):  Contact email for AI policy inquiries.

   Policy-URL (OPTIONAL):  URL to a human-readable AI policy page.

2.4.  Content Policy Fields

   These fields declare site-wide defaults.  Each accepts "allow" or
   "deny".  The value "conditional" is valid only for the Training
   field, where it activates the per-path rules defined in the Training
   Path Fields section; implementations encountering "conditional" on
   any other field SHOULD treat it as "deny".

   Training (OPTIONAL, default "deny"):  Whether AI systems may use
      content for model training.

   Scraping (OPTIONAL, default "allow"):  Whether AI agents may scrape
      or read content.


Cardillo                Expires 14 December 2026                [Page 6]

Internet-Draft                   ai-txt                        June 2026


   Indexing (OPTIONAL, default "allow"):  Whether AI systems may index
      content for retrieval.

   Caching (OPTIONAL, default "allow"):  Whether AI systems may cache
      content.

2.5.  Training Path Fields

   When Training is "conditional", these fields specify per-path rules:

   Training-Allow (OPTIONAL):  Glob pattern for paths where training is
      permitted.

   Training-Deny (OPTIONAL):  Glob pattern for paths where training is
      denied.

   Multiple Training-Allow and Training-Deny lines MAY appear.  More
   specific patterns take precedence.

2.6.  Licensing Fields

   Training-License (OPTIONAL):  SPDX license identifier [SPDX] for AI
      training use (e.g., "CC-BY-4.0").

   Training-Fee (OPTIONAL):  URL to commercial licensing or pricing
      page.

2.7.  Agent Blocks

   Agent blocks declare per-agent policy overrides.  The wildcard "*"
   sets the default for all agents.

   Agent: *
     Rate-Limit: 60/minute

   Agent: ClaudeBot
     Training: allow
     Rate-Limit: 200/minute

   Agent: GPTBot
     Training: deny
     Scraping: deny

   Agent identifiers SHOULD match the first token of the agent's User-
   Agent header (case-insensitive).

   Fields within an Agent block:


Cardillo                Expires 14 December 2026                [Page 7]

Internet-Draft                   ai-txt                        June 2026


   *  Training, Scraping, Indexing, Caching: Override site-wide policy

   *  Rate-Limit: Advisory rate limit in "N/window" format (second,
      minute, hour, day)

2.8.  Content Requirement Fields

   Attribution (OPTIONAL):  Whether AI outputs must attribute the
      source.  One of: "required", "recommended", "none".

   AI-Disclosure (OPTIONAL):  Whether AI-generated content derived from
      this site must be disclosed as AI-generated.  One of: "required",
      "recommended", "none".

2.9.  Compliance Fields

   Audit (OPTIONAL):  Whether AI agents must provide audit receipts.
      One of: "required", "optional", "none".

   Audit-Format (OPTIONAL):  Expected audit format identifier (e.g.,
      "rer-artifact/0.1").

3.  The "ai.json" Well-Known URI

3.1.  Location

   The JSON companion file MUST be served at:

   https://example.com/.well-known/ai.json

   The file MUST be served with Content-Type "application/json;
   charset=utf-8".

3.2.  Format

   The JSON format contains equivalent information to "ai.txt" in a
   typed JSON structure suitable for direct consumption by programmatic
   clients.  The "ai.txt" file MAY reference the JSON file via:

   AI-JSON: https://example.com/.well-known/ai.json

   A minimal "ai.json" document:


Cardillo                Expires 14 December 2026                [Page 8]

Internet-Draft                   ai-txt                        June 2026


   {
     "specVersion": "1.0",
     "site": {
       "name": "My Blog",
       "url": "https://myblog.com"
     },
     "policies": {
       "training": "deny",
       "scraping": "allow",
       "indexing": "allow",
       "caching": "allow"
     },
     "agents": {
       "*": {}
     }
   }

   Field semantics are identical to those defined in Section 2 for the
   text format, with one structural difference: in the JSON form, the
   "specVersion" member, the "policies" member (with all four of its
   "training", "scraping", "indexing", and "caching" members), and the
   "agents" member are REQUIRED.  Defaults that the text format applies
   implicitly MUST be stated explicitly in JSON documents.

4.  Agent Behavior

4.1.  Discovery

   AI agents and crawlers SHOULD fetch "/.well-known/ai.txt" and/or
   "/.well-known/ai.json" before interacting with an unfamiliar site.

   Agents SHOULD prefer the JSON format when both are available.

   Agents SHOULD cache the policy for the duration declared by the HTTP
   Cache-Control header, with a minimum TTL of 60 seconds.

4.2.  Compliance

   "ai.txt" is advisory.  It declares the site owner's policy.
   Compliance is expected in good faith but is not enforced by the file
   itself.

   Agents SHOULD respect Training declarations by not using content for
   model training when Training is "deny".

   Agents SHOULD respect rate limit declarations.


Cardillo                Expires 14 December 2026                [Page 9]

Internet-Draft                   ai-txt                        June 2026


   Servers MUST enforce rate limits and access control independently of
   the declarations in "ai.txt".

5.  Security Considerations

   Policy declarations MUST NOT include actual credentials, tokens, or
   secrets of any kind.

   "ai.txt" is advisory; servers MUST enforce policies independently.

   Agents MUST validate that referenced URLs use HTTPS before following
   them.

   Site owners SHOULD review their "ai.txt" periodically to ensure it
   accurately reflects current policy.

6.  IANA Considerations

6.1.  Well-Known URI Registration: "ai.txt"

   This document requests registration of the following Well-Known URI
   in the "Well-Known URIs" registry established by [RFC8615]:

   URI suffix:  ai.txt

   Change controller:  Kayla Cardillo

   Specification document(s):  This document.

   Related information:  Text-format AI policy declaration file.  Allows
      website operators to declare their AI content policy - training
      permissions, licensing terms, per-agent rules, and compliance
      requirements.

6.2.  Well-Known URI Registration: "ai.json"

   This document requests registration of the following Well-Known URI
   in the "Well-Known URIs" registry established by [RFC8615]:

   URI suffix:  ai.json

   Change controller:  Kayla Cardillo

   Specification document(s):  This document.

   Related information:  JSON-format AI policy declaration file.
      Companion format to ai.txt.


Cardillo                Expires 14 December 2026               [Page 10]

Internet-Draft                   ai-txt                        June 2026


7.  References

7.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/rfc/rfc2119>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.

   [RFC8615]  Nottingham, M., "Well-Known Uniform Resource Identifiers
              (URIs)", RFC 8615, DOI 10.17487/RFC8615, May 2019,
              <https://www.rfc-editor.org/rfc/rfc8615>.

   [RFC9110]  Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
              Ed., "HTTP Semantics", STD 97, RFC 9110,
              DOI 10.17487/RFC9110, June 2022,
              <https://www.rfc-editor.org/rfc/rfc9110>.

7.2.  Informative References

   [AIPREF-ATTACH]
              "Attaching AI Usage Preferences to Content", Work in
              Progress, Internet-Draft, draft-ietf-aipref-attach, 2026,
              <https://datatracker.ietf.org/doc/draft-ietf-aipref-
              attach/>.

   [AIPREF-VOCAB]
              "A Vocabulary for Expressing AI Usage Preferences", Work
              in Progress, Internet-Draft, draft-ietf-aipref-vocab,
              2026, <https://datatracker.ietf.org/doc/draft-ietf-aipref-
              vocab/>.

   [CF-CONTENT-SIGNALS]
              "Cloudflare Content Signals Policy", 2025,
              <https://blog.cloudflare.com/content-signals-policy/>.

   [RFC9116]  Foudil, E. and Y. Shafranovich, "A File Format to Aid in
              Security Vulnerability Disclosure", April 2022.

   [ROBOTS]   "Robots Exclusion Protocol", September 2022,
              <https://www.rfc-editor.org/rfc/rfc9309>.


Cardillo                Expires 14 December 2026               [Page 11]

Internet-Draft                   ai-txt                        June 2026


   [SPAWNING-AITXT]
              "ai.txt -- Generate ai.txt files for your website", 2023,
              <https://site.spawning.ai/spawning-ai-txt>.

   [SPDX]     "SPDX License List", 2024, <https://spdx.org/licenses/>.

   [TDMREP]   "TDM Reservation Protocol", 2022,
              <https://www.w3.org/community/tdmrep/>.

Appendix A.  Example: News Site

   # ai.txt - AI Policy Declaration
   Spec-Version: 1.0

   Site-Name: News Daily
   Site-URL: https://newsdaily.com
   Contact: ai@newsdaily.com
   Policy-URL: https://newsdaily.com/ai-policy

   Training: conditional
   Scraping: allow
   Indexing: allow
   Caching: allow

   Training-Allow: /articles/free/*
   Training-Deny: /articles/premium/*
   Training-License: CC-BY-4.0
   Training-Fee: https://newsdaily.com/ai-licensing

   Agent: *
     Rate-Limit: 30/minute

   Agent: ClaudeBot
     Training: allow
     Rate-Limit: 120/minute

   Agent: GPTBot
     Training: deny

   Attribution: required
   AI-Disclosure: required

Appendix B.  Acknowledgments

   The "ai.txt" format draws on the design of "robots.txt" [ROBOTS] and
   "security.txt" [RFC9116] for structural inspiration.  The SPDX
   license identifiers referenced in Training-License are maintained by
   the Linux Foundation [SPDX].


Cardillo                Expires 14 December 2026               [Page 12]

Internet-Draft                   ai-txt                        June 2026


Author's Address

   Kayla Cardillo
   Independent
   Email: contactkaylacard@gmail.com


Cardillo                Expires 14 December 2026               [Page 13]