Questions that the ISE would like you to answer:
A. Is the subject of this document relevant to the RFC Series?

This document addresses a real operational need and is relevant to the RFC Series. It 
defines how URIs, email addresses, and IP addresses should be represented so that they 
are not inadvertently activated by general-purpose tools while remaining readable by 
humans. The motivation for a single canonical form would be further strengthened by 
explicitly acknowledging that human-readable documents are also routinely parsed by 
machines to extract IOCs, making deterministic de-obfuscation a practical requirement.


B. Is this document technically competent, as far as you can tell?

Does the work build upon industry state-of-the-art?  Are protocols and interfaces well specified?

yes.

C. Is this document in reasonable (not necessarily final) editorial shape?

While the document is generally sound, I think some editorial revisions would 
strengthen it.

D. Are the Abstract and Introduction of this document reasonably clear?

yes. 

E. Does the document make clear upfront how the specification does/does not relate to past or current IETF activities?

yes.

F. How else can the document be improved?

Here are some comments provided in line. These are just suggestions, feel free to ignore them.


Network Working Group                                  S. Grimminck, Ed.
Internet-Draft                                               26 May 2026
Intended status: Informational                                          
Expires: 27 November 2026


      Safe and Reversible Sharing of Malicious URLs and Indicators
                  draft-grimminck-safe-ioc-sharing-11

Abstract

   This document codifies a consistent and reversible convention used in
   the threat intelligence and security communities for sharing
   potentially malicious indicators of compromise (IOCs), such as URLs,
   IP addresses, email addresses, and domain names.  It describes a safe
   obfuscation format that reduces the risk of accidental execution or
   activation when IOCs are displayed or transmitted. 

<mglt> 

I recommend keeping the Abstract at a high level and deferring the mechanism details (bracket treatment of schemes, colons, nested indicators) to the body of the document, as they assume familiarity with conventions not yet introduced.

</mglt>

 The recommended
   form brackets the URI scheme name so that the string is not
   syntactically a valid URI per generic URI parsers, and extends the
   same bracket treatment to colons inside IPv6 literals; recognizable
   nested indicators within the Path, Query, or Fragment of a URI are
   obfuscated in place, and legacy scheme-substitution tokens are
   defined for de-obfuscation interoperability. 

 Safe-IOC strings are a
   textual rendering convention, not URIs, and are not intended to be
   processed by generic URI parsers.  These conventions aim to improve
   interoperability among tools and feeds that exchange threat
   intelligence data.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 27 November 2026.


Grimminck               Expires 27 November 2026                [Page 1]

Internet-Draft              Safe-IOC Sharing                    May 2026


Copyright Notice

   Copyright (c) 2026 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  Problem Statement . . . . . . . . . . . . . . . . . . . . . .   5
   4.  Canonical Transformation Rule . . . . . . . . . . . . . . . .   5
     4.1.  Step 1: Scheme  . . . . . . . . . . . . . . . . . . . . .   6
     4.2.  Step 2: Userinfo  . . . . . . . . . . . . . . . . . . . .   6
     4.3.  Step 3: Host  . . . . . . . . . . . . . . . . . . . . . .   6
     4.4.  Step 4: Nested Indicators . . . . . . . . . . . . . . . .   8
   5.  Formal ABNF Grammar . . . . . . . . . . . . . . . . . . . . .   8
   6.  De-obfuscation Techniques . . . . . . . . . . . . . . . . . .   9
     6.1.  Safety Check for Reversibility  . . . . . . . . . . . . .  10
   7.  Example Use Cases . . . . . . . . . . . . . . . . . . . . . .  10
   8.  Implementation Guidance . . . . . . . . . . . . . . . . . . .  11
   9.  Internationalized Domain Names  . . . . . . . . . . . . . . .  11
   10. Test Vectors  . . . . . . . . . . . . . . . . . . . . . . . .  11
   11. Security Considerations . . . . . . . . . . . . . . . . . . .  14
     11.1.  Relationship to the URI Scheme Registry  . . . . . . . .  14
     11.2.  Partial Obfuscation  . . . . . . . . . . . . . . . . . .  14
     11.3.  Parser Confusion . . . . . . . . . . . . . . . . . . . .  15
     11.4.  De-obfuscation in Non-Executable Contexts  . . . . . . .  15
     11.5.  Additional Considerations  . . . . . . . . . . . . . . .  15
   12. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  16
   13. References  . . . . . . . . . . . . . . . . . . . . . . . . .  16
     13.1.  Normative References . . . . . . . . . . . . . . . . . .  16
     13.2.  Informative References . . . . . . . . . . . . . . . . .  17
   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  18
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  18

1.  Introduction

   This document is an Informational specification published as an
   Independent Submission to the RFC series.  It is not an Internet
   Standard and does not represent the consensus of the Internet
   Engineering Task Force (IETF) or any of its working groups.


Grimminck               Expires 27 November 2026                [Page 2]

Internet-Draft              Safe-IOC Sharing                    May 2026


   The purpose of this document is to define a single transformation
   that converts a URL, IP address, domain name, or email address into
   an inert textual rendering, together with the inverse transformation
   that recovers the original value.  Specifying both directions enables
   interoperability among the systems that produce, transport, or
   consume these renderings while preventing accidental activation by
   general-purpose software that handles them in transit.

   The secure sharing of malicious artifacts is vital to threat
   intelligence, open-source intelligence (OSINT), and incident
   response.  However, sharing raw URLs, IP addresses, and email
   addresses associated with malware or threat actors poses a risk of
   accidental activation.

   Participants who routinely share indicators of compromise (IOCs)
   include security operations center (SOC) analysts, computer security
   incident response teams (CSIRTs), OSINT researchers, incident
   responders, and vendors of threat intelligence platforms and feeds.
   IOCs appear in email threads, instant-messaging channels, ticketing
   systems, PDF and HTML reports, blog posts, paste sites, machine-
   readable formats such as STIX [STIX21] / TAXII [TAXII21], and
   platforms such as MISP [MISP].  Both human readers and automated
   pipelines consume this material.

<mglt>

It would help me — and likely other readers — to better understand where in the data pipeline obfuscation is expected to occur and why a canonical form is needed.

My assumption is that structured formats such as STIX or OCSF carry raw (non-obfuscated) IOCs in their typed fields, and that obfuscation is applied only when generating human-readable outputs — PDF reports, HTML pages, emails, chat messages. If that assumption is correct, it might be worth stating it explicitly in the Introduction to frame the scope.

If the sole goal of obfuscation is to prevent accidental activation in human-readable  contexts (ensuring a URL in a PDF is not treated as a clickable link), then any visually recognizable but syntactically invalid form would arguably suffice. The variety of legacy conventions (hxxp, [.], etc.) has served this purpose for years, which may explain why a non-deterministic approach was acceptable until now.

However, if these human-readable documents are also used as input to automated tools —  and in practice, analysts routinely parse reports, emails, and paste-site content to extract IOCs programmatically — then a canonical format becomes valuable for enabling reliable, deterministic de-obfuscation. I think this is the key motivation, and making it more explicit early in the Introduction would strengthen the document.

It also seems valuable for general-purpose applications (PDF readers, browsers) to be able to recognize the obfuscated pattern so as to not make it active. This is a slightly different concern from de-obfuscation, but both benefit from a recognizable canonical form.

I note there are other situations where obfuscation is useful (e.g., hiding information from bots while keeping it readable for humans), but those seem to be different use cases not addressed here. A brief statement scoping the document away from such uses might be helpful.

A related question: when an obfuscated IOC appears in a structured format (say, a free-text field in STIX), how should a consumer distinguish an intentionally obfuscated indicator from a simply malformed URL? If structured formats provide a dedicated field type for obfuscated values, the ambiguity disappears. But if obfuscated IOCs may appear in fields typed as generic URLs, it would be useful for the document to discuss how to differentiate obfuscation from malformation.

Finally, it seems to me that standardizing a recognizable obfuscation pattern serves two complementary classes of consumers: (1) general-purpose applications (browsers, PDF  viewers) that should recognize the pattern as not a valid URI and therefore not activate it, and (2) IOC-aware tools that should recognize the pattern as a recoverable indicator and de-obfuscate it withappropriate precautions. Ironically, having a well-known standard format makes the obfuscation "less hidden" — but the point was never concealment from knowledgeable readers; it is about preventing unintentional activation by software. Making both roles explicit would help clarify the motivation.

</mglt>


   When a raw URI such as "https://malicious-host.example/path" is
   embedded in those channels, many systems automatically detect it and
   render it as a clickable or otherwise actionable link.  An analyst
   may then activate the resource unintentionally: navigating to an
   attacker-controlled URI can reveal the analyst's IP address and
   organizational affiliation, trigger delivery of malware, or alert the
   threat actor that a particular indicator is under active
   investigation.  Some mail and web infrastructure pre-fetches or
   resolves links for scanning or preview purposes, producing the same
   exposure without any deliberate user action.  PDF viewers and rich-
   text editors may turn strings that resemble URIs into hyperlinks even
   when the author intended plain text.


Grimminck               Expires 27 November 2026                [Page 3]

Internet-Draft              Safe-IOC Sharing                    May 2026


   A longstanding practice in the security community is to alter IOCs so
   that they remain human-readable but are not treated as live URIs by
   typical software: for example, replacing "." with "[.]" and wrapping
   the scheme name in brackets so that "http" becomes "[http]".

<mglt>

The Introduction mentions replacing "." with "[.]" as a longstanding practice, but it is not immediately clear whether this refers to dots in FQDNs, in IPv4 addresses, or both. It would be helpful to clarify this, and to briefly mention how IPv6 addresses have traditionally been handled (if at all) by existing conventions.

</mglt>

  Older
   conventions substituted characters within the scheme name (e.g.,
   "http" to "hxxp"), but many variant spellings emerged (e.g., "h**p",
   "hXXp"), hindering reliable parsing and automation.  This document
   defines a canonical form that uses uniform square-bracket wrapping
   for the scheme, dot, at-sign, and (inside IPv6 literals) colon
   delimiters, and a strict order of operations so that independent
   implementations can interoperate.  Legacy scheme substitutions
   ("hxxp", "hxxps") are documented for de-obfuscation interoperability
   but are NOT RECOMMENDED for new output.

   Safe-IOC strings produced by this specification are a textual
   rendering convention intended for human consumption and for lossy
   text channels.  They are deliberately not valid URIs per [RFC3986]
   and are not intended to be passed to generic URI parsers, resolvers,
   or dereferencing libraries.
<mglt>

If the obfuscated form is intended solely for human consumption, the need for a single canonical representation is harder to justify — any recognizably invalid form would serve the purpose. The standardization argument becomes compelling only when these strings are also expected to be machine-parsed, which might be worth reiterating here.

</mglt>

   The canonical "[scheme]" form does not occupy the URI scheme
   namespace (see Section 4).  The legacy tokens "hxxp" and "hxxps" do
   occupy that syntactic position.  The security implications are
   discussed in Section 11.  Implementations MUST NOT treat any
   obfuscated form as a resolvable URI scheme.

<mglt>

This paragraph discusses the relationship between "[scheme]" and the URI scheme  namespace, which feels premature in the Introduction. I would suggest relocating it to the Security Considerations section (Section 11), where it fits more naturally alongside the existing discussion in Section 11.1.

</mglt>

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

2.  Terminology

   *Obfuscating:* The process of altering an indicator so that it cannot
   be accidentally activated or clicked.  The goal is to prevent
   automatic execution or resolution, not to conceal the content from
   human readers; the original indicator remains visually recognizable.

   *De-obfuscating:* The process of restoring an obfuscated indicator to
   its original, actionable form.

   *IOC:* Indicator of Compromise - data such as a URL, IP address,
   domain name, email address, or hash associated with malicious
   activity.


Grimminck               Expires 27 November 2026                [Page 4]

Internet-Draft              Safe-IOC Sharing                    May 2026


   *IP-literal:* The bracketed form of an IPv6 address as it appears in
   a URI authority component (e.g., "[2001:db8::1]"), per Section 3.2.2
   of [RFC3986].

   *Nested Indicator:* A URI, email address, or IP address literal that
   appears inside the Path, Query, or Fragment of another URI (for
   example, an open-redirect target carried as a query parameter, or an
   email address embedded in a redirect URL).

3.  Problem Statement

   Inconsistent obfuscation practices hinder the reliable and automated
   exchange of threat intelligence.  For example:

   *  A URL obfuscated as "h**p://example[.]example" cannot be reliably
      parsed by tools expecting "[http]://example[.]example".

<mglt>

The Introduction states that Safe-IOC strings are intended for human consumption, yet 
the Problem Statement motivates standardization by showing that inconsistent obfuscation
hinders automated parsing. These two claims are in tension. I think the core problem 
being solved is simply that no common convention exists — and this document provides 
one. It might help to frame the Problem Statement around the lack of a common convention
rather than around automated parsing failures, or alternatively, to adjust the 
Introduction to acknowledge both human and machine consumers explicitly.

</mglt>

   *  An IP address obfuscated with parentheses (e.g., "192.0.2(.)1")
      may fail to de-obfuscate in systems expecting "[.]".

   *  An IPv6 literal left in colon-hexadecimal form (e.g.,
      "[2001:db8::1]") remains parseable by URI libraries and may be
      auto-linked by document viewers, mail clients, and terminal
      emulators, producing the same activation risk that "[.]"
      substitution addresses for IPv4 and domain names.

   Such inconsistencies reduce the effectiveness of threat detection and
   response.

<mglt>

The claim that inconsistencies "reduce the effectiveness of threat detection and 
response" holds primarily when these human-readable documents are also parsed by 
machines. If they are only read by humans, the variety of obfuscation styles is not a 
significant obstacle — an analyst can recognize "hxxp", "h**p", or "[http]" equally 
well. It might be worth qualifying this statement accordingly.


</mglt>

4.  Canonical Transformation Rule

<mglt>

The double-bracketing prevention rule is introduced before the basic transformation 
steps have been defined, which makes it harder to follow. I would suggest presenting the
four steps first, then addressing idempotency and detection of already-obfuscated 
strings afterward — this also ties into the broader question of how implementations 
detect whether an input is an obfuscated IOC in the first place.

</mglt>

   To prevent double-bracketing (e.g.,
   "[[https]]://example[[.]]example") when a tool processes the same
   string twice or in the wrong order, implementations MUST apply
   transformations in the following strict order of operations.
   Implementations MUST treat already-obfuscated substrings (the tokens
   "[scheme]", "[.]", "[@]", and "[:]") as opaque and MUST NOT apply
   transformations to them again; thus, the transformation is
   idempotent.

<mglt>

Same comment as above — the percent-encoding handling details would be easier to 
follow if presented after the basic transformation steps are introduced, rather than 
before them.

</mglt>

   Percent-encoding is handled as follows.  Percent-encoded forms of the
   delimiters ".", "@", and ":" in the Host or Userinfo of an input URI
   (i.e., "%2e" or "%2E" for ".", "%40" for "@", and "%3a" or "%3A" for
   ":") MUST be decoded to their literal form before Steps 2 and 3 are
   applied; the resulting literal characters are then bracketed by those
   steps.  Other percent-encoded content in the Host or Userinfo (e.g.,
   "%65" for "e") does not affect the obfuscation outcome and MAY be
   decoded or preserved verbatim.  Percent-encoded content inside a


Grimminck               Expires 27 November 2026                [Page 5]

Internet-Draft              Safe-IOC Sharing                    May 2026


   Path, Query, or Fragment MUST NOT be decoded during obfuscation, so
   that independent implementations produce the same output for the same
   input.  The output of the transformation MUST NOT contain percent-
   encoded forms of the bracketed delimiters, nor token-like sequences
   such as "[%2e]", "[%40]", or "[%3a]".

4.1.  Step 1: Scheme

<mglt>

Step 1 (Scheme) applies only to URIs, not to bare IP addresses or email addresses. Yet
the algorithm is presented as a single linear sequence applied uniformly to all IOC 
types. Since the Introduction identifies URLs, IPs, domain names, and emails as distinct
indicator types — and since these types are not fully independent (a URI may contain an
FQDN or IP, an email contains an FQDN) — it might be clearer to first define the 
elementary transformations per IOC type (dot-bracketing for FQDNs and IPv4, colon-
bracketing for IPv6, at-bracketing for emails), and then describe how these compose when
obfuscating a full URI. The current single-function approach could then be presented as
the combined algorithm, with the understanding that implementations will typically know
whether the input is a URI, an IP, or an email address.

 </mglt>


   Identify the URI scheme and wrap it in square brackets.  For example,
   "http" becomes "[http]", "https" becomes "[https]", "ftp" becomes
   "[ftp]".  The leading "[" character is not permitted in a URI scheme
   name (Section 3.1 of [RFC3986]), so the result cannot be mistaken for
   a valid URI by compliant parsers.  Because the transformation is
   purely syntactic wrapping, it extends to any current or future scheme
   without requiring a per-scheme mapping table; for example,
   "smb://fileserver.example/share" becomes
   "[smb]://fileserver[.]example/share".  Only the scheme name is
   wrapped; the scheme delimiter ("://" for hierarchical URIs, ":" for
   schemes such as mailto: or tel:) is preserved unchanged.  The case of
   the scheme name is preserved verbatim.  Applications that require
   case-folded schemes can normalize before or after applying this
   transformation.

<mglt>

In the preambule of this section, we had some warning to prevent the rules to be applied twice and result in a double backetting. Since bracket cannot is not permitted, and snce URL always have scheme, I am wondering if this coudl not be implemented by specifiying that a scheme with a bracket is either unvalid or have already been treated. </mglt>

   The legacy tokens "hxxp" and "hxxps" are in widespread operational
   use as obfuscated forms of "http" and "https" respectively.
   Implementations encountering these tokens in existing data SHOULD
   recognize them as obfuscated indicators during de-obfuscation (see
   Section 6).  These legacy forms are NOT RECOMMENDED for new output.
   No other legacy substitutions (e.g., "h**p", "fxxps") are defined by
   this specification.

<mglt>

The legacy token handling (hxxp, hxxps) is interleaved with the core obfuscation 
algorithm, adding complexity to a section that should focus on the canonical 
transformation. Since these tokens are NOT RECOMMENDED for new output and exist solely 
for backward compatibility with existing data, I would suggest moving them to a 
dedicated section (e.g., "Legacy Compatibility"). This would keep the core specification
clean and make it clear that legacy recognition is only relevant for de-obfuscation of 
pre-existing content — not for producing new obfuscated output. Additionally, if the 
primary use case is generating obfuscated IOCs from structured formats (STIX, OCSF) into
human-readable reports, legacy tokens are irrelevant to that direction entirely; they 
only matter when those reports are subsequently parsed back as input.

</mglt> 

4.2.  Step 2: Userinfo

   Identify the "@" symbol in the userinfo subcomponent (per [RFC3986])
   and replace it with "[@]".  This applies to email addresses and URIs
   containing userinfo (e.g., "username:password@host").

4.3.  Step 3: Host

   Replace all "." (period) characters in the Host subcomponent with
   "[.]".  This applies to domain names and IPv4 addresses, including
   standalone values (e.g., "evil.example" or "198.51.100.1" without a
   scheme).  IPv4 addresses use dotted-decimal notation that overlaps
   with domain name syntax, so the "[.]" substitution applies to them
   naturally.  Inside an IPv6 literal, replace every ":" (colon) with
   "[:]"; if the literal contains an embedded IPv4 address (e.g.,
   "::ffff:192.0.2.1" per Section 2.5.5 of [RFC4291]), each "." in that
   embedded IPv4 MUST also be replaced with "[.]".  The outer square


Grimminck               Expires 27 November 2026                [Page 6]

Internet-Draft              Safe-IOC Sharing                    May 2026


   brackets that surround an IP-literal in URI syntax (Section 3.2.2 of
   [RFC3986]) are part of the URI grammar, not part of the obfuscation,
   and MUST be preserved verbatim.  Port numbers following the host
   (e.g., ":8080" or the port following an IP-literal) are not part of
   the Host subcomponent and MUST NOT be altered.

   Zone identifiers (per [RFC4007]) may appear attached to an IPv6
   address as the "%<zone-id>" suffix in bare form, or as the "%25<zone-
   id>" suffix inside a URI IP-literal.  The URI-literal encoding was
   originally defined in [RFC6874], which has since been obsoleted by
   [RFC9844] without a replacement URI syntax; the "%25<zone-id>" form
   remains in operational use and is treated here as the de facto
   encoding.  Whichever form is present is preserved verbatim by this
   transformation.

   Bare IPv6 literals without surrounding URI brackets (e.g.,
   "2001:db8::1") MUST receive the same colon-bracketing treatment.
   Implementations SHOULD recognize bare IPv6 by the presence of either
   a "::" shorthand or the fully expanded eight-group colon-hexadecimal
   form so that strings such as "host:port" (a single colon) are not
   misclassified.  Intermediate forms (more than one colon but neither a
   "::" shorthand nor a full eight-group expansion, e.g., a malformed
   "2001:db8:1:2:3:4:5") are not recognized as IPv6 by this rule and are
   left unchanged.

   Addresses presented with a CIDR prefix-length suffix (e.g.,
   "192.0.2.0/24" or "2001:db8::/32") are handled by applying the host-
   level transformations to the address portion only; the "/" and the
   prefix-length digits are preserved verbatim.  For example,
   "192.0.2.0/24" becomes "192[.]0[.]2[.]0/24" and "2001:db8::/32"
   becomes "2001[:]db8[:][:]/32".

   For schemes whose content after the scheme delimiter is not a
   hierarchical authority (e.g., "mailto:" per [RFC6068], or "urn:"),
   implementations apply Step 2 to every "@" and Step 3 to every "."
   within the scheme body, up to the first "?" or "#" delimiter.

   When a scheme is present, host subcomponent boundaries are determined
   by the URI syntax defined in [RFC3986].  For bare indicators without
   a scheme, implementations identify the host by matching the input
   against one of the following productions, in order:

   *  A dotted-decimal IPv4 address per the IPv4address production of
      Section 3.2.2 of [RFC3986];

   *  A bare IPv6 literal as defined above (containing "::" or the fully
      expanded eight-group colon-hexadecimal form);


Grimminck               Expires 27 November 2026                [Page 7]

Internet-Draft              Safe-IOC Sharing                    May 2026


   *  A domain name matching the obf-host production from Section 5
      after temporary removal of any bracketing already present (one or
      more dot-separated labels, each matching obf-label).

   Inputs that match none of these productions are not Safe-IOC inputs
   under this specification and are left unchanged.

4.4.  Step 4: Nested Indicators

   Implementations MUST NOT apply Steps 2 and 3 wholesale to the Path,
   Query, or Fragment of the primary URI; doing so would alter semantics
   (for example, by bracketing dots in file extensions or query values)
   and is not reversible without ambiguity.

   Implementations locate nested indicators by scanning the literal
   (undecoded) text of the Path, Query, and Fragment for spans matching
   one of the grammars listed below; each matched span is obfuscated in
   place by recursively applying Steps 1 through 3.  For the purposes of
   this step, a nested indicator is one of the following:

   *  A URI with a scheme (including "mailto:" URIs).

   *  A bare email address (an addr-spec per Section 3.4.1 of
      [RFC5322]).

   *  A bare IPv4 or IPv6 address literal.

   Only spans matching one of the grammars above are obfuscated; dots,
   "@", and ":" characters that appear elsewhere in the Path, Query, or
   Fragment MUST be preserved verbatim.

5.  Formal ABNF Grammar

   The following uses Augmented BNF (ABNF) per [RFC5234] to define the
   tokens used in obfuscated IOC strings.  The scheme production matches
   the syntax in Section 3.1 of [RFC3986].  An implementation MAY use
   this grammar to validate whether a string is already obfuscated or
   still requires processing.


Grimminck               Expires 27 November 2026                [Page 8]

Internet-Draft              Safe-IOC Sharing                    May 2026


   safe-scheme    = "[" scheme "]"
   scheme         = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )
   legacy-scheme  = "hxxp" / "hxxps"
   safe-dot       = "[" "." "]"
   safe-at        = "[" "@" "]"
   safe-colon     = "[" ":" "]"

   obf-label      = 1*( ALPHA / DIGIT / "-" )
   obf-host       = obf-label 1*( safe-dot obf-label )
   obf-local      = 1*( ALPHA / DIGIT / "." / "_" / "%"
                        / "+" / "-" )
   obf-email      = obf-local safe-at obf-host

                                  Figure 1

   The obf-label production permits any character allowed in a DNS label
   but does not enforce DNS label structure rules; applications that
   will resolve the recovered host SHOULD additionally enforce the no-
   leading-or-trailing-hyphen and 63-octet-maximum rules per Section 3.5
   of [RFC1035].  The obf-host production matches a host of two or more
   labels; a single-label host has no dots to bracket and is left
   unchanged by Step 3.  The composition of a full URI (scheme,
   authority, path, query, fragment) and of an obfuscated IPv6 literal
   is not restated here; both are defined algorithmically in Section 4
   and rely on the URI grammar of [RFC3986] and the address grammar of
   [RFC4291].

   A compliant implementation MUST recognize a string as obfuscated when
   it contains any of the tokens safe-scheme, legacy-scheme, safe-dot,
   safe-at, or safe-colon in the roles described in this document.

6.  De-obfuscation Techniques

   Tools designed to ingest obfuscated data SHOULD automatically reverse
   these transformations in a deterministic manner:

   *  Strip enclosing brackets from "[scheme]" to recover the original
      scheme name (e.g., "[https]" becomes "https").

   *  Convert legacy token "hxxps" back to "https".

   *  Convert legacy token "hxxp" back to "http".

   *  Convert "[.]" back to ".".

   *  Convert "[@]" back to "@".

   *  Convert "[:]" back to ":".


Grimminck               Expires 27 November 2026                [Page 9]

Internet-Draft              Safe-IOC Sharing                    May 2026


   The substitutions listed above are non-overlapping and may be applied
   in any order, with one exception: for legacy tokens, longer strings
   MUST be reversed before shorter prefixes that are substrings of them
   (e.g., reverse "hxxps" before "hxxp").  De-obfuscation MUST maintain
   the original semantics of the data to avoid misinterpretation.

   The percent-decoding of delimiter forms required during obfuscation
   (Section 4) is not restored by de-obfuscation: an input such as
   "evil%2eexample" is obfuscated to "evil[.]example", and the inverse
   produces "evil.example".  The de-obfuscated value is semantically
   equivalent to the original under [RFC3986] but is not necessarily
   byte-for-byte identical.

   Safe-IOC strings may contain two kinds of square brackets: the URI
   IP-literal brackets (Section 3.2.2 of [RFC3986]) and the obfuscation
   tokens "[scheme]", "[.]", "[@]", and "[:]".  When an IPv6 URI is
   obfuscated, adjacent "[" or "]" characters may appear (e.g.,
   "[http]://[[:][:]1]").  These are adjacent tokens, not nested
   brackets.  Parsers SHOULD tokenize left-to-right, attempting to match
   the obfuscation tokens first and treating any remaining "[" or "]" as
   a URI IP-literal delimiter.

6.1.  Safety Check for Reversibility

   De-obfuscation MUST only be performed when the output is written to a
   non-executable buffer (e.g., a variable, string, or file) that cannot
   be automatically interpreted, executed, or rendered as a clickable
   link by the system or application.  The tool MUST NOT de-obfuscate a
   string if it is currently being rendered in a "live" environment
   (e.g., a web browser preview, an active document viewer, or any
   context where the resulting string could be automatically executed,
   resolved, or displayed as a clickable link).

   De-obfuscation SHOULD only occur in controlled contexts such as:

   *  Command-line tools with explicit user confirmation

   *  Isolated analysis environments (sandboxes)

   *  Backend processing pipelines that do not render output to users

7.  Example Use Cases

   Common scenarios include:

   *  *OSINT Sharing:* A report lists obfuscated URLs (e.g.,
      "[http]://malware[.]example/payload") to prevent accidental
      clicks.


Grimminck               Expires 27 November 2026               [Page 10]

Internet-Draft              Safe-IOC Sharing                    May 2026


   *  *Email Communication:* Security teams share obfuscated IOCs like
      "attacker[@]example[.]example" or
      "[mailto]:attacker[@]example[.]com" in email threads.

   *  *Threat Intelligence Platforms:* Automated ingestion of obfuscated
      IPs (e.g., "192[.]0[.]2[.]1" or
      "[http]://[2001[:]db8[:][:]1]:8080/") for blocklist updates.

   Safe-IOC strings are intended for human-readable contexts (reports,
   email, tickets, chat) and for free-text fields in structured formats
   such as STIX 2.1 [STIX21]; they are not a substitute for the typed
   observable and indicator objects those formats provide.

8.  Implementation Guidance

   Software designed to parse threat intelligence feeds should
   explicitly support these obfuscation and de-obfuscation conventions.
   Implementations SHOULD verify correct behavior through unit tests and
   validation scripts using the test vectors in Section 10.

9.  Internationalized Domain Names

   Apply Step 3 (Section 4.3) to the Internationalized Domain Name as
   presented, in either A-label (ASCII-Compatible Encoding, "xn--"
   prefixed) or U-label (native Unicode) form per [RFC5890]; for
   example, "xn--n3h.example" becomes "xn--n3h[.]example".  The
   transformation operates on the displayed form of the label and does
   not perform IDNA conversion between A-label and U-label
   representations.

<mglt>

If the core algorithm can be expressed concisely (under ~100 lines), a reference 
implementation in an appendix would aid interoperability and serve as an additional 
check against ambiguity in the prose description.

</mglt>

10.  Test Vectors

   The following provides a "golden set" of inputs and expected outputs
   in JSON form, intended to be consumed directly by automated test
   harnesses.  Each entry has a "label", an "input", an "operation"
   ("obfuscate" for the forward transformation defined in Section 4,
   "deobfuscate" for the inverse transformation defined in Section 6),
   and the "expected" output.  Domain names use the "example" reserved
   space per [RFC2606]; IPv4 addresses use documentation ranges per
   [RFC5737]; IPv6 addresses use the documentation prefix per [RFC3849].

   The forward URI rows exercise Steps 1 through 3 on full URIs and on
   bare indicators, including IPv4, IPv6 in every form covered by
   Section 4.3, zone identifiers, userinfo, ports, CIDR prefix-length
   suffixes, and percent-encoded delimiter forms in the Host and
   Userinfo.  The Step 4 rows (nested-url-in-query, nested-email-in-
   query) exercise in-place obfuscation of nested indicators.  The
   idempotency rows confirm that applying the transformation to already-


Grimminck               Expires 27 November 2026               [Page 11]

Internet-Draft              Safe-IOC Sharing                    May 2026


   obfuscated input produces no change.  The deobfuscate rows exercise
   the inverse transformation defined in Section 6, including the legacy
   "hxxps" recognition rule.

   [
     {"label": "standard-url", "operation": "obfuscate",
      "input": "https://bad.example",
      "expected": "[https]://bad[.]example"},
     {"label": "url-with-path", "operation": "obfuscate",
      "input": "https://evil.example/path",
      "expected": "[https]://evil[.]example/path"},
     {"label": "deep-link-url", "operation": "obfuscate",
      "input": "https://bad.example/path/to/page?q=1#frag",
      "expected": "[https]://bad[.]example/path/to/page?q=1#frag"},
     {"label": "http-url", "operation": "obfuscate",
      "input": "http://attacker.example",
      "expected": "[http]://attacker[.]example"},
     {"label": "ftp-url", "operation": "obfuscate",
      "input": "ftp://files.example/",
      "expected": "[ftp]://files[.]example/"},
     {"label": "mailto", "operation": "obfuscate",
      "input": "mailto:user@example.com",
      "expected": "[mailto]:user[@]example[.]com"},
     {"label": "ipv4-address", "operation": "obfuscate",
      "input": "198.51.100.1",
      "expected": "198[.]51[.]100[.]1"},
     {"label": "ipv4-in-url", "operation": "obfuscate",
      "input": "http://192.0.2.1",
      "expected": "[http]://192[.]0[.]2[.]1"},
     {"label": "ipv6-in-url", "operation": "obfuscate",
      "input": "http://[2001:db8::1]:8080",
      "expected": "[http]://[2001[:]db8[:][:]1]:8080"},
     {"label": "ipv6-full-form", "operation": "obfuscate",
      "input": "http://[2001:db8:0:0:0:0:0:1]/",
      "expected": "[http]://[2001[:]db8[:]0[:]0[:]0[:]0[:]0[:]1]/"},
     {"label": "ipv4-mapped-ipv6", "operation": "obfuscate",
      "input": "http://[::ffff:192.0.2.1]",
      "expected": "[http]://[[:][:]ffff[:]192[.]0[.]2[.]1]"},
     {"label": "ipv6-with-zone-id", "operation": "obfuscate",
      "input": "http://[2001:db8::1%25eth0]/",
      "expected": "[http]://[2001[:]db8[:][:]1%25eth0]/"},
     {"label": "bare-ipv6", "operation": "obfuscate",
      "input": "2001:db8::1",
      "expected": "2001[:]db8[:][:]1"},
     {"label": "bracketed-bare-ipv6", "operation": "obfuscate",
      "input": "[2001:db8::1]",
      "expected": "[2001[:]db8[:][:]1]"},
     {"label": "bare-ipv6-with-zone", "operation": "obfuscate",


Grimminck               Expires 27 November 2026               [Page 12]

Internet-Draft              Safe-IOC Sharing                    May 2026


      "input": "2001:db8::1%eth0",
      "expected": "2001[:]db8[:][:]1%eth0"},
     {"label": "email-address", "operation": "obfuscate",
      "input": "phish@target.example",
      "expected": "phish[@]target[.]example"},
     {"label": "punycode-domain", "operation": "obfuscate",
      "input": "xn--n3h.example",
      "expected": "xn--n3h[.]example"},
     {"label": "url-with-userinfo", "operation": "obfuscate",
      "input": "http://user:pass@attacker.example",
      "expected": "[http]://user:pass[@]attacker[.]example"},
     {"label": "bare-domain-with-port", "operation": "obfuscate",
      "input": "evil.example:443",
      "expected": "evil[.]example:443"},
     {"label": "scheme-case-preserved", "operation": "obfuscate",
      "input": "HTTPS://bad.example",
      "expected": "[HTTPS]://bad[.]example"},
     {"label": "percent-encoded-dot-in-host", "operation": "obfuscate",
      "input": "http://evil%2eexample",
      "expected": "[http]://evil[.]example"},
     {"label": "percent-encoded-dot-upper", "operation": "obfuscate",
      "input": "http://evil%2Eexample",
      "expected": "[http]://evil[.]example"},
     {"label": "percent-encoded-at-userinfo", "operation": "obfuscate",
      "input": "http://user%40attacker.example",
      "expected": "[http]://user[@]attacker[.]example"},
     {"label": "percent-encoded-non-delim", "operation": "obfuscate",
      "input": "http://%65vil.example",
      "expected": "[http]://%65vil[.]example"},
     {"label": "ipv4-cidr", "operation": "obfuscate",
      "input": "192.0.2.0/24",
      "expected": "192[.]0[.]2[.]0/24"},
     {"label": "ipv6-cidr", "operation": "obfuscate",
      "input": "2001:db8::/32",
      "expected": "2001[:]db8[:][:]/32"},
     {"label": "nested-url-in-query", "operation": "obfuscate",
      "input":
        "http://example.com/r?url=http://evil.example",
      "expected":
        "[http]://example[.]com/r?url=[http]://evil[.]example"},
     {"label": "nested-email-in-query", "operation": "obfuscate",
      "input":
        "http://example.com/?contact=abuse@evil.example",
      "expected":
        "[http]://example[.]com/?contact=abuse[@]evil[.]example"},
     {"label": "idempotency-check", "operation": "obfuscate",
      "input": "[https]://bad[.]example",
      "expected": "[https]://bad[.]example"},


Grimminck               Expires 27 November 2026               [Page 13]

Internet-Draft              Safe-IOC Sharing                    May 2026


     {"label": "ipv6-idempotency", "operation": "obfuscate",
      "input": "[http]://[2001[:]db8[:][:]1]:8080",
      "expected": "[http]://[2001[:]db8[:][:]1]:8080"},
     {"label": "legacy-deobfuscation", "operation": "deobfuscate",
      "input": "hxxps://bad[.]example",
      "expected": "https://bad.example"},
     {"label": "legacy-deobfuscation-bare", "operation": "deobfuscate",
      "input": "hxxps://bad.example",
      "expected": "https://bad.example"},
     {"label": "case-preserved-roundtrip", "operation": "deobfuscate",
      "input": "[HTTPS]://bad[.]example",
      "expected": "HTTPS://bad.example"},
     {"label": "ipv6-uri-deobfuscation", "operation": "deobfuscate",
      "input": "[http]://[2001[:]db8[:][:]1]:8080",
      "expected": "http://[2001:db8::1]:8080"},
     {"label": "mailto-deobfuscation", "operation": "deobfuscate",
      "input": "[mailto]:user[@]example[.]com",
      "expected": "mailto:user@example.com"}
   ]

                                  Figure 2

11.  Security Considerations

   While these obfuscation techniques reduce the risk of accidental
   activation, systems processing obfuscated indicators SHOULD apply the
   safeguards described in this section.

11.1.  Relationship to the URI Scheme Registry

   The canonical "[scheme]" form does not occupy the URI scheme
   namespace (see Section 4).  Software that encounters bracketed
   schemes MUST NOT attempt to resolve or dereference them.

   The legacy tokens "hxxp" and "hxxps" do occupy the syntactic position
   of a URI scheme.  Provisional entries for both names appear in the
   "Uniform Resource Identifier (URI) Schemes" registry per [RFC7595].
   Implementations should consider surrounding context when
   disambiguating these tokens.

11.2.  Partial Obfuscation

   When a scheme is present, a compliant tool MUST obfuscate both the
   scheme (Step 1) and the host-level delimiters (Steps 2-3).  Partial
   obfuscation (for example, replacing only "." with "[.]" while leaving
   "https" unchanged) creates a false sense of security: the scheme
   remains active and could still trigger automatic linkification or
   execution.  Implementations MUST NOT produce partially obfuscated


Grimminck              Expires 27 November 2026               [Page 14]

Internet-Draft              Safe-IOC Sharing                    May 2026


   output when full obfuscation is intended.  For bare indicators
   without a scheme (e.g., a standalone IP address or email address),
   host-level and userinfo obfuscation (Steps 2-3) alone constitutes
   valid Safe-IOC output.

11.3.  Parser Confusion

   Implementations that parse Safe-IOC strings may become confused by
   malformed or inconsistently obfuscated input.  For example,
   "[https]://example.example" (scheme obfuscated but dots not) or
   "https://example[.]example" (dots obfuscated but scheme not) are not
   canonical Safe-IOC output and SHOULD be treated as partially
   obfuscated: such strings retain activation risk and implementations
   SHOULD either complete the obfuscation before display or flag the
   input as non-conforming.

11.4.  De-obfuscation in Non-Executable Contexts

   The non-executable-buffer requirement on de-obfuscation is stated
   normatively in Section 6.1.  A non-executable buffer is one that
   cannot be automatically interpreted by the system (for example, as a
   URI to fetch, a command to run, or a link to display).  Writing de-
   obfuscated output into a live document, rich-text editor, or browser
   address bar before explicit user action creates an unacceptable risk
   of accidental activation.

11.5.  Additional Considerations

   *  Implementations that do not follow the canonical transformation
      rule (e.g., by not treating "[.]", "[@]", and "[:]" as opaque) MAY
      produce nested or non-reversible output when obfuscation is
      applied repeatedly.

   *  Obfuscated URLs in PDFs may still be rendered as hyperlinks; use
      plain-text formatting.

   *  Systems processing obfuscated indicators MUST treat them as
      untrusted data, applying sandboxing or isolated environments for
      analysis.

   *  Credentials (e.g., _username:password_) SHOULD NOT be shared, even
      in obfuscated form, due to inherent security risks.

   *  Analysts accessing de-obfuscated indicators for investigation
      SHOULD do so within protected environments such as containers or
      dedicated analysis VMs to limit exposure to exploits.


Grimminck               Expires 27 November 2026               [Page 15]

Internet-Draft              Safe-IOC Sharing                    May 2026


   *  If a de-obfuscated indicator must be forwarded to a system that is
      not prepared to handle unsafe input, it SHOULD be re-obfuscated
      before being passed on.

<mglt>

The Security Considerations section would benefit from a discussion of potential 
collisions: can a legitimate, non-obfuscated string accidentally match the Safe-IOC 
pattern (e.g., could "[.]" or "[:]" appear naturally in a non-IOC context and be 
mistakenly interpreted as obfuscation)? Conversely, could a Safe-IOC string be confused 
with a valid but malformed URI? A brief analysis of why such collisions are unlikely — 
or guidance on how to handle them if they arise — would strengthen this section.

</mglt>

12.  IANA Considerations

   This document has no IANA actions.

   [RFC Editor: Please remove this section before publication.]

13.  References

13.1.  Normative References

   [RFC1035]  Mockapetris, P., "Domain names - implementation and
              specification", STD 13, RFC 1035, November 1987,
              <https://www.rfc-editor.org/rfc/rfc1035>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997,
              <https://www.rfc-editor.org/rfc/rfc2119>.

   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
              Resource Identifier (URI): Generic Syntax", STD 66,
              RFC 3986, January 2005,
              <https://www.rfc-editor.org/rfc/rfc3986>.

   [RFC4291]  Hinden, R. and S. Deering, "IP Version 6 Addressing
              Architecture", RFC 4291, February 2006,
              <https://www.rfc-editor.org/rfc/rfc4291>.

   [RFC5234]  Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
              Specifications: ABNF", STD 68, RFC 5234, January 2008,
              <https://www.rfc-editor.org/rfc/rfc5234>.

   [RFC5322]  Resnick, P., Ed., "Internet Message Format", RFC 5322,
              October 2008, <https://www.rfc-editor.org/rfc/rfc5322>.

   [RFC6068]  Duerst, M., Masinter, L., and J. Zawinski, "The 'mailto'
              URI Scheme", RFC 6068, October 2010,
              <https://www.rfc-editor.org/rfc/rfc6068>.

   [RFC7595]  Thaler, D., Ed., Hansen, T., and T. Hardie, "Guidelines
              and Registration Procedures for URI Schemes", BCP 35,
              RFC 7595, June 2015,
              <https://www.rfc-editor.org/rfc/rfc7595>.


Grimminck               Expires 27 November 2026               [Page 16]

Internet-Draft              Safe-IOC Sharing                    May 2026


   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, May 2017,
              <https://www.rfc-editor.org/rfc/rfc8174>.

13.2.  Informative References

   [RFC4007]  Deering, S., Haberman, B., Jinmei, T., Nordmark, E., and
              B. Zill, "IPv6 Scoped Address Architecture", RFC 4007,
              March 2005, <https://www.rfc-editor.org/rfc/rfc4007>.

   [RFC6874]  Carpenter, B., Cheshire, S., and R. Hinden, "Representing
              IPv6 Zone Identifiers in Address Literals and Uniform
              Resource Identifiers", RFC 6874, February 2013,
              <https://www.rfc-editor.org/rfc/rfc6874>.

   [RFC9844]  Carpenter, B. and R. Hinden, "Entering IPv6 Zone
              Identifiers in User Interfaces", RFC 9844, August 2025,
              <https://www.rfc-editor.org/rfc/rfc9844>.

   [RFC2606]  Eastlake, D. and A. Panitz, "Reserved Top Level DNS
              Names", BCP 32, RFC 2606, June 1999,
              <https://www.rfc-editor.org/rfc/rfc2606>.

   [RFC3849]  Huston, G., Lord, A., and P. Smith, "IPv6 Address Prefix
              Reserved for Documentation", RFC 3849, July 2004,
              <https://www.rfc-editor.org/rfc/rfc3849>.

   [RFC5737]  Arkko, J., Cotton, M., and L. Vegoda, "IPv4 Address Blocks
              Reserved for Documentation", RFC 5737, January 2010,
              <https://www.rfc-editor.org/rfc/rfc5737>.

   [RFC5890]  Klensin, J., "Internationalized Domain Names for
              Applications (IDNA): Definitions and Document Framework",
              RFC 5890, August 2010,
              <https://www.rfc-editor.org/rfc/rfc5890>.

   [STIX21]   Jordan, B., Ed., Piazza, R., Ed., and T. Darley, Ed.,
              "STIX Version 2.1", OASIS Standard stix-v2.1-os, 10 June
              2021, <https://docs.oasis-open.org/cti/stix/v2.1/os/stix-
              v2.1-os.html>.

   [TAXII21]  Jordan, B., Ed. and D. Varner, Ed., "TAXII Version 2.1",
              OASIS Standard taxii-v2.1-os, 10 June 2021,
              <https://docs.oasis-open.org/cti/taxii/v2.1/os/taxii-
              v2.1-os.html>.

   [MISP]     MISP Project, "MISP - Malware Information Sharing Platform
              and Threat Sharing", <https://www.misp-project.org/>.


Grimminck               Expires 27 November 2026               [Page 17]

Internet-Draft              Safe-IOC Sharing                    May 2026


Acknowledgements

   The author thanks Tim Bray, Frank Denis, Martin J.  Dürst, Ted
   Hardie, Graham Klyne, Eliot Lear, Barry Leiba, Kathleen Moriarty, and
   Martin Thomson for their review and feedback during the development
   of this document.  The author also thanks the participants of the
   IETF SecDispatch and practical-cybersecurity discussions for early
   input on this work.

Author's Address

   Stefan Grimminck (editor)
   Email: ietf@stefangrimminck.nl


Grimminck               Expires 27 November 2026               [Page 18]