Document: draft-ietf-oauth-selective-disclosure-jwt-18
Title: Selective Disclosure for JWTs (SD-JWT)
Reviewer: Henry S. Thompson
Review Date: 2025-05-02

*Summary*

The substance of this is, as far as I can tell as a non-specialist, is
in good shape.  There are a few nits and editorial points at the end
of this review, but as will be evident by its length, there is one
essentially presentational issue, classified as Minor because a
specialist in this area will shrug and say "yes, but I see what
they're getting at".  I hope none-the-less the authors will find it
useful and address the points I raise, because I do think as things
stand there's a genuine risk of misunderstanding exactly what's
required of an implementation.

*Minor points*

 4.2.1
     This bullet

      "JSON-encode the array, producing [a] UTF-8 string"

     looks simple, but ended up taking me several days to sort out.
     
     For the rest of things to work, you must mean "Serialize the array
     to the corresponding UTF-8 byte sequence", but that's not
     exactly trivial in the JSON-native context you've adopted in this
     document.

     In the end I think you should include one extra step in the
     Disclosure construction, example, namely what the that byte
     sequence looks like as (what [RFC8259] calls) "UTF-8 encoded JSON
     text", immediately after the array creation display:

   ["_26bc4LT-ac6q2KI6cBW5es", "family_name", "Möbius"]         [1]

   ["26bc4LT-ac6q2KI6cBW5es", "family_name", "M%xc3%xb6bius"]   [2]

     It would also be good at this point to clarify notation and
     terminology, following [RFC8259].  That is, to emphasise the
     distinction that [1] is a "JSON text" per the RFC, whose final
     value is a six-character Unicode string, while [2] is a UTF-8
     byte sequence, the result of what you call "JSON-encoding".

     It's true that they are both valid JSON texts, per RFC8259, but
     you have to apply a JSON parser to them to get to
     indistinguishable JSON objects.

     To be more specific, since you use "JSON-encode" a number of
     times in later sections, I would _strongly recommend_ that you

      a) Add the following to section 1.2, immediately after the
         definition of *base64url*:

         *JSON-encode* denotes the conversion of a JSON object to
         "JSON text" and encoding that text in UTF-8, as defined in
         [RFC8259].  That is, mapping a JSON object to a UTF-8 byte
         sequence which when decoded and parsed will reconstruct an
         object indistinguishable from the original.

      b) Replace the first two bullets in the algorithm description, with

         * JSON-encode the array, producing a UTF-8 byte sequence.

         * base64url-encode the resulting byte sequence. The resulting
           string is the Disclosure.

      c) Be careful never to use "string" when "(UTF-8) byte sequence"
         is meant, starting in 4.2.2 with

           The Disclosure string is created by JSON-encoding this array
           and base64url-encoding the resulting byte sequence as
           described in Section 4.2.1

      d) In the second media type registration in 12.2
           "represented as a JSON Object" ->
           'represented as UTF-8 encoded "JSON text" as defined in [RFC8259]'

      e) Include [RFC8259] in 13.1

 Appendices A and B.

     The above problem resurfaces here, with confusion between three
     possible interpretations, in the terms of [RFC8259], of what is
     displayed at various points:

        * a JSON object, that is, structured data composed of
          instances of the six primitive types which JSON can
          represent.  It is _not_ to be understood as string, byte
          sequence or file contents;
   
       * a possible JSON text for some JSON object.

       * a UTF-8 encoding of some JSON text, aka a "JSON encoding".

     I'll use the first example figure in Appendix B to go through
     this in detail, expanding on the discussion above about 4.2.1.

     The first figure is labelled as a JSON object, which is OK.

     But it is indistinguishable from one of the possible JSON texts
     corresponding to that object, and that should be explicitly
     acknowledged.

     The next figure purports to present two alternative "JSON
     encodings", the second of which is problematic.

     Its first line appears indistinguishable from that shown for the
     JSON object in the preceding figure, but is in fact different.

     In the first figure, construed as "JSON text", the o-umlaut glyph
     denotes a single Unicode character in a six-character
     representation of a six-character object member string value.

     However in the second figure, second alternative, the o-umlaut
     corresponds to a _two_-byte UTF-8 sub-part of the JSON encoding of
     that value as a seven-byte UTF-8 byte sequence, either in some
     internal representation or an external stream or file.

     What to do?  First, add something similar to
     https://www.ietf.org/archive/id/draft-bray-unichars-14.html#name-notation
     Then, whenever presenting JSON, always indicate whether what is
     being shown is JSON text or JSON-encoded text (that is UTF-8
     byte sequences).  In JSON text, always include a version using the
     U+xxxx notation whenever the underlying string contains non-ASCII
     characters.  In JSON-encoded text, _always_ use the %xnn notation
     for non-ASCII characters.  

     Some examples of a possible way of indicating JSON text (*JT*)
     and JSON-encoded text (*JUBS*), from section 4.2.1

     Replace the first figure with these two:

     _________________________________________________________
     |*JT*                                                    |
     |                                                        |
     |  ["_26bc4LT-ac6q2KI6cBW5es", "family_name", "Möbius"]  |
     |                                               ^        |
     |                                               |        |
     |                                             X+00F6     |
     |                                                        |
     |________________________________________________________|

     _______________________________________________________________
     |*JUBS*                                                         |
     |                                                               |
     |  ["_26bc4LT-ac6q2KI6cBW5es", "family_name", "M%xC3%xB6bius"]  |
     |                                                               |
     |_______________________________________________________________|

  and the first bullet of the three alternatives which follow with

     * A different way to encode the unicode o-umlaut:

     ______________________________________________________________
     |*JT*                                                         |
     |                                                             |
     |  ["_26bc4LT-ac6q2KI6cBW5es", "family_name", "M\x00f6bius"]  |
     |                                                             |
     |_____________________________________________________________|

     ______________________________________________________________
     |*JUBS*                                                       |
     |                                                             |
     |  ["_26bc4LT-ac6q2KI6cBW5es", "family_name", "M\x00f6bius"]  |
     |                                                             |
     |_____________________________________________________________|

     The corresponding declaration is then

       WyJfMjZiYzRMVC1hYzZxMktJNmNCVzVlcyIsICJmYW1pbHlfbmFtZSIsICJNX
      HUwMGY2Yml1cyJd

  And throughout the examples in Appendices A and B, label the initial
  figure with *JT* and the 'Content' boxes with *JUBS*.  You don't
  need to gloss every Chinese/German string with their U+xxxx version,
  but saying something at the top of A that where non-ASCII characters
  appear in any of the initial examples that the actual Unicode
  character is what is meant.

  The Appendix B example then looks like this, along with some small
  changes to the text:

     Usually, JSON-based formats transport claim values as simple
     properties of a JSON object such as this:

     _________________________________________
     |*JT*                                    |
     |                                        |
     |  ...                                   |
     |    "family_name": "Möbius",            | ö is the single character
     |    "address": {                        |   LATIN SMALL LETTER O
     |      "street_address": "Schulstr. 12", |   WITH DIAERESIS
     |      "locality": "Schulpforta"         |
     |     }                                  |
     |  ...                                   |
     |________________________________________|


     [In first para, change "byte string" to "byte sequence"
      twice, and three more times further down]

     JSON, however, does not prescribe a unique representation for
     data, allowing for variations in the how it presented. The JSON
     text above is only one possibility.  Other possible
     representations include
     ________________________________________
     |*JT* and *JUEBS*                       |
     |                                       |
     |  ...                                  |
     |   "family_name": "M\u00f6bius",       |
     |   "address": {                        |
     |     "street_address": "Schulstr. 12", |
     |     "locality": "Schulpforta"         |
     |   }                                   |
     |  ...                                  |
     |_______________________________________|

     and

     __________________________________________________________________________
     |*JT*                                                                     |
     |                                                                         |
     | ...                                                                     |
     |  "family_name": "Möbius",                                               |
     |  "address": {"locality":"Schulpforta", "street_address":"Schulstr. 12"} |
     | ...                                                                     |
     |_________________________________________________________________________|

         ö is the single character LATIN SMALL LETTER O WITH DIAERESIS

     __________________________________________________________________________
     |*JUBS*                                                                   |
     |                                                                         |
     | ...                                                                     |
     |  "family_name": "M%xC3%xB6bius",                                        |
     |  "address": {"locality":"Schulpforta", "street_address":"Schulstr. 12"} |
     | ...                                                                     |
     |_________________________________________________________________________|


     The two representations of the value in family_name are very
     different on the byte-level, but when decoded from UTF-8 byte
     sequences to JSON texts, those texts would be parsed into
     indistinguishable JSON objects.  The same goes for ...

     The variations in white space, ordering of object properties, and
     representation of Unicode characters are all explicitly allowed
     in [RFC8259].  There are further variations, e.g. for floating
     point values ([RFC 8785]) and UNICODE combining characters
     ([UNICODE]).

*Nits*

 4. "(for those who celebrate)" will be anywhere from obscure to
     confusing for many readers from many cultures -- best to remove it.

 4.2.1. "an UTF-8" -> "a UTF-8" [overtaken above]

        "However, the digest is calculated over the respective
         base64url-encoded value itself, which effectively signs"

         ->

        "Because the digest is calculated over the respective
         base64url-encoded value itself, this effectively signs"

 4.3.1.  I'd recommend
        "The bytes of the digest MUST" -> "The bytes of the sd_hash value MUST"

 6. I have decoded a few of the Disclosures and they're fine, but you
    might want to ask a friendly 3rd party to double-check all the
    Disclosures and digests, at least here and in Appendix A.

 9. "Security considerations in this section help achieve the
     following properties:"

    This confused me for a while.  I think what you mean to say here
    is something like

      This spec aims to provide two security guarantees:

      *Selective Disclosure*: ...

      *Integrity*: ...

    The following sub-sections show how the various aspects of the
    design presented here combine to achieve this.