I have reviewed this document as part of the security directorate's
ongoing effort to review all IETF documents being processed by the IESG.
These comments were written primarily for the benefit of the security
area directors.  Document editors and WG chairs should treat these
comments just like any other last call comments.

Summary: Ready with Issues

This draft makes me uneasy. It operates in a nebulous security zone: 
attempting to provide cryptographic privacy through a trusted relay, 
but only approximately. Interactive chunking permits timing side-channels 
that enable the gateway to fingerprint clients, fundamentally degrading 
the unlinkability OHTTP was designed to provide.

CRITICAL ISSUES

1. Nonce Counter Wraparound (Section 6.2)

The XOR construction chunk_nonce = aead_nonce XOR encode(Nn, counter)
for deriving per-chunk nonces has a critical failure mode when counter
wraps at 256^Nn.

I'd add explicit text in Section 6.2:

"Implementations MUST maintain a chunk counter and verify it never 
reaches 256^Nn. If the counter would overflow, implementations MUST 
terminate the response and return an error to prevent nonce reuse."

2. Interactive Timing Side-Channels (Section 7.2)

Client <-> Relay <-> Gateway <-> Target
        (encrypted)   (decrypted)
        
The Gateway never sees the Client's IP, that's the whole point of
OHTTP. It only sees the Relay but there's still an interactivity timing
channel in that the gateway is measuring the RTT.

What Actually Leaks:

When the Gateway measures timing:
t0: Gateway sends response chunk #1 -> Relay -> Client
t1: Gateway receives request chunk #2 <- Relay <- Client
Measured time: (t1 - t0)

This time measurement includes:
RTT(Gateway <-> Relay) - Gateway knows this
RTT(Relay <-> Client) - This is what leaks
Client processing time

The Attack

The Gateway learns Client-to-Relay RTT, which enables:

Fingerprinting: Same RTT pattern across requests -> same client -> breaks unlinkability

Geographic inference: If Relay location known, RTT constrains Client geography

Anonymity set partitioning: Divide users by timing buckets

So interactive exchanges leak RTT to gateway, violating core OHTTP
unlinkability goal. I'd suggest mitigation guidance be provided:

"Interactive mode fundamentally degrades unlinkability. Gateway timing 
measurements can fingerprint clients through Relay-Client RTT patterns. 
Clients requiring strong privacy guarantees SHOULD NOT use interactive 
chunked OHTTP. Applications that require interactivity (e.g., 100-continue) 
MUST document the reduced privacy properties compared to base OHTTP."

3. Message Size Limits Not Normative (Section 7.3)

Example calculation provided but not stated explicitly:

"Implementations MUST enforce maximum message sizes appropriate to their
cipher suite to maintain security margins. For AEAD_AES_GCM_128: 
individual requests and responses MUST NOT exceed 2^30 bytes 
(approximately 2^16 chunks at 2^14 bytes per chunk)."

MUST FIX

4. Incremental Header (Section 3)

Change "SHOULD include" to "MUST include" for Incremental header
field. Without it, intermediaries might buffer entire message, defeating
the purpose of the chunking.

5. HPKE Sequence Limit (Section 6.1)

Specify explicit chunk limit: "Maximum request chunks limited to 2^96
for typical HPKE cipher suites (Nseq=12)."

SHOULD FIX

6. Replay Attack Expansion (Section 7.1) Add discussion about how
chunking expands replay attack surface (partial replay, cross-session
attacks).  See Appendix A.

Scenario: Mix old chunks with new chunks

Original Request (captured):
  [Chunk 1: "GET /account/balance"]
  [Chunk 2: "Authorization: Bearer token_old"]
  [Chunk 3: "final"]

Attacker constructs:
  [Chunk 1: "GET /account/balance"] <- replayed from old request
  [Chunk 2: "Authorization: Bearer token_new"] <- new, stolen token
  [Chunk 3: "final"] <- newly generated
  
Does the HPKE sequence binding prevent this?  Yes, but an attacker could
replay the entire sequence of chunks as a unit (same as replaying whole
request in base OHTTP).

Scenario: Replay chunks across different OHTTP sessions

Session A (Monday):
  HPKE context_A -> [Chunk 1_A] [Chunk 2_A] [final_A]

Session B (Tuesday):
  HPKE context_B -> [Chunk 1_B] [Chunk 2_B] [final_B]

Can attacker send: [Chunk 1_A] [Chunk 2_B] [final_A]?
No, each session has unique HPKE context

Different ephemeral keys (enc)
Different sequence counters
Chunk 1_A cannot be decrypted with context_B

HPKE key encapsulation binds all chunks to same session

Scenario: Replay request without final chunk

Captured request:
  [Header + enc]
  [Chunk 1: "DELETE /account"]
  [Chunk 2: "?confirm=yes"]
  [final chunk with AAD="final"]

Attacker replays:
  [Header + enc]
  [Chunk 1: "DELETE /account"]
  <- stops here, no final chunk
  
Impact:

Server processes partial request
May execute DELETE before realizing truncation
This is the main concern from Section 7.1

Protection:

Final chunk uses AAD="final"
Server MUST NOT act on request until receiving valid final chunk
But if server processes incrementally and has side effects...?

Scenario: Small captured chunks replayed massively

Attacker captures:
  [Chunk 1: "SELECT * FROM users"] (1 KB)
  [final]

Attacker replays 1000x in parallel:
  -> Database executes expensive query 1000 times

Perhaps Section 7.1 should add:

"Chunked OHTTP does not introduce new full-request replay
vulnerabilities beyond base OHTTP. However, incremental processing
also introduces new truncation risks:

- Chunks from different HPKE sessions cannot be mixed (protected 
  by key encapsulation)
- Chunks within a session cannot be reordered (protected by HPKE 
  sequence numbers)
- Truncated requests (missing final chunk) may be processed partially
  
In this same vein, I would define "final" as a protocol constant with
explicit byte encoding, and since you're in there, also consider adding
a version field to AAD for future extensibility.  If nothing else,
expand the discussion is my recommendation.

Derrell

Appendix A

"The Locked Mailbox with Multi-Part Letters" (base OHTTP, non-chunked):

Imagine a locked mailbox system.

You drop one sealed envelope (the encrypted OHTTP request) into a public
mailbox.

The mail carrier (relay) can't open it, but delivers it to the
destination server.

A thief who copies that envelope can drop it into the mailbox again
later, the post office will dutifully deliver it again, and the
recipient will process it twice.

Chunked OHTTP: multi-part envelopes

Now the sender doesn't send one big envelope, instead, the sender sends
a series of smaller envelopes, each labeled "Part 1 of 3," "Part 2 of
3," and so on, all under a unique session key.

Each envelope is locked and numbered, so they must be opened in order.
This enables streaming, but it also creates new ways for attackers to
misbehave.

1. Partial Replay (mix-and-match envelopes)

An attacker tries to swap envelopes between letters.

They take Part 1 from yesterday's letter and Part 2 from today's letter.
But each letter series has its own lock and numbering sequence (HPKE
context + sequence number).  The keys don't match, so the receiver can't
open a hybrid series.

2. Cross-Session Replay

The thief tries to take Part 1 from one day's delivery and Part 2 from
another, pretending they belong together.  Different letters, different
keys. The lock won't open.

3. Truncation + Replay (missing the last envelope)

Here's a more subtle one.  The thief replays Part 1 and Part 2 but
leaves out "Part 3 of 3," the final piece that means "end of letter."

If the receiver starts acting on the partial message, say, deleting an
account before reading the closing "confirm=yes", they've been tricked.
That's the truncation risk and also a new vulnerability introduced by
chunking that's not present when the message was in one atomic envelope.

4. Amplification

Because the envelopes are small, a thief can cheaply copy and resend
them thousands of times, forcing the recipient to repeatedly perform
expensive tasks.  Like mailing the same tiny instruction slip 1,000
times implying replay volume risk worsens with smaller chunks.

Chunking doesn't make OHTTP less secure in key exchange terms, the locks
and numbering protect against mixing parts.  But it creates new
operational hazards if servers act on incomplete deliveries.  Until the
"final envelope" arrives and is verified, the server must not commit any
transaction.