This document has been reviewed as part of the transport area review team's
ongoing effort to review key IETF documents. These comments were written
primarily for the transport area directors, but are copied to the document's
authors and WG to allow them to address any issues raised and also to the IETF
discussion list for information.

When done at the time of IETF Last Call, the authors should consider this
review as part of the last-call comments they receive. Please always CC
tsv-art@ietf.org if you reply to or forward this review.

Section 4. "MTU size detection prior to a test is out of scope of this document. A method to detect the path MTU size are proposed by [RFC8899]." 

Using DPLPMTUD requires specification of how to probe and receive feedback. Varying MTU of a Load PDU appears a bad idea from measurement integrity perspective. Thus, this needs to be done in a pre-phase and thus defined and integrated into the protocol.

If you actually want to support MTU discovery, I would define additional test activation messages that runs a path MTU discovery run per RFC 8899 using this new request message type and a response message or the Status PDU. What is required is a message type with a sequence number and something that can acknowledge receipt of the various packet sizes sent. And can perform an implemented search algorithm to determine a working MTU, prior to running the capacity test. But maybe this better be an extension to this protocol.  

To get an understanding of what is requires to apply DPLPMTUD to a protocol you can review https://datatracker.ietf.org/doc/draft-ietf-tsvwg-udp-options-dplpmtud/ or the minimal needed even when you have a complete transport protocol worth of mechanisms see section 14.3 of QUIC https://www.rfc-editor.org/rfc/rfc9000.html#name-datagram-packetization-laye
Note, I think the first is more applicable to what requires to be specified than the QUIC example. 

Section 4.2:
An OPTIONAL Unauthenticated mode for all messages shall only be allowed when all other modes requiring authentication (or Partial Encryption) are blocked or unavailable for use

The motivation for the mode appears strange. I think the relevant aspect is that this mode only is acceptable in a lab or limited domain [RFC8799]. And why would one use it is in those domains, what is simplified? Lack for clock synch or other? 

Section 4.2.2 Assumes that authUnixTime will be unique for all status messages. Replay protection? Appear to give a large window in time (10seconds) where all will be processed. I will note that by not having any type of sequence numbers that allow replay protection on an attacker has the possibility to attempt forgeries during a significantly long window making the attack easier. Also to my understanding even a later duplication will be processed, thus not preventing late messages to trigger a response that the attacker can observe when forgery is successful. 

Which cipher algorithms must be supported by an implementation? RFC 6234 specifies several SHA-2 variants, which is supported.   There need to be better specification of which part of the key is being used if the length doesn't match. 

Section 4.2.3: immediacy of authUnixTime
What is the definition for immediacy? Compare against own wall clock? Or only against prior timestamps? If the former what implications does that have on requiring synchronized clocks? 

"The sender SHALL populate the initVector field with a random 16-octet Initialization Vector (IV)." To my understanding the used IVs needs to be crypto graphically random like discussed in https://datatracker.ietf.org/doc/html/rfc4086? So adding clarifying and adding a reference is probably good. 

Section 4.3: Automatic Key-management. What is the motivation for not providing an automatic mechanism? Lets looks at the arguments in RFC 4107.  https://datatracker.ietf.org/doc/rfc4107/

To me this usage appear more towards the MUST use automatic key-management side of the BCP. A fairly large number of keys if one expect to have different test clients for all network endpoints in an access network. Secondly, no short term keys are derived. Although the IV for is full length it will have a limited life time. ANSSI recommendation for AES-CBC says 2^59 blocks as the limit to rekey. Tracking that could become a problem in actual deployment. 

A sketch for an automatic key-managemenet for this protocol. The initial control and setup phase could be performed over DTLS, then perform an key export to use for the integrity of the status messages and activation control on the data plane. That would mean possibility for public cert for the test server, and if one anyway was ready to give clients a secret key, why not a cert to do mutual authenticated DTLS if client auth is needed to prevent misuse of test server. 

I would also note that the key-length required is dependent on cipher and that you have two different mechanisms.  

Section 4.4:

"Assuming that the firewall administration at the server does not allow an open UDP ephemeral port range, then the server MUST send a Null Request to the client from the ephemeral port communicated to the client in the Test Setup Response. The Null Request may not reach the client: it may be discarded by the client's firewall."

I think the client side port to send to here need better clarity as to my understanding that is a different port than the one used for Test Setup request. If one send the null request to the client's setup port then the firewall may still block the client's traffic as this doesn't match the reversed 5-tuple. More on this below.

Section 4.5 

"excluding the Payload Content of the Load PDU and, to be clear, also the IP header)."

This appears to indicate that the UDP header is included in the checksum. That will result in issues if there is a NAT on the path that rewrites the UDP ports. I think a more clear definition of which part of the PDU are actually covered. I would think a figure for which field rows that are used would simplify here. 

Section 5.1:

Why isn't the Control ID and the protocolVer defined? So having read to the end I understand the first field should actually be named as PDU Type/ID. But also the protocolVer should be defined as field. 

mcIndent appears to small a field for avoiding collision between test session requests from multiple clients. Especially as it appears to be the only binding when doing multi-connection tests, as nothing of the IP/UDP can necessarily be used to determine if different mcIndex belong to different or the same session other than the mcIdent. I would recommend that using more of the padding field to include a larger random request indetification. 

Section 5.2.2: Null Request PDU.

This message appears to have very limited utility as it will in most cases not have the desired function.


Client -> SETUP request (C:P1 -> S:SP) -> NAT (N1:P2 -> S:SP) -> routable Network -> S:SP (Test Server Port)
Client <- SETUP Response(C:P1 <- S:SP) <- NAT(N1:P2 <- S:SP) <- routable network  <- Server
                                       <- NAT(N1:P2 <- S:S1) <- routable network  <- Null Request PDU(N1:P2 <- S:S1) <-  Server

As the NAT has no 5-tuple mapping for a client port (C:P1 -> S:S1) for external address N1:P2 then this packet is dropped by the NAT.

In addition if there is a firewall between server and routable network, then that FW needs to be configured with a filtering policy corresponding to RFC 4787 Address-Dependent Filtering or Endpoint-Independent Filtering, however for firewalls the most common patter is after all Address and Port-Dependent Filtering this message would not help. 
In addition this specification is very unclear if it is okay or not to send the test activate request from another client side ephemeral port. Assuming that one count on NAT existing between client and server it MUST be allowed, as NAT that have what RFC 4787 defines as Address and Port-Dependent Mapping, could result in that the source port the activation request comes in on are different than N1:P2 in the above, and rather be N1:P3.

I will note that for the Null Request to function for all filtering types it would need to be sent to N1:P3 rathern than N1:P2. But, the client do not know of N1:P3, and cant normally unless the intended destination can receive the message and echo it back. 

Section 5: What about retransmission of requests? As this uses UDP loss have to be expected, thus the control protocol will have to support retransmissions. Still I don't find any mentioning of retransmission in this specification. I think that is needed to define how it is handled. Also consider if one need some type of sequence number in the setup request to determine that this is a retransmission of an request that might have been processed and where the response was lost. But in general the setup protocol and test activate messages can operate based on lock step retransmission scheme as long as the endpoint can determine if a request is repeated and just need a retransmission of the old. 

Section 6.1:

A new registry will be needed for modifierBitmap assignments; see the IANA Considerations section.

So to my understanding there will be registry created by Section 11.2.6, thus new assignments need a codepoint (entry) from that registry. And please include section references. 

Section 7.1: The first field in figures are poorly labled. Instead of saying loadID I think it should say "PDU ID = loadID" and make corresponding change where controlID is used.

lpduSeqNo: Load PDU sequence number (starting at 1). 
Should this field be explicit that it could be wrapped if a high rate session is long. I calculate that it will take roughly 4000 seconds at 10 Gbps to wrap this field using 1200 bytes packets. Smaller packets faster wrapping. 


Section 11.2.2

First of all the section heading is not matching the field name the registry is for. 

Secondly, the fact that Table 3 indicates registration rules for values that is assigned by table 4 is strange: To my understanding the first entry in table 3 should be 21-40960, or shouldn't this be 21-40599 to find the bit boundaries?