I have reviewed this document as part of the security directorate's ongoing effort to review all IETF documents being processed by the IESG. These comments were written primarily for the benefit of the security area directors. Document editors and WG chairs should treat these comments just like any other last call comments. The summary of the review is Ready with Nits. The protocol seems robust, but the document is in need of an editing pass, particularly around "may" vs "MAY", and presentation of the security-related concepts could be clearer. Most of my comments can be ignored if the authors are short on time. I have flagged the most important ones with “******”. Security-related comments: —--------------------------------- Section 2.2: “Note that the Pledge only sends the CSR Attributes request to the entity acting as the EST server as per [RFC7030] section 2.6, and MUST NOT send the CSR Attributes request to the Cloud Registrar.” … why? Is there a security / privacy / operational reason for this MUST NOT? Or is it simply a “this won’t do anything”? If this is a security reason, then please be clear about what “bad thing” happens. Last paragraph of Section 3.1.2; there may be other DoS scenarios in which the Cloud Registrar MAY wish to protect itself; for example if a large number of requests come from known-malicious IP addresses, exhibit DoS style behaviours, etc. Cloud Registrars MAY implement rate-limiting, incremental backoffs, predictive filtering, or any other applicable DoS mitigation techniques. BRSKI clients SHOULD be equipped with retry mechanisms appropriate to the DoS mitigation techniques used by its manufacturer Cloud Registrar. Remember that legitimate devices can become compromised and exhibit malicious behaviour, so just because a Pledge device successfully completes TLS client-auth does not mean that it should be fully trusted. ******* Section 3.3.1: The security of the jump from Cloud Registrar to Owner / VAR Registrar relies quite heavily on the BRSKI Provisional TLS mechanism. Even with skimming RFC 8995, the details of the Provisional TLS stuff did not become clear to me until I reached somewhere around section 8 of this document. Since so much of the security of this document relies upon the Pledge correctly handling the Provisional TLS state, I highly suggest adding a section specifically about the Provisional TLS state. Section 8.2 has some really good stuff buried in it about what things a Pledge MAY and MUST NOT do while it is in the "Provisional TLS" state. Also, I think it would be helpful to explicitly spell out what validation the Pledge MUST perform in order to get itself out of the Provisional TLS state. RFC 8995 section 5.6.2 sorta outlines this, but in my opinion, not well enough to base the entire security of this document on. Section 5 should include some discussion of management lifecycle of the TLS certificates used by VAR / Owner Registrars and EST servers. IE once a certificate has been pinned either in Pledge devices or in the vouchers of up-stream Registrars, the operator of such infrastructure requires coordination with their upstream registrar in order to change their certificates. Section 8.1, last sentence: “...but do not constitute a security risk, as the Pledge is correctly verifying all TLS connections as per [BRSKI].” I agree, but I would strengthen that to “..., so long as the Pledge is correctly verifying all TLS connections as per [BRSKI]” to highlight that it is tempting for Pledge manufacturers to be loose with TLS checking around captive portals, but that doing so will likely introduce exploitable security holes such as where an attacker simulates a captive portal scenario in order to feed the Pledge a forged Voucher. Section 8.2 "There are additional considerations regarding TLS certificate validation that must be accounted" "The Pledge should check whether the identity of the Registrar" Should those be normative MUST and SHOULD? Or some other intent? ******Section 9.1: In addition to trust anchor update, there is another huge security reason to do firmware updates as the first step after waking up: to apply any available security patches to the OS, TCP / HTTP stack, etc, to prevent the device from becoming exploited by malicious network infrastructure. I think this ought to be “Pledges SHOULD attempt to contact the manufacturer and apply any available firmware patches (with any appropriate firmware signing), and networks SHOULD allow this”. In my own professional work, there are some classes of devices where it’s preferable for the device to brick itself in situations like that rather than to allow itself to become compromised, but advice like that is probably too strong for this document. ******Section 9.2: I’m not sure that this section has sufficiently clearly drawn the through-line to the security implications. The advice in this section feels more like “It won’t work” type advice rather than “It’ll be insecure” type advice. If it’s not really security, then maybe move it to another section? ******Same comment about 9.3. What is the security consideration here? What is a Pledge developer or a Registrar operator supposed to do with the text in this section? Non-security comments & Nits —------------------------------------- I feel like the use case explanations in the Introduction and Architecture sections are overly verbose and repetitive. This could be shortened. *****The draft contains 47 “may” and only 12 “MAY”. I suggest that each of the 47 be checked to decide if it carries the meaning of “is allowed to”, in which case it should be a normative “MAY”, or carries the meaning of “it could happen that..” in which case I suggest that some other wording is found instead of “may”. "For instance, a SIP phone might have a client certificate to be used with a SIP proxy." Define or reference "SIP", please. In the Terminology section: I would add some more words for OEM and VAR to describe in English how those things are different from each other. The way it is currently presented, it sounds like 4.1 (redirect to another Registrar) and 4.2 (redirect to an EST server) are distinct cases, but I assume that eventually the Pledge needs to end up at an EST server, so 4.2 (redirect to an EST server) MUST happen eventually, with zero or more iterations of 4.1 (redirect to another Registrar) before it? Section 2: “there are a number of parties involve” should be “involved”. Section 3.2: “The Cloud Registrar must determine Pledge ownership” Should that be a normative MUST? Exactly what technical action is involved in “determining Pledge ownership”? … ah, this is explained below in 3.2.1. Then I think the opening sentence should be “The Cloud Registrar must determine Pledge ownership, see [3.2.1]”. Section 4.2: typo: “If the est-domain was provided by with an IP address literal” … “by with” seems like a grammar mistake. Section 4.2: “The Pledge also has the details it needs to be able to create the CSR request to send to the RA based on the details provided in the voucher.” … but that’s not quite true, is it? It may also need to make a /csrattrs call to the EST server, right? Section 4.2: “In steps 5.a and 5.b, the Pledge may optionally notify the Cloud Registrar/MASA of the success or failure of its attempt to establish a secure TLS channel with the EST server.” It might be helpful to mention the purpose of doing this. IE how does communicating this information benefit the Pledge? What will the Cloud Registrar/MASA do with this information? *****Section 4.2: “The Pledge must verify that the issued certificate in step 7 has the expected identifier obtained from the Cloud Registrar/MASA in step 3.” I feel like this needs to describe some error handling. If it does not contain the expected identifier, then what is the Pledge supposed to do? Is it supposed to discard the cert and start over? Is it supposed to trigger revocation of the mis-issued cert? If so, how? Section 5: “The well-known URL that is used is specified by the manufacturer when designing it's firmware” should be “its”. Section 8.1: “A Pledge may be deployed in a network where a captive portal or an intelligent home gateway that provides access control on all connections is also deployed.” I think here you don’t mean “may” as in “is allowed to” (which is the 2119 meaning of the word) but rather “might find itself”. I suggest that “A Pledge might find itself deployed…” would be clearer. Section 9: “This internet accessible service may be operated by the manufacturer and/or by” “a Pledge that may have been in a dusty box” Again, I think this wants to be a “might” rather than a “may” since we want the non-2119 meaning of the word here. Section 9.2: “The Cloud Registrar may have a certificate” “it is recommended to limit the number” “the Cloud Registrar may have a certificate that can” “The Pledge may have any kind of Trust Anchor built in” This feels like it wants to be normative MAY, RECOMMENDED. There are a couple instances of markdown section references that did not build properly, such as “{bootstrapping-with-no-owner-registrar}”. Section 9.4: “the Cloud Registrar actually does all the voucher processing as specified in [BRSKI].” Should that be a “MUST” ?