# secdir review of draft-ietf-emu-eap-arpa-06 CC kaduk I have reviewed this document as part of the security directorate's ongoing effort to review all IETF documents being processed by the IESG. These comments were written primarily for the benefit of the security area directors. Document editors and WG chairs should treat these comments just like any other last call comments. The summary of the review is almost ready -- the general idea of defining predefined identifiers under eap.arpa to signal a type of provisioning EAP access request is sound and, in hindsight, long overdue. Many of my most significant comments will be probing at the boundaries of what we expect future implementations/documents to do, and the statements we make about existing implementations and deployments. That said, if I was still on the IESG I would ballot DISCUSS due to a few specific points that might impact the current and/or future interoperability of the protocol, and some internal-consistency issues. I also made a github PR (https://github.com/FreeRADIUS/eap-arpa/pull/2) with some editorial suggestions for things I noticed while reviewing that (probably) do not need discussion here. ## Discuss ### (Non-)Permanence of domain registrations Section 6.6 (and others) describes self-assignment of identifiers under the "v." subdomain, with an organization being able to use a FQDN they have registered as the domain prefix. But such domain registrations are not permanent, and implementations using such names in software may persist after the registration has lapsed. I think we should have some text in the document discussing this mismatch in timescales, which might entail guidance to domain owners to ensure they keep the domain registered or some guidance to implementors/users that such self-registrations may become stale if the domain ownership changes (or some other solution, of course). (For example, the claim in §3.2 that such self-assigned identifiers "cannot conflict with other identifiers" is not true if the domain name used to construct the identifiers gets reassigned.) ### authenticate the server or not It looks like there's some internal inconsistency in what we expect to happen for using EAP-TLS for provisioning w.r.t. server authentication. In toplevel §5 we say that EAP-TLS has the advantage of authenticating the EAP server, but in §5.1 we say that the device "SHOULD ignore" the server certificate (but that the device likely has web CAs present and could use those to authenticate the EAP server). Is there some subtlety I'm missing that makes these cases different? If not, it seems like we need to have a consistent message on what EAP-TLS for provisioning is supposed to provide (and if there is a subtle distinction, we should call it out clearly). If we do end up keeping the statement that peers could use web CAs to authenticate the EAP server, I would strongly recommend providing some commentary about when it would or would not be a good idea to actually do so, or what factors would come into play in deciding whether or not to do so. ### Does TLS-PSK need to be handled separately from regular EAP-TLS? The final paragraph of §5.1 mentions that TLS-PSK can technically be used with EAP-TLS for provisioning purposes, but in all the TLS stacks I know of, using TLS-PSK is effectively a distinct operation than doing a certificate-based handshake, and I would not generally expect either peers or servers to be prepared to handle both for the same TLS connection (i.e., letting the other endpoint pick which to use). To me, that suggests that interoperability would benefit from defining a distinct provisioning NAI to indicate that TLS-PSK should be used with EAP-TLS, leaving portal@tls.eap.arpa for certificate-based (server) authentication. Do we have reason to believe that the current specification will be interoperable in the face of peers/servers that do and do not want to attempt TLS-PSK "authentication"? I would probably also say something to clarify that the (lowercase!) raw ASCII byte string of the NAI name is used directly as the PSK, without other processing, but that's just at a comment level. ### NAIs for TLS-based EAP methods The rules for the registry seem to say that there must be a 1:1 correspondence (or at least N:1) between provisioning NAI and EAP method. So I'm really confused at why we have any discussion of TTLS and PEAP (in §5.2) but say to use the same NAI (portal@tls.eap.arpa) as for EAP-TLS. Why do we not need to define distinct NAIs to provide the semantics indicated here? If the intent is to explicitly not define such NAIs to align with our recommendation to use EAP-TLS in preference to other TLS-based EAP methods, then I think we need a clear disclaimer that portal@tls.eap.arpa MUST NOT be used for those methods. ## Comments ### division of responsibility between this doc and provisioning methods In §5 we have some discussion about how our predefined provisioning NAIs will interact with existing EAP types, including a statement that where TLS-based methods have inner identity/authentication, those credentials "MUST be the provisioning identifier", among other requirements. I'm not sure I understand why we need to tie our hands so strongly in this document, when any given provisioning identifier is going to be specific to a single EAP method (per §6.2 and 3.4.1). Why is it necessary for the core protocol framework specifically to impose this requirement, vs the individual provisioning methods doing so (with guidance from the framework as a useful default)? I do see that the registration procedure is merely "expert review" so there may not always be a document that would be able to hold such a requirement. But it seems like we could say "unless otherwise specified, assume that the password is the provisioning identifier" and leave room for future evolution. ### TEAP There's a lot of mention of TEAP in this doc, including using teap.eap.arpa as an example NAI realm, and discussion of using in-EAP provisioning via TEAP. But this document does not actually specify/register any teap.eap.arpa NAIs. Why not? I see that there's an rfc7170bis in progress, but the current -21 does not contain the string "arpa". ### terminology My reading of RFC 3748 is that we should prefer "peer" over the IEEE 802.1X "supplicant", but "supplicant" appears a few times. Please check whether or not those uses should be changed to "peer" for consistency. ### table formatting The prose in Section 6.2.1 suggests that there is a "table" but neither the TXT nor HTML versions is rendering as such for me; I think either the doc-generation toolchain or the prose needs an update to become consistent. The HTML version in particular does not even have a line separator between what is two different lines in the TXT version (which I infer is intended to be a row separator in the table). ### PIE is tasty but perhaps out of scope While I appreciate the levity in "[t]he choice of "Provisioning Identifiers for EAP" (PIE) was considered and rejected", it feels more suited to an I-D than a final RFC; please consider dropping that sentence. ### Concepts vs protocol It seems to me that there are large swathes of Section 3 that have grown past just describing "concepts" to going into substantial detail on protocol operation, making the arrival of Section 4 as an "overview" a bit of a surprise. It is even more of a surprise to see that the "overview" is just a review of existing functionality and a rationale for the class of approach taken, while saying essentially nothing about how the actual protocol works. While it is perhaps a bit late to propose a drastic reshuffling of content, perhaps retitling the two sections would still be useful. ### not routing .arpa for AAA Section 3 notes: > The realm is one which should not be automatically proxied by any Authentication, Authorization, and Accounting (AAA) proxy framework as defined in [RFC7542], Section 3. I think it would be helpful to be more clear about what we mean by "should not", here -- are we making an interpretation of the requirements already present in RFC 7542, an interpretation of the preexisting rules around use of .arpa (I did not try to pull the sources for where that is specified), a statement based on knowledge of current implementation behavior, or something else? I do see there is a bit more discussion in §3.7, but there's no forward reference from here so this text should either gain such a reference or do more to stand on its own. Similarly, when we say: > The realm is also one which will not return results for [RFC7585] dynamic discovery. I assume that we are assuming that there are no S-NAPTR records in the .arpa zone at all, and that's the basis for the claim. It seems helpful to the reader to include our reasoning in this instance as well. ### Enumerating implementations In Section 3 we say: > We note that this specification is fully compatible with all existing EAP implementations, so it is fail-safe. which is making a statement about "all existing EAP implementations". While I have pretty high confidence in the statement, it remains impossible for us to prove the absence of some private EAP implementation that is incompatible with this specificiation. So we probably want to hedge a bit about "known implementations" or point to a list of them or something like that. (We could in theory also go into more detail on what exactly we mean about "compatible" in terms of existing servers behaving as expected in the face of updated peers, and existing peers not doing anything that would trigger the new functionality in upgraded servers, but I don't actually think that would add real value in this case and so do not recommend it.) ### Coordinating method type names and subdomain names In §3.2 > Where it is not possible to make a direct mapping between the EAP Method Type name (e.g. "TEAP" for the Tunneled EAP method), and a subdomain (e.g. "teap.eap.arpa"), the name used in the realm registry SHOULD be similar enough to allow the average reader to understand which EAP Method Type is being used. There's a (probably theoretical) risk of an EAP Method Type that's not a valid domain name being translated to a name, call it foo, and then some future EAP Method Type being created that's named "foo" as well, so the preferred mapping is no longer possible. We could probably avert that by updating the Method Types registry to have a note to not register such conflicting names, though I'm not entirely convinced that it's worth the effort to do so, since new method types are pretty rare and Joe (the DE) would probably flag it anyway. ### Anonymous not recommended Can we say something in §3.3 about why we say that a username of "anonymous" is "NOT RECOMMENDED"? ### Direct configuration of NAI In §3.4.1 we say: > EAP peers MUST NOT allow these NAIs to be configured directly by end users. Instead the user (or some other process) chooses a provisioning method, and the peer then chooses a predefined NAI which matches that provisioning method. I agree with the goal here, but are there or could there be existing situations where implementations already allow the user to directly enter the NAI (along with the associated credentials)? If so, we probably want some discussion about what might happen if a user (maliciously?) enters a predefined NAI in such a way, along with guidance that implementations that do allow this behavior need to check for eap.arpa entries and reject them. ### re-authentication process In §3.4.1: > When all goes well, running EAP with the provisioning NAI results in new authentication credentials being provisioned. The peer then drops its network connection, and re-authenticates using the newly provisioned credentials. Do we expect any user involvement in this drop+reauthenticate scenario? Is the user supposed to have access to/knowledge of credentials that are provisioned? ### Allow for server upgrades In §3.4.1: > There are a number of ways in which provisioning can fail. One way is when the server does not implement the provisioning method. EAP peers therefore MUST track which provisioning methods have been tried, and not repeat the same method to the same EAP server when receiving a an EAP Nak. EAP peers MUST rate limit attempts at provisioning, in order to avoid overloading the server. We may want to saay something about the not repeating being bound to some large-ish but not-infinite timeframe, to allow for another attempt much later to succeed if the server has been upgraded in the interim. (We also don't want requirements on peers to have unbounded local storage requirements!) (We could also give some guidance on what good rate limiting might look like, even if that takes the form of factors to consider rather than specific values. Note that rate limiting also comes up in §3.4.2.) ### Large amounts of data and PQC In §3.4.2: > A limited network SHOULD also limit the amount of data being transferred by devices being provisioned, and SHOULD limit the network services which are available to those devices. The provisioning process generally does not need to download large amounts of data, and similarly does not need access to a large number of services. Do you have a sense for what people might take "large amounts of data" to mean? As we start transitioning to post-quantum cryptography with its larger key sizes, it would be unfortunate if the total data limit for provisioning was too small to admit transfer of credentials using PQC algorithms (but I'm not sure if we actually need so say something, if the limits in practice will be fine without us doing so). (There is some related discussion in §5.1 that might want a section reference back to any new content added here.) ### EST, ACME, and CMP Section 3.6.2 uses EST and ACME as examples of provisioning protocols, but ACME was a bit surprising for me to see there, since it is most often used for TLS server certificates and where the entity getting a certificate has a DNS name for it, which does not seem like it would generally be the case for an EAP peer. I would find something like CMP (RFC 4210) more analogous to EST as a good example to use here. ### More on AAA Routability In addition to saying that administrators "will not have statically configured AAA proxy routes for this domain [at the time of this writing]" do we want to say anything about "there is generally no reason for administrators to add such proxy routes, and if they do it would be in service of using this specification"? ### TEAP details The final paragraph of §4.1 discusses TEAP, but manages to leave enough unsaid that it's hard to discern what point we're trying to make by mentioning it. For example: (1) are we trying to contrast the "server unauthenticated provisioning mode" (presumably for the outer tunnel" with the "inner TLS exchange requires that both end [sic] authenticate each other" apparent requirement that the server can in fact authenticate itself, or to highlight that the peer still needs some credentials for that inner tunnel? (2) The final sentence seems to contrast "ways to provision a certificate" with a need to have preexisting credentials, but the apparent conclusion that the ways to provision a certificate are not very useful is left unsaid. I would recommend fleshing out this discussion a bit more to make the message more readily apparent. ### Rationale We have §4.2.1 to give a rationale for provisioning inside EAP, but no corresponding section with a rationale for provisioning inside a captive portal, yet we do not specifically recommend provisioning inside EAP. This leaves me unsure what the purpose of the section is, if we're going to spend time justifying something that's just one option to choose from with no other special status. (I can infer that using a captive portal facilitiates reuse of existing provisioning protocols and/or deployments, but the document doesn't tell me that.) ### EAP-TLS clarifications The final sentence of toplevel §5 provides some commentary about what EAP-TLS allows, but I find myself unclear both about why this information is being added and what scenarios are being described. My current theory is that it's saying that an EAP peer can use EAP-TLS-based provisioning via captive portal with only a small amount of pre-provisioned or factory-provisioned information (the CAs that are locally configured), and we're mentioning this to support our argument that using EAP-TLS for provisioning (whether with in-EAP provisioning or captive-portal provisioning) provides advantages and is generally recommended. Is that correct? ### reference for EAP-TLS In §6.2.1 we seem to only list RFC 9190 and this document as references for EAP-TLS, skipping RFC 5216. Is that what we want? ### guidance to the experts Generally we treat "SHOULD NOT" as "MUST NOT, with exceptions". If NAIs in the registry SHOULD NOT contain more than one subdomain, what kind of exceptions might make sense? Relatedly, I think the guidance should say that NAIs with any "v." subdomain, leading or otherwise, MUST NOT be retistered, in order to preserve the purpose of that prefix. Do we need to specifically include in this section the content from §6.4 that the Method Type must provide MSK and EMSK? ### expanded method types The registry guidance (§6.4) says that the "Method Type" column must either be from the EAP Method types registry or "be an Expanded Type". How would we expect an expanded type that is not in the Method Types registry to appear in our registry? ### specifying designated experts AFAIK, "For registration requests where a Designated Expert should be consulted, the responsible IESG area director should appoint the Designated Expert" is implied by the use of the "Expert Review" registration procedure and can safely be omitted from this document. ## Nits ### EAP working group Section 6.5 describes a process including "publish a notice of the decision to the EAP WG mailing list or its successor" which seems stale even as it is written (EMU is "EAP Method Update", not "EAP"). Is this really the statement we want to make? ### pre-defined Both "pre-defined" and "predefined" (no hyphen) appear, 3 and 6 times, respectively. I removed the three hyphens in my PR but both versions appear in recent-ish RFCs, so the best we can achieve is self-consistency and I don't care which way we achieve that. ### IAB coordination In §3.1's "NOTE: the "arpa" domain is controlled by the IAB. Allocation of "eap.arpa" requires agreement from the IAB." we should probably leave an RFC-Editor note to change the text to reflect an approval, assuming one is granted. ### method vs type In §4.2 we talk about "provisioning done within the EAP type" (as with EAP-NOOB). Is it better to talk about it being within the method than the type, with the method being a better match for the actual operations performed, as contrasted to the type being an identifier for the method? ### section references for peer unauthenticated access In §5.1 we reference Section 2.1.1 of RFC 5216 for mention of "peer unauthenticated access" but give no section reference for RFC 9190, which seems like a lack of parallelism. ### naming If we're going to call the registry "EAP Provisioning Identifiers", we might want to try to bias toward using that term in the running prose; I see a lot of "predefined NAI"/"predefined identifier" and related phrasing that doesn't quite line up with the registry's terminology. ### privacy and more privacy What's the expected difference between §7 and §8.3 (both of which are titled "Privacy Considerations")? ## Notes This review is formatted in the "IETF Comments" Markdown format, see https://github.com/mnot/ietf-comments.