Enke/Srihari: Hi! How are you? I have been selected to do a BGP Directorate review of this draft. I'm very happy to see this work revived! In summary, I have some significant concerns with the document. I put in-line comments (below), but want to highlight some major issues here: (1) Implementations of Previous Versions Do implementations exist for previous versions of this document? Have they been deployed? I ask because some of the text intends to work with those implementations (by maintaining the "Ack Request", or the "Unknown Sequence Number" error subcode, for example), or even interoperate with them (by suggesting that a NOTIFICATION be ignored (!), or all of Section 8). This concerns me for two reasons: the obvious squatting and the added complexity to the specification. (2) Network-wide Capabilities While capabilities are negotiated on a session-by-session basis, the functionality they signal may need to be deployed across the entire network. For example, adding a new AF requires support beyond a single hop. This document should discuss this type of deployment and the risks of not being able to revise a capability across multiple hops. The procedures in Section 4 may result in incomplete or inconsistent instantiation of e2e services, without a failsafe. (3) Receiver Demarcation A specific case of inconsistency is the demarcation for the Receiver (§4.2), where it can take action without the loop being closed with the Initiator. Specifically: If the Init/Ack bit is set to 0...the Receiver MUST treat it as an indication of demarcation for that capability revision. The operation, as described, may result in an open loop: the Initiator may never receive the Ack (or timeout, etc.), which could lead to an inconsistent state. Note that if the Initiator tries again, the Receiver will ignore them and the loop will never close: The Receiver SHALL ignore such a capability revision that either results in no change to an existing capability, or removes a capability that was not advertised previously. (4) Security Considerations The current text says that this specification "does not change the underlying security or confidentiality issues inherent in the existing BGP". Even if that were true, the attack surface is extended by the addition of a new message type, new functionality, etc. See the comments below for specific risks. (5) FSM The FSM from Appendix 3 should be moved into the main text. It should indicate any changes or additions to the FSM in RFC 4271. Also, the current text needs a better description and should use terminology similar to that in rfc4271. To the Chairs: this draft should be marked as replacing draft-chen-bgp-dynamic-cap. Thanks! Alvaro. [Line numbers from idnits.] ... 11 Abstract 13 This document defines a new BGP capability termed "Dynamic 14 Capability", which would allow the dynamic update of capabilities 15 over an established BGP session. This capability would facilitate 16 non-disruptive capability changes by BGP speakers. [nit] This document is moving towards being a specification, not a proposal. The use of "would" makes the specification sound like a proposal. Suggestion> This document defines a new BGP capability termed "Dynamic Capability", which allows the dynamic update of capabilities over an established BGP session. This capability facilitates non-disruptive capability changes by BGP speakers. ... 86 1. Introduction ... 98 Most capabilities define just one instance of the capability (also 99 known as single-instance capabilities). Certain other capabilities 100 have multiple instances of capability (also known as multi-instance 101 capabilities). Route Refresh capability [RFC2918], is an example for 102 single-instance capability and the Multiprotocol extensions for BGP-4 103 capability [RFC2858] is a multi-instance capability as it can list 104 one or more individual address-family, sub-address-family 105 capabilities. [nit] s/instances of capability/instances of the capability [nit] s/Route Refresh capability/The Route Refresh capability/g [nit] s/example for/example of a [major] s/RFC2858/RFC4760 [minor] s/individual address-family, sub-address-family capabilities/ Address Family Identifier and Subsequent Address Family Identifier pairs 107 The IANA BGP protocol registry lists the capabilities that a BGP 108 speaker can advertise during session establishment phase. It would 109 benefit network operations if each of these capabilities can be 110 revised dynamically without resetting the session. [minor] "The IANA BGP protocol registry lists the capabilities that a BGP speaker can advertise during session establishment phase." The registry is just a table: it doesn't specify how/when the capabilities can be advertised (that is left to rfc5492 and documents like this). If this document moves forward, it should add a column to the registry to indicate when the capability can be advertised. [nit] The Capabilities Codes registry is technically not part of the IANA BGP Parameters. https://www.iana.org/assignments/capability-codes/capability-codes.xhtml All this is to say that I believe this paragraph can be deleted. 112 This document defines a new BGP capability termed "Dynamic 113 Capability", which would allow the dynamic update of capabilities 114 over an established BGP session. This capability would facilitate 115 non-disruptive capability changes by BGP speakers. [] See the comments in the Abstract. ... 125 2. Enabling a Capability on a peering session [] This section talks about the current operation. Some of the text describes behavior at a high-level without being specific ("certain capabilities"). IMO, the text could be replaced by a simple statement pointing to rfc5492: [RFC5492] specifies how capabilities can be advertised in the BGP OPEN message. All the currently-defined capabilities adhere to that specification, including the need to restart a BGP session if one of the BGP speakers wishes to enable new functionality during the lifetime of the BGP session. [Only the first sentence is really needed.] 127 [RFC5492] specifies Capabilities advertisement for BGP as an optional 128 parameter in OPEN message. By announcing the capability via OPEN 129 message a BGP speaker conveys that it is capable of receiving and 130 properly handling messages related to that capability. A BGP speaker 131 may publish one or more capabilities during the session 132 establishment. This document extends the usage of capability. [nit] s/OPEN message/in the OPEN message/g ... 143 The type of capability advertised by a BGP speaker will determine its 144 behavior during the peering session. For example, Route Refresh 145 Capability can be advertised by only one BGP speaker and by doing so, 146 it interprets and handles incoming Route-Refresh messages. A BGP 147 speaker that supports the Multiprotocol Extensions for BGP-4 148 capability, may use it after determining that the peer also supports 149 this capability. if an operator wishes to enable new functionality 150 during the lifetime of a BGP session, and if that requires a BGP 151 speaker to revise one or more capabilities, it can do so by resetting 152 the session and advertise the capabilities during OPEN as per 153 [RFC5492] [nit] s/if an operator...per [RFC5492]/If an operator...per [RFC5492]. 155 2.1. Dynamic revision of a Capability via BGP Dynamic Capability [] Perhaps rename this section to "Dynamic Capability BGP Capability" (or something that makes it clear the specification of the new capability is done here). 157 This document proposes a new BGP capability called Dynamic 158 Capability, defined as per BGP capability [RFC5492]. The Capability 159 Code for this capability is specified in the "IANA Considerations" 160 section of this document. The Capability Value field consists of a 161 list of capability codes (one-octet for each) that specify the 162 capabilities that MAY be revised dynamically by the remote BGP 163 speaker during the lifetime of a session. The list of capabilities 164 in the value field of Dynamic Capability TLV s hereby referred as 165 DCAP list. [minor] s/proposes a new BGP capability/defines a new BGP capability [minor] No need for an indirection...the code point is already assigned. s/is specified in the "IANA Considerations" section of this document/is 67 [major] The Capability Length is not specified. It is easy to assume what it may be, but we shouldn't have to assume. [minor] It would be nice to add a drawing of what the capability looks like. [nit] Even though the Capabilities format is effectively a TLV, I've never seen a capability be called a TLV. s/in the value field of Dynamic Capability TLV s/in the Capability Value field is [minor] "DCAP list" DCAP is mentioned in §3 as "DYNAMIC CAPABILITY (hereby referred to as DCAP) Message". To avoid confusion with DCAP being associated with the new message, introduce the terminology earlier. [major] To confirm, the set of capabilities in the DCAL List may include more capabilities than the ones advertised in the OPEN message, right? In fact, the lists (OPEN Capability Negotiation vs DCAP List) may be completely different. 167 By advertising the Dynamic Capability to a peer in the OPEN, a BGP 168 speaker conveys to the peer that the speaker is capable of receiving 169 and properly handling the DYNAMIC CAPABILITY message (as defined in 170 the next Section) from the peer after the BGP session has been 171 established. A BGP speaker may announce Dynamic Capability in the 172 OPEN message and during the lifetime of the session, and it may 173 revise one or more capabilities or capability-instances. The BGP 174 speaker that revises its capability by sending the Dynamic Capability 175 message is hereby referred to as Initiator. The remote BGP speaker 176 that responds to the Dynamic Capability message is hereby referred to 177 as Receiver. [major] "A BGP speaker may announce Dynamic Capability in the OPEN message and during the lifetime of the session..." The ability to (initially) advertise the Dynamic Capability remains in the OPEN only. IOW, if it is not included in the OPEN, it cannot be advertised later on. It can be revised, but not advertised (for the first time). 179 Via the Dynamic Capability message, a BGP speaker (Initiator) will 180 revise its capabilities. The receiving BGP speaker (Receiver) will 181 make a note of the Initiator's capability revisions and sends 182 messages to the Initiator pertaining to that capability. While many 183 capabilities enable information exchange via existing BGP messages, 184 some require a change in the format of the message. For example, the 185 add-path capability [RFC7911] or 4-octet AS capability [RFC6793] 186 change the structure of BGP update messages. [minor]"sends messages to the Initiator pertaining to that capability" This sounds as if the Receiver sends more than one message, but it only sends an Ack. 188 This document limits the scope of the dynamic revision of 189 capabilities and following capability revisions are allowed. [nit] s/and following capability /and the following capability ... 197 This document describes procedures for a 2-way handshake for 198 capability revision. Given the underlying TCP's reliable transport, 199 the 2-way handshake procedure is sufficient to create consistent 200 state on both Initiator and Receiver as they implement the capability 201 revision. The Initiator will initiate the handshaking process with a 202 capability revision Init message. The Receiver will acknowledge the 203 capability revision by sending a Ack message. The receipt of the ack 204 message completes the 2-way handshaking procedure and the two 205 speakers can then put the revised capability into effect. The 206 capabilities that do not result in change in the format of any 207 existing message structure can be revised dynamically via 2-way 208 handshake. The capabilities that change the format of existing 209 message structure can be revised during OPEN message or dynamically 210 via [I-D.chen-idr-enhanced-dynamic-cap] which proposes additional 211 protocol procedures. [nit] s/underlying TCP's reliable transport/underlying reliable transport [major] "Init Message...Ack message" These types of messages are not defined anywhere. Yes, after reading Section 3, I can guess that it refers to the setting of the Init/Ack bit. But we shouldn't have to guess -- please add a definition in Section 3. [major] "The Receiver will acknowledge the capability revision by sending a Ack message." The definition in Section 3 says that "The Ack Request bit indicates whether an acknowledgment is requested (when set to 1), or not (when set to 0) for a capability revision being initiated." How is that related to the text in this section, where it sounds like an Ack is always needed? [major] "The capabilities that change the format of existing message structure can be revised during OPEN message or dynamically via [I-D.chen-idr-enhanced-dynamic-cap] which proposes additional protocol procedures." This text should be removed because the document's scope is clear above. When the other ID advances, then it can refer to this document. [] The text in this section that refers to the new DCAP message may fit better *after* the new message is defined. 213 3. BGP Dynamic Capability Message ... 220 +------------------------------+ 221 | Init/Ack (1 bit) | 222 +------------------------------+ 223 | Ack Request (1 bit) | 224 +------------------------------+ 225 | Reserved (5 bits) | 226 +------------------------------+ 227 | Action (1 bit) | 228 +------------------------------+ 229 | Sequence Number (4 octets) | 230 +------------------------------+ 231 | Capability Code (1 octet) | 232 +------------------------------+ 233 | Capability Length (2 octets) | 234 +------------------------------+ 235 | Capability Value (variable) | 236 +------------------------------+ 238 Table 1 [nit] s/Table/Figure Also, please add a description. 240 The Init/Ack bit indicates whether a capability revision is being 241 initiated (when set to 0), or being acknowledged (when set to 1). 243 The Ack Request bit indicates whether an acknowledgment is requested 244 (when set to 1), or not (when set to 0) for a capability revision 245 being initiated. [minor] Given the text in §2.1 about enabling a 2-way handshake, when should this bit not be set? In fact, it looks like this bit is not needed. ... 250 The Action bit is 0 for advertising a capability, and 1 for removing 251 a capability. [major] Much of the text talks about "revising" a capability, which (to me) means "changing". Sure, advertising or removing changes the capability. Can a capability be changed (the time for GR, for example), or does it need to be removed and then advertised? 253 The Sequence Number field MAY be used by a BGP speaker to co-relate 254 the responses to the capability revision that the speaker initiated 255 previously for debugging purposes. [minor] I was going to ask about the use of the Sequence Number, and then found this text in §4.1 -- consider moving it here. The assignment of the Sequence Number is a local matter, and may be used to correlate the responses from the Receiver. This can be helpful during troubleshooting any problems in capability revision. 257 Conceptually the triple is the same as the one defined in [RFC5492], and it 259 specifies a capability for which the "Action" shall be applied. The 260 Capability Length field, though, is larger than the one specified in 261 [RFC5492]. [major] Why are you defining a longer Capability Length? How should the length be aligned so that it matched the Capability Length defined in rfc5492? [major] I know that there's an error code to indicate "Invalid Capability Length". Indicate here the valid lengths. ... 277 4. Procedures for handling BGP Dynamic Capability Revisions 279 A BGP speaker that is willing to receive the DCAP message for a 280 capability from its peer SHOULD use the BGP Capabilities 281 Advertisement [RFC5492] to advertise the Dynamic Capability 282 containing the capability code. A DCAP message MAY be received only 283 in the Established state. Receiving a DCAP message in any other 284 state is a Finite State Machine Error as defined in [RFC4271]. A BGP 285 speaker SHOULD reset the HoldTimer upon receiving a DCAP message from 286 its peer. [major] "A DCAP message MAY be received" In this case, "MAY" is not indicating a normative option, but a fact: s/MAY/may [major] Any changes/updates to the FSM should be explicitly included. See the comments in Appendix 3. [major] "SHOULD reset the HoldTimer upon receiving a DCAP message" Why is this action recommended and not required? The FSM (rfc4271) resets the HoldTimer when any other message is received. Why is the DCAP message different? ... 295 4.1. Procedures for the Initiator 297 The Initiator MUST only proceed with following steps if the Receiver 298 has advertised Dynamic Capability indicating that it is capable of 299 handling DCAP message. For the capability 'c' that the Initiator is 300 going to revise, it MUST also verify if the Receiver has listed 301 capability 'c' in the DCAP list. If the Receiver is not capable of 302 Dynamic Capability or if 'c' is not in the DCAP list, the Initiator 303 should log and discard the capability revision. [nit] s/with following steps...advertised Dynamic Capability...handling DCAP message/ with the following steps...advertised the Dynamic Capability...handling the DCAP message [major] The summary of this first paragraph is: "the Initiator MUST NOT send a DCAP message is the indented Receiver did not advertise the capability to be revised in its DCAP list." IMO, that is a lot clearer than the current text. Among other reasons: (1) there is no interoperability need to use "MUST also verify" because that is an internal operation, and (2) it is not clear what action is taken when "discard the capability revision" is mentioned. [major] What should the Receiver do if it receives a DCAP message for a capability it doesn't support? 305 When the Initiator sends a DCAP message to its peer to initiate a 306 capability revision, the Init/Ack bit for the capability revision in 307 the message MUST be set to 0 indicating that the capability revision 308 has been initiated. The Ack Request bit MUST be set to 1 indicating 309 that capability MUST be acknowledged. The assignment of the Sequence 310 Number is a local matter, and may be used to correlate the responses 311 from the Receiver. This can be helpful during troubleshooting any 312 problems in capability revision. The capability that is being 313 revised will be encoded as per IANA BGP Protocol registry capability 314 codes. While a capability revision is in progress, the Initiator 315 MUST NOT initiate another revision of the same capability (or the 316 same capability instance for a multi-instance capability). [major] "The capability that is being revised will be encoded as per IANA BGP Protocol registry capability codes." The registry is just a list of numbers; it doesn't explain the encoding. Suggestion> The revised capability is encoded per the corresponding specification that defines it. [major] "the Initiator MUST NOT initiate another revision" What action should the Receiver take if the Initiator does? ... 325 * If the Initiator intends to add a new capability, it must not 326 enable the capability or send messages based on the new capability 327 revision. As there is no prior state, it MUST discard any 328 received messages pertaining to that capability. 330 * If the Initiator intends to remove an existing capability, it must 331 not disable the capability but continue to send and process the 332 received messages pertaining to that capability. [nit] Should the "MUST NOT" in the second bullet also be an rfc2119 keyword? To match the style of the first bullet. 334 After receiving the DCAP message carrying capability acknowledgement 335 with Init/Ack bit set to 1 from the Receiver, the Initiator MUST 336 validate the DCAP message verifying the Capability code that was 337 revised. This is the demarcation indicator for the Initiator. With 338 this, the Initiator's capability revision finite state machine is 339 complete and it can then function in accordance with the new 340 capability revision as follows: [minor] "the Initiator's capability revision finite state machine" Please put a reference to the Appendix. 342 * If the Initiator added the capability, it can now process any new 343 messages received, based on the revised capability. 345 * If the capability was withdrawn by the Initiator, it may reset the 346 internal state. Since the prior state is cleared, it may begin to 347 discard the new messages that may be received from the Receiver 348 pertaining to the removed capability. 350 To put an upper bound on the amount of time for capability revision, 351 an implementation MUST support a (configurable) timer 352 CapabilityRevisionTimer that imposes this upper bound. The Initiator 353 starts the CapabilityRevisionTimer when it starts the capability 354 revision by sending the Init message. The timer is stopped with 355 Initiator receives the Ack message from the Receiver. When 356 CapabilityRevisionTimer times out and the capability revision is 357 still in progress, the dynamic capability revision MUST be discarded. 358 This document recommends logging this error condition for 359 troubleshooting purpose and no further attempts for dynamic 360 capability revision should be made without administrator 361 intervention. This document recommends 10 minute timeout value or a 362 similar large value to avoid premature discard of capability 363 revision. [nit] s/The timer is stopped with/The timer is stopped when the [major] "recommends 10 minute timeout value or a similar large value" This document should specify the default value and allow it to be changed. If the default is not specified, the timers may be incongruent, with one side following the recommendation for a "large value" while the other may consider 1 min (for example) to be "large" enough. Suggestion> CapabilityRevisionTimer MUST be set to 10 minutes by default. If the operator decides to change the value, it should be the same on both sides of the BGP session. 365 4.2. Procedures for the Receiver 367 The Receiver should expect more than one capability tuple in the DCAP 368 message and should process each capability revision independently. 369 In the received DCAP message, if the Init/Ack bit is set to 1, it 370 SHOULD silently discard the capability revision. For troubleshooting 371 purposes, the unexpected acknowledgement may be logged. [major] "The Receiver should expect more than one capability tuple in the DCAP message and should process each capability revision independently." Does this mean that the Initiator may include more than one capability "block" (what is in Table 1) in a DCAP message? If so, then it is not clear in Section 3. In fact, this seems to be the only place where that is implied. Am I reading too much into it? [major] This text made me think of this text in §3: If multiple capability instances..., then each capability instance MUST be revised individually, one capability instance at a time. ...and its relationship to this from §4.1: ...the Initiator MUST NOT initiate another revision of the same capability (or the same capability instance for a multi-instance capability). ...and the text above. If multiple capabilities (or instances) are included in the same DCAP message, how does that fit with §3? Is that operation considered to be at different times since the Receiver may Ack them individually (can they?)? 373 If the Init/Ack bit is set to 0, the Receiver MUST first validate the 374 capability code. If the capability code is not listed in the Dynamic 375 Capability (DCAP list) advertised by the Receiver itself, and the 376 Receiver MUST send a NOTIFICATION message back to the Initiator as 377 specified in the Error Handling section. For a valid capability 378 code, the Receiver MUST treat it as an indication of demarcation for 379 that capability revision. [major] See the comment above about "MUST first validate". Suggestion> If a DCAP message is received with the Init/Ack bit set to 0 and containing a Capability Code not advertised by the Receiver, it MUST send a NOTIFICATION... [See the comments in the Error Handling section about this type of NOTIFICATION.] [major] "the Receiver MUST treat it as an indication of demarcation for that capability revision" AFAIU, the demarcation means that the Receiver can start processing the revision. But what exactly does that mean? If the capability is being removed, for example, should the Receiver start ignoring related UPDATES from its Peer? Note that the Initiator has not received the Ack from the Receiver (and may never receive it), so the state could become inconsistent. ... 386 The Receiver SHALL update the capability previously received from the 387 Initiator based on the Action bit in the message, and then function 388 in accordance with the revised capability for the peer. The Receiver 389 SHALL ignore such a capability revision that either results in no 390 change to an existing capability, or removes a capability that was 391 not advertised previously. The procedures specified in the "Error 392 Handling" section SHOULD be followed when an error is detected in 393 processing the CAPABILITY message. [major] "SHOULD be followed" When is it ok not to follow the Error Handling section? IOW, why is this action recommended and not required? It is not necessary to include a Normative statement here. Instead, just point at the section. Suggestion> The "Error Handling" section (Section 7) provides more details in cases when an error is detected. 395 5. Revising capabilities via Dynamic Capability ... 404 A BGP speaker may choose to advertise one of more capabilities. If 405 it has advertised Dynamic Capability (via OPEN or dynamically) it can 406 accept Dynamic Capability message from remote BGP speaker, The value 407 of Dynamic Capability TLV is DCAP list. By having a capability in 408 the DCAP list, the local BGP speaker is indicating that it has the 409 support and ability to handle the revision (add or delete) of that 410 capability. The remote BGP speaker makes a note of the list of 411 capabilities in the DCAP list and performs the revision during the 412 lifetime of the peering session. [] Everything in this paragraph is already covered elsewhere. [major] "has advertised Dynamic Capability (via OPEN or dynamically)" The ability to (initially) advertise the Dynamic Capability remains in the OPEN only. IOW, if it is not included in the OPEN, it cannot be advertised later on. It can be revised, but not advertised (for the first time). [nit] s/Dynamic Capability/the Dynamic Capability/g [nit] s/from remote BGP speaker/from a BGP speaker [nit] ", The value of Dynamic Capability TLV is DCAP list." This sounds like an orphan fragment. 414 It is quite possible to list Dynamic Capability itself in DCAP list. 415 This means that local BGP speaker can handle the revision of Dynamic 416 Capability itself, thereby allowing add/delete capabilities from DCAP 417 list. This document recommends that BGP speakers list the Dynamic 418 Capability Code in Dynamic Capability. This will allow a BGP speaker 419 to revise the list capability instances during the lifetime of the 420 peering session by sending a DCAP message with Dynamic Capability 421 revising DCAP list. [nit] s/DCAP list/the DCAP list/g [nit] s/that local BGP speaker/that the local BGP speaker [nit] The last sentence repeats the first three. 423 6. Limitations of the BGP Dynamic Capability 425 If the capability results in change of the format of the messages, it 426 is important to have tighter co-ordination. For example, the 427 procedures specified in this document does not provide demarcation 428 enough for the Receiver to know when the Initiator will advertise the 429 messages based on the revised capability. Hence, the capabilities 430 that have bi-directional capability dependency requiring 3-way 431 handshake will not function accurately. With this limitation, 432 following capabilities can be revised using the procedures mentioned 433 in this document. [?] "bi-directional capability dependency requiring 3-way handshake will not function accurately" Which are these? Can you provide an example? [major] The definition/specification in this paragraph is mentioned as an example. Suggestion> Not all capabilities can be revised using the procedures in this document. Capabilities that result in a change in the message format, or that require bi-directional advertisement are not supported. Only the following capabilities can be revised using the specification in this document: 435 * Multiprotocol Extensions for BGP-4 437 * Route Refresh Capability for BGP-4 439 * BGP Role 441 * Graceful Restart 443 * Enhanced Route Refresh 445 * Long-Lived Graceful Restart 447 * Routing Policy Distribution 449 * FQDN [major] FQDN is documented in an expired individual draft. If you list it, you should also consider other capabilities that are currently documented in a draft. For example, Multisession, software version, paths-limit, or metadata…. In fact, because the registration policy of part of the registry is FCFS, all capabilities in that space should be covered. [major] The Dynamic Capability itself is not listed. [major] What about ORF? [major] The BGP Extended Message capability doesn't strictly result in changing the message format...why is it not included? [major] Please add a statement requiring future capabilities to include a statement of whether they can be dynamically revised. [major] It would also be a good idea to include the ability to revise a capability in the registry. This would also require a change in the registration policy to at least include DE review. [] It would be nice to include a table that includes the Capability name, code point, and whether it is supported...and if not, why. 451 The remaining capabilities may only be advertised via OPEN message 452 during session establishment. [major] What should a receiver do if one of those capabilities is included in the DCAP message? 454 7. Error Handling ... 464 Subcode Description 465 1 Unknown Sequence Number (deprecated) 466 2 Invalid Capability Length 467 3 Malformed Capability Value 468 4 Unsupported Capability Code [major] This document establishes this set of subcodes. It doesn't need to indicate anything as "deprecated" unless implementations have already been deployed with that codepoint. Do implementations exist?? 470 If a BGP speaker detects an error while processing a CAPABILITY 471 message, it MUST send a NOTIFICATION message with Error Code 472 CAPABILITY Message Error. If any of the defined error subcode is 473 applicable, the Data field of the NOTIFICATION message MUST contain 474 the tuple for the capability revision that causes the speaker to send 475 the message. On the receipt of such a NOTIFICATION message, the BGP 476 speaker should log for troubleshooting purposes. The ongoing 477 capability revision MUST be discarded by both Initiator and Receiver. 478 No new capability revisions can be initiated until administrator 479 intervention. [major] "The ongoing capability revision MUST be discarded by both Initiator and Receiver." This sentence is unnecessary because if a NOTIFICATION is sent, then the session is reset, so whatever was going on there is discarded. [major] "No new capability revisions can be initiated until administrator intervention." What does this mean? The session will restart—is that considered an administrative event, or do you expect a knob to allow revisions through session restarts? I'm assuming any revision would result from a configuration change, which I would call "administrator intervention". 481 This document revises the usage of Sequence Number. The DCAP message 482 fields are sufficient to correlate the message between the Initiator 483 and the Receiver, hence the Sequence Number usage is limited to 484 diagnostic purposes. The BGP speaker MUST not validate the Sequence 485 Number received and MUST not send NOTIFICATION message with Unknown 486 Sequence Number. If a NOTIFICATION message with Error code Unknown 487 Sequence Number is received, it MUST be logged for troubleshooting 488 purposes before silently discarding it. [major] "This document revises the usage of Sequence Number." See the comment about the deprecated value... [major] "If a NOTIFICATION message with Error code Unknown Sequence Number is received, it MUST be logged for troubleshooting purposes before silently discarding it." Once a NOTIFICATION is received, the session is reset, no matter what. Even if the local speaker could ignore the event, the sender should have already reset the session. :-( IOW, the reason for the reset is irrelevant. [major] s/MUST not/MUST NOT ... 498 If the Capability Code in the CAPABILITY message is not any of the 499 capability codes advertised in the Dynamic Capability by the speaker, 500 then the error subcode is set to Unsupported Capability Code. [major] Why should the session be reset in this case? Note that rfc5493 is explicit about an unknown capability not resulting in a session reset: If a BGP speaker receives from its peer a capability that it does not itself support or recognize, it MUST ignore that capability. In particular, the Unsupported Capability NOTIFICATION message MUST NOT be generated and the BGP session MUST NOT be terminated in response to reception of a capability that is not supported by the local speaker. 502 8. Backward compatibility with existing deployment [nit] s/deployment/deployments 504 The new protocol procedures can work with existing implementations of 505 Initiator and Receiver. The following section describes the 506 different scenarios. [major] "existing implementations of Initiator and Receiver" Existing implementations?? This is the first specification -- any pre-specification implementation should be out of scope! ... 538 9. IANA Considerations 540 This document introduces a new CAPABILITY message type for BGP. IANA 541 is requested to allocate the message type. [major] Indicate the registry that this value will come from. 543 This document proposes NOTIFICATION message to handle errors during 544 capability revision. IANA is requested to allocate NOTIFICATION 545 error code for handling such errors. [nit] s/propose/specify/g [major] Indicate the registry that this value will come from. 547 This document proposes Dynamic Capability that BGP speaker announces 548 in the OPEN message. IANA has assigned code point 67 for Dynamic 549 Capability. [major] This has already been assigned. Mention the value and the registry instead, and request that the entry point to this document. [major] This section should also include the request for the new registry (Error Subcodes), the initial values, references, and allocation policy [rfc8126]. 551 10. Security Considerations 553 The extension proposed in this document does not change the 554 underlying security or confidentiality issues inherent in the 555 existing BGP [RFC4271]. [minor] Also mention that the considerations from rfc5492 apply. [major] Several things come to mind: - Dynamically enabling a capability has similar characteristics to enabling it using the OPEN message. For the different supported capabilities, their Security Considerations should apply here as well. - Having a new message type increases the potential attack surface of the protocol. A rogue node may send the new message with incorrect information to cause a session reset. Yes, this can be done with current messages, but the new message increases that risk. Look at the message analysis in rfc4272. - A (rogue) Receiver may not ACK a message, resulting in a timeout and a delay (or complete failure) of the capability revision. The practical result is that a service in the network may not be deployable, which implies the ability to perform a denial-of-service attack. - A downgrade attack is also possible: a rogue Initiator may remove a capability. For example, removing RR/ERR. - While capabilities are negotiated on a session basis, some functionality has to be enabled network-wide -- for example, support for a new AF. If the chain of BGP sessions doesn't all successfully revise a capability, the service in the network may be inconsistent (some support it, but some don't), which could also result in unexpected forwarding, increased congestion, transient loops/dropped packets (as the revision is propagated), etc... - Before dynamic capabilities, a rogue BGP speaker could, for example, inject or withdraw routes. Now it can also change the rules of the session -- "session compromise". The blast radius may expand beyond a single peering session. - A rogue Initiator can repeatedly add/remove a capability, which will force reconfiguration, maybe exchange of UPDATES, RIBs built/torn down repeatedly, control plane instability, etc.. - Except for ADD_PATH, which is not supported, I don't remember a capability that alters the route selection semantics. But if there was one in the future, a dynamic revision could result in bypassing policy or other unintended behavior. This is a general risk... ... 628 12.3. Appendix 3: Operational states for Capability Revision 630 A BGP speaker MUST maintain states about whether a capability has 631 been advertised, or received during the lifetime of the BGP session. 632 For a multi-instance capability, the states of the capability and its 633 revision MUST be instance specific. [major] Is the FSM shown here intended to be Normative/Mandatory (as indicated by the MUSTs above)? I suggest that his part of the appendix be moved to the main body of the document, and an introductory paragraph similar to rfc4271 be included: The data structures and FSM described in this document are conceptual and do not have to be implemented precisely as described here, as long as the implementations support the described functionality and they exhibit the same externally visible behavior. 635 The following symbols are designated for that purpose: 637 "L:" refers to the local speaker and "R:" refers to the remote speaker 639 L:Cap.True - Capability advertised 640 R:Cap.True - Capability received 642 L:Cap.False - Capability not advertised 643 R:Cap.False - Capability not received [major] Advertised and received in the OPEN or in the DCAP message or ?? 645 L:Dyn.Oper.None/Add/Del - Following Capability revision may be triggered 646 at Idle state 647 Operator adds a capability OR 648 Operator deletes a capability [major] "None/Add/Del" What does "None" mean? [major] "at Idle state" I assume you mean "in the Established State". 650 L.Send.None - Router does not send DCAP message 651 R.Send.None - Router does not send DCAP message 653 L.Recv.None - Router does not receive DCAP messages 654 R.Recv.None - Router does not receive DCAP messages [] I'm assuming this is a state where the routers haven't sent or received. 656 L.Send.Init - Router sends DCAP message with Init/Ack=0 657 R.Recv.Init - Router receives DCAP message with Init/Ack=0 659 L.Recv.Ack - Router receives DCAP message with Init/Ack=1 660 R.Send.Ack - Router sends DCAP message with Init/Ack=1 662 During the dynamic revision of a capability, there are separate 663 states, "Sending State", and "Receiving State" driven by the Dynamic 664 Capability revision. [] Describe the FSM using similar notation as in rfc4271. 666 12.3.1. Appendix 3.1: Initiator Capability Revision 668 States for Initiator as it revises its capability. 670 L.Dyn.Oper.None/Add/Del 671 L:Send.None 672 L:Recv.None 673 L:Send.Init 674 L:Recv.Ack 676 State transitions: 678 L:Cap.True/False 679 L:Dyn.Oper.None/Add/Del 680 L:Send.None ---> L:Send.Init ------+ 681 L:Recv.None | 682 | | 683 ^ v 684 | | 685 |------<--- Timeout ----<----+ 686 | | 687 ^ v 688 | | 689 +----<---- L:Recv.Ack --<----+ [] There should be a description of the events that cause transitions, etc.. ... 712 12.4. Deployment use cases [] Include a pointer in the main text to the Appendix. ... 719 12.4.1. Adding a capability dynamically 721 During session establishment, the Receiver advertises Cap-1, Cap-2 722 and DynCap in OPEN message. The DynCap contains Cap-1 and Cap-2 723 indicating that Receiver is capable of handling dynamic revision of 724 capabilities Cap-1 and Cap-2 dynamically. The Initiator during 725 session establishment advertises Cap-1 and DynCap in OPEN message. 726 After session is established, sometime during the lifetime of the 727 peering session, the Operator enables a functionality on the 728 Initiator that requires Cap-2. Since Receiver allows dynamic 729 revision of Cap-2, the Initiator sends DCAP message with Action "add" 730 for Cap-2. The Receiver acknowledges addition of Cap-2 and after 731 receiving the ack, the Initiator puts additions of Cap-2 into effect. [minor] Only this section uses the "DynCap" term. Please be consistent. [minor] s/DynCap list/DCAP list/g 733 12.4.2. Deleting a capability dynamically [minor] In the main body, the text talks about "removing" (not deleting) a capability. Be consistent. ... 745 12.4.3. Network upgrade without Non-Stop-Routing [] Reference for NSR?? 747 The Initiator is upgraded and during session establishment, it 748 advertises Cap-1 and DynCap in OPEN message. The Receiver is not 749 upgraded and during the session establishment, it advertises Cap-1 750 and DynCap in OPEN message. The DynCap lists Cap-1 indicating that 751 Receiver is capable of habdling dynamic revision of Cap-1 only. As 752 part of the upgrade, the Initiator supports new functionality and new 753 capability Cap-2. The Operator wishes to enable new functionality 754 during the lifetime of the peering session and as a result the 755 Initiator wants to "add" Cap-2. It cannot send DCAP message with 756 Action "add" for Cap-2 because the Receiver does not have support to 757 handle Cap-2. The Initiator can add Cap-2 only when Receiver allows 758 the dynamic revision of Cap-2. [] This example seems obvious (and unnecessary) to me: if the peer doesn't support the dynamic revision of a capability, it cannot be revised... 760 12.4.4. Network upgrade with Non-Stop-Routing 762 The Initiator has advertised Cap-1 and DynCap capabilities in the 763 OPEN message. The Receiver has advertised Cap-1 and DynCap 764 capabilities in the OPEN message. The DynCap lists contains Cap-1 765 and DynCap capabilities indicating that Receiver is capable of 766 handling dynamic revision of Cap-1 and DynCap list. The Initiator is 767 upgraded without resetting the BGP session and the Initiator now has 768 additional capability Cap-2 that it wants to revise. The Initiator 769 will revise DynCap capability with action "add" and the DynCap list 770 will contain Cap-1, Cap-2 and DynCap capabilities. The receiver 771 acknowledges that revision of DynCap capability. When the Receiver 772 is updated without resetting the session, it may revise the DynCap 773 capability adding Cap-1, Cap-2 and DynCap capabilities. With this, 774 the Receiver is indicating that it is capable of handling Cap-2 775 capability revision. The Initiator may send DCAP message with Action 776 "add" for Cap-2. The Receiver may follow the similar process 777 independently, thus allowing asynchronous network upgrade without 778 resetting the BGP session. [] I may be missing something obvious, but I see no difference in these examples: until both have the support for Cap-2, it cannot be revised. [EoR-19]