Hello

I have been selected to do a routing directorate “IETF Last Call review” of this draft.
https://datatracker.ietf.org/doc/html/draft-ietf-grow-bmp-bgp-rib-stats-10


Document: draft-ietf-grow-bmp-bgp-rib-stats-10
Reviewer: Bruno Decraene
Review Date: 2025-10-15
Intended Status: Standards Track
Review result: Has issues

I have been selected as the Routing Directorate reviewer for this draft. For background, see the RtgDir wiki.

Summary: I have some concerns about this document that I think should be resolved before it is submitted to the IESG.


Comments:
Draft is well written and clear.
Note that I'm not familiar with BMP, hence I may be missing something.


===== Major:
The Gauges defined in this document relies on terms which are not always clearly specificed in my opinion. e.g. primary path, Adj-RIB-In, pre-policy Adj-RIB-In, post-policy Adj-RIB-In...
I would call for adding a "definition" section to normatively specify the terms used for specifying gauges, in a clear and unabiguous way. Possibly pointing to references in existing STD track RFC.

More details are to be found in the "minor comment" section of this review, but some examples below:
- Adj-RIB-In. Not defined, no reference provided. Reader would assume RFC 4271 which says: "Adj-RIBs-In: The Adj-RIBs-In stores routing information learned from inbound UPDATE messages that were received from other BGP speakers." and "In summary, the Adj-RIBs-In contains unprocessed routing information that has been advertised to the local BGP speaker by its peers;". This seems to be the routes received in the UPDATES, so it's not clear that there is a stage/state before receiving UPDATES. So what is "pre-policy Adj-RIB-In" which is presumably before Adj-RIB-In?. In addition  "pre-policy Adj-RIB-In" points to RFC7854 which says that "pre-policy Adj-RIB-In" is equal to "Adj-RIB-In"... Given this, some text are difficult to understand. e.g., "Type = 18: (64-bit Gauge) Current number of routes in pre-policy Adj-RIB-In [RFC7854]. This gauge updates stats type 7 defined in [RFC7854] and makes it an explicit for pre-policy Adj-RIB-In."

- pre-policy Adj-RIB-In. Not defined, no reference provided. e.g. Are routes received, but discarded as per Treat-as-Withdraw (RFC 7606) counted or not? On one hand, seems obviously yes since they have been received. On the other hand, they are not likely in the RIB-in so no. Readers and implementers shouldn't have to guess. 

- post-policy Adj-RIB-In. What is the difference between "post-policy Adj-RIB-In" (type 21) and "accepted by inbound policy" (type 23)?.
What about routes filtered by "draft-haas-idr-path-attribute-filtering" ? Would they be in pre-policy Adj-RIB-In, Adj-RIB-In, post-policy Adj-RIB-In?
 
- post-policy Adj-RIB-Out. Not defined, no reference provided. Should we assume that "post-policy Adj-RIB-Out" on upstream BGP router (exactly) equals to "pre-policy Adj-RIB-In" on the downstream BGP router? (given that there is essentially a direct link/TCP socket between both). If so, could you please say so. If not, could you please define the difference. What about Route Contrained Filtering (RFC 4684)? Are the routes filtered by RTC counted in post-policy Adj-RIB-Out (because it's not "policy" filtering) or not? What about ORF? (which is not a local policy but could be read as a "policy" from my downstream neighbhor)...


===== Minor:
Type 18 & 19
"This gauge updates stats type 7 defined in [RFC7854] and makes it an explicit for pre-policy Adj-RIB-In."

What do you mean with that? Are you redefining type 7? If so, shouldn't this document state that it UPDATEs RFC 7854?
Or do you just define type 18 and explain why type 7 was not enough? Or not specified clearly enough?

Also statistics seems to be sent with a per-peer header which contains the L flag indicating whether the message reflects the post-policy Adj-RIB-In or the pre-policy Adj-RIB-In. Should we understand that those types definition override the L-flag? i.e., they advertise pre-policy Adj-RIB-In even if the L-flag is set? Or can't be advertised when the L-flag is set? if so what's the error handling ? (ignore?)

---
Type 18 & 19

"When the monitoring station supports both type 7 and type 18, the monitored router SHOULD send only one of these types."
Why not specifying which type is to be sent? (presumably type 18)
Why a SHOULD and not a MUST?
My reading is that type 7 was not clearly specified enough. If so, do you want to re-specify type 7 to make it non-ambiguous? or deprecate it since draft seems to say that both are not needed? 

---
"Current number of routes in pre-policy Adj-RIB-In [RFC7854]. This gauge updates stats type 7 defined in [RFC7854] and makes it an explicit for pre-policy Adj-RIB-In."

Adj-RIB-In, pre-policy Adj-RIB-In, post-policy Adj-RIB-In... Does this document normatively define those terms, or could you point to definitions in STD track RFCs?
In particular, as per RFC 4271, "the Adj-RIBs-In contains unprocessed routing information that has been advertised to the local BGP speaker by its peers". This looks the same as what this document calls "pre-policy Adj-RIB-In". If so, why using two different terms? Why the RFC 7854 type 7  ("Number of routes in Adj-RIBs-In") was not clear enough for implementations?
If not, what is the difference?
What about route received in BGP update but discarded as per Treat-as-Withdraw (RFC 7606). In which type of RIB are they counted/not counted?

---

"A primary route is a recursive or non-recursive path whose next-hop resolution ends with an adjacency (see, e.g., [I-D.ietf-rtgwg-bgp-pic])."
Can you point to a STD track RFC defining what a primary route is? If not, can you provide a formal definition?
If bgp-pic is the source of the definition, it would need to be a normative reference. But note that bgp-pic is 1) an informational document, 2) a 10 years old WG document which eventually may never progress. 
Are the "primary routes" the same as the Multipaths routes defined in draft-ietf-idr-add-paths-guidelines?
--

Some Gauge are defined as "Current number of routes" while some others as "Number of routes".e.g. type 21 and 23.
If there is no reason, please consider consistency. If there is a reason, please consider expliciting the difference, e.g., at the beguining of section 2.

---
Type 24 & 25 & 26 & 27 & 28 & 31 & 32
"This statistic would apply to Loc-RIB view as well."
I've quickly read RFC 7854 and I haven't seen the ability to specify or distinguish the type of RIB those statistics refers to. Hence I'm not sure what it means by applying to both Adj-RIB-IN and Loc-RIB. You are not counting the route twice?
Why using "would" for a normative definition?
---

Type 25
"A backup path is also installed in the Loc-RIB, but it is not used until some or all primary paths become unreachable. Backup paths are used for fast convergence in the event of failures."
Could you normatively define "backup path"? Are all paths not selected as best or primary are to be considered as "backup"? Or do you only mean the Next-Best path? (as per draft-ietf-idr-add-paths-guidelines). Or is this related to FIB installation...?

As per RFC4271, I'm not sure such backup routes would be in the Loc-RIB (RFC 4271 "Phase 2 is invoked on completion of phase 1.  It is responsible for choosing the best route out of all those available for each distinct destination, and for installing each chosen route into the Loc-RIB." With "A backup path is also installed in the Loc-RIB" is your intention to update RFC 4271? i.e. BGP speaker implementing this specification now needs to install backup path in the Loc-RIB?

---
type 23
"Type = 23: (64-bit Gauge) Number of routes in per-AFI/SAFI accepted by inbound policy. "
Does "accepted by inbound policy" (in type 23) means the same as "post-policy Adj-RIB-In" (type 21)? If so, can a single terminology be used? Plus type 21 and 23 seems the same. If not, please explicitly clarify the differences.

---
type 23
"Some implementations, or configurations in implementations, may discard routes that do not match policy and thus the accepted count and the Adj-RIB-In counts will be identical in such cases"

For clarity, could you please add the type numbers that you are refering to? I'm assuming NEW "Some implementations, or configurations in implementations, may discard routes that do not match policy and thus the accepted count (type 23) and the Adj-RIB-In counts (type 9 ??) will be identical in such cases."

Essentially, you are saying that the Gauge value will be different depending on implementations? i.e., this STD specification does not provide for interoperability?
If an implementation can't make the distinction between 2 types (e.g. type 23 and type 9) I think it would be preferable that this implementation does not advertise the stats that it does not support, rather than changing the specification of this type.

---
type 30
"Type = 30: (64-bit Gauge) Number of routes in per-AFI/SAFI left until reaching the received route threshold as defined in Section 6.7 of [RFC4271]. "
RFC4271 couldn't define per-AFI/SAFI threshold as it does not define multiple SAI/SAFI. So may be better rephrasing. e.g., s:/as defined in/following the model of
Also may be :s/in per-/per-   (alternatively specify in _what_)

---
type 31
"Number of routes left until reaching a license-customized route threshold. This value is affected by whether a customized license exists for the relevant address family,"
For this type, it's independent of AFI/SAFI. I would suggest removing "for the relevant address family"
---
type 38
"This counter only considers routes distributed from Loc-RIB into the Adj-RIB-Out and does not include cases like BGP add-paths [RFC7911]."
Could you clarify what this means when add-path is configured for the outbound peer?
To me add path routes are distributed from Loc-RIB into Adj-RIB-Out.

may be :s/This counter/This gauge

----
type 41
"Current number of routes in per-AFI/SAFI post-policy Adj-RIB-Out"
Can you point to a definition of "post-policy Adj-RIB-Out"?
In particular does RT constraint filtering (RFC 4684) count as policy filtering? i.e. are they in the post-policy Adj-RIB-Out?
Same question for ORF.

---
The number of types defined is significant. It's not easy for a reader to have a global view / summary.
As a first time/occasional reader, I would have appreciated a short summary to quickly highlight the differences and what I'm looking for.
e.g. a new section before §4 with 1 line description per type
...
21: routes in per-AFI/SAFI post-policy Adj-RIB-In
22: routes in per-AFI/SAFI rejected by inbound policy.
23: routes in per-AFI/SAFI accepted by inbound policy.
...

Do you think this would be helpful & doable?


===== Nits:
Type 18 & 19
"This gauge updates stats type 7 defined in [RFC7854] and makes it an explicit for pre-policy Adj-RIB-In."
Possibly a typo around "an explicit for". 
May be :s/an explicit/explicitly
--
"These routes are active routes which should otherwise would have been advertised in absence of outbound policy which rejected them."
may be :s/should otherwise would/otherwise would