This is RTGDIR review of draft-ietf-bier-oam-requirements-18.

Status:
  Has issues.

Summary: 
  Thanks a lot for the work on this draft.

  This document is overall very useful and on a good path. Nevertheless, there
  are a couple of topics that should demand more attention. Appropriate explanations
  why and suggested changes are provided inline. Some feedback addresses hardware support
  issues and would benefit from vendor implementors review and opinions.

Details:

  Format of the review is numbered original lines from idnits interspersed with inline
  comments classified with numbered (for easier reference) nit/minor/mayor.

2        BIER Working Group                                        G. Mirsky, Ed.
3        Internet-Draft                                                  Ericsson
4        Intended status: Informational                                  N. Kumar
5        Expires: 28 March 2026                               Cisco Systems, Inc.
6                                                                         M. Chen
7                                                             Huawei Technologies
8                                                              S. Pallagatti, Ed.
9                                                                          VMware
10                                                               24 September 2025

12         Operations, Administration and Maintenance (OAM) Requirements for Bit
13                        Index Explicit Replication (BIER) Layer

1. nit:

Suggest to move (OAM) behind (BIER) to have (BIER OAM).
Just because that's what you define as the term in the terminology section
and candidate readers might look for that term specifically.

14                          draft-ietf-bier-oam-requirements-18

16        Abstract

18           This document describes a list of functional requirements toward
19           Operations, Administration and Maintenance (OAM) toolset in Bit Index

2. nit:
s/toolset/protocols, methods, and tools/

Explanation: To be consistent with terminology section definition of "BIER OAM"

20           Explicit Replication (BIER) layer of a network.

22        Status of This Memo

24           This Internet-Draft is submitted in full conformance with the
25           provisions of BCP 78 and BCP 79.

27           Internet-Drafts are working documents of the Internet Engineering
28           Task Force (IETF).  Note that other groups may also distribute
29           working documents as Internet-Drafts.  The list of current Internet-
30           Drafts is at https://datatracker.ietf.org/drafts/current/.

32           Internet-Drafts are draft documents valid for a maximum of six months
33           and may be updated, replaced, or obsoleted by other documents at any
34           time.  It is inappropriate to use Internet-Drafts as reference
35           material or to cite them other than as "work in progress."

37           This Internet-Draft will expire on 28 March 2026.

39        Copyright Notice

41           Copyright (c) 2025 IETF Trust and the persons identified as the
42           document authors.  All rights reserved.

44           This document is subject to BCP 78 and the IETF Trust's Legal
45           Provisions Relating to IETF Documents (https://trustee.ietf.org/
46           license-info) in effect on the date of publication of this document.
47           Please review these documents carefully, as they describe your rights
48           and restrictions with respect to this document.  Code Components
49           extracted from this document must include Revised BSD License text as
50           described in Section 4.e of the Trust Legal Provisions and are
51           provided without warranty as described in the Revised BSD License.

53        Table of Contents

55           1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
56             1.1.  Conventions used in this document . . . . . . . . . . . .   2
57               1.1.1.  Terminology . . . . . . . . . . . . . . . . . . . . .   2
58               1.1.2.  Requirements Language . . . . . . . . . . . . . . . .   3
59               1.1.3.  Acronyms  . . . . . . . . . . . . . . . . . . . . . .   3
60           2.  Requirements  . . . . . . . . . . . . . . . . . . . . . . . .   3
61           3.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   5
62           4.  Security Considerations . . . . . . . . . . . . . . . . . . .   5
63           5.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   5
64           6.  Normative References  . . . . . . . . . . . . . . . . . . . .   5
65           7.  Informative References  . . . . . . . . . . . . . . . . . . .   5
66           Contributors' Addresses . . . . . . . . . . . . . . . . . . . . .   7
67           Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   8

69        1.  Introduction

71           [RFC8279] introduces and explains Bit Index Explicit Replication
72           (BIER) architecture and how it supports forwarding of multicast data
73           packets.

3. nit:

s/introduces and explains/specifies/

note: After all, it is the actual forwarding plane specification except for the encap part.


75           This document lists the OAM requirements for the BIER layer (see
76           Section 4.2 of [RFC8279]) of the multicast domain.  The list can
77           further be used for gap analysis of available OAM tools to identify
78           possible enhancements of existing or whether new OAM tools are
79           required to support proactive and on-demand path monitoring and
80           service validation.

4. minor:

Given how there are OAM bits in the RFC8296, i am puzzled here if/why this is
not mentioned here. Even if just explanatory. 

82        1.1.  Conventions used in this document

84        1.1.1.  Terminology

86           The term "BIER OAM" is used in this document interchangeably with a
87           more extended version, "set of OAM protocols, methods, and tools for
88           BIER layer".

90           *  In-band OAM is an active OAM or hybrid OAM method [RFC7799] in
91              which OAM packets traverse the same set of links and interfaces,
92              and receive the same QoS treatment, as the monitored BIER flow.

5. mayor:

I find the "active or hybrid" (according to RFC7799) to be redundant because active is
already included in the definition of hybrid. So, one option would be to remove the mentioning
of active. But i think that makes it even more terse...

The other problem is that without further elaborations in this document, the definitions
of RFC7799 will not be interpreted unambiguous by readers of this document leading to
confusions/disagreements how to interpret this document.

For example: "Passive" is effectively described in RFC7799 section 3.7 as "ipfix measurement".
To me, it seems clear that this also includes any possible signaling mechanisms to set up
such ipfix measurements for pre-existing BIER traffic flows. Including in-band signaling of 
the desire to set up such IPFIX observation - including sending BIER packets with an OAM
bits set and indicating a particular match of BIER specific keys. Aka: Specific BFIR, TOS/DSCP
and SI/SD/Bitstring in the BIER header - aka: the (pre-existing) BIER flow of interest.

Is this what the authors think "passive" is ? If yes, it would be good to explain "passive"
accordingly, e.g.: with the type of example text i used in the above paragraph.

RFC7799 active to me simply means that the traffic of interest is OAM generated. And hybrid is
measurements of two or more flows. And i have not seen further references to requirements against
hybrid cases in this spec, so maybe that's best kept for future work.

To me also, the OAM packets consist on OAM flow data packets for passive flows and
OAM signaling packets such as BIER header with OAM bits, and those OAM signaling packets
do not need to receive the same QoS treatment 

"BIER flow" is used by not specified, but needs to be added to terminology IMHO.

So here is my suggested replace:

     *  A BIER flow (of interest) is a set of BIER non-OAM packets with a specified combination
        of packet header values such as for example the following tuple if RFC8296 encapsulation is
        used: (BIFT-id, TC, BSL, Entropy, DSCP, Proto, BFR-id, BitString). 

     *  In-band OAM for BIER is the subset of active OAM ([RFC7799], section 3.4)
        or passive OAM ([RFC7799], Section 3.6) methods (applied to BIER), in which 
        OAM signaling packets traverse the same set of links and interfaces
        as the BIER flow(s) of interest. In-band OAM for BIER may also include OAM
        indications in BIER data packets, such as RFC8321 methods. These BIER data+OAM
        packets need to receive the same QoS treatment as the other data packets of
        the BIER flow of interest.

     Explanations: In RFC7799 passive OAM, OAM is used to monitor
     non-OAM generated BIER data flows. In RFC7799 active OAM, the BIER data flow is
     generated by OAM too. 

     Note: In RFC779, hybrid9 OAM involves two or more flow, for example a non-OAM
     generated flow being observed passively, and an actively generated BIER flow to
     measure the impact of its additional traffic on the non-OAM flow. This document
     does not introduce specific requirements for hybrid use cases, but these may be
     constructed by combining multiple active and/or passive OAM sessions.

6. mayor:

The definition "same set of links and interfaces" that you provide is not the one which
i remember to be used for "in-band". https://en.wikipedia.org/wiki/In-band_signaling is not
"very good" but i think it captures the essence of what i remember: that OAM signaling
is multiplexed into the traffic of interest.

So, IMHO, there needs to be a further explanation of what type of mux/demux points are
considered to be inband for the purpose of these requirements:

a) ONLY ? demux based on a BIER header OAM indication (like RFC8296 OAM bits ?)

b) Also via "outer header" OAM indications, e.g.: MPLS MNA with BIER payload or
   IPv6+(e.g.:IOAM) ExtHdr with BIER payload ?

c) Also via "inner header" OAM indications, such as MNA when BIER payload is MPLS or
IPv6 ExtHdr when payload is IPv6 ? See also further below on the issues/benefits of such
"DPI" (if we assume we do want to observe/OAM-monitor traffic on midpoint-BFR which without
OAM would not look further into the packet). Hint: IPFIX is often used to DPI into TCP
headers on nodes that otherwise are only routers (e.g.: should not look into TCP headers).

d) If you want to permit for the purpose of this document also OAM signaling packets
that do not even need to include a BIER header, but for example only carry signaling
for a BIER flow via MPLS MNA or IPv6 ExtHdr, then the whole solution is IMHO effectively
not "in-band", but it is "on-path". This is like RSVP. 

I think th text should enumerate these options (with the IPv6/MPLS example options mentioned)
and explain if or if not a particular type is within scope and if not, why not.

7. mayor:

I am of course a fan of the OAM bits in the BIER header, but i also think that we should
not constrain ourselves to believe that there always will be only one such BIER header (in
the spirit of RFC8279 which did not draw this conclusion), so i would very much like to see the following requirement:

  Requirement: Any encapsulation for BIER MUST support signaling methods to allow differentiating
  in-band OAM messages from non-OAM messages. Note: RFC8296 supports this requirement via two OAM
  bits.

  Explanation: This requirement allows to perform BIER OAM without new dependencies against outer
  encapsulations used (potentially differing per-hop) or the need to inspect packets beyond the
  BIER header solely for OAM signaling.

8. mayor:

There really needs to be a description how the target of "in-band" in this document
relates to "in-situ" as specified in RFC9378. If the idea is that "in-band" is exactly like
"in-situ", just specific to BIER, then of course, it would be confusing to use a different
term, but in-band should be changed to in-situ.

If "in-band" is meant to include other options beside those specified in RFC9378, then
there really needs to be an example given of such an option to motivate the "superset".
And if there is only partial overlap, then of course explain that as well.

If the authors choose "in-band" because they where unclear if BIER OAM needs to invent something
beyond the in-situ solution, but it is unclear if or what that could be, then that's a perfect 
explanation about the relationship too.

94           *  Out-of-band OAM refers to an active OAM method in which the path
95              traversed through the BIER domain is not topologically identical
96              to that of the monitored BIER flow, or in which the OAM test
97              packets receive different QoS treatment, or both.

9. mayor:

Topological seems redundant. Or else there would be an exaple of "path traversed ... not identitial"
which is not topological. Is there any such example ?

The explanation is confusing because it does not distinguish between OAM signaling and
OAM data packets. These need to be separated because those are two different cases.
For example, as mentioned before, even in in-band signaling, not all OAM signaling
packets ned to receive the same QoS treatment as the BIER flow of interest.

There is a larger distinction between in-band signaling and out-of-band signaling
than the same path, and that is that the encap for signaling can be arbitrary. That's
important to mention.

The out-of-band signaling also applies to passive OAM.

Suggested rewrite:

           *  Out-of-band OAM signaling refers to an OAM method where OAM signaling
              does not need to be BIER packets with OAM indications, as in in-band, and
              where those OAM signaling packets can traverse through the BIER domain
              not necessarily on the same paths as the monitored BIER flow. Out-of-band
              OAM signaling applies to both active and passive OAM.

           The typical case of Out-of-band OAM signaling consists of signaling from
           a central controller to all or a subset of the BFR along the path of the monitored
           BIER flow.

           *  Out-of-band active OAM (data traffic) refers to an active OAM method in which
              the OAM test packets are not meant to emulate or be actual BIER user traffic
              of interest. These packets may receive different QoS treatments and performance
              measurements than BIER user traffic.

           Out-of-band active OAM may use in-band or out-of-band OAM signaling.
           A typical use case for Out-of-band active OAM is for basic BFIR to BFER
           connectivity testing and potentially path tracing, but foregoing latency and/or
           BER/congestion based packet loss measurements.
        
99           *  OAM session is a communication established between network nodes
100              to perform OAM functions like fault detection, performance
101              monitoring, and localization [RFC7276].  These sessions can be
102              proactive (continuous, persistent configuration) or on-demand
103              (manual, temporary diagnostics).

105        1.1.2.  Requirements Language

107           The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
108           "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
109           "OPTIONAL" in this document are to be interpreted as described in BCP
110           14 [RFC2119] [RFC8174] when, and only when, they appear in all
111           capitals, as shown here.

113           The requirements language is used in Section 2 and applies to
114           implementations of BIER OAM conformant to the listed requirements.

116        1.1.3.  Acronyms

118           BFD: Bidirectional Forwarding Detection [RFC8562]

120           BFR: Bit-Forwarding Router [RFC8279]

122           BFER: Bit-Forwarding Egress Router [RFC8279]

124           BIER: Bit Index Explicit Replication [RFC8279]

126           OAM: Operations, Administration, and Maintenance [RFC6291]

128           p2mp: Point-to-Multipoint [RFC8562]

130           STAMP: Simple Two-way Active Measurement Protocol [RFC8762]

132        2.  Requirements

134           This section lists the requirements for OAM of the BIER layer:

136           1.   The listed requirements MUST be supported with any transport
137                layer over which the BIER layer can be realized.

10. nit:

"transport layer" seems like a confusing choice. TCP ? QUIC ? RFC8279 avoided any such term,
there is no such "transport layer" in RFC8279!

How about something more descriptive derived from RFC8279:

  Any BIER packet encapsulation/transmission option enabled through the BIER routing underlay.

139           2.   It MUST be possible to initialize a BIER OAM session from any
140                Bit-Forwarding Router (BFR) of the given BIER domain.

142           3.   It SHOULD be possible to initialize a BIER OAM session from a
143                centralized controller.

145           4.   BIER OAM MUST support proactive and on-demand OAM monitoring and
146                measurement methods.

148           5.   BIER OAM MUST support unidirectional OAM methods, both
149                continuity check (e.g., Bidirectional Forwarding Detection (BFD)
150                [RFC8562]) and performance measurement (e.g., Simple Two-way
151                Active Measurement Protocol (STAMP) [RFC8762]).

11. mayor:

No definition what "continuity check" means. I guess the term is taken from RFC5860
which is the reference for RFC6428 that you reference later. Unfortunately, RFC5860
is also not really specifying the semantic of "continuity check".

From what i figure from RFC5860/RFC6428, continuity check is knowing that an LSP reaches
a destination that "accepts" the LSP - but you don't even know if it's the right LSR.
Which makes sense because of PHP, so the target node may will not even see if it was
addressed correctly. BIER is a bit different, in that the BFER will still see it's own
bit in the BitString, but that's not necessarily unambiguous. So i think BIER could
indeed have similar problems than MPLS - you reach a destination, but it's the wrong one.

And seemingly in MPLS TP, this problem is resolved through connectivity verification if i
read RFC5860 corectly adds the identification of the receiving node.

If i understand all this correctly, then i think BIER OAM definitely needs "Connectivity
Verification" as a requirement. With an explanation what this means, 

So, here is what i would suggest:

  Requirements:

  * BIER OAM MUST support a method for connectivity verification
    (as used in RFC5860/RFC6428) from a chosen BFIR to a chosen BFER which can verify 
    what BFR-id and BFR-prefix of sender and receiver are. 

  Connectivity check allows to verify correct BFR-id/BFR-prefix configuration and forwarding
  behavior in a BIER domain.

  * BIER OAM MUST support a unidirectional proactive method for continuance check (as used
    in RFC5860/RFC6428) in which a BFER can check the ability of BIER to deliver packets
    from a specific BFIR as indicated by BFR-id/BFR-prefix. Continuance failure of less than
    100 msec should be supported.

  Continuance check allows to trigger services failover or other network management operations
  upon loss or recovery of continuance. 

  Connectivity Verification and Continuance Check may use/expand for BIER existing methods
  like Bidirectional Forwarding Detection (BFD) [RFC8562].

12. minor:

RFC8762 is not a sufficient enough reference by itself because it does not cover
P2MP. If the document continues to reference that RFC for BIER, it should be amended with
an explanation that some BIER adopted version of RFC8762 is needed, potentially including
P2MP functionality.

13. mayor:

I am worried about a requirement against a functionality like STAMP in the BIER layer and
would suggest to rethink how to cover the desired functionality. Either remove it, describe
it's limitations or expand the scope of the document to include overlay flow layer functionality.

Explanations:

If i am not mistaken, STAMP is an IETF variation of prior proprietary vendor technologies
like IP SLA. I was working on specifying and deploying IP SLA multicast and the experience
was that it was painful to integrate a functionality like this into routers. Sending and
receiving packets to measure feasibility of a path for user traffic may require high-performance
sending and receiving of packets and measuring latency and loss and creating statistics.

Back in the time when IP SLA was implemented and deployed, this was only possible in
CPU on routers. Which was a great tool, but of course very much limited: The latency to send
and receive from/to user land in routers was already larger than the total (HW forwarding)
latency through the network. So latency measurement was not very useful. Likewise, it was
taking router CPU, so operators even resorted to running it on dedicated ad-on routers. And
later on, dedicated non-router monitoring/performance measurement devices.

Of course, today, some high-speed forwarding engines can actually act as very powerful packet
sender/receivers (such as implemented on Intel Tofino as shown by Michael Menth team), but before
we put a requirement like this with a MUST and no further performance limitations into this spec,
i would like to have a round call with all the relevant forwarding plane vendors contributing to
BIER. Or else downgrade/performance-limit/remove the requirement.

[ And if the vendors in BIER think that their PE are great for sending high performance
BIER traffic then i think the same should be true for unicast, and i would love to have
some IETF unicast OAM performance sending/receiving requirements against PE routers/LSR ;-)) ]

The biggest challenge with BIER performance measurement compared to IP Multicast performance
measurement is that it likely will never be a great solution to add external monitoring
devices to the network that operate at the BIER level. Those devices would eat up bits in
BIER Bitstrings and hence reduce the scalability of BIER. Therefore i think that the best
solution for any type of active performance measurement is to:

1. Use eternal IP Multicast traffic generators - e.g.: directly connected to BIFR/BFER
2. Use out-of-band signaling to establish passive measurement on the BFIR/BFER

And 2. of course would be the same for passive OAM again a flow injected by an actual IP
multicast application.

I think it is also important to understand, which type of monitoring hardware routers are
known to support and which IMHO could well be leveraged by BIER OAM. And that is passive
measurement of traffic including the ability to skip past headers. So i think it is
perfectly valid to ask for 

Suggested requirements:

    * BFIR/BFER SHOULD support BIER level traffic generation and reception/measurement
      for simple, lightweight path performance measurement of the type of
      STAMP (Simple Two-way Active Measurement Protocol, [RFC8762]). This MUST be able
      to operate without requiring additional BFR-id/BFR-prefix allocation to that already
      required for non-OAM BIER operations. BFIR/BFER MAY support higher performances.

    With SHOULD level performance, most BFIR/BFER platforms can resort to implement this
    functionality in software on a CPU such a control plane CPU, potentially conflicting with
    control plane operations and achieving only limited performance accuracy, such as
    lower accuracy latency incurred by sending/receiving from control plane. With MAY level
    requirement, platforms would likely need to support this packet level sending/receiving
    in a high-speed forwarding plane engine where it may not always be possible to implement.

    * BFR SHOULD support passive monitoring of traffic flows of specific BIER payloads,
      such as simple RTP flows (RFC3550), monitoring the flows throughput, sender-to-monitoring
      node latency/jitter and packet loss/reordering. The ability to specify flow keys SHOULD
      include all relevant BIER traffic attributes such as SI/SD or BIFT-id, BitString (with mask),
      TOS/DSCP, payload proto and Entropy.

    Passive monitoring of video flows is supported for a long time in MPLS and IP Multicast routers.
    If additional headers for sequence number and timestamps (or equivalent) are available in a
    header/payload (such as when RTP or STAMP are used), then more performance metric including
    loss, latency, jitter can be measured, otherwise only basic metrics such as throughput,
    inter-packet arrival time distribution and packet size distribution can be measured.
    
153           6.   BIER OAM packets in the forward direction (i.e., from the
154                ingress toward the egress endpoint(s) of the OAM test session)
155                MUST be transmitted in-band, as defined in Section 1.1.1.

14. minor:

This is missing the detail of whether we're talking OAM data packets in case of active
measurement or actual OAM signaling packets. E.g.: fix to "BIER OAM signaling packets" -
assuming we're talking about signaling packets.

15. mayor:

But how about when the session is created by a controller ? Then this requirement will
fail, because the controller will first send out-of-band to the OAM session ingress endpoint,
and the egress endpoint may directly send (out-of-band) to the controller.

Aka: add new requirement about how to support signaling for controllers ?

16. mayor:

I think there are also mayor requirements missing for active measurements related to
"in-band":

Consider the OAM operator wants to understand how a particular user flow would be treated
by the BIER network. But that flow is currently not present. Simply because it is a user-flow
which can not always exist when the operator wants to troubleshoot.

Likewise, there can be a hybrid OAM test setup where one wants to measure the impact of
another flow through the network competing with an actually existing user-flow (as described
also in rfc7799). This too requires the ability to generate an active flow that runs along
a predefined path.

So the OAM operator needs to be able to emulate the user flow to the extend that the emulated/active
flow will use the same path/qos through the network as the user flow would.

Requirements:

  * If a BFIR supports setting the Entropy of BFR packets to values other than 0, then it
    SHOULD support the ability to set the entropy of specific BIER overlay flows to
    specific Entropy values determined by OAM operations. The BFIR SHOULD then also support
    determining the Entropy that would be assigned to a non-existing overlay flow with
    known parameters and provide that Entropy value to the OAM system (such as an external
    controller).

  These requirements allows the OAM system to establish an active overlay flow that uses
  the same paths towards BFER as an either pre-existing overlay flow or a potential
  overlay flow. 


157           7.   BIER OAM MUST support bi-directional OAM methods.  Such methods
158                MAY combine in-band monitoring or measurement in the forward
159                direction with out-of-band notification, as defined in
160                Section 1.1.1, in the reverse direction (i.e., from the egress
161                toward the ingress endpoint of the OAM test session, as in
162                Point-to-Multipoint (p2mp) BFD with active tail [RFC9780]).

17. minor:

I would again like to up-level the requirement. What seems to be the root requirement
is that the OAM workflow needs the OAM measurement result not only in the place
where it is taken, but also in other places. What are those places ? Is it enough
to ask that this information is available in the node that initiated the OAM session ?

Requirement:

    * BIER OAM MUST support retrieving of the OAM measurement results in the OAM session
      initiator. Explanation: This can be achieved by proactive or reactive signalling of those
      results from the OAM nodes taking the measurements back to the initator node.

18. minor:

I think it would be very useful to ask for minizing dependencies against unicast forwarding
in BIER OAM, because that eliminates false positives when (only) unicast does not work.
This is easily possible in BIER whenever the session initiator is not only a BFER but also
BFIR with a BFR-id, because then the return packets can be sent as BIER "unicast" (just with
the session initiator bit set.

Requirement:

    * BIER OAM SHOULD support using BIER "unicast" instead of actual unicast for reverse
      direction messages whenever the ingress endpoint can be reached via BIER, e.g. when it has
      a BFR-id. BIER "unicast" means a BIER packet where only that one bit of the ingress endpoint
      is set.
   
19. minor:

These requirements do i think not fully address the situation where there is an
out-of-band controller. Does this document for example want to express any opinions
about whether or not responses should be sent directly to the controller (unicast only)
or whether they should be sent back through the BIER ingress (unicast or BIER unicast)....

164           8.   BIER OAM MUST support proactive monitoring of BFER availability
165                by a BFR in the given BIER domain, e.g., p2mp BFD active tail
166                support [RFC9780].

20. mayor:

Totally unclear requirement  what does "BFER availability" means. Just aliveness ?
Aliveness plus some degree of working BIER forwarding plane ? Reachability of the BFER
from the "BFR" via BIER ? That requirement can only be met if that BFR can be BFIR,
so then write "reachability of a BFER from a BFIR via BIER".


168           9.   BIER OAM MUST support Path Maximum Transmission Unit discovery
169                [RFC1191].

21. minor:

refine, more detail rewrite suggestion:

  BIER OAM MUST support discovery of Path Maximum Transmission Unit (MTU) between a BFIR
  and one BFER. This discover MUST support setting all different values that can impact
  the chosen path across the BIER domain, specifically Entropy (RFC8729). MTU discovery
  SHOULD use an efficient mechanism relying on specific OAM support in the BFR midpoints,
  such as ICMP DF support in IP (RFC1911). For backward compatibility, it SHOULD also
  support a mechanism that works without BFR midpoint OAM support, such as the probing in RFC4821.

171           10.  BIER OAM MUST support Remote Defect Indication [RFC6428]
172                notification of the source of continuity checking BFR by Bit-
173                Forwarding Egress Routers (BFERs), e.g., by using the Diagnostic
174                field in p2mp BFD with active tail support, as described in
175                Section 5 of [RFC9780].

22. minor:

RFC5860 and RFC6428 are really terrible in explainin their terms. I have not been able
to figure out exactly what type of failure "RDI" can indicate that other methods can
not indicate. I would appreciate if this document could include an explanation of those
cases, or else i will claim this requirement is not needed.

177           11.  BIER OAM MUST support active and passive performance measurement
178                methods [RFC7799].

nit:

I would move this far up as the first or early requirement. 

23. minor:

Some high-level justification/explanation/examples for each would be nice.

Also a note that further requirements refine this high-level requirement with further
details specific to BIER.


180           12.  BIER OAM MUST support unidirectional performance measurement
181                methods to calculate throughput, loss, delay, and delay
182                variation metrics [RFC6374].  STAMP ([RFC8762] and [RFC8972]) is
183                an example of an active performance measurement method and
184                performance metrics that may be applied in a BIER domain.  The
185                Alternate Marking Method, described in [RFC9341] and [RFC9342],
186                is an example of a hybrid measurement method ([RFC7799]) that
187                may be applied in a BIER domain.

24. nit:

See the earlier suggested requirements text and see if/how these are duplicates and
how to best deal with them.

Also, this requirement sounds like it can only be supported for active OAM unless
the user flow does include timestamp/sequence numbers, so say "MUST support active unidirectional ..."

From STAMP on, i think those are explanations and should be separated from the actual
requirement. Maybe instead of using a numbered list it may be easier to use explicitly
tagged requirements paragraphs and number them manually (formatting issues when separating
requirements from notes/explanations).

189           13.  BIER OAM MUST support defect notification mechanism, like Alarm
190                Indication Signal [RFC6427].  Any BFR in the given BIER domain
191                MAY originate a fault management message [RFC6427] addressed to
192                any subset of BFRs within the domain.

25. mayor:

Again, these MPLS RFCs are IMHO terrible in explaining what all those fancy OAM features
actually do. So, thanks Nokia vendor document (popped up first in google ;-) for a good
explanation:

https://infocenter.nokia.com/public/7750SR217R1A/index.jsp?topic=%2Fcom.nokia.MPLS_Guide_21.7.R1%2Falarm_indicatio-ai9emdyo4q.html

I got the basic mechanism - i hope:

1) LSP on a midpoint LSR with a known ingress interface and known support for some form of OAM.
2) Link failure on the known ingress interface (e.g: Ethernet link failure indication).
3) MPLS-TS OAM generates AIS packet towards the downstream LSP
4) Endpoint LSR for this LSP receives AIS and somehow changes its OAM behavior.

First of all, i would like to see a more explanatory writeup of the overall use-case,
e.g.: what the useful magic (4) on the receiver endpoint(s) (BFER) could be.

Secondly: I wonder how feasible this type of functionality is given how in MPLS-TP it seems
to depend on the pre-existing LSP state, especially a known ingress. BIER state on midpoint
BFR does not have known ingress. And as much as i am a fan of establishing on-demand or
proactively additional state across BIER routers for passive monitoring of traffic - when
it comes to additional state for error discover, thats something i'd first love to see
more discussion of value about.

Aka: Right now i'd say remove it unless we get a better result discussing on the list.

194           14.  BIER OAM MUST support methods to enable the survivability of a
195                BIER layer.  These recovery methods MAY use protection switching
196                and restoration.

26. mayor:

I am really a big fan of resilience, but i see way too much effort wasted into the 
hop-by-hop forwarding layer methods including all (IP)FRR solutions we have. Of all of
them, BIER-FRR is actually the coolest due to the way BIER operates.

But nevertheless: I think it is much more important to put more focus on survivability via
solutions that use dual transmission with duplicate elimination on the receiver edge. 
Only these solutions can provide zero packet loss during any type of single link/node
failure/recovery. And we do have a whole IETF WG that is working for a long time already
on this solution and has good enough specifications to do this: DetNet. And i am not
talking about any form of QoS to guarantee no-loss or latency, but i am only talkiing about
PREOF as the best form of survivability without interruption.

In addition: In my > 15 year work life deploying IP multicast with high-value applications
across service provider core and access networks, the experience was:

- Customers who came from a protection switching background (TDM or ATM with sub-50 msec
  failover for voice) also raised this requirement against IP Multicast

- When they then started to deploy more IP Multicast and recognized theat sub 50-msec
  protection was not any more useful than sub 500 msec for their (eg: video) traffic, they
  also figured out that it is often not the absolute time of the outage but the total
  number of outages that customers experience. And those could be reduced a lot easier
  with very simple network designs. Which was simply IP Multicast with fast reconvergence
  through well implemented routers and well configured routing protocols. KISS - Keep It Simple
  Stupid. Those solutions gave typically < 500 msec failover and that was equivalent in
  TV viewer experience. Or on the other hand those dual transmission customers that used
  appropriate routing mechanisms for path diversity. And merged traffic often only in
  specific CE devices.

- BIER actually makes fast reconvergence for multicast much faster already because it
  eliminates multicast tree state convergence and just needs (like unicast) adjacency
  reconvergence if the path to next-hops changes. I would assume that BIER recovery is like unicast
  much more < 100 msec for high priority destinations.

So, long story short: Not all BIER networks or services need resilience, and many of those
that do need it, the options are to varied that a requirement like this would make a lot
of sense except for biasing the readers towards one likely not most beneficial option.

So, i would like to see this requirement removed or replaced by a broader discussion in
some form of appendix or non-normative section about survivability/resilience.

27. mayor:

I think there is a whole complex missing about establishing passive monitoring across
multiple hops along the path of traffic to isolate where along the path traffic fails.
Of course, there is something to be said about trying to deduce this based on comparing
the performance numbers across the different BFER for a flow and then correlating
at which point in the BIER replication tree the errors would have had to occur to only
impact the subset of BFER that see the error (e.g.: loss, latency too high,...), but 
whenever the tree is too sparse to know this exactly, then passive monitoring
setup on midpoints is necessary to isolate a problem. 

28. mayor:

I think there may be additional requirements beneficial to auto-diagnose any type of
misconfiguration or inconsistent configurations of the BIER layer in a BIER domain.
And/or the ask to support autoconfiguration of such misconfigurable parameters.

For example, RFC8444 only talks ONLY about reporting an error (arguably an OAM operation)
when misconfiguring BAR/IPA. But it should equally be possible to automatically raise
errors when seeing other configuration errors, such as duplicate assignment of BFR-prefix
(e.g.: from the whole topology by one of the BFR that sees same BFR-prefix announcements
from another router). Or duplicate assignments of bits (BFR-id) to multiple BFR.

And of course at least a MAY for automatically assigning free BFR-id to BFR would be nice.

    * BIER implementations SHOULD discover as much as feasible BIER layer misconfigurations
      in a distributed fashion without the help of a non-BIER-domain nose such as an
      external, out-of-band controller. This specifically includes determining duplicate
      assignment of BFR-prefix or BFR-id as well as inconsistent configuration of SI and SI
      or BIFT-id in general.

    * BIER implementation MAY employ mechanisms to automatically assign any of the
      necessary configuration parameters for a BIER domain, such as determining the
      BFR-prefix from unicast router ID, and assigning unused BFR-id through distributed
      algorithms such as determining all the other assigned BFR-id from an LSP IGP routing
      information.


198        3.  IANA Considerations

200           This document does not propose any IANA consideration.  This section
201           may be removed.

203        4.  Security Considerations

205           This document lists the OAM requirement for a BIER-enabled domain and
206           thus inherits security considerations discussed in [RFC8279] and
207           [RFC8296].  Another general security aspect results from using active
208           OAM protocols, according to the [RFC7799], in a multicast network.

29. nit:

Paragraph break before this sentence, add "as follows" to end of sentence.

209           Active OAM protocols inject specially constructed test packets, and
210           some active OAM protocols are based on the echo request/reply
211           principle.  In the multicast network, test packets are replicated as
212           data packets, thus creating a possible amplification effect of
213           multiple echo responses being transmitted to the sender of the echo
214           request.  Thus, an implementation of BIER OAM MUST protect the
215           control plane from spoofed replies.  Also, an implementation of BIER

30. minor:

I don't think this is a correct representation of the real or perceived issues with multicast
in reply/response OAM applications. If i remember correctly, the root problem where packets
with a spoofed sender address sent to a multicast group/port that would create replies
back to the spoofed sender address, hence creating an amplification attack against another
(spoofed ip address) device.

Aka:

               Active OAM protocols inject specially constructed test packets, and
               some active OAM protocols are then sending unicast replies back to the
               sender. If the BIER domain operates such that attackers can modify BFIR
               to send such BIER multicasted packets with spoofed sender / reply-to 
               addresses, then such OAM protocol need to employ sender address authentication
               (such as specified for STAMP and other protocols) to prohibit such reply
               amplification by the BIER service. 


End of Review comments.

216           OAM MUST provide control of the number of BIER OAM messages sent to
217           the control plane.

219        5.  Acknowledgements

221           The authors would like to thank the comments and suggestions from
222           Gunter van de Velde that helped improve this document.

224        6.  Normative References

226           [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
227                      Requirement Levels", BCP 14, RFC 2119,
228                      DOI 10.17487/RFC2119, March 1997,
229                      <https://www.rfc-editor.org/info/rfc2119>.

231           [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
232                      2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
233                      May 2017, <https://www.rfc-editor.org/info/rfc8174>.

235        7.  Informative References

237           [RFC1191]  Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191,
238                      DOI 10.17487/RFC1191, November 1990,
239                      <https://www.rfc-editor.org/info/rfc1191>.

241           [RFC6291]  Andersson, L., van Helvoort, H., Bonica, R., Romascanu,
242                      D., and S. Mansfield, "Guidelines for the Use of the "OAM"
243                      Acronym in the IETF", BCP 161, RFC 6291,
244                      DOI 10.17487/RFC6291, June 2011,
245                      <https://www.rfc-editor.org/info/rfc6291>.

247           [RFC6374]  Frost, D. and S. Bryant, "Packet Loss and Delay
248                      Measurement for MPLS Networks", RFC 6374,
249                      DOI 10.17487/RFC6374, September 2011,
250                      <https://www.rfc-editor.org/info/rfc6374>.

252           [RFC6427]  Swallow, G., Ed., Fulignoli, A., Ed., Vigoureux, M., Ed.,
253                      Boutros, S., and D. Ward, "MPLS Fault Management
254                      Operations, Administration, and Maintenance (OAM)",
255                      RFC 6427, DOI 10.17487/RFC6427, November 2011,
256                      <https://www.rfc-editor.org/info/rfc6427>.

258           [RFC6428]  Allan, D., Ed., Swallow, G., Ed., and J. Drake, Ed.,
259                      "Proactive Connectivity Verification, Continuity Check,
260                      and Remote Defect Indication for the MPLS Transport
261                      Profile", RFC 6428, DOI 10.17487/RFC6428, November 2011,
262                      <https://www.rfc-editor.org/info/rfc6428>.

264           [RFC7276]  Mizrahi, T., Sprecher, N., Bellagamba, E., and Y.
265                      Weingarten, "An Overview of Operations, Administration,
266                      and Maintenance (OAM) Tools", RFC 7276,
267                      DOI 10.17487/RFC7276, June 2014,
268                      <https://www.rfc-editor.org/info/rfc7276>.

270           [RFC7799]  Morton, A., "Active and Passive Metrics and Methods (with
271                      Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799,
272                      May 2016, <https://www.rfc-editor.org/info/rfc7799>.

274           [RFC8279]  Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A.,
275                      Przygienda, T., and S. Aldrin, "Multicast Using Bit Index
276                      Explicit Replication (BIER)", RFC 8279,
277                      DOI 10.17487/RFC8279, November 2017,
278                      <https://www.rfc-editor.org/info/rfc8279>.

280           [RFC8296]  Wijnands, IJ., Ed., Rosen, E., Ed., Dolganow, A.,
281                      Tantsura, J., Aldrin, S., and I. Meilik, "Encapsulation
282                      for Bit Index Explicit Replication (BIER) in MPLS and Non-
283                      MPLS Networks", RFC 8296, DOI 10.17487/RFC8296, January
284                      2018, <https://www.rfc-editor.org/info/rfc8296>.

286           [RFC8562]  Katz, D., Ward, D., Pallagatti, S., Ed., and G. Mirsky,
287                      Ed., "Bidirectional Forwarding Detection (BFD) for
288                      Multipoint Networks", RFC 8562, DOI 10.17487/RFC8562,
289                      April 2019, <https://www.rfc-editor.org/info/rfc8562>.

291           [RFC8762]  Mirsky, G., Jun, G., Nydell, H., and R. Foote, "Simple
292                      Two-Way Active Measurement Protocol", RFC 8762,
293                      DOI 10.17487/RFC8762, March 2020,
294                      <https://www.rfc-editor.org/info/rfc8762>.

296           [RFC8972]  Mirsky, G., Min, X., Nydell, H., Foote, R., Masputra, A.,
297                      and E. Ruffini, "Simple Two-Way Active Measurement
298                      Protocol Optional Extensions", RFC 8972,
299                      DOI 10.17487/RFC8972, January 2021,
300                      <https://www.rfc-editor.org/info/rfc8972>.

302           [RFC9341]  Fioccola, G., Ed., Cociglio, M., Mirsky, G., Mizrahi, T.,
303                      and T. Zhou, "Alternate-Marking Method", RFC 9341,
304                      DOI 10.17487/RFC9341, December 2022,
305                      <https://www.rfc-editor.org/info/rfc9341>.

307           [RFC9342]  Fioccola, G., Ed., Cociglio, M., Sapio, A., Sisto, R., and
308                      T. Zhou, "Clustered Alternate-Marking Method", RFC 9342,
309                      DOI 10.17487/RFC9342, December 2022,
310                      <https://www.rfc-editor.org/info/rfc9342>.

312           [RFC9780]  Mirsky, G., Mishra, G., and D. Eastlake 3rd,
313                      "Bidirectional Forwarding Detection (BFD) for Multipoint
314                      Networks over Point-to-Multipoint MPLS Label Switched
315                      Paths (LSPs)", RFC 9780, DOI 10.17487/RFC9780, May 2025,
316                      <https://www.rfc-editor.org/info/rfc9780>.

318        Contributors' Addresses

320           Erik Nordmark
321           Email: nordmark@acm.org

323           Sam Aldrin
324           Google
325           Email: aldrin.ietf@gmail.com

327           Lianshu Zheng
328           Email: veronique_cheng@hotmail.com

330           Nobo Akiya
331           Email: nobo.akiya.dev@gmail.com

333        Authors' Addresses

335           Greg Mirsky (editor)
336           Ericsson
337           Email: gregimirsky@gmail.com

339           Nagendra Kumar
340           Cisco Systems, Inc.
341           Email: naikumar@cisco.com

343           Mach Chen
344           Huawei Technologies
345           Email: mach.chen@huawei.com

347           Santosh Pallagatti (editor)
348           VMware
349           Email: santosh.pallagatti@gmail.com