Hi, I have been selected as the Operational Directorate (opsdir) reviewer for this Internet-Draft. The Operational Directorate reviews all operational and management-related Internet-Drafts to ensure alignment with operational best practices and that adequate operational considerations are covered. A complete set of _"Guidelines for Considering Operations and Management in IETF Specifications"_ can be found at https://datatracker.ietf.org/doc/draft-ietf-opsawg-rfc5706bis/. While these comments are primarily for the Operations and Management Area Directors (Ops ADs), the authors should consider them alongside other feedback received. - Document: [Internet-Draft Name and Revision] - Reviewer: Gyan Mishra - Review Date: 2/12/2026 - Intended Status: Informational --- ## Summary This draft proposes the CATS framework and architecture for the CATS Working Group. Choose one: - Has Issues: I have some major concerns about this document that I think should be resolved before publication. Terminology section: Service - mentioned to be virtual or physical. Virtual refers to VM and not a container. So I think we should include container so cloud native aware with kubernetes and RHEL Openshift use cases are covered. Metric - is it a compute or network metric Ingress CATS forwarder- Would this be a RSVP-TE or SR head end source node and if so I believe it should be stated. Egress CATS forwarder-Would this be a RSVP-TE or SR tail end of tunnel and if so it should be stated. C-PS Would this be a SR candidate path or RSVP-TE static or dynamic cSPF path and if so it should be stated. C-SMA C-NMA - Would the service or network agent data collection be streaming telemetry be sent to a NMS / SDN controller. If so then I think it should be mentioned. If not then where would the streaming telemetry be sent. Also how would this work in a distributed scenario with no centralized controller and where would it be sent. C-TS is the forwarding class QOS or DSCP aware similar to DS TE or path based QOS or SR forwarding class concept. If so then should be mentioned. CS-ID - This is missing in the terminology section section 3.4.7 Underlay infrastructure Should this include both PE and P which makes up the network underlay for MPLS or SR. Here we should mention of true that the CATS ingress forwarder is SR headend and CATS egress forwarder is CATS tail end. Also mention technologies MPLS, RSVP-TE, SR-MPLS, SRv6 technologies as part of the underlay. It’s mentioned that underlay is not necessarily CATS aware which does make the CATS framework more abstract and not as tangible to the reader as to how all of this would come to fruition. If you decide to leave it as-is then you have to explain in more detail the this new overlay paradigm with CATS and its interactions with the network underlay as well as overlay network layer both network control plane and data plane. That is if CATS is using a completely separate control plane and data plane. My suggestion is to state that the network both underlay and overlay is CATS aware and their maybe extensions both IGP, SR and MPLS and RSVP-TE extensions to support CATS framework. I don’t see anywhere in the draft any mention of load balancing or use of anycast or ADC load balancers for load balancing of flows L4 to backend servers instances, content load balancers L7 or GEO load balancing using DNS and GEO database based on source IP and serving DNS name closest proximity to client. I think that would be part of CS-ID or CSCI-ID. CSCI-ID CATS service contact instance ID in the example given could be an IP address or DNS name I would think this this would be the ADC load balancer VIP that the client connects to and is L4 load balanced or L7 content load balanced to the backend server service instances which is the CS-ID. I have it backwards here but on purpose as the names are confusing as written. So fixing what I stated as the CS-ID is what the client connects to that is the service VIP IP / DNS name and the CSCI-ID is the backend service instance IP / DNS name. 4.1 Service announcement Should mention here and in CS-ID definition that if you agree the Load balancer service VIP either L4 or L7 VIP is advertised to the network either BGP or IGP via any related IGP extension if it’s carried as meta data resource memory or cpu related information. Also if DC fabric is extended to the host and the host container or VM or physical is has the IGP or BGP underlay and or overlay vxlan, SR-MPLS or SRv6 L2 or L3 VPN extended to the host CATS compute node. This detail should be included in the draft. 4.2 Metric distribution So here are the compute metrics advertised injected into the IGP to track resource usage and if the resources are running hot to notify the network dynamically by a cost related metric change injected into IGP or BGP using a new routing extension. If that is the case then it should be mentioned. 5.2 CATS OAM So it seems that this is all completely new overlay infrastructure and so all completely new OAM telemetry so cannot leverage any existing OAM network technologies. I think some of the concepts like the forwarders are exactly using IETF network technologies but if not then it does complicate as it’s all net new. In this section talking about CATS OAM if there is a separate CATS control plane and data plane separation completely disjoint from existing network layer then this gets quite complicated on the network side and we will need a lot more work to figure this all out how the network will work with CATS. If this is all net new at network layer for CATS then I would say this draft has major issues. So let’s say it’s not and reusing network layer then OAM would use existing network OAM mechanisms based on IETF technology deployed. ## General Operational Comments Alignment with RFC 5706bis Provide an overview of the draft’s operational feasibility, readability, and alignment with RFC5706bis guidelines. Example: > This document defines a mechanism for the framework for CATS Compute aware TE. While the technical approach is sound, the draft lacks clarity on how the mechanism and how from a network perspective relates to IETF technologies such as Segment Routing. > The Operational Considerations section should be expanded to address how CATS integration works with existing IETF technologies. How is CATS new data plane going to function and operate. The draft lacks a lot of detail related to CATS control plane and data plane and how they both integrate with network control plane and data plane. Explicitly evaluate compliance with operational guidelines (optional but recommended): For example the check list: - Fault Management: Are failure detection/recovery mechanisms specified? None and needs to be considered and specified - Configuration Management: Are configuration changes to enable/disable the feature clearly defined? Not specified and the details are missing on how this would work end to end and how to enable or disable CATS features - Performance Monitoring: Are metrics (e.g., latency, resource usage) clearly identified? Very generically defined. I would recommend a Yang data model be defined for CATS. | Review Item | RFC 5706 Considerations |------------------------------- |------------------------------------------------------------------------------------------------------- | Deployment | Does the document include a description of how this protocol or technology is going to be deployed and managed? No. This needs to be defined and developed in detail as well as any related extensions to existing IETF technologies | Installation and Initial Setup | Are configuration parameters clearly identified and do they have reasonable default values? This is not defined at all and should be defined | Migration Path | Is a path to migrate existing configuration clearly articualted? Are there any backward compatibility issues? Details on how to migrate greenfield or brownfield migration to CATS should be defined | Requirements on Other Protocols| What other protocol operations are expected to be performed relative to the new protocol or technology? New extensions may need to be developed for IGP and BGP and that all needs to be defined | Impact on Network Operation | Will the new protocol significantly increase traffic load on existing networks or affect the control plane? How will load balancing work and will their be an uptick in traffic utilization or will traffic be more balanced with CATS | Verifying Correct Operation | For example, how can one test end-to-end connectivity and throughput? How to test end to end throughput with CATS ? For routing protocols, example as [RFC 6123 – Inclusion of Manageability Sections in Path Computation Element (PCE) Working Group Drafts](https://www.rfc-editor.org/rfc/rfc6123.html) ## Major Issues List critical problems blocking publication (e.g., protocol flaws, missing operational safeguards, or lack of manageability considerations). Include section/paragraph references. - Example: > Section 4.2 describes [feature] but does not specify how operators can monitor its performance (RFC 5706 Section 3.6). This omission could lead to undiagnosed failures in production networks. - If none: > No major issues found. --- ## Minor Issues List non-blocking but important clarifications (e.g., ambiguous terminology or incomplete examples). - Example: > Section 2.1 uses "node" without defining its scope (physical/virtual). Add a reference to RFC 8345 for consistency. - If none: > No minor issues found. --- ## Nits Optional editorial suggestions (e.g., acronym expansions or grammar fixes). - Example: > Abstract: Expand "NFV" on first use. > Section 3.1: "it’s" -> "its". ---