Network Working Group R. Wilton Internet-Draft N. Corran Intended status: Informational Cisco Systems Expires: 22 May 2025 18 November 2024 Device Network Management - Current Status, and Future Direction draft-wilton-nemops-net-mgmt-future-latest Abstract This document gives a perspective of where we believe the industry is with regarding to network management and telemetry based on Rob's experience as a recent IETF OPS Area Director for Network Management and our joint experience designing and implementing network management technologies for large IP/MPLS Internet scale backbone routers. About This Document This note is to be removed before publishing as an RFC. The latest revision of this draft can be found at https://rgwilton.github.io/network-mgmt-future/draft-wilton-nemops- net-mgmt-future.html. Status information for this document may be found at https://datatracker.ietf.org/doc/draft-wilton-nemops-net- mgmt-future/. Source for this draft and an issue tracker can be found at https://github.com/rgwilton/network-mgmt-future. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 22 May 2025. Copyright Notice Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License. Table of Contents 1. Introduction 2. End Goal 2.1. Long term vision 2.2. Strategic Medium Term Goals 2.2.1. Small improvements to existing network management protocols 2.2.2. Improvements to the YANG language 2.2.3. Better Data Models 2.2.4. Continued engagement with operators 2.2.5. More efficient execution, smaller steps 2.2.6. Availability of Open Source solutions 3. Security Considerations 4. IANA Considerations Acknowledgments Informative References Appendix A. Current technology and solutions A.1. CLIs interfaces A.2. SNMP & MIBS A.3. YANG based Network Management Protocols A.3.1. NETCONF A.3.2. RESTCONF A.3.3. gNMI Protocol Suite A.4. YANG & YANG Data Models A.4.1. YANG A.4.2. Network Device YANG Data Models A.4.3. Network and Service YANG Models Authors' Addresses 1. Introduction The focus of this document is predominantly on the technology requirements on network devices rather that network management agents or network wide controllers that hold a wider view of the network. In addition, the document focusses more on the historical network management configuration rather than YANG telemetry solutions such as YANG Push [RFC8639] [RFC8641] or gNMI. In addition, any complete telemetry solution is likely to also be interested in IPFIX and BMP, Both of which should be considered alongside YANG based solutions when monitoring network devices and subscribing to telemetry data. The main body of this document provides suggestions for what problems IETF should focus on for evolving network configuration. The appendices explore the current landscape of network management tools, protocols, and models for network devices. 2. End Goal 2.1. Long term vision The obvious and cliched long term goal would be for a network to be completely self-managed, automatically recovering from any failures or configuration errors, with strong projections of future capacity planning and network evolution. Such a network would be configured through high level statements of intent, with the network or controller(s) making intelligent and automatic decisions as to how to enact that intent with little or no human involvement. The network would also self-monitor, continuously comparing the actual network state with the desired intent, reconfiguring on the fly to meet the service level requirements. Although this may be an achievable goal in the medium to long term, we believe that such a goal remains a reasonable time away, and there are significant areas of unknown complexity that would need to be solved before being able to achieve this. Generative AI, and other machine learning techniques may help, and these technologies appear to be evolving rapidly, but it is still unclear if they will be able to manage this level of complexity in a robust and comprehensible way, and even if they are, what the resource and financial cost of relying on such technologies would be. In short, we believe that it would be foolish to assume that AI will solve all network management problems and no further short or medium term technology investment is required. Hence, the following sections presents our view of the most pragmatic effective improvements for Network Management technologies in the short/medium term. 2.2. Strategic Medium Term Goals The overall theme of this section is that IETF energies would be better spent making the existing functionality a bit better rather than trying to come up with the next big idea. Hence, this section contains the authors' views of improvements that would likely help vendors and operators move to more network automation and to extract maximum value over the next 5 to 10 years. The summary of recommendations: 1. Improving the existing networking management protocols, to make them easier to implement and use and deploy. E.g., the NETCONF RFC is specified to support arbitrary XML data that is not modelled in YANG, causing the RFC to be harder to specify and harder to understand. Conversely, the YANG RFC contains normative text regarding how it should behave with NETCONF. 2. The IETF should carefully update the YANG language to make it a little bit better, but without a large cost to updating tool chains. The focus of the update should focus on small meaningful improvements rather than turning YANG into a much bigger language. 3. Control the proliferation of YANG data models. Ideally we would have one industry supported external data model for devices rather than both IETF and Open Config. IETF should decide whether to focus only on network and service YANG models, or whether to also provide complete device YANG models. 4. The IETF should maintain continued engagement with operators (as NMOP WG already is) to ensure that IETF's network management focus is on solving the most urgent and most important problems that network operators are currently facing. 5. Solutions should be delivered in a reasonable time frame. I.e., it is better to get to 90% functionality in 1-2 years, than a 100% in 5+ years. An agile iterative approach is best. 6. This is somewhat outside the scope of IETF, but having freely available open source implementations of the protocols, particularly, for the client code. Some additional detail of the recommendations is provided in the following sections. 2.2.1. Small improvements to existing network management protocols For devices using YANG data models, there has been strong industry adoption of using NETCONF as the protocol for editing and querying configuration of devices. However, this protocol has various scenarios that are not clearly specified, increasing the cost of implementation and the risk of incompatible implementations arising, and making the protocol more complicated than it needs to be. An 2.0 version of NETCONF could: * be optimized to specify the minimum functionality required to manage network devices using YANG. E.g., mandate consistent with- defaults handling for all server implementations. * make all extra functionality optional, perhaps moving them to a separate document (e.g., XPath filtering) * consider if there is any legacy features that are no longer useful and could be removed altogether (e.g., shared candidate) * model all NETCONF RPC operations in YANG data models. * support for JSON encoding of YANG data by default, but also allowing support for CBOR and XML. 2.2.2. Improvements to the YANG language The YANG language specification should be updated. However, it is important that there is a clear focus and strategy for updating the language. A large number of issues, tracked on github, have been analyzed, but there needs to be a very critical view at ensuring that the language doesn't evolve into an overly complex second version. The next version of YANG should focus on: * merging in the core versioning changes * any small changes to the language that significantly improve modelling of difficult cases * any small generalizations to the language that make it more widely usable (e.g., add a base float type) * deprecation of functionality that adds unnecessary complexity, to tbe removed in future version (e.g., sub-modules) * any bug fixes or omissions from the existing specification. 2.2.3. Better Data Models It would be better for both vendors and operators if there was a single set of standard/open data models rather than the competing sets from IETF and OpenConfig that are incompatible ecosystems. However, from an authors perspective it is hard to see how these two ecosystems can combine - if anything there seems to be a hardening of the gulf between the two ecosystems, with very different designs to the data models, and diverging protocol specifications that are being optimized for the different styles of data models. E.g., the IETF protocols are being developed in a direction with direct and explicit datastore support, whereas the Open Config models combine intended configuration and operational state into a single data model. Having different data models and protocols greatly increases the cost for vendors, and adds indecision in the market because nobody wants to heavily invest in a technology that may have a limited lifetime. Even though there may not be any way to converge the IETF and Open Config ecosystems, the IETF should try hard to ensure that there is no further fracturing of the YANG Ecosystem. The IETF could, strategically, decide that it doesn't want to invest in device YANG models, but given that it has already published a large number of them, this may not be the best strategy. Assuming that the IETF still wants to develop and improve the ecosystem of IETF YANG data models then there should be more efforts to ensure that the data models work well together and function as a cohesive API. The IETF should: * Develop a mechanism to define sets of IETF and other SDO YANG models that are known to work well together, e.g., perhaps via defining YANG packages [I-D.draft-ietf-netmod-yang-packages]. * Define a more efficient mechanism for evolving YANG data models. Rather than having all of the YANG modules residing in RFCs, that are slow and expensive to update, it would be better to have a working copy of the IETF YANG models with fixes and enhancements applied, stored in github and readily available for use. Over time, as these models become stable they could be published in RFCs, if necessary. * The IETF should consider whether assets, such as YANG models, should be specified in documents at all, of whether the RFCs should only document the abstract overview of the YANG data model structure with the details of the code assets versioned within a git repository (perhaps backed by IANA). * The IETF should check whether the YANG data models are complete to solve particular standard deployments and configuration. E.g., are all the required IETF YANG models available to configure an L3VPN service, or are there basic bits of functionality that would also be needed that are missing. The IETF should aim to fill in any gaps in the model to ensure that at least the basic functionality can be defined in a vendor agnostic way. 2.2.4. Continued engagement with operators The NMOP WG was deliberately chartered to encourage and bring more operator engagement into the IETF, and although the WG is only recently chartered, currently it is working well, particularly as there is increased energy by some operators towards more network automation, and trying to make significant improvements to the tools and techniques that are available, e.g., see [I-D.ietf-nmop-network-anomaly-architecture] and [I-D.ietf-nmop-yang-message-broker-integration]. This equitable collaboration between operators, vendors, and universities is great for the IETF, and should be used as an example of a collaborative project within the IETF sphere that is working well. This close collaboration means that the focus is directly on solving the most critical problems, with the running code being developed by multiple vendors at the same time to ensure that the solution is efficiently implementable. 2.2.5. More efficient execution, smaller steps IETF has a reputation for being slow to standardize new protocols and features, and partly that is the cost of a full consensus based approach. One beneficial aspect of the increased time allows for more reviews and more implementation experience before the specification is finalized. However, the IETF also needs to understand that the slowness also comes at a cost, and for network management it would be better to have a solution driven approach, embracing the IETF mantra of "rough consensus and running code". E.g., it is arguably better to have a solution that achieves all the critical functionality and 90% of the desired functionality delivered in 1-2 year, rather than a full solution that achieves all of the desired functionality, but that takes 5 years to achieve consensus and be standardized. This is the approach that is being taken with the YANG Push Telemetry work. Driven by a group of operators, the focus is on staged "minimum-viable-product" deliverables, where each deliverable is specified to require the minimum agreed functionality to meet a set of goals. The vendors who are participating are developing implementations at the same time as the drafts are being standardized which quickly highly potential problems in the proposed standards which means those issues can be more quickly mitigated, and we also have high confidence as the drafts progress towards RFC. 2.2.6. Availability of Open Source solutions More focus should also be given to the availability of open source solutions that are easy for operators to adopt and that are shown to interoperate well with vendor implementations. Although the IETF is not in the game of conformance checking of implementations, it could still be helpful for the Network Management related working groups to collectively invest in supporting an open source "reference" implementation that keeps pace with the standards, and robustly implements the core functionality defined in the specifications. The goal here is to reduce the barrier of entry for operators and vendors making better use of the existing network management configuration and telemetry solutions. 3. Security Considerations Security of network management operations is of high importance due to the sensitive nature of the information. 4. IANA Considerations This document has no IANA actions. Acknowledgments Informative References [I-D.draft-ietf-netmod-yang-packages] Wilton, R., Rahman, R., Clarke, J., Sterne, J., and B. Wu, "YANG Packages", Work in Progress, Internet-Draft, draft- ietf-netmod-yang-packages-04, 21 October 2024, . [I-D.ietf-nmop-network-anomaly-architecture] Graf, T., Du, W., and P. Francois, "An Architecture for a Network Anomaly Detection Framework", Work in Progress, Internet-Draft, draft-ietf-nmop-network-anomaly- architecture-01, 20 October 2024, . [I-D.ietf-nmop-yang-message-broker-integration] Graf, T. and A. Elhassany, "An Architecture for YANG-Push to Message Broker Integration", Work in Progress, Internet-Draft, draft-ietf-nmop-yang-message-broker- integration-05, 19 October 2024, . [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., and A. Bierman, Ed., "Network Configuration Protocol (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, . [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, "Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information", STD 77, RFC 7011, DOI 10.17487/RFC7011, September 2013, . [RFC7854] Scudder, J., Ed., Fernando, R., and S. Stuart, "BGP Monitoring Protocol (BMP)", RFC 7854, DOI 10.17487/RFC7854, June 2016, . [RFC8040] Bierman, A., Bjorklund, M., and K. Watsen, "RESTCONF Protocol", RFC 8040, DOI 10.17487/RFC8040, January 2017, . [RFC8199] Bogdanovic, D., Claise, B., and C. Moberg, "YANG Module Classification", RFC 8199, DOI 10.17487/RFC8199, July 2017, . [RFC8299] Wu, Q., Ed., Litkowski, S., Tomotaki, L., and K. Ogaki, "YANG Data Model for L3VPN Service Delivery", RFC 8299, DOI 10.17487/RFC8299, January 2018, . [RFC8342] Bjorklund, M., Schoenwaelder, J., Shafer, P., Watsen, K., and R. Wilton, "Network Management Datastore Architecture (NMDA)", RFC 8342, DOI 10.17487/RFC8342, March 2018, . [RFC8345] Clemm, A., Medved, J., Varga, R., Bahadur, N., Ananthakrishnan, H., and X. Liu, "A YANG Data Model for Network Topologies", RFC 8345, DOI 10.17487/RFC8345, March 2018, . [RFC8466] Wen, B., Fioccola, G., Ed., Xie, C., and L. Jalil, "A YANG Data Model for Layer 2 Virtual Private Network (L2VPN) Service Delivery", RFC 8466, DOI 10.17487/RFC8466, October 2018, . [RFC8526] Bjorklund, M., Schoenwaelder, J., Shafer, P., Watsen, K., and R. Wilton, "NETCONF Extensions to Support the Network Management Datastore Architecture", RFC 8526, DOI 10.17487/RFC8526, March 2019, . [RFC8527] Bjorklund, M., Schoenwaelder, J., Shafer, P., Watsen, K., and R. Wilton, "RESTCONF Extensions to Support the Network Management Datastore Architecture", RFC 8527, DOI 10.17487/RFC8527, March 2019, . [RFC8528] Bjorklund, M. and L. Lhotka, "YANG Schema Mount", RFC 8528, DOI 10.17487/RFC8528, March 2019, . [RFC8639] Voit, E., Clemm, A., Gonzalez Prieto, A., Nilsen-Nygaard, E., and A. Tripathy, "Subscription to YANG Notifications", RFC 8639, DOI 10.17487/RFC8639, September 2019, . [RFC8641] Clemm, A. and E. Voit, "Subscription to YANG Notifications for Datastore Updates", RFC 8641, DOI 10.17487/RFC8641, September 2019, . [RFC9182] Barguil, S., Gonzalez de Dios, O., Ed., Boucadair, M., Ed., Munoz, L., and A. Aguado, "A YANG Network Data Model for Layer 3 VPNs", RFC 9182, DOI 10.17487/RFC9182, February 2022, . [RFC9291] Boucadair, M., Ed., Gonzalez de Dios, O., Ed., Barguil, S., and L. Munoz, "A YANG Network Data Model for Layer 2 VPNs", RFC 9291, DOI 10.17487/RFC9291, September 2022, . Appendix A. Current technology and solutions This section of the document gives a perspective of the current landscape of existing network management solutions that may be found on network devices, along with a brief mention of network and service YANG models, and lists some of the issues with those existing technologies. A.1. CLIs interfaces Most vendors offer a command line interface (CLI) for configuring of devices and reporting the operational state of devices (e.g., via _show commands_). For some devices, these CLIs offer interactive text based interfaces to underlying external management models, where as for other devices, the CLIs are independently defined from any programmatic external data model, which can make it hard for network engineers to migrate from a familiar CLI to using a very different programmatic data models. Generally, these CLI based interfaces offer configuration for the full capabilities of the device, including all optional functionality and features. They also generally ensure that the device can be configured in an efficient way (e.g., consider scenarios where adhoc data model specific templates allow a particular configuration to be represented both concisely on the CLI and also implemented in the device's hardware in a resource efficient way). In some cases, network controllers are used as a bridge between offering a north bound programmatic data model and a south bound interface for configuring and managing the device by CLI. In all cases, the CLIs are generally not designed or optimized to be manipulated programmatically, lacking consistent structure and typing, making the solution more fragile when used in this manner. Most devices also offer _show commands_, or the equivalent, for reporting operational state in a text based format, either using tables, or free-form text reporting of relevant fields. Automated parsing of this output can be very fragile, particularly for values that are occasionally outside the anticipated ranges, that may skew a table formatting, or be truncated. For some devices, these show commands are available in different variants that control the level of detail reported, and often the same fundamental information may be reported in multiple separate show commands. I.e., there can be a level of duplication in the data that is reported. Device vendors seem to more commonly apply version management to the configuration aspects of the CLI rather than operational show commands. Generally, access to the show commands is via a synchronous operation, that queries the device and waits for it to collate, sort, and format the data before returning it. These mechanisms are likely to be less efficient for devices that push the operational state off the device, particularly if only changes in the data are pushed, and if those changes don't occur at high frequency (or some form of efficient dampening mechanism is employed). A.2. SNMP & MIBS Although there is currently a wide deployed base of SNMP and MIBS used for monitoring operational data via periodic polling, we expect there to be a significantly decrease in deployments over the next 10 years, at least when used for network management, although there will inevitably be a long tail of deployments before everyone has migrated to newer technologies. SDOs, like IETF, have generally stopped writing new MIBs, or making significant updates to existing MIBs, or the SNMP protocol. Similarly, it appears that most vendors are investing much more heavily in more recent network management protocols and data models rather than investing in either SNMP, updating existing MIBs with new OIDs, or creating new MIBs. For distributed servers, the design of the SNMP protocol is inherently expensive to implement, generally requiring lots of information to be cached before it can be returned. SNMP & MIBs never achieved significant traction in the industry for configuration of core network devices, with operators required to either use the CLI, YANG based management protocols, or other proprietary management interfaces or APIs. A.3. YANG based Network Management Protocols These section describe the current modern network management protocols, that are predominantly YANG based, or optimized for use with YANG. A.3.1. NETCONF NETCONF [RFC6241] is an XML based network management protocol. Originally it was specified to work with generic XML based network management data, but now it is generally expected to be used in conjunction with YANG modeled configuration and operational data. More recently, NETCONF was extended to support the NMDA [RFC8342]. This hasn't yet seen wide adoption, but there is gradually increasing interest. NETCONF is one of the main network management protocols used for configuring devices, used alongside the CLI and gNMI. Most of the NETCONF protocol is reasonably well specified, but there aspects of the protocol that have more patchy implementation support, including a shared candidate datastore, confirmed commit capability, and XPath based filtering. Some areas of the specification are unclear or hard to understand because the definition of the expected behaviour is split between the NETCONF and YANG RFCs. Some aspects of the NETCONF specification give flexibility for the server to implement the behaviour in different ways (e.g., different YANG defaults handling and reporting, startup configuration handling, writable running vs candidate configuration). This flexibility makes it easier for device implementations, but increases the complexity for clients because they must be able to interoperate with different server behaviour. This is some early work within the NETCONF WG to update the NETCONF protocol. This could be a good opportunity to drive for more baseline conformity in behaviour across all network devices that support the new protocol version. A.3.2. RESTCONF RESTCONF [RFC8040] is a newer _REST_ style network management protocol that runs over HTTP and uses YANG Data models. Broadly, RESTCONF offers similar functionality to NETCONF. RESTCONF has achieved more traction as a Northbound interface to network controllers, whereas the main programmatic network management interfaces to devices remains as NETCONF or gNMI. RESTCONF initially offered a "simulated combined datastore" view of the data, done as an effort to simplify the interface. However, the NMDA architecture effectively changed this to a datastore aware architecture, more closely mirroring NETCONF. There seems to be more support for ensuring that RESTCONF maintains feature parity with NETCONF. It has been suggested that RESTCONF could just replace NETCONF as the single IETF protocol to network devices, but there doesn't appear to be a strong industry backing for going in that direction. RESTCONF supports encoding the data in both JSON and XML. The RFC specifies XML as the mandatory to support encapsulation, but it seems likely that over the last seven years since the RFC was published, that the JSON encoding is becoming much more popular than XML. Some enhancements to RESTCONF, in many cases, mirroring similar enhancements being made for NETCONF, are being considered for standardization within IETF. A.3.3. gNMI Protocol Suite gNMI is a newer, industry defined, gRPC based network management protocol that carries data modelled in YANG, encoded via JSON or Protobuf. The _gNMI family_ of protocols includes other related protocols for the management and orchestration of devices (e.g., gNOI, gNSI, gRIBI, Bootz). These are often modelled using gRPC and separate Protobuf definitions rather than leveraging YANG's gRPC, Action, and Notification mechanisms. The stewardship of this protocol suite predominantly falls on the operator community, but with strong leadership by a principal network operator. This allows the protocol to evolve more quickly, although potentially in non-compatible ways that could break existing deployments. The specifications tend to be less precisely specified than the equivalent IETF protocols, and generally have a lower level of technical review, meaning that there are more likely to be interoperability issues between different implementations. There are efforts underway to improve interoperability via a conformance test suite that is being collectively maintained. A.4. YANG & YANG Data Models A.4.1. YANG The YANG data modelling language exists in two version, YANG 1, and YANG 1.1. The effective differences between the two versions are relatively minor and YANG models using both versions are deployed. The IETF NETMOD working group is at the early stages of considering a new version of the YANG language, considering over 100 potential issues and enhancements! At the time of this draft publication, it is unclear whether consensus will converge around a relatively small update to the language, or a more significant new version. It is anticipated that any new version of the YANG language would likely take several years to specify and gain consensus. Care must be taken to strike the right balance of making enough improvements to the language to make an upgrade worthwhile, vs bloating the language with too many features, i.e., suffering from second system syndrome. A future version of the language should be framed clearly around the set of problems it is aiming to solve, e.g., minor fixes to the existing specification, ease of use improvements, or making it easier to model specific problem domains, hopefully without introducing too much additional complexity. A.4.2. Network Device YANG Data Models There appears to be some what of a fracture in the industry as to whether YANG models should be modelled using datastores (as per the IETF Network Management Datastore Architecture), or they should adopt OpenConfig's style, where a single data model contains intended configuration, applied configuration, and operational state in a combined data tree, using a structural naming convention. In some ways, the OpenConfig style leads to a simpler combined data tree, but the YANG files themselves, through the frequent use of groupings are generally much harder to read then the NMDA equivalent, unless compiled into a more readable format. The OpenConfig style doesn't lend itself well to modeling special configuration, e.g., boot configuration, or ephemeral configuration, both of which can be modelled cleanly using the NMDA datastore architecture. Further, there are aspects of the YANG language that somewhat conflict with the OpenConfig style, meaning that there are various YANG language constructs, i.e., presence containers or choice & case statements, that are problematic to use with OpenConfig modelling. Conversely, models designed using the NMDA require using extensions to the NETCONF [RFC8526] and RESTCONF [RFC8527] protocols, that require the target datastore to be specified during operations, to use those models effectively. Further, this means that operations and requests act on either configuration or operational data, not both together. In terms of implementation, many network devices store and manage configuration data separately from operational data due to the different constraints and requirements on the different data sets, e.g., configuration must be transactional and fully consistent, whereas, the operational data is generally only ever eventually consistent. This means that queries or subscriptions that require both configuration and operational state in a single response require the system to fetch the information from two different subsystems and to merge the data into a single response before returning. Depending on the system design, this may be required when combining 'applied configuration' and system defined operational state (e.g., counters and protocol network state), depending on where the applied configuration is tracked in the system. A.4.2.1. Standards based YANG models (IETF, IEEE, BBF. 3GPP) Various SDOs, e.g., IETF, IEEE, BBF, and 3GPP are all in the process of defining YANG models to define network management interfaces for the network protocols that they are responsible for. For the IETF, these data models are designed around the NMDA, allowing the same models to be used both for configuration and be extended to cover operational state aligning the same paths and definitions wherever possible. This approach allows for flexibility for other views (i.e., datastores) on the data to be provided (e.g., factory-default, startup configuration, system defaults, or ephemeral configuration). IETF has already produced RFCs defining network device YANG data models covering many of the key network protocols defined by the IETF. Where published, the YANG models generally provide good coverage of the protocol in question, including optional functionality. The problem with the set of IETF YANG models published so far is that it has taken them a very long time to reach standardization, and they make use of YANG extensions that are still not yet widely implemented (e.g., Schema mount [RFC8528]) and there are significant gaps in the YANG modules that have been published to- date, e.g., the IETF doesn't yet have a published RFC for BGP, L2VPN or EVPN functionality. A.4.2.2. Industry based YANG Models (OpenConfig) The Open Config industry consortium also defines a set of YANG models for configuring and monitoring devices. Like the gNMI protocol, the stewardship of these models predominantly falls on the operator community, but with strong leadership by a principal operator. Vendors implementing the models can also make suggestions and provide comments on proposed changes and additions to the data models, particularly when they would be hard, or impossible, to implement effectively on the network devices. However, generally decisions are less open, than say, IETF's consensus based procedures. The OpenConfig models are focussed on solving the configuration requirements of those operators who participate in the forums, and hence they are somewhat more focussed on solving particular network designs and protocol choices. This can be mean that some technologies may not currently be covered by the OpenConfig YANG models at all, and it can be harder to get additions added, or those additions could undergo significant breaking changes if more operators start to pick a particular technology and collectively decide that a different approach to modelling would be better. The OpenConfig models evolve at a much faster rate than those in the IETF with a lower bar to review and more willingness to make breaking changes to just fix issues, or improve the models, and then move on. There are efforts to restrict those breaking changes to an annual basis, but this will still likely mean that many deployments that move between software releases more slowly would see breaking changes in the management model whenever they update. Models are likely to gain more stability over time, but it is still very likely that there will be issues with version skew in the models, which is likely to fall on the clients or controllers using the models. Generally, OpenConfig models are restricted to using YANG 1, rather than using the updated YANG 1.1 specification. A.4.2.3. Vendor specific YANG models Most large Internet Routers all expose YANG data models for configuring and monitoring the device. There are various choices for the sources of these data models: * based on an existing internal data model * based on the CLI (or show commands) * based on an existing publish or draft models (e.g., IETF or OpenConfig) * designed form scratch. Each of these designs have advantages and disadvantages. Generating or basing the external model on an internal model normally has the advantage that it is easy to translate the configuration for consumption by the system. However, it has the disadvantage that it may leak internal details and structures into the external model, not being able to leverage the full capabilities of YANG, and not being as easy to use. If the internal model is quite different from the CLI then network operators familiar with the CLI must still learn the new model structure. It probably also forces some level of versioning on the internal data-structures or alternatively the ability to handle version skew between the generated models and the internal data model. Basing the vendor device YANG model on the CLI makes the models more familiar, but the structure and extensibility of the CLI and YANG somewhat differs, potentially making for somewhat less well structured YANG models (compared to designing the YANG models from scratch). One strong advantage of this approach is allowing a clean bijective conversion between CLI and the equivalent YANG. Basing the vendor device YANG model on existing SDO or Industry YANG models potentially allows for network operator familiarity (but not with respect to the CLI) and conformability, but unless the device is a green field development, the way particular features are modelled in the external model may differ significantly from the internal device representation, requiring more complex, and potentially less efficient, mapping and internal representation (e.g., expansion of config and less efficient use of hardware resources). Hence, it is likely that deviations and augmentations to the external models will be required to ensure that the external model can be mapped reasonably cleanly into internal representations. A further concern is version skew if the published models change over time but more stability is required in the vendors external model to support existing customer deployments. A final concern here is trying to predict the right public model familiy to base the models on - i.e., which YANG models will likely end up succeeding in the market in the medium term. The final choice is to define the model entirely from scratch. This potentially allows for a better solution, but at a greater development cost. Depending on how closely the model maps to the existing CLI, internal model, or industry or SDO models generally affects the different advantages/disadvantages of this approach from those described above. Generally, in all cases, you would desire and expect the vendor models to hae full parity with the configuration that can be expressed via the CLI, leveraging all of the device configuration capabilities. A different set of choices may be made for the operational data (e.g., show command equivalents), although many of the same advantages and disadvantages equally apply. A.4.2.4. Problems with the YANG model ecosystem One of the biggest problems that is slowing the adoption of YANG and automated network management is the fracture between standard network management models for managing devices, documented in Appendix A.4.2.1 and Appendix A.4.2.2: * OC YANG is more cohesive and complete for various deployments. * IETF YANG is more complete for some specific protocols, but it may not be sufficient to be deployed on its own, retaining some large gaps that must be filled with draft models, or augmented with vendor proprietary models. In addition to this, every vendor has their own legacy CLI, their own data models, which may be entirely independent, be based on the CLI, or perhaps an internal data model. Most devices are likely to have separate internal data models that differ from the external data models, and won't necessarily even be defined in YANG. All of these data model families define their properties in different ways that are not completely compatible with each other. Further, it isn't clear which external YANG data models, if any, will dominate in the market, and hence modifying the internal data models to align with a particular external data model family could be a risky strategy if the wrong data model is chosen. Hence, this generally requires some form of 'mapping' of data in external model families into internal model families, which has its own set of challenges and complexities, see Appendix A.4.2.5. It is unclear which external YANG data models, if any, will end up dominating the market place, and hence, reworking (perhaps based on previous non YANG technologies) or aligning a device's internal data models to better suit the style of a single external model family is likely to be a risky strategy. A.4.2.5. Problems with mapping between internal and external data model families Mapping between external and internal data model families brings its own set of issues. The first obvious problem occurs when the external and internal data models are not fundamentally defined in the same modelling language and where equivalent concepts are modelled in different ways. For example, the concept of how filtering is performed can be specified in an optimized form in the data model, or it can be defined purely as a protocol operation. Secondly, even when both external and internal models are represented in the same domain language (e.g., YANG) then there is a fundamental choice about how to map data (configuration or operational) between the external and internal model families, and what represents the source of truth of configuration data for the device. The perhaps naive, and most obvious, approach is to try to convert between configuration data in the external model to configuration data in the internal data model, and then store the configuration in that internal format. Whenever a request is made to read the current configuration, the device converts back from its internal configuration back to the requested external representation. For the device, the source of truth for the configuration is always stored in the internal native format. Such a choice would allow clients to query the configuration in different formats (e.g., device-native, Open Config, or IETF), or send in separate configuration requests in different families (e.g., the bulk of the configuration could be defined as Open Config YANG, but overridden with native CLI or YANG to cover the parts of the configuration that are not expressible in Open Config). Alas, this approach also brings significant problems. Unless the internal and external data models are very closely aligned (and this isn't generally possible when different incompatible external model families exist) then exact bijective mappings are not possible, since there is always a loss of data, and when you request to read the configuration back, even in the same model family as first configured, you will receive a slightly different version of that configuration data, perhaps with default values added/removed, or differences in the name of arbitrary identifiers. It is the authors' opinion that this is not the best way of trying to solve this problem. The alternative solution, for configuration, is to only map the external configuration down into the internal configuration in a single direction (but allow for configuration errors to be correctly propagated back). The device persists the configuration in the external format as the source of truth, but any queries to return the applied configuration are able to return the exact configuration originally provided. This approach allows for more complex mappings than the bidirectional mapping approach described above, but requires that the external client manage configuration in different model families effectively. A.4.2.6. Problems with how the IETF creates and management YANG models It is hard to argue that IETF has been anything less that very successful at encouraging and advancing interopability between devices over the last four decades. Some aspects that make the IETF process very successful also somewhat act to its detriment. One key observation is that new technology and advances generally move fairly slowly in the IETF, and once standardized, are often even slower to change further. Generally, it is much easier to slow down or block work within the IETF than it is to bring new ideas. Although the slow pace of initial standards development and subsequent evolution can be frustrating, it has the benefit that once the technology becomes mature and is implemented, those protocols and implementations can be stable over a relatively long time period. For some operators and deployments this isn't necessarily important, for others, it can reduce long term costs A.4.3. Network and Service YANG Models The IETF has also specified various YANG models that are exist at the Service or Network-wide layer rather than models for managing specific devices. E.g., L3VPN [RFC8299], and L2VPN [RFC8466] define _Service_ YANG models. [RFC9182] and [RFC9291] define _Network-wide_ YANG models. In addition, network wide topologies can be modelled using [RFC8345], along with many augmentations that have been published or are being developed. [RFC8199] helps characterize the difference between service and device (element) YANG models, but doesn't cover the network-wide layer classification. There has been somewhat stronger adoption of the network and service IETF YANG models by operators, sometimes used in conjunction with OpenConfig YANG models for configuring elements or otherwise device native CLI or YANG models. These models are generally fall outside the scope of the YANG models discussed in the rest of this document, because they do not directly apply to network elements. We are not aware of other industry attempts at defining Network or Service YANG models, but MEF has been working on defining APIs at various management layers, mostly built around OpenAPI specifications rather than YANG. Authors' Addresses Robert Wilton Cisco Systems Email: rwilton@cisco.com Nick Corran Cisco Systems Email: ncorran@cisco.com