Network Working Group                                          R. Wilton
Internet-Draft                                                 N. Corran
Intended status: Informational                             Cisco Systems
Expires: 22 May 2025                                    18 November 2024


    Device Network Management - Current Status, and Future Direction
               draft-wilton-nemops-net-mgmt-future-latest

Abstract

   This document gives a perspective of where we believe the industry is
   with regarding to network management and telemetry based on Rob's
   experience as a recent IETF OPS Area Director for Network Management
   and our joint experience designing and implementing network
   management technologies for large IP/MPLS Internet scale backbone
   routers.

About This Document

   This note is to be removed before publishing as an RFC.

   The latest revision of this draft can be found at
   https://rgwilton.github.io/network-mgmt-future/draft-wilton-nemops-
   net-mgmt-future.html.  Status information for this document may be
   found at https://datatracker.ietf.org/doc/draft-wilton-nemops-net-
   mgmt-future/.

   Source for this draft and an issue tracker can be found at
   https://github.com/rgwilton/network-mgmt-future.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 22 May 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction
   2.  End Goal
     2.1.  Long term vision
     2.2.  Strategic Medium Term Goals
       2.2.1.  Small improvements to existing network management
               protocols
       2.2.2.  Improvements to the YANG language
       2.2.3.  Better Data Models
       2.2.4.  Continued engagement with operators
       2.2.5.  More efficient execution, smaller steps
       2.2.6.  Availability of Open Source solutions
   3.  Security Considerations
   4.  IANA Considerations
   Acknowledgments
   Informative References
   Appendix A.  Current technology and solutions
     A.1.  CLIs interfaces
     A.2.  SNMP & MIBS
     A.3.  YANG based Network Management Protocols
       A.3.1.  NETCONF
       A.3.2.  RESTCONF
       A.3.3.  gNMI Protocol Suite
     A.4.  YANG & YANG Data Models
       A.4.1.  YANG
       A.4.2.  Network Device YANG Data Models
       A.4.3.  Network and Service YANG Models
   Authors' Addresses

1.  Introduction

   The focus of this document is predominantly on the technology
   requirements on network devices rather that network management agents
   or network wide controllers that hold a wider view of the network.

   In addition, the document focusses more on the historical network
   management configuration rather than YANG telemetry solutions such as
   YANG Push [RFC8639] [RFC8641] or gNMI.

   In addition, any complete telemetry solution is likely to also be
   interested in IPFIX and BMP, Both of which should be considered
   alongside YANG based solutions when monitoring network devices and
   subscribing to telemetry data.

   The main body of this document provides suggestions for what problems
   IETF should focus on for evolving network configuration.  The
   appendices explore the current landscape of network management tools,
   protocols, and models for network devices.

2.  End Goal

2.1.  Long term vision

   The obvious and cliched long term goal would be for a network to be
   completely self-managed, automatically recovering from any failures
   or configuration errors, with strong projections of future capacity
   planning and network evolution.  Such a network would be configured
   through high level statements of intent, with the network or
   controller(s) making intelligent and automatic decisions as to how to
   enact that intent with little or no human involvement.  The network
   would also self-monitor, continuously comparing the actual network
   state with the desired intent, reconfiguring on the fly to meet the
   service level requirements.

   Although this may be an achievable goal in the medium to long term,
   we believe that such a goal remains a reasonable time away, and there
   are significant areas of unknown complexity that would need to be
   solved before being able to achieve this.  Generative AI, and other
   machine learning techniques may help, and these technologies appear
   to be evolving rapidly, but it is still unclear if they will be able
   to manage this level of complexity in a robust and comprehensible
   way, and even if they are, what the resource and financial cost of
   relying on such technologies would be.  In short, we believe that it
   would be foolish to assume that AI will solve all network management
   problems and no further short or medium term technology investment is
   required.

   Hence, the following sections presents our view of the most pragmatic
   effective improvements for Network Management technologies in the
   short/medium term.

2.2.  Strategic Medium Term Goals

   The overall theme of this section is that IETF energies would be
   better spent making the existing functionality a bit better rather
   than trying to come up with the next big idea.  Hence, this section
   contains the authors' views of improvements that would likely help
   vendors and operators move to more network automation and to extract
   maximum value over the next 5 to 10 years.

   The summary of recommendations:

   1.  Improving the existing networking management protocols, to make
       them easier to implement and use and deploy.  E.g., the NETCONF
       RFC is specified to support arbitrary XML data that is not
       modelled in YANG, causing the RFC to be harder to specify and
       harder to understand.  Conversely, the YANG RFC contains
       normative text regarding how it should behave with NETCONF.

   2.  The IETF should carefully update the YANG language to make it a
       little bit better, but without a large cost to updating tool
       chains.  The focus of the update should focus on small meaningful
       improvements rather than turning YANG into a much bigger
       language.

   3.  Control the proliferation of YANG data models.  Ideally we would
       have one industry supported external data model for devices
       rather than both IETF and Open Config.  IETF should decide
       whether to focus only on network and service YANG models, or
       whether to also provide complete device YANG models.

   4.  The IETF should maintain continued engagement with operators (as
       NMOP WG already is) to ensure that IETF's network management
       focus is on solving the most urgent and most important problems
       that network operators are currently facing.

   5.  Solutions should be delivered in a reasonable time frame.  I.e.,
       it is better to get to 90% functionality in 1-2 years, than a
       100% in 5+ years.  An agile iterative approach is best.

   6.  This is somewhat outside the scope of IETF, but having freely
       available open source implementations of the protocols,
       particularly, for the client code.

   Some additional detail of the recommendations is provided in the
   following sections.

2.2.1.  Small improvements to existing network management protocols

   For devices using YANG data models, there has been strong industry
   adoption of using NETCONF as the protocol for editing and querying
   configuration of devices.  However, this protocol has various
   scenarios that are not clearly specified, increasing the cost of
   implementation and the risk of incompatible implementations arising,
   and making the protocol more complicated than it needs to be.

   An 2.0 version of NETCONF could:

   *  be optimized to specify the minimum functionality required to
      manage network devices using YANG.  E.g., mandate consistent with-
      defaults handling for all server implementations.

   *  make all extra functionality optional, perhaps moving them to a
      separate document (e.g., XPath filtering)

   *  consider if there is any legacy features that are no longer useful
      and could be removed altogether (e.g., shared candidate)

   *  model all NETCONF RPC operations in YANG data models.

   *  support for JSON encoding of YANG data by default, but also
      allowing support for CBOR and XML.

2.2.2.  Improvements to the YANG language

   The YANG language specification should be updated.  However, it is
   important that there is a clear focus and strategy for updating the
   language.  A large number of issues, tracked on github, have been
   analyzed, but there needs to be a very critical view at ensuring that
   the language doesn't evolve into an overly complex second version.

   The next version of YANG should focus on:

   *  merging in the core versioning changes

   *  any small changes to the language that significantly improve
      modelling of difficult cases

   *  any small generalizations to the language that make it more widely
      usable (e.g., add a base float type)

   *  deprecation of functionality that adds unnecessary complexity, to
      tbe removed in future version (e.g., sub-modules)

   *  any bug fixes or omissions from the existing specification.

2.2.3.  Better Data Models

   It would be better for both vendors and operators if there was a
   single set of standard/open data models rather than the competing
   sets from IETF and OpenConfig that are incompatible ecosystems.
   However, from an authors perspective it is hard to see how these two
   ecosystems can combine - if anything there seems to be a hardening of
   the gulf between the two ecosystems, with very different designs to
   the data models, and diverging protocol specifications that are being
   optimized for the different styles of data models.  E.g., the IETF
   protocols are being developed in a direction with direct and explicit
   datastore support, whereas the Open Config models combine intended
   configuration and operational state into a single data model.

   Having different data models and protocols greatly increases the cost
   for vendors, and adds indecision in the market because nobody wants
   to heavily invest in a technology that may have a limited lifetime.

   Even though there may not be any way to converge the IETF and Open
   Config ecosystems, the IETF should try hard to ensure that there is
   no further fracturing of the YANG Ecosystem.

   The IETF could, strategically, decide that it doesn't want to invest
   in device YANG models, but given that it has already published a
   large number of them, this may not be the best strategy.  Assuming
   that the IETF still wants to develop and improve the ecosystem of
   IETF YANG data models then there should be more efforts to ensure
   that the data models work well together and function as a cohesive
   API.  The IETF should:

   *  Develop a mechanism to define sets of IETF and other SDO YANG
      models that are known to work well together, e.g., perhaps via
      defining YANG packages [I-D.draft-ietf-netmod-yang-packages].

   *  Define a more efficient mechanism for evolving YANG data models.
      Rather than having all of the YANG modules residing in RFCs, that
      are slow and expensive to update, it would be better to have a
      working copy of the IETF YANG models with fixes and enhancements
      applied, stored in github and readily available for use.  Over
      time, as these models become stable they could be published in
      RFCs, if necessary.

   *  The IETF should consider whether assets, such as YANG models,
      should be specified in documents at all, of whether the RFCs
      should only document the abstract overview of the YANG data model
      structure with the details of the code assets versioned within a
      git repository (perhaps backed by IANA).

   *  The IETF should check whether the YANG data models are complete to
      solve particular standard deployments and configuration.  E.g.,
      are all the required IETF YANG models available to configure an
      L3VPN service, or are there basic bits of functionality that would
      also be needed that are missing.  The IETF should aim to fill in
      any gaps in the model to ensure that at least the basic
      functionality can be defined in a vendor agnostic way.

2.2.4.  Continued engagement with operators

   The NMOP WG was deliberately chartered to encourage and bring more
   operator engagement into the IETF, and although the WG is only
   recently chartered, currently it is working well, particularly as
   there is increased energy by some operators towards more network
   automation, and trying to make significant improvements to the tools
   and techniques that are available, e.g., see
   [I-D.ietf-nmop-network-anomaly-architecture] and
   [I-D.ietf-nmop-yang-message-broker-integration].

   This equitable collaboration between operators, vendors, and
   universities is great for the IETF, and should be used as an example
   of a collaborative project within the IETF sphere that is working
   well.  This close collaboration means that the focus is directly on
   solving the most critical problems, with the running code being
   developed by multiple vendors at the same time to ensure that the
   solution is efficiently implementable.

2.2.5.  More efficient execution, smaller steps

   IETF has a reputation for being slow to standardize new protocols and
   features, and partly that is the cost of a full consensus based
   approach.  One beneficial aspect of the increased time allows for
   more reviews and more implementation experience before the
   specification is finalized.  However, the IETF also needs to
   understand that the slowness also comes at a cost, and for network
   management it would be better to have a solution driven approach,
   embracing the IETF mantra of "rough consensus and running code".
   E.g., it is arguably better to have a solution that achieves all the
   critical functionality and 90% of the desired functionality delivered
   in 1-2 year, rather than a full solution that achieves all of the
   desired functionality, but that takes 5 years to achieve consensus
   and be standardized.

   This is the approach that is being taken with the YANG Push Telemetry
   work.  Driven by a group of operators, the focus is on staged
   "minimum-viable-product" deliverables, where each deliverable is
   specified to require the minimum agreed functionality to meet a set
   of goals.  The vendors who are participating are developing
   implementations at the same time as the drafts are being standardized
   which quickly highly potential problems in the proposed standards
   which means those issues can be more quickly mitigated, and we also
   have high confidence as the drafts progress towards RFC.

2.2.6.  Availability of Open Source solutions

   More focus should also be given to the availability of open source
   solutions that are easy for operators to adopt and that are shown to
   interoperate well with vendor implementations.  Although the IETF is
   not in the game of conformance checking of implementations, it could
   still be helpful for the Network Management related working groups to
   collectively invest in supporting an open source "reference"
   implementation that keeps pace with the standards, and robustly
   implements the core functionality defined in the specifications.

   The goal here is to reduce the barrier of entry for operators and
   vendors making better use of the existing network management
   configuration and telemetry solutions.

3.  Security Considerations

   Security of network management operations is of high importance due
   to the sensitive nature of the information.

4.  IANA Considerations

   This document has no IANA actions.

Acknowledgments

Informative References

   [I-D.draft-ietf-netmod-yang-packages]
              Wilton, R., Rahman, R., Clarke, J., Sterne, J., and B. Wu,
              "YANG Packages", Work in Progress, Internet-Draft, draft-
              ietf-netmod-yang-packages-04, 21 October 2024,
              <https://datatracker.ietf.org/doc/html/draft-ietf-netmod-
              yang-packages-04>.

   [I-D.ietf-nmop-network-anomaly-architecture]
              Graf, T., Du, W., and P. Francois, "An Architecture for a
              Network Anomaly Detection Framework", Work in Progress,
              Internet-Draft, draft-ietf-nmop-network-anomaly-
              architecture-01, 20 October 2024,
              <https://datatracker.ietf.org/doc/html/draft-ietf-nmop-
              network-anomaly-architecture-01>.

   [I-D.ietf-nmop-yang-message-broker-integration]
              Graf, T. and A. Elhassany, "An Architecture for YANG-Push
              to Message Broker Integration", Work in Progress,
              Internet-Draft, draft-ietf-nmop-yang-message-broker-
              integration-05, 19 October 2024,
              <https://datatracker.ietf.org/doc/html/draft-ietf-nmop-
              yang-message-broker-integration-05>.

   [RFC6241]  Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed.,
              and A. Bierman, Ed., "Network Configuration Protocol
              (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011,
              <https://www.rfc-editor.org/rfc/rfc6241>.

   [RFC7011]  Claise, B., Ed., Trammell, B., Ed., and P. Aitken,
              "Specification of the IP Flow Information Export (IPFIX)
              Protocol for the Exchange of Flow Information", STD 77,
              RFC 7011, DOI 10.17487/RFC7011, September 2013,
              <https://www.rfc-editor.org/rfc/rfc7011>.

   [RFC7854]  Scudder, J., Ed., Fernando, R., and S. Stuart, "BGP
              Monitoring Protocol (BMP)", RFC 7854,
              DOI 10.17487/RFC7854, June 2016,
              <https://www.rfc-editor.org/rfc/rfc7854>.

   [RFC8040]  Bierman, A., Bjorklund, M., and K. Watsen, "RESTCONF
              Protocol", RFC 8040, DOI 10.17487/RFC8040, January 2017,
              <https://www.rfc-editor.org/rfc/rfc8040>.

   [RFC8199]  Bogdanovic, D., Claise, B., and C. Moberg, "YANG Module
              Classification", RFC 8199, DOI 10.17487/RFC8199, July
              2017, <https://www.rfc-editor.org/rfc/rfc8199>.

   [RFC8299]  Wu, Q., Ed., Litkowski, S., Tomotaki, L., and K. Ogaki,
              "YANG Data Model for L3VPN Service Delivery", RFC 8299,
              DOI 10.17487/RFC8299, January 2018,
              <https://www.rfc-editor.org/rfc/rfc8299>.

   [RFC8342]  Bjorklund, M., Schoenwaelder, J., Shafer, P., Watsen, K.,
              and R. Wilton, "Network Management Datastore Architecture
              (NMDA)", RFC 8342, DOI 10.17487/RFC8342, March 2018,
              <https://www.rfc-editor.org/rfc/rfc8342>.

   [RFC8345]  Clemm, A., Medved, J., Varga, R., Bahadur, N.,
              Ananthakrishnan, H., and X. Liu, "A YANG Data Model for
              Network Topologies", RFC 8345, DOI 10.17487/RFC8345, March
              2018, <https://www.rfc-editor.org/rfc/rfc8345>.

   [RFC8466]  Wen, B., Fioccola, G., Ed., Xie, C., and L. Jalil, "A YANG
              Data Model for Layer 2 Virtual Private Network (L2VPN)
              Service Delivery", RFC 8466, DOI 10.17487/RFC8466, October
              2018, <https://www.rfc-editor.org/rfc/rfc8466>.

   [RFC8526]  Bjorklund, M., Schoenwaelder, J., Shafer, P., Watsen, K.,
              and R. Wilton, "NETCONF Extensions to Support the Network
              Management Datastore Architecture", RFC 8526,
              DOI 10.17487/RFC8526, March 2019,
              <https://www.rfc-editor.org/rfc/rfc8526>.

   [RFC8527]  Bjorklund, M., Schoenwaelder, J., Shafer, P., Watsen, K.,
              and R. Wilton, "RESTCONF Extensions to Support the Network
              Management Datastore Architecture", RFC 8527,
              DOI 10.17487/RFC8527, March 2019,
              <https://www.rfc-editor.org/rfc/rfc8527>.

   [RFC8528]  Bjorklund, M. and L. Lhotka, "YANG Schema Mount",
              RFC 8528, DOI 10.17487/RFC8528, March 2019,
              <https://www.rfc-editor.org/rfc/rfc8528>.

   [RFC8639]  Voit, E., Clemm, A., Gonzalez Prieto, A., Nilsen-Nygaard,
              E., and A. Tripathy, "Subscription to YANG Notifications",
              RFC 8639, DOI 10.17487/RFC8639, September 2019,
              <https://www.rfc-editor.org/rfc/rfc8639>.

   [RFC8641]  Clemm, A. and E. Voit, "Subscription to YANG Notifications
              for Datastore Updates", RFC 8641, DOI 10.17487/RFC8641,
              September 2019, <https://www.rfc-editor.org/rfc/rfc8641>.

   [RFC9182]  Barguil, S., Gonzalez de Dios, O., Ed., Boucadair, M.,
              Ed., Munoz, L., and A. Aguado, "A YANG Network Data Model
              for Layer 3 VPNs", RFC 9182, DOI 10.17487/RFC9182,
              February 2022, <https://www.rfc-editor.org/rfc/rfc9182>.

   [RFC9291]  Boucadair, M., Ed., Gonzalez de Dios, O., Ed., Barguil,
              S., and L. Munoz, "A YANG Network Data Model for Layer 2
              VPNs", RFC 9291, DOI 10.17487/RFC9291, September 2022,
              <https://www.rfc-editor.org/rfc/rfc9291>.

Appendix A.  Current technology and solutions

   This section of the document gives a perspective of the current
   landscape of existing network management solutions that may be found
   on network devices, along with a brief mention of network and service
   YANG models, and lists some of the issues with those existing
   technologies.

A.1.  CLIs interfaces

   Most vendors offer a command line interface (CLI) for configuring of
   devices and reporting the operational state of devices (e.g., via
   _show commands_).

   For some devices, these CLIs offer interactive text based interfaces
   to underlying external management models, where as for other devices,
   the CLIs are independently defined from any programmatic external
   data model, which can make it hard for network engineers to migrate
   from a familiar CLI to using a very different programmatic data
   models.

   Generally, these CLI based interfaces offer configuration for the
   full capabilities of the device, including all optional functionality
   and features.  They also generally ensure that the device can be
   configured in an efficient way (e.g., consider scenarios where adhoc
   data model specific templates allow a particular configuration to be
   represented both concisely on the CLI and also implemented in the
   device's hardware in a resource efficient way).

   In some cases, network controllers are used as a bridge between
   offering a north bound programmatic data model and a south bound
   interface for configuring and managing the device by CLI.

   In all cases, the CLIs are generally not designed or optimized to be
   manipulated programmatically, lacking consistent structure and
   typing, making the solution more fragile when used in this manner.

   Most devices also offer _show commands_, or the equivalent, for
   reporting operational state in a text based format, either using
   tables, or free-form text reporting of relevant fields.  Automated
   parsing of this output can be very fragile, particularly for values
   that are occasionally outside the anticipated ranges, that may skew a
   table formatting, or be truncated.  For some devices, these show
   commands are available in different variants that control the level
   of detail reported, and often the same fundamental information may be
   reported in multiple separate show commands.  I.e., there can be a
   level of duplication in the data that is reported.  Device vendors
   seem to more commonly apply version management to the configuration
   aspects of the CLI rather than operational show commands.

   Generally, access to the show commands is via a synchronous
   operation, that queries the device and waits for it to collate, sort,
   and format the data before returning it.  These mechanisms are likely
   to be less efficient for devices that push the operational state off
   the device, particularly if only changes in the data are pushed, and
   if those changes don't occur at high frequency (or some form of
   efficient dampening mechanism is employed).

A.2.  SNMP & MIBS

   Although there is currently a wide deployed base of SNMP and MIBS
   used for monitoring operational data via periodic polling, we expect
   there to be a significantly decrease in deployments over the next 10
   years, at least when used for network management, although there will
   inevitably be a long tail of deployments before everyone has migrated
   to newer technologies.  SDOs, like IETF, have generally stopped
   writing new MIBs, or making significant updates to existing MIBs, or
   the SNMP protocol.  Similarly, it appears that most vendors are
   investing much more heavily in more recent network management
   protocols and data models rather than investing in either SNMP,
   updating existing MIBs with new OIDs, or creating new MIBs.

   For distributed servers, the design of the SNMP protocol is
   inherently expensive to implement, generally requiring lots of
   information to be cached before it can be returned.

   SNMP & MIBs never achieved significant traction in the industry for
   configuration of core network devices, with operators required to
   either use the CLI, YANG based management protocols, or other
   proprietary management interfaces or APIs.

A.3.  YANG based Network Management Protocols

   These section describe the current modern network management
   protocols, that are predominantly YANG based, or optimized for use
   with YANG.

A.3.1.  NETCONF

   NETCONF [RFC6241] is an XML based network management protocol.
   Originally it was specified to work with generic XML based network
   management data, but now it is generally expected to be used in
   conjunction with YANG modeled configuration and operational data.
   More recently, NETCONF was extended to support the NMDA [RFC8342].
   This hasn't yet seen wide adoption, but there is gradually increasing
   interest.

   NETCONF is one of the main network management protocols used for
   configuring devices, used alongside the CLI and gNMI.

   Most of the NETCONF protocol is reasonably well specified, but there
   aspects of the protocol that have more patchy implementation support,
   including a shared candidate datastore, confirmed commit capability,
   and XPath based filtering.  Some areas of the specification are
   unclear or hard to understand because the definition of the expected
   behaviour is split between the NETCONF and YANG RFCs.

   Some aspects of the NETCONF specification give flexibility for the
   server to implement the behaviour in different ways (e.g., different
   YANG defaults handling and reporting, startup configuration handling,
   writable running vs candidate configuration).  This flexibility makes
   it easier for device implementations, but increases the complexity
   for clients because they must be able to interoperate with different
   server behaviour.

   This is some early work within the NETCONF WG to update the NETCONF
   protocol.  This could be a good opportunity to drive for more
   baseline conformity in behaviour across all network devices that
   support the new protocol version.

A.3.2.  RESTCONF

   RESTCONF [RFC8040] is a newer _REST_ style network management
   protocol that runs over HTTP and uses YANG Data models.  Broadly,
   RESTCONF offers similar functionality to NETCONF.  RESTCONF has
   achieved more traction as a Northbound interface to network
   controllers, whereas the main programmatic network management
   interfaces to devices remains as NETCONF or gNMI.

   RESTCONF initially offered a "simulated combined datastore" view of
   the data, done as an effort to simplify the interface.  However, the
   NMDA architecture effectively changed this to a datastore aware
   architecture, more closely mirroring NETCONF.  There seems to be more
   support for ensuring that RESTCONF maintains feature parity with
   NETCONF.  It has been suggested that RESTCONF could just replace
   NETCONF as the single IETF protocol to network devices, but there
   doesn't appear to be a strong industry backing for going in that
   direction.

   RESTCONF supports encoding the data in both JSON and XML.  The RFC
   specifies XML as the mandatory to support encapsulation, but it seems
   likely that over the last seven years since the RFC was published,
   that the JSON encoding is becoming much more popular than XML.

   Some enhancements to RESTCONF, in many cases, mirroring similar
   enhancements being made for NETCONF, are being considered for
   standardization within IETF.

A.3.3.  gNMI Protocol Suite

   gNMI is a newer, industry defined, gRPC based network management
   protocol that carries data modelled in YANG, encoded via JSON or
   Protobuf.  The _gNMI family_ of protocols includes other related
   protocols for the management and orchestration of devices (e.g.,
   gNOI, gNSI, gRIBI, Bootz).  These are often modelled using gRPC and
   separate Protobuf definitions rather than leveraging YANG's gRPC,
   Action, and Notification mechanisms.

   The stewardship of this protocol suite predominantly falls on the
   operator community, but with strong leadership by a principal network
   operator.  This allows the protocol to evolve more quickly, although
   potentially in non-compatible ways that could break existing
   deployments.  The specifications tend to be less precisely specified
   than the equivalent IETF protocols, and generally have a lower level
   of technical review, meaning that there are more likely to be
   interoperability issues between different implementations.

   There are efforts underway to improve interoperability via a
   conformance test suite that is being collectively maintained.

A.4.  YANG & YANG Data Models

A.4.1.  YANG

   The YANG data modelling language exists in two version, YANG 1, and
   YANG 1.1.  The effective differences between the two versions are
   relatively minor and YANG models using both versions are deployed.

   The IETF NETMOD working group is at the early stages of considering a
   new version of the YANG language, considering over 100 potential
   issues and enhancements!  At the time of this draft publication, it
   is unclear whether consensus will converge around a relatively small
   update to the language, or a more significant new version.  It is
   anticipated that any new version of the YANG language would likely
   take several years to specify and gain consensus.  Care must be taken
   to strike the right balance of making enough improvements to the
   language to make an upgrade worthwhile, vs bloating the language with
   too many features, i.e., suffering from second system syndrome.  A
   future version of the language should be framed clearly around the
   set of problems it is aiming to solve, e.g., minor fixes to the
   existing specification, ease of use improvements, or making it easier
   to model specific problem domains, hopefully without introducing too
   much additional complexity.

A.4.2.  Network Device YANG Data Models

   There appears to be some what of a fracture in the industry as to
   whether YANG models should be modelled using datastores (as per the
   IETF Network Management Datastore Architecture), or they should adopt
   OpenConfig's style, where a single data model contains intended
   configuration, applied configuration, and operational state in a
   combined data tree, using a structural naming convention.

   In some ways, the OpenConfig style leads to a simpler combined data
   tree, but the YANG files themselves, through the frequent use of
   groupings are generally much harder to read then the NMDA equivalent,
   unless compiled into a more readable format.  The OpenConfig style
   doesn't lend itself well to modeling special configuration, e.g.,
   boot configuration, or ephemeral configuration, both of which can be
   modelled cleanly using the NMDA datastore architecture.  Further,
   there are aspects of the YANG language that somewhat conflict with
   the OpenConfig style, meaning that there are various YANG language
   constructs, i.e., presence containers or choice & case statements,
   that are problematic to use with OpenConfig modelling.

   Conversely, models designed using the NMDA require using extensions
   to the NETCONF [RFC8526] and RESTCONF [RFC8527] protocols, that
   require the target datastore to be specified during operations, to
   use those models effectively.  Further, this means that operations
   and requests act on either configuration or operational data, not
   both together.

   In terms of implementation, many network devices store and manage
   configuration data separately from operational data due to the
   different constraints and requirements on the different data sets,
   e.g., configuration must be transactional and fully consistent,
   whereas, the operational data is generally only ever eventually
   consistent.  This means that queries or subscriptions that require
   both configuration and operational state in a single response require
   the system to fetch the information from two different subsystems and
   to merge the data into a single response before returning.  Depending
   on the system design, this may be required when combining 'applied
   configuration' and system defined operational state (e.g., counters
   and protocol network state), depending on where the applied
   configuration is tracked in the system.

A.4.2.1.  Standards based YANG models (IETF, IEEE, BBF. 3GPP)

   Various SDOs, e.g., IETF, IEEE, BBF, and 3GPP are all in the process
   of defining YANG models to define network management interfaces for
   the network protocols that they are responsible for.

   For the IETF, these data models are designed around the NMDA,
   allowing the same models to be used both for configuration and be
   extended to cover operational state aligning the same paths and
   definitions wherever possible.  This approach allows for flexibility
   for other views (i.e., datastores) on the data to be provided (e.g.,
   factory-default, startup configuration, system defaults, or ephemeral
   configuration).

   IETF has already produced RFCs defining network device YANG data
   models covering many of the key network protocols defined by the
   IETF.  Where published, the YANG models generally provide good
   coverage of the protocol in question, including optional
   functionality.  The problem with the set of IETF YANG models
   published so far is that it has taken them a very long time to reach
   standardization, and they make use of YANG extensions that are still
   not yet widely implemented (e.g., Schema mount [RFC8528]) and there
   are significant gaps in the YANG modules that have been published to-
   date, e.g., the IETF doesn't yet have a published RFC for BGP, L2VPN
   or EVPN functionality.

A.4.2.2.  Industry based YANG Models (OpenConfig)

   The Open Config industry consortium also defines a set of YANG models
   for configuring and monitoring devices.  Like the gNMI protocol, the
   stewardship of these models predominantly falls on the operator
   community, but with strong leadership by a principal operator.
   Vendors implementing the models can also make suggestions and provide
   comments on proposed changes and additions to the data models,
   particularly when they would be hard, or impossible, to implement
   effectively on the network devices.  However, generally decisions are
   less open, than say, IETF's consensus based procedures.

   The OpenConfig models are focussed on solving the configuration
   requirements of those operators who participate in the forums, and
   hence they are somewhat more focussed on solving particular network
   designs and protocol choices.  This can be mean that some
   technologies may not currently be covered by the OpenConfig YANG
   models at all, and it can be harder to get additions added, or those
   additions could undergo significant breaking changes if more
   operators start to pick a particular technology and collectively
   decide that a different approach to modelling would be better.

   The OpenConfig models evolve at a much faster rate than those in the
   IETF with a lower bar to review and more willingness to make breaking
   changes to just fix issues, or improve the models, and then move on.
   There are efforts to restrict those breaking changes to an annual
   basis, but this will still likely mean that many deployments that
   move between software releases more slowly would see breaking changes
   in the management model whenever they update.  Models are likely to
   gain more stability over time, but it is still very likely that there
   will be issues with version skew in the models, which is likely to
   fall on the clients or controllers using the models.

   Generally, OpenConfig models are restricted to using YANG 1, rather
   than using the updated YANG 1.1 specification.

A.4.2.3.  Vendor specific YANG models

   Most large Internet Routers all expose YANG data models for
   configuring and monitoring the device.

   There are various choices for the sources of these data models:

   *  based on an existing internal data model

   *  based on the CLI (or show commands)

   *  based on an existing publish or draft models (e.g., IETF or
      OpenConfig)

   *  designed form scratch.

   Each of these designs have advantages and disadvantages.

   Generating or basing the external model on an internal model normally
   has the advantage that it is easy to translate the configuration for
   consumption by the system.  However, it has the disadvantage that it
   may leak internal details and structures into the external model, not
   being able to leverage the full capabilities of YANG, and not being
   as easy to use.  If the internal model is quite different from the
   CLI then network operators familiar with the CLI must still learn the
   new model structure.  It probably also forces some level of
   versioning on the internal data-structures or alternatively the
   ability to handle version skew between the generated models and the
   internal data model.

   Basing the vendor device YANG model on the CLI makes the models more
   familiar, but the structure and extensibility of the CLI and YANG
   somewhat differs, potentially making for somewhat less well
   structured YANG models (compared to designing the YANG models from
   scratch).  One strong advantage of this approach is allowing a clean
   bijective conversion between CLI and the equivalent YANG.

   Basing the vendor device YANG model on existing SDO or Industry YANG
   models potentially allows for network operator familiarity (but not
   with respect to the CLI) and conformability, but unless the device is
   a green field development, the way particular features are modelled
   in the external model may differ significantly from the internal
   device representation, requiring more complex, and potentially less
   efficient, mapping and internal representation (e.g., expansion of
   config and less efficient use of hardware resources).  Hence, it is
   likely that deviations and augmentations to the external models will
   be required to ensure that the external model can be mapped
   reasonably cleanly into internal representations.  A further concern
   is version skew if the published models change over time but more
   stability is required in the vendors external model to support
   existing customer deployments.  A final concern here is trying to
   predict the right public model familiy to base the models on - i.e.,
   which YANG models will likely end up succeeding in the market in the
   medium term.

   The final choice is to define the model entirely from scratch.  This
   potentially allows for a better solution, but at a greater
   development cost.  Depending on how closely the model maps to the
   existing CLI, internal model, or industry or SDO models generally
   affects the different advantages/disadvantages of this approach from
   those described above.

   Generally, in all cases, you would desire and expect the vendor
   models to hae full parity with the configuration that can be
   expressed via the CLI, leveraging all of the device configuration
   capabilities.

   A different set of choices may be made for the operational data
   (e.g., show command equivalents), although many of the same
   advantages and disadvantages equally apply.

A.4.2.4.  Problems with the YANG model ecosystem

   One of the biggest problems that is slowing the adoption of YANG and
   automated network management is the fracture between standard network
   management models for managing devices, documented in
   Appendix A.4.2.1 and Appendix A.4.2.2:

   *  OC YANG is more cohesive and complete for various deployments.

   *  IETF YANG is more complete for some specific protocols, but it may
      not be sufficient to be deployed on its own, retaining some large
      gaps that must be filled with draft models, or augmented with
      vendor proprietary models.

   In addition to this, every vendor has their own legacy CLI, their own
   data models, which may be entirely independent, be based on the CLI,
   or perhaps an internal data model.  Most devices are likely to have
   separate internal data models that differ from the external data
   models, and won't necessarily even be defined in YANG.

   All of these data model families define their properties in different
   ways that are not completely compatible with each other.  Further, it
   isn't clear which external YANG data models, if any, will dominate in
   the market, and hence modifying the internal data models to align
   with a particular external data model family could be a risky
   strategy if the wrong data model is chosen.  Hence, this generally
   requires some form of 'mapping' of data in external model families
   into internal model families, which has its own set of challenges and
   complexities, see Appendix A.4.2.5.

   It is unclear which external YANG data models, if any, will end up
   dominating the market place, and hence, reworking (perhaps based on
   previous non YANG technologies) or aligning a device's internal data
   models to better suit the style of a single external model family is
   likely to be a risky strategy.

A.4.2.5.  Problems with mapping between internal and external data model
          families

   Mapping between external and internal data model families brings its
   own set of issues.

   The first obvious problem occurs when the external and internal data
   models are not fundamentally defined in the same modelling language
   and where equivalent concepts are modelled in different ways.  For
   example, the concept of how filtering is performed can be specified
   in an optimized form in the data model, or it can be defined purely
   as a protocol operation.

   Secondly, even when both external and internal models are represented
   in the same domain language (e.g., YANG) then there is a fundamental
   choice about how to map data (configuration or operational) between
   the external and internal model families, and what represents the
   source of truth of configuration data for the device.

   The perhaps naive, and most obvious, approach is to try to convert
   between configuration data in the external model to configuration
   data in the internal data model, and then store the configuration in
   that internal format.  Whenever a request is made to read the current
   configuration, the device converts back from its internal
   configuration back to the requested external representation.  For the
   device, the source of truth for the configuration is always stored in
   the internal native format.  Such a choice would allow clients to
   query the configuration in different formats (e.g., device-native,
   Open Config, or IETF), or send in separate configuration requests in
   different families (e.g., the bulk of the configuration could be
   defined as Open Config YANG, but overridden with native CLI or YANG
   to cover the parts of the configuration that are not expressible in
   Open Config).  Alas, this approach also brings significant problems.
   Unless the internal and external data models are very closely aligned
   (and this isn't generally possible when different incompatible
   external model families exist) then exact bijective mappings are not
   possible, since there is always a loss of data, and when you request
   to read the configuration back, even in the same model family as
   first configured, you will receive a slightly different version of
   that configuration data, perhaps with default values added/removed,
   or differences in the name of arbitrary identifiers.  It is the
   authors' opinion that this is not the best way of trying to solve
   this problem.

   The alternative solution, for configuration, is to only map the
   external configuration down into the internal configuration in a
   single direction (but allow for configuration errors to be correctly
   propagated back).  The device persists the configuration in the
   external format as the source of truth, but any queries to return the
   applied configuration are able to return the exact configuration
   originally provided.  This approach allows for more complex mappings
   than the bidirectional mapping approach described above, but requires
   that the external client manage configuration in different model
   families effectively.

A.4.2.6.  Problems with how the IETF creates and management YANG models

   It is hard to argue that IETF has been anything less that very
   successful at encouraging and advancing interopability between
   devices over the last four decades.  Some aspects that make the IETF
   process very successful also somewhat act to its detriment.  One key
   observation is that new technology and advances generally move fairly
   slowly in the IETF, and once standardized, are often even slower to
   change further.  Generally, it is much easier to slow down or block
   work within the IETF than it is to bring new ideas.  Although the
   slow pace of initial standards development and subsequent evolution
   can be frustrating, it has the benefit that once the technology
   becomes mature and is implemented, those protocols and
   implementations can be stable over a relatively long time period.
   For some operators and deployments this isn't necessarily important,
   for others, it can reduce long term costs

A.4.3.  Network and Service YANG Models

   The IETF has also specified various YANG models that are exist at the
   Service or Network-wide layer rather than models for managing
   specific devices.  E.g., L3VPN [RFC8299], and L2VPN [RFC8466] define
   _Service_ YANG models.  [RFC9182] and [RFC9291] define _Network-wide_
   YANG models.  In addition, network wide topologies can be modelled
   using [RFC8345], along with many augmentations that have been
   published or are being developed.  [RFC8199] helps characterize the
   difference between service and device (element) YANG models, but
   doesn't cover the network-wide layer classification.

   There has been somewhat stronger adoption of the network and service
   IETF YANG models by operators, sometimes used in conjunction with
   OpenConfig YANG models for configuring elements or otherwise device
   native CLI or YANG models.

   These models are generally fall outside the scope of the YANG models
   discussed in the rest of this document, because they do not directly
   apply to network elements.

   We are not aware of other industry attempts at defining Network or
   Service YANG models, but MEF has been working on defining APIs at
   various management layers, mostly built around OpenAPI specifications
   rather than YANG.

Authors' Addresses

   Robert Wilton
   Cisco Systems
   Email: rwilton@cisco.com


   Nick Corran
   Cisco Systems
   Email: ncorran@cisco.com