Internet-Draft | Deterministic Networking | December 2022 |
Guo & Wen | Expires 26 June 2023 | [Page] |
This document presents a scheme of planning virtue periodic forwarding channel (VPFC) based on virtue periodic forwarding path (VPFP) in large-scale deterministic networks (LDNs). This solution can solve the queuing resource conflict of specified cycle forwarding in nonlinear topology domain in LDNs, thus ensuring the bounded latency of DetNet flow in the same periodic forwarding domain deployed in LDNs.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 26 June 2023.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
As described in [I-D.ietf-detnet-scaling-requirements], LDN need to support not only large amounts of flows and devices, but also large single-hop propagation latency and accommodate a variety of data plane queuing and forwarding mechanisms to carry App-flows with different levels of SLA. For some App-flows with strict requirements on delivery delay and delay variation (jitter), it is necessary to adopt a mechanism based on cyclic queuing and forwarding similar to CQF [IEEE802.1Qch]. These mechanisms are extended for WAN and make forwarding in DetNet transit nodes lightweight, without per-flow and per-hop state, and suitable for high-performance hardware implementation.¶
[I-D.qiang-detnet-large-scale-detnet] presents the overall framework and key method for LDN. Because of different bearing methods, CSQF [I-D.chen-detnet-sr-based-bounded-latency] and TCQF [I-D.eckert-detnet-mpls-tc-tcqf] propose different methods for LDN. For multi-hop forwarding with these LDN methods, the delay variation (jitter) does not exceed the length of 2 cycles. These LDN methods are all based on CQF extensions, which are cycle-based queuing and forwarding schemes. All these methods need to work with some resource reservation scheme, but none of them gives how to realize the resource reservation scheme in detail. This document proposes a VPFC planning scheme based on VPFP to meet this demand.¶
The resource reservation method of queuing and forwarding with specified cycle is very different from the traditional resource reservation method. Traditional resource reservation methods, such as RSVP-TE [RFC3473], only consider bandwidth availability for best-effort flows, that is, the reserved bandwidth meets the Peak Data Rate (PDR) of the service flow at the macroscopic level, which do not take into account the injection time of the packets at the microscopic level. The result of applying these methods to the resource reservation of cyclic queuing and forwarding is that the bandwidth resources meet the transmission demand at the macroscopic level, but there may be no resources in a specific cycle. If this problem remains unsolved, the premise of CSQF/TCQF bounded latency cannot be satisfied.¶
The prerequisite of CSQF/TCQF is that the data corresponding a cycle can be forwarded during the cycle. In a single path model, the prerequisite of CSQF/TCQF is easy to meet, but in reality, networks are all nonlinear topologies, and to ensure the precondition more work needs to be done.¶
For example, as shown in Figure 1, three flows (or aggregation of multiple service flows) flow1, flow2, and flow3 are injected into PE1, PE2, and PE3 respectively, and are converged on P4 after being forwarded. On a macro level, the combined traffic of flow1, flow2 and flow3 does not exceed the bandwidth of the outgoing interface of P4 (set to intf3).¶
In a certain scenario (which is completely unavoidable in practical applications), three DetNet flows arrive at P4 within the same cycle interval and need to be forwarded to P5 through intf3. Assuming that all physical links have 100G bps bandwidth and the cycle interval is planned to be 10us, and about 125,000 bytes of data can be transmitted within this interval. For each of the three flows, 125,000 bytes of data reaches P4 within 10us, but no data arrives within 990us after that. In the micro view of 10us time interval, each flow rate reaches 100G bps over a 10us time interval, but only 1G bps per flow over a 1ms time interval.¶
In this case, if the traffic arriving at P4 at the same time is scheduled in the same cycle of Intf3 of P4, a conflict will occur in this cycle (the deterministic data that arrives cannot be sent within the specified cycle, resulting in additional random queuing delay, thereby affecting the deterministic forwarding of the next node), and the theoretical conditions of CSQF cannot be guaranteed. Therefore, the theoretical upper boundary of end-to-end jitter of CSQF, which should be less than two cycles, cannot be achieved. Especially after multi-hop accumulation, the jitter will exceed the upper limit that the App-flow can tolerate.¶
For a small-scale deterministic network, the conflict in the domain is not very prominent, but in an LDN, multiple App-flows access to the deterministic domain from different edge devices, and the topology formed by the forwarding paths is non-identical linear. After further consideration of factors such as time injection, different link bandwidths, etc., the situation becomes very complicated. For an LDN, this document presents a general scheme to avoid the conflict of resources in the domain for cycle-based queuing and forwarding.¶
Note: To simplify the description, CSQF is used in the following examples.¶
In the following chapters, Section 2 gives the definition of relevant terminology; Section 3 specifies VPFP, VPFC and their configuration data models in our proposed scheme in detail; Section 4 describes the resource planning and reservation model in detail, in which Section 4.1 describes the relevant principles, and Section 4.2.1 describes the resource planning and reservation scheme in detail on the basis of Section 4.1. The resource reservation process involved in it is detailed in Section 5 separately due to too much content. In Section 5, the detailed processing flow related to resources is given.¶
This document uses the terms defined as [RFC8655], [RFC8938],[I-D.ietf-detnet-controller-plane-framework] and [RFC9320]. Moreover, the following terms are used in this document:¶
In an LDN, multiple intersecting VPFPs form a mesh topology. In order to meet the transmission requirements of a specific DetNet flow, one or more VPFCs needs to be planned in one or more VPFP. This clause specifies VPFP, VPFP and their configuration data models in detail.¶
When there is a transmission requirement of a deterministic service flow, the forwarding path needs to be calculated in advance. Then add cycle-based queuing and forwarding capabilities, and establish cycle-to-cycle mapping relationships between adjacent nodes. We further abstract this mapping relationship as a function. When the mapping function is added in the forwarding path, VPFP is formed.¶
Virtual Periodic Forwarding Path (VPFP):¶
A virtual forwarding path based on the cycle and the mapping functions between the cycles to forward the data is called a VPFP. The mapping function is established between an outgoing interface scheduling cycle of an upstream node and an outgoing interface scheduling cycle of a downstream node between adjacent nodes. The VPFP has the following characteristics:¶
The cycle of sending data from the upstream node and the mapping relationship together determine which forwarding cycle the data is forwarded on the outbound interface of the downstream node.¶
Taking the adjacency relationship in Figure 2 as an example, the resource description adds a function relationship description:¶
((PE1,Intf0) ,(P1,Intf3)):f1;¶
((P1,Intf3) ,(P3,Intf3)):f2;¶
((P4,Intf2) ,(PE5,Intf0)):f4;¶
((PE2,Intf0),(P1,intf3)):g1;¶
((P3,intf3),(P4,intf2)):f3;¶
((P4,intf2),(PE5,Intf1)):f5;¶
((PE3,intf0),(P2,intf2)):h1;¶
((P2,intf2),(P3,intf3)):h2;¶
((P3,intf3),(P4,intf1)):h3;¶
((P4,intf1),(PE4,Intf0)):h4.¶
The controller plane maintains the mapping function between the scheduling cycles of the outgoing interfaces of each pair of adjacent nodes. This function can be unique or multiple. Once the path is determined, the function is also determined. When the resource reservation fails, the physical path carrying the VPFP can be changed, or the mapping function in the VPFP can be changed to calculate the reserved resources again (TBD).¶
The forwarding path carrying the VPFP is generated by the MCPE or network administrator after calculating the path, and the mapping function is generated by the calibration after measurement. As shown in Figure 2, assuming that there are three deterministic flow paths, the VPFPs form are:¶
VPFP1:(PE1,Intf0) f1(P1,Intf3)f2(P3,Intf3) f3(P4,Intf2) f4(PE5,Intf0)¶
VPFP2:(PE2,Intf0) g1 (P1,intf3) f2(P3,intf3) f3(P4,intf2) f5(PE5,Intf1)¶
VPFP3:(PE3,intf0) h1 (P2,intf2) h2(P3,intf3) h3(P4,intf1) h4(PE4,Intf0)¶
Where f1~5, g1, h1~3 are injective functions, see section Section 4.2 for details.¶
After the cycle-based resource reservation is completed in the one or more VPFPs, one or more VPFCs are planned in the paths.¶
The same forwarding path can carry multiple VPFPs (Note: When all the mapping relationships in the path are determined, only one VPFP is determined), and multiple VPFCs can be established in the same VPFP.¶
If the cycle mapping mode is stack mode, the VPFP parameters(see Figure 3 for the configuration interface) should be deployed to the head node of the VPFP (e.g., ingress PE) to generate the information for directing forwarding, and the specific process is beyond the scope of this document. If the cycle mapping mode is swap mode, VPFP-related information needs to be deployed to each node of VPFP, such as Ingress PE, P, and Egress PE.¶
Take the stack mode as an example, assuming the VPFP for a DetNet flow is:¶
(PE1,Intf0) f1(P1,Intf3) f2(P3,Intf3) f3(P4,Intf2) f4(PE5,Intf0)¶
Assuming the successfully reserved result list is:¶
{¶
(VPFP):(intf0,Cycle0,1),¶
(VPFP):(intf0,Cycle1,1),¶
(VPFP):(intf0,Cycle2,1),¶
(VPFP):(intf0,Cycle3,1),¶
(VPFP):(intf0,Cycle4,1),¶
(VPFP):(intf0,Cycle5,1),¶
(VPFP):(intf0,Cycle6,1),¶
(VPFP):(intf0,Cycle7,1),¶
}¶
The VPFC consists of a VPFP and resources reserved along the VPFP. When the MCPE deploys the VFPC to the head node of the VPFP, the parameters that need to be configured are summarized below.¶
vpfc¶
+-- uint16 vpfcid # Virtual Periodic Forwarding Channel Identifier +-- uint16 vpfpid # virtual periodic forwarding path identifier +-- if_config[oif] # Outgoing InterFace +-- uint16 cycles # Number of cycles involved in resource # reservation +-- uint8 cycleinfo[0..cycles-1] #Cycle Info +--uint16 cycleid #Cycle ID +--uint16 res #Number of Resources¶
Note: A vpfci uniquely identifies a vpfc, and a vpfpid uniquely identifies a vpfp. Returning a result list may create multiple vpfc, which are divided according to different vpfp.¶
vpfp¶
+-- uint16 vpfpid # virtual periodic forwarding path identifier +-- uint8 cycles # is the number of cycles used across all # interfaces in the CSQF/TCQF domain. +-- policy_info [policy] # Policy information +-- pipe_info[0..cycles-1] # The scheduling cycle pipeline # corresponding to each scheduling cycle # on the head node +-- uint8 hops # Number of hops +-- map_info[0..hops-1] # The mapping target in each pipeline # is a specific scheduling cycle +--uint8 out_cycle #output cycle¶
In the head node (e.g., Ingress PE) of the VPFP, the forwarding information is generated based on the vpfp configuration, which is used for cyclic queueing and forwarding, as well as packet encapsulating in the CSQF domain. At the same time, according to the configuration information of VPFC, the selection of the scheduling cycle in the head node is strictly stipulated, which is used to realize the PSPF function similar to [IEEE802.1Qci].¶
With these configuration data models, the creation, deletion, and modification operations of VPFC can be achieved.¶
The establishment of periodic forwarding resource system is a complex system engineering. It is more realistic to establish the system based on the existing best-effort system. The whole process needs the cooperation of user plane, management/control plane and data plane. To show the overall framework of resource reservation, the content shown in Figure 4 is copied from [RFC9016]. The management/control plane entity (MCPE) is responsible for the managing, planning, reserving, and recycling of cyclic forwarding resources for deterministic service flows. To get a sense of the whole picture, the main planning related processes in a resource system are listed as follows:¶
As stated in [RFC8557], it is possible to investigate whether there is value in a distributed alternative without PCE. For example, such an alternative could be to build a solution similar to that in [RFC3209]. But the focus of current work on DetNet should be to provide a centralized method first. The solution provided in this document belongs to the centralized solution, and can make the implementation of resource reservation by data plane devices as lightweight and stateless as possible.¶
In the following contents of this chapter, Section 4.1 first describes the basic principles, in which the measurement and calibration in Section 4.1.1 are the prerequisite for establishing the function mapping of VPFP. Section 4.1.2 describes how to use the characteristics of mapping functions to resolve scheduling cycle planning conflicts; combined with the theory in Section Section 4.1.2, Section 4.1.3 briefly introduces the overall process of resource planning. Based on the principles of Section 4.1, Section 4.1.2 describes the complete scheme of resource reservation in detail, including various data models involved in the scheme. Due to too much content, the resource reservation process involved is elaborated separately in Section 5.¶
As shown in Figure 5, [RFC9320] highly abstracts the timing model of the DetNet transit node. In a large-scale deterministic network, the implementation of some DetNet transit nodes is a distributed architecture, and the processing delay in these nodes varies widely, which is the operation that contributes the most to the delay jitter. In cycle-based queuing and forwarding, the jitter introduced by various operations needs to be fully considered, so that the end-to-end transmission delay can reach a definite bound.¶
Taking CSQF as an example, before forwarding, it is necessary to establish a mapping relationship between the scheduling cycles of the outgoing interfaces of upstream and downstream nodes, and the process is completed by measurement and calibration. (Note:In order to simplify the description, the specific interface of the A node is uniformly replaced by the Node A, and B is similar.)¶
As shown in Figure 6, taking 8 cycles as an example, Node A and Node B are two CSQF nodes. Node A is the upstream node, and Node B is the downstream node. To know which cycle is being scheduled in Node B (for example, cycle 2 in Figure 6) when packets sent from a certain cycle in Node A (for example, cycle 0 in Figure 6) reaches it, a measurement should be applied. The specific implementation of the measurement is beyond the scope of this demo, and will not be described here. After the measurement is done, the forwarding cycle in Node B for these packets should be decided, which should take into account the processing delay variant in the device. In this example, for packets sent in cycle 0 of Node A, cycle 6 is chosen as forwarding cycle in Node B. The packets sent in cycle 1 of Node A are assigned to cycle 7 of Node B, and so on. This is the task to be done by calibration.¶
After calibration, the scheduling cycle of A and that of B have the following mapping relationship:¶
0 ----> 6¶
1 ----> 7¶
2 ----> 0¶
3 ----> 1¶
4 ----> 2¶
5 ----> 3¶
6 ----> 4¶
7 ----> 5.¶
Note 1: In terms of jitter absorption, if the jitter range is 2 (the jitter range is 3 in Figure 6), it is feasible to assign the packet sent in the 0th scheduling cycle of Node A to the queue in the 5th or 6th or 7th scheduling cycle of Node B. So there is more than one mapping that can be calibrated.¶
As described in Section 4.1.1, after the calibration is completed, a definite mapping relationship is established between the scheduling cycles of the outbound interface of the two adjacent nodes. This relationship can be regarded as a function f: its domain is the scheduling cycle range of Node A, 0~7, and its range is the scheduling cycle range of Node B, 0~7. Further constraints are imposed on the mapping relationship of scheduling cycles between Nodes A and B: during calibration, any scheduling cycle in A has one and only one scheduling cycle in B is calibrated with it. Under this constraint, the function f becomes an injective function.¶
The cycle planning is further constrained: in the same CSQF domain, all interface plan the same number of scheduling cycles. Under this constraint, all mapping functions have the same domain and range.¶
Note:Different interfaces of the same node can belong to different domains, and cross-domain processing is beyond the scope of this demo.¶
It is assumed that the calibrated mapping between the scheduling cycles of the outbound interfaces of the upstream and downstream nodes described in Figure 7 also satisfies the injective function relationship, and the calibrated relationship between the scheduling cycles of the outbound interfaces PE1 and P1 is the functional relationship f1, and the calibrated mapping relationship between the scheduling cycles of the outbound interfaces P1 and P3 is the functional relationship f2. The mapping relationship between the PE and the scheduling cycles of the P3 outbound interface satisfies the composite function f:¶
f= f2of1¶
According to the property of injective function, f is also an injective function.¶
Similarly, we can get:¶
g=g2og1¶
h=h2oh1¶
And g, h are both injective functions, where f2 and g2 are the mapping relationship between the scheduling cycle of outbound interface P2 and the scheduling cycle of outbound interface P3.¶
Therefore, with the above constraints, it is assumed that the flow of PE1, PE2, and PE3 conflicts at P3, that is, they are mapped to the same scheduling cycle. Let the scheduling cycles of flow in PE1, PE2, and PE3 be a, b, and c respectively, that is, the following situation occurs:¶
The function values of f(a), g(b) and h(c) appear the same, which is a conflict. According to the nature of the injective function, as long as a or b is changed, for example, a is changed to c (c!= a), then f(c) != f(a), and then f(c) != g(b). Similarly, changing the input of function g or h can also achieve the effect of eliminating conflict.¶
At the same time, it is easy to draw the following conclusions:¶
For f= f2of1, when c!= b then f(c)!= f(b) and f1(c) != f1(b).¶
Further generalization of this conclusion: Suppose that the ordered set <f1, f2, ..., fn> is a set composed of injective functions(where n belong to N), and the domain and range are both A, A={x|0<=x<k , k belong to Z}. The composite function composed of the first i (1<=i<=n) functions is denoted as:¶
f[i]=fiofi-1o...f1.¶
Let the proper subsets B and C of set A satisfy the condition:¶
A is union of B and C, the intersection of B and C is null set, then any b belong to B and any c belong to C, f[i](b)!=f[i](c).¶
When planning the usage of scheduling cycles, the scheduling cycles that have been traversed in the scheduling cycles of the head node of VPFP (e.g., Ingress PE) are regarded as set B, and the scheduling cycles that have not been traversed are regarded as set C. When a conflict for a scheduling cycle occurs when converging with other paths, a new scheduling cycle c is selected from C, and the new c and the set elements that have been traversed in set B as input produce different results. That is, there will be no cycle planning conflicts starting from the same head node along the currently planning VPFP. In this way, it is only necessary to judge whether there is a conflict with other path aggregation, and if there is no conflict, the scheduling cycle planning is successful.¶
Note 1: For paths that have established a mapping relationship, only the scheduling cycle within the path can be adjusted for use, and there are many constraints to change the mapping relationship of the path. For further discussion of this issue(TBD).¶
Note 2: When there are multiple mapping relationships that can be calibrated, each calibrated relationship corresponds to a function. In multiple planning, different function mappings can be used each time.¶
Under the premise of rational utilization of resources,the key issue to ensure that the theoretical conditions of CSQF are satisfied is to plan for a given VPFC to be scheduled during a certain cycle of the interface of the corresponding node. In other words, the forwarding capability corresponding to the specified cycle interval on the interface of the corresponding node is allocated to the VPFC. The common feature of cycle-based forwarding is that all data that needs to be forwarded in a cycle interval is first buffered and then forwarded in a specific cycle interval. Combined with this feature, the more abstract forwarding capability during a cycle can be converted into the cache resources required for the specific data that can be forwarded during this cycle. The problem is transformed into a buffer resource reservation problem, that is, the buffer resource is reserved for the VPFCs that are allowed to be scheduled during the cycle (referred to as resource reservation for cyclic forwarding), and the VPFCs that do not have buffer resources reserved are not scheduled during the cycle.¶
According to the conclusion in Section 4.1.2, the resources in the CSQF domain can be reasonably planned. When a scheduling cycle conflict occurs on the convergence point or the outbound interface with small bandwidth and the traffic entering from other paths, change the planning cycle of the head node, then perform cycle calculation along the VPFP and try to reserve resources. After the attempt is successful, the conflict can be eliminated.¶
At the same time, because the mapping functions along the VPFP are injection functions, we can regard the scheduling cycle and buffering resources of the non-head node's aggregation interface or low-bandwidth outbound interface to be shared as a common resources, and allocate this common resource to the head nodes with deterministic transmission requirements. The resources allocated on the head node and the VPFP constitute the VPFC of our scheme.¶
The injective function also strictly constrains the corresponding relationship between the head node and convergent node resources. Therefore, the head node only needs to save the allocated resources of its own node, and does not need to save the allocated resources of the non-head node (including sink node). The non-head node does not need to save the resource allocation state, and the resource allocation state is saved by the MCPE, so as to achieve the lightweight implementation of the resource reservation of the non-head node (e.g., P node).¶
While the head node of VPFP performs VPFC scheduling strictly according to the resources allocated to the VPFC, conflicts can be avoided on the non-head node's aggregation interface or low-bandwidth outbound interface sharing transmission. Scheduling strictly according to allocated resources on the head node is a key issue, which will be further studied in other literatures.¶
According to [RFC8655], service flows can be aggregated and resource can be reserved for the aggregated flows. With our solution, the aggregated flows share the scheduled resources reserved on the edge nodes, and the resource competition is localized. The delay jitter caused by the aggregated flows will also be local, and it is easier to realize the time delay bounded. In order to further optimize the jitter of the member service flows in the aggregate flow, only the scheduling resources allocated to the edge nodes of the aggregate flow need to be further refined, and this arrangement will not cause the change of global resource allocation.¶
By separating resource planning from measurement and calibration, there is no need to consider the problem of path aggregation when performing calibration after measurement, which can greatly reduce the complexity of calibration.¶
In Section 4.1, we give a highly abstract description of the principle and method of our resource reservation scheme, and give a very high-level idea. This section discusses the scheme in detail, including quantitative representation of forwarding resources corresponding to the scheduling cycle, and description of the main elements involved in the scheme. The detailed process of resource reservation in this scheme is described in Section 5.¶
Forwarding resources are a relatively vague concept. They include not only bandwidth resources, but also device storage resources. For example, high-speed on-chip caches inside ASICs are often measured in units of bytes rather than bps. It is not enough to consider bandwidth only when reserving resources, we have to establish a new resource metric in bytes or bytes of a certain length, for example, 64 bytes is one resource unit. The comprehensive capability of one cycle is measured in new resource units, which covers cache capacity, cycle interval time, and interface bandwidth. In this way, the three dimensions of cycle duration, cache capacity, and physical bandwidth are simplified into one dimension: the number of resource units, so as to simplify the implementation of resource reservation.¶
For example, assuming that the backbone network is uniformly divided into a 10us scheduling cycle and one resource unit has 64 bytes, the data that can be transmitted on a 400G interface in each scheduling cycle is about 7812 resource units, 1953 on a 100G interface, 195 on a 10G interface, and 19 on a 1G interface. The amount of resources that can be provided by the scheduling cycle of an interface is given by the comprehensive evaluation of the implementation specifications of devices such as interface bandwidth and storage resources. Resource planning is done based on this quantity, which simplifies implementation.¶
This section extends the description of the method described in Section 4.1 in conjunction with the constraints of Section 4.2.1.¶
According to the theory in Section 4.1, if the scheduling cycle a of PE1 is mapped to the scheduling cycle c of P1, and the scheduling cycle b of PE2 is also mapped to the scheduling cycle c of P1, then a cycle conflict occurs. The conflict can be resolved by adjusting the scheduling cycle a or b, but if P1 has multiple units of resources in the same scheduling cycle, it is allowed to allocate resources in the same scheduling cycle to PE1 and PE2 as needed, thereby increasing the utilization of resources in the scheduling cycle.¶
Based on the resource metrics proposed in Section 4.2.1, the resources of the scheduling cycle are uniformly quantified. As shown in Figure 8, assuming the data rates of all physical links are 100Gbps, and the length of the scheduling cycle is 10us, each scheduling cycle can transmit 1953 64-byte data resource units. Because of multiple resources belonging to one cycle, instead of explicitly judging whether a scheduling cycle conflict occurs, it is changed to judge whether the resources corresponding to the scheduling cycle on the path meet the demand. If the demand for resources can be satisfied, the corresponding resources are reserved. Otherwise, the MCPE chooses another scheduling cycle as the input of the function, and tries iterative calculation and resource reservation again.¶
According to the resource demand of a DetNet flow, starting from the head node of the VPFP, the functions in the path are sequentially called along the VPFP for calculation. Firstly, the first value in the domain of the function associated with the head node, is used as the input of that function, and then the previous function's value is used as the input value of the next function for iterative calculation. Concomitantly, it is judged whether the resources of the cycle corresponding to each input value and output value meet the demand. If the resources do not meet the demand, the current iterative calculation is terminated, and the next value in the domain of the function associated with the head node is used as a new input, and the above processing is continued. When all the values in the domain of the function of the head node are traversed and fail to meet the demand, no VPFC is successfully planned for the DetNet flow. If the resource meets the demand, a VPFC is successfully planned, and the remaining resources corresponding to the scheduling cycle are updated along the planned VPFC. The controller plane records the VPFC, and delivers the VPFC to the head node of the VPFP.¶
To optimize resource usage, multiple DetNet flows can be aggregated together to share a VPFC. For example, suppose the cycle duration is 10us and 10 cycles are used, then each cycle will be scheduled once every 100us. If 1 unit of resources is allocated to a VPFC for a DetNet flow which sends 1 unit of data every 1ms, then only 1/10 of the allocated resources is really used. If another DetNet flow regularly sends 1 unit of data every 3ms, which has the same forwarding path with the VPFC, then this VPFC can be shared. Of course, this inevitably introduces jitter, but this jitter is bounded, unperceived or tolerated by most applications, and can be eliminated by other methods, which are beyond the scope of this document.¶
Scheduling cycle resource description:¶
(node, interface, scheduling cycle): (number of available resources, number of initial resources), where the number of resources is measured in uniform resource units. For example, in Figure 8, assuming that the total number of cycles in the domain is uniformed as 8, the cycle interval is 10us, the PE1's Intf0 is a 10Gbps interface, and each cycle can forward about 193 units of resources. For factors that have not been considered, some capabilities need to be reserved, the number of resources can be initialized as 180. The resource numbers of PE1's Intf0 are initialized as follows:¶
(PE1,Intf0,Cycle0): (180,180);¶
(PE1,Intf0,Cycle1): (180,180);¶
(PE1,Intf0,Cycle2): (180,180);¶
(PE1,Intf0,Cycle3): (180,180);¶
(PE1,Intf0,Cycle4): (180,180);¶
(PE1,Intf0,Cycle5): (180,180);¶
(PE1,Intf0,Cycle6): (180,180);¶
(PE1,Intf0,Cycle7): (180,180);¶
The P1's Intf3 is a 100Gbps interface, and each cycle can forward about 1953 units of resources. For factors that have not been considered, some capabilities need to be reserved, the number of resources can be initialized as 180. The resource numbers of P1's Intf3 are initialized as follows:¶
(P1,Intf3,Cycle0): (1900, 1900);¶
(P1,Intf3,Cycle1): (1900, 1900);¶
(P1,Intf3,Cycle2): (1900, 1900);¶
(P1,Intf3,Cycle3): (1900, 1900);¶
(P1,Intf3,Cycle4): (1900, 1900);¶
(P1,Intf3,Cycle5): (1900, 1900);¶
(P1,Intf3,Cycle6): (1900, 1900);¶
(P1,Intf3,Cycle7):¶
(1900, 1900);¶
...¶
The controller plane maintains the resource information of the scheduling cycle of the interface of each node, and reduces the number of available resources in the corresponding scheduling cycle of the interface of the corresponding node after the DetNet flow path resource reservation is performed.¶
Assuming that each node is divided into n equal-length scheduling cycles, after measurement and calibration, there are n function mapping relationships:¶
Assuming that each node is divided into n equal-length scheduling cycles, after measurement and calibration, there are n function mapping relationships:¶
y=(x+k) mod n, the domain of definition is {x|0<=x<n, x belong to N, n belong to N }, and the value range of the constant k is: {k | 0<=k<n, k belong to N, n belong to N and n>3}, (In principle, the scheduling cycle and the number of corresponding CSQF queues can be less than 3, but in large-scale deterministic networks, it is unrealistic to be less than or equal to 3, and it is not conducive to resource planning).¶
Taking the network in Figure 7 as an example, it is assumed that each node is divided into 8 (n=8) equal-length scheduling cycles. After measurement, the function mapping relationships such as f1~5, g1, and h1~4 are obtained by calibration as one of the following 8 functions:¶
y =(x+0)mod 8,{x|0<=x<8,x belong to N };¶
y =(x+1)mod 8, {x|0<=x<8,x belong to N };¶
y =(x+2)mod 8, {x|0<=x<8,x belong to N };¶
y =(x+3)mod 8, {x|0<=x<8,x belong to N };¶
y =(x+4)mod 8, {x|0<=x<8,x belong to N };¶
y =(x+5)mod 8, {x|0<=x<8,x belong to N };¶
y =(x+6)mod 8, {x|0<=x<8,x belong to N };¶
y =(x+7)mod 8, {x|0<=x<8,x belong to N };¶
As a specific example,these functions can be finally decided as:¶
f1(x)=(x+3)mod 8, {x|0<=x<8,x belong to N };¶
f2(x)=(x+1)mod 8, {x|0<=x<8,x belong to N };¶
f3(x)=(x+2)mod 8, {x|0<=x<8,x belong to N };¶
g1(x)=(x+4)mod 8, {x|0<=x<8,x belong to N };¶
g2(x)=(x+5)mod 8, {x|0<=x<8,x belong to N };¶
g3(x)=(x+2)mod 8, {x|0<=x<8,x belong to N };¶
h1(x)=(x+6)mod 8, {x|0<=x<8,x belong to N };¶
h2(x)=(x+7)mod 8, {x|0<=x<8,x belong to N };¶
h3(x)=(x+2)mod 8, {x|0<=x<8,x belong to N };¶
...¶
The controller plane saves the mapping function which is one of the components used to describe VPFP.¶
The resource demand here is the input for reservation processing after conversion processing of app-flow's requirements (see [RFC9016]), not the original transmission requirement of an app-flow.¶
Resource demands are described as a list of requirements:¶
{sub-demand 1, sub-demand 2, ...}¶
Each sub-demand looks like:¶
(Virtual Periodic Forwarding Path): (Outbound Interface, Scheduling Cycle, Number of Resource Units Required, Minimum Allocation Granularity Per Cycle).¶
The components of the above requirements are described as follows:¶
For the transmission requirements of app-flows with strict jitter upper bound requirements, resources for a specified cycle may be allocated. For example, a certain DetNet flow has a transmission requirement of one resource unit, but it is required to be transmitted in time, and the jitter is less than 2 scheduling cycles. CSQF can meet this requirement. A unit of resources is allocated from each cycle along the VPFP, so that no matter when the data of the service flow arrives, there is always a unit of resources is ready.¶
In order to facilitate the following description, the demand list is further symbolized, and the resource demand is described as the demand list DemandList:¶
{SubDemand1, SubDemand2, ...}¶
SubDemand for each specified cycle is in the form of:¶
(VPFP):(oif,cycle,res,min);¶
That is, SubDemand1 is set to "(VPFP): (oif1, cycle1, res1, min1)".¶
The allocation of sub-requirements for each non-specified cycle is as follows:¶
(VPFP): (oif, InvalidCycle, res, min);¶
That is, SubDemand1 is set to "(VPFP): (oif1, InvalidCycle, res1, min1)".¶
where VPFP is described in Section 3.3, for example, VPFP is:¶
(PE1,Intf0) f1(P1,Intf3) f2(P3,Intf3) f3(P4,Intf2) f4(PE5,Intf0)¶
For example, for a flow, its path is the above VPFP, and for its specified cycle allocation, its resource demand list DemandList can be expressed as:¶
{¶
(VPFP):(intf0,Cycle0,1,1),¶
(VPFP):(intf0,Cycle1,1,1),¶
(VPFP):(intf0,Cycle2,1,1),¶
(VPFP):(intf0,Cycle3,1,1),¶
(VPFP):(intf0,Cycle4,1,1),¶
(VPFP):(intf0,Cycle5,1,1),¶
(VPFP):(intf0,Cycle6,1,1),¶
(VPFP):(intf0,Cycle7,1,1)¶
}¶
The above allocation indicates that resources of 0 to 7 cycles are allocated to the flow, and one unit of resources is allocated from each cycle; and if the resource list DemandList allocated by the scheduling cycle is not specified, it can be expressed as:¶
{(VPFP):(intf0, InvalidCycle, 10, 2)}, where InvalidCycle is the invalid cycle, defined by the implementation.¶
The resource allocation result of the specified scheduling cycle is exactly the same as the resource allocation result of the non-specified cycle, and it is described as a result list:¶
{subresult 1, subresult 2, ...}¶
Each sub-result looks like:¶
(path information): (outbound interface, scheduling cycle, number of resource units).¶
In order to facilitate the following description, the resource allocation result list is further symbolized, and the resource allocation result is the result list ResultList:¶
{SubResult1,SubResult2,......}¶
SubResult of each specified cycle sub-requirement is as follows:¶
(VPFP): (oif, cycle, res);¶
That is, SubResult1 is set to (VPFP): (oif1, cycle1, res1).¶
Where VPFP is described in Section 3.3, for example:¶
VPFP is:¶
(PE1,Intf0) f1(P1,Intf3) f2(P3,Intf3) f3(P4,Intf2) f4(PE5,Intf0)¶
List of results after the specified cycle reservation method is successful:¶
{¶
(VPFP):(intf0,Cycle0,1),¶
(VPFP):(intf0,Cycle1,1),¶
(VPFP):(intf0,Cycle2,1),¶
(VPFP):(intf0,Cycle3,1),¶
(VPFP):(intf0,Cycle4,1),¶
(VPFP):(intf0,Cycle5,1),¶
(VPFP):(intf0,Cycle6,1),¶
(VPFP):(intf0,Cycle7,1),¶
}¶
After the reservation of non-specified cycle resources is successful, the resulting list contains the actually reserved cycles and their corresponding resources. Suppose the requirements of Flow1 are as follows:¶
{(VPFP):(intf0, InvalidCycle, 10, 2)}¶
After the allocation is successful, the result list may be:¶
{(VPFP):(intf0,Cycle0,10)}¶
It may also be as follows, when one cycle cannot meet the transmission demand, it is allocated from multiple cycles:¶
{¶
(VPFP): (intf0, Cycle0, 5),¶
(VPFP): (intf0, Cycle1, 5)¶
}¶
Note: When resource requirements are allocated for multiple scheduling cycles, ensure that the resources allocated for each cycle can transmit the full packet in service.¶
After the resource is successfully reserved, the MCPE needs to record the VPFC planned for the DetNet flow, which will be used for the reasons of VPFC recycling, modifying, etc. Since the topology may change, resource reclamation cannot rely on topology information. Therefore, it is necessary to save the VPFC allocated on the PE on the controller plane.¶
This section gives a detailed resource-related processing flow. At the same time, the complex issue of resource recycling will be briefly covered, and there will be discussions of unresolved issues related to resource reservation. These discussions do not give any direction to the solution developer for how they should do with the forwarding resource recycling, but to point out that these issues should not be ignored in the implementation.¶
For simplicity, this solution is based on the existing best-effort forwarding mechanism and do some extensions. In terms of network topology and path planning, it directly inherits current implementation. For example, collecting topology through IGP and BGP-LS, measuring inter-node delay through NQA or TWAMP, collecting link delay through NETCONF, planning paths that meet application delay requirements based on CSPF, are all existing technologies. The specific implementation is beyond the scope of this document.¶
In order to implement this solution, it is necessary to add some new functions on the basis of the existing implementation, including establishing a resource database in the periodic forwarding domain. The database needs to include the association relationship of interface, cycle, and forwarding resources, and also needs to include the mapping relationship between the outgoing interfaces of the upstream node and the downstream node. For reference, a new added process can be:¶
When a deterministic service flow session needs to be established, it sends a request to the MCPE. The MCPE selects a path and allocate resources according to the flow characteristics. The overall process is as follows:¶
After a VPFC is established, the VPFC is scheduled based on the its buffer resources in the scheduling cycle of the VPFP head node, so that conflict of periodic data forwarding will not occur in the CSQF domain, and the non-deterministic scheduling in the domain will not be caused.¶
Assuming the format of the path information calculated by MCPE is as described in Section 3.1, then VPFP1 is:¶
(PE1,Intf0) f1(P1,Intf3)f2(P3,Intf3) f3(P4,Intf2) f4(PE5,Intf0)¶
The resource demand list for specified cycle is:¶
{¶
(VPFP1):(intf0,Cycle0,1,1),¶
(VPFP1):(intf0,Cycle1,1,1),¶
(VPFP1):(intf0,Cycle2,1,1),¶
(VPFP1):(intf0,Cycle3,1,1),¶
(VPFP1):(intf0,Cycle4,1,1),¶
(VPFP1):(intf0,Cycle5,1,1),¶
(VPFP1):(intf0,Cycle6,1,1),¶
(VPFP1):(intf0,Cycle7,1,1)¶
}¶
For the convenience, the node, outgoing interface, and available resources are respectively abbreviated as cn (Current Node), ci, and ar. So cn.ci.ar[c] denotes the available resources in cycle c of the outgoing interface of current node. For the above resource demands for specified scheduling cycle, perform the following resource reservation calculation and reservation processing:¶
Assuming the format of the path calculated by MCPE is as described in Section 3.1, and VPFP1 is:¶
(PE1,Intf0) f1(P1,Intf3)f2(P3,Intf3) f3(P4,Intf2) f4(PE5,Intf0),¶
and VPFP2 is:¶
(PE2,Intf0) g1 (P1,intf3) f2(P3,intf3) f3(P4,intf2) f5(PE5,Intf1)¶
The resource demand list for unspecified cycle is:¶
{¶
(VPFP1):(intf0, InvalidCycle, 10,2);¶
(VPFP2):(intf0, InvalidCycle, 8,2)¶
},¶
Where InvalidCycle is the invalid cycle, whose value is defined by the implementation.¶
For the convenience, the node, outgoing interface, and available resources are respectively abbreviated as cn (Current Node), ci, and ar. So cn.ci.ar[c] denotes the available resources in cycle c of the outgoing interface of current node. For the above resource demands for non-specified scheduling cycle, perform the following resource reservation calculation and reservation processing:¶
After the resource reservation calculation, the resource reservation is executed. The MCPE traverse the ResultList and perform the following resource reservation operations:¶
For a PREOF implementation, each resource reservation demand on a VPFP forms a sub-demand (see Section 4.2.5). Multiple sub-demands form a demand list for resource reservation calculation and reservation (see Section 4.2.2, Section 4.2.2 and Section 4.2.3). For example, suppose there is a deterministic service flow that requires two member paths to form a compound path to increase reliability. Where one of the member paths is VPFP1:¶
(PE1,Intf0) f1(P1,Intf3)f2(P3,Intf3) f3(P4,Intf2) f4(PE5,Intf0)¶
Another Member Path is VPFP2:¶
(PE1,Intf1) g1 (P2,Intf3) f2(P5,intf3) f3(P6,intf2) f5(PE5,Intf0)¶
In this example, the DetNet flow is injected from PE1 and copied on PE1. The original flow and the copy are sent from Intf0 and Intf1 respectively. The original flow and the copy are finally aggregated on PE5, and the aggregated data flows out from Intf0 of PE5 after processing. The bandwidth requirement of this service flow is 10 resource units. Due to the multi-path, the jitter caused by unequal path lengths is greater than the jitter caused by the access PE scheduling cycle. Therefore, for the PREOF deployment method, the resource reservation method with a non-specified cycle is more practical. Assuming that the resource demand of the service flow is 10 resource units, and the minimum granularity of resource allocation in each cycle is 2 resource units, the following non-specified cycle resource demand list is formed:¶
{¶
(VPFP1):(intf0, InvalidCycle, 10,2);¶
(VPFP2):(intf0, InvalidCycle, 8,2)¶
},¶
Where InvalidCycle is the invalid cycle, whose value is defined by the implementation.¶
When the bandwidth demand of a service flow increases, convert the newly added bandwidth demand into resource demand to form the demand list described in Section 4.2.5, and execute the combined processing flow of Section 4.2.1 and Section 4.2.4 or Section 4.2.2 and Section 4.2.4.¶
For a single path change, the MCPE recycles the old path resources and reserves demanded resources along the new path.¶
For the PREOF implementation, the process for one VPFP change is same as the process for a single path change.¶
Resource recycling is a key issue. The resource recycling process is relatively complex. In an LDN, resources that have been allocated will not be used for various reasons. If they are not recycled, resource "leakage" will occur, reducing the effective utilization of the network.¶
The reasons that may trigger the resource recovery include:¶
As a common resource, the scheduling cycle resource should be correlated with the OAM module. When OAM detects some failure or abnormality, recycling of the scheduling cycle resource should be triggered. Therefore, the scheduling cycle resource recovery is also a part of the OAM that needs to be enhanced.¶
The security considerations related to resource reservation are the same as those described in [I-D.ietf-detnet-controller-plane-framework]. In addition, it is necessary to deal with the errors mentioned in [IEEE802.1Qci] , such as exceeding SDU, etc. This kind of process includes discarding and counting the packets, and is usually implemented on the forwarding plane.¶
This document makes no IANA requests.¶
The authors express their appreciation and gratitude to Min Liu for the review and helpful comments.¶
The editor wishes to thank and acknowledge the following author for contributing text to this document.¶
Zhou Lei New H3C Technologies 100094 Email: zhou.leiH@h3c.com Zhu Shiyin New H3C Technologies 100094 Email: zhushiyin@h3c.com Cheng Zuopin New H3C Technologies 100094 Email: czp@h3c.com Pan Ning New H3C Technologies 100094 Email: panning@h3c.com Xu Shenchao New H3C Technologies 100094 Email: xushenchao@h3c.com Chen Xusheng New H3C Technologies 100094 Email: cxs@h3c.com Wu Pin New H3C Technologies 100094 Email: wupin@h3c.com Chu Jun New H3C Technologies 100094 Email: chu.jun@h3c.com Wei Wang New H3C Technologies 100094 Email: david_wang@h3c.com Liu Xinmin New H3C Technologies 100094 Email: liuxinmin@h3c.com¶