CURRENT_MEETING_REPORT_ Reported by Steve Casner/USC-ISI Minutes of the Audio/Video Transport Working Group (AVT) Overview At the previous AVT meeting in Toronto, version 2 of the Real-time Transport Protocol (RTP) was presented as documented in the July version (-05) of the specification. It was agreed to proceed with this version of RTP as soon as the missing example algorithms and the ``to be determined'' points in the specification were completed. In the interim, the authors have prepared a new draft (-06) filling in these items and introducing a few small changes to address items missed before. At this meeting, Steve Casner presented a report on the new draft and there were no objections to these changes. However, there was a surprising amount of debate about the jitter parameter in the Reception Report which had not changed except that the algorithm had been defined. As a result, a second session was scheduled to allow completing the planned presentations. During the second session, a compromise solution was devised: the jitter field remained unchanged, but packet loss will be reported as both cumulative and short term, as described below. With this issue settled, the draft editing will be completed and the draft will be submitted for Area Directorate review and IESG Last Call as a Proposed Standard. An interesting aspect of this meeting was that for the first time we heard reports on implementations of RTP version 2, as outlined below. There was also a presentation on the new draft Packet Format for Encapsulation of MPEG in RTP. The slides for the presentations are available from ftp://ftp.isi.edu/mbone/avt/sanjose-dec94/, as is the file transcript.94dec, a more complete report on the meeting including a rough transcript. Changes in RTP Since July Draft The two primary items left ``to be determined'' in the July RTP draft (-05) were the algorithm for calculating the RTCP reception report interval based on the observed session size and the algorithm for the ``jitter'' measure in the reception report. Algorithms for both of these have been included in the -06 draft. However, in the rush to meet the cutoff for Internet-Draft submissions before IETF, a couple of details were left uncorrected in the report interval calculation algorithm. Also, two jitter algorithms were under discussion; what is in -06 is the algorithm that Henning Schulzrinne has been using in Nevot. Instead, we have decided to use the algorithm that Steve McCanne and Van Jacobson have implemented in vic because it is simpler and a more straightforward first-order estimator. See the discussion in the next section. There were several additional small changes, either from agreements at the Toronto meeting, or to fix problems discovered in the interim: o As agreed in Toronto, the draft now specifies that the data and control ports are to be an even/odd pair. o The group decided not to partition the RTCP packet type space for profile- and payload-specific definitions because there is a problem with synchronization between the control and data streams. Instead, experiments with new packet types can use the APP type, and registration of successful types with IANA is encouraged. o It was unclear in -05 whether the numeric type values assigned to RTCP packets types began with 0 or 1. In a note preceding this meeting, it was proposed that the RTCP packets be assigned type values 201-205 to enhance the probability of successful header validation or invalidation when comparing RTP versus RTCP which might be sent on the wrong port, or for comparison against some random or incorrectly decrypted packet. No strong objections to renumbering were given. There was some discussion of what values should be chosen, with 224-228 being another suggestion, but that is closer to all-1s which should be avoided the same as 0. After more thought subsequent to the meeting, Casner suggests 200-204 rather than 201-205 so that the SR/RR pair differ in only one bit to make the check simpler. Also, following a suggestion from Wieland Holfelder, we will reserve the two payload type values in the RTP data packet header that would correspond to the low 7 bits of the RTCP packet types SR and RR. Since any stack of RTCP packets is to begin with SR or RR, only these need to be reserved. o In the -05 draft, most length fields had been changed to be zero-based, but the SDES item length was missed and is now changed. o Three new SDES items were added: PHONE, TOOL, PRIV. o A one-octet length field was prefixed to optional BYE reason string since its length was not defined before. Note that the changes in RTCP type values, SDES item length, and BYE reason length introduce incompatibilities with previous drafts. The group also agreed on three additional points where the RTP specification will be made more specific: o It will be specified that SR rather than RR will be sent if data was transmitted during the last interval or the previous one. This provides some redundancy for loss of the last SR. o When no data has been heard from any source during a reporting interval, the receiver should still send an RR packet containing zero reception reports rather than omit the RR. This is so that the RTCP packet stack always begins with SR or RR. o In order to calculate the RTP timestamp to go in the SR packet, and in order to calculate jitter, it is necessary that the clock from which RTP timestamps are derived be monotonic and linear in time. Note that this refers to the clock, not the sequence of timestamps generated. In particular, it does not preclude the sending of timestamps out of order in the MPEG encoding. Discussion of the Jitter Algorithm The reception report in the RTCP SR/RR packet reports packet loss and jitter. The algorithm that will be specified in the RTP draft for calculating the jitter parameter is based on the difference in packet spacing at the receiver compared to the sender, or equivalently, the difference in relative transit times, for a pair of packets: D(i,j) = (Rj - Ri) - (Sj - Si) = (Rj - Sj) - (Ri - Si) Here S is the RTP timestamp from the packet, and R is the time of arrival in RTP timestamp units. Jitter is defined to be the mean deviation (smoothed absolute value) of this difference: J = J + 1/16 (|D(i-1,i)| - J) Two issues regarding the jitter parameter were debated in this meeting: 1. Whether this algorithm, and in particular the gain parameter of 1/16, was the correct choice, or whether the algorithm should be left to be profile specific. Some went even further to suggest that the jitter parameter should be relegated to a profile- specific section of the reception report or left out entirely since its usefulness has not been demonstrated yet. 2. Whether a short-term packet loss measure, useful for feedback control, should be reported instead of or in addition to the jitter measure. In large sessions, the requirement to keep state on all the receivers to take differences between reports could become a problem. Furthermore, the long interval between reports could mean that only one report is received from some receivers. On the first issue, Van Jacobson emphasized that the jitter measure is for network diagnostic purposes as well as for algorithms that adapt to the behavior of the network. Since this requirement is common across all applications, we want to allow profile-independent monitors to be able to interpret the jitter numbers, and therefore the jitter parameter cannot be in a profile-dependent section of the report. The usefulness of the jitter parameter has not been proven, but the same is true for all the other parameters in the reception report. On the other hand, experience with the MBone has shown a pressing need for mechanisms to monitor distribution, and getting reports from the participants seems like the only practical means. Observations of the local statistics in the vat program for packet loss and playout time variation, which is derived from the jitter calculation, have shown a strong correlation with the signal quality and establish a reasonable basis for inclusion of these statistics in the reception report. Packet loss tracks persistent congestion while jitter tracks transient congestion. If our best guess turns out to be wrong with more experience, we can use the report extension mechanism to test additional information, and once we have got a much better guess then we can field RTP version 3 with a revised report format. Furthermore, to allow profile-independent monitors to make valid interpretations of reports coming from different implementations, we must also specify the algorithm and its parameters as part of the main RTP protocol. This algorithm is the optimal first-order estimator and the parameter 1/16 is the optimal noise power reduction ratio for situations where there is no model of the system. To address the second issue, Ron Frederick suggested a compromise that was accepted by the group as a whole. The cumulative number of packets received will be replaced by the cumulative number of packets lost (calculated by the receiver as ``packets expected'' minus ``packets received''). Since this number is typically around two orders of magnitude less than the number of packets expected, a comparable range will be maintained if the packets lost field is reduced from 32 to 24 bits. The top 8 bits will then be used to carry a relative measure of packet loss that provides short-term information from a single report packet. This will be expressed as an 8-bit fraction of packets lost during the last reporting interval. A companion change was made to allow correlation between the single reception reports from multiple recipients: the ``cumulative number of packets expected'' is replaced by ``extended last sequence number received.'' The difference between these two values is only that the initial sequence number received is subtracted from the latter to calculate the former. Not subtracting the initial sequence number means that the ratio of the two words above will no longer produce an accurate overall loss rate. However, an accurate calculation of the loss rate for nearly the full session is possible by taking the difference in these fields between the first and last reception reports from a particular receiver, and then calculating the ratio. Reports on RTP Version 2 Implementations Two presentations were given on implementations of RTP version 2 in video tools. Steve McCanne from LBL reported that the implementation of RTP in vic was mostly straightforward; vic is producing reception reports per the -06 specification, but not yet analyzing them. Sources for vic were released before the IETF meeting. Both nv and ``Robust H.261'' encodings are implemented, but Steve identified some problems with the current specification for H.261 fragmentation. He proposed to make macroblocks the unit of processing by putting enough state information into the header so each packet can be processed independently. Frank Kastenholz from FTP Software presented Loki, a new payload format for RTP to carry the video formats of ``Video for Windows'' targeted at the PC/Windows environment. Processing load is shifted to the transmitter where possible because there are fewer and they can run on more powerful machines whereas receivers may be slow machines like 286s. The protocol includes some additional application-specific RTCP control packets, including some that are sent via unicast to a source. This will not work through RTP translators since RTPv2 no longer has the ``reverse control'' mechanism, so this is an issue to study. The Loki specification will be available as an Internet-Draft, and the implementation will be available for anonymous FTP. New Internet-Drafts on Video Payload Formats Three new Internet-Drafts specifying how to encapsulate Cell-B, JPEG and MPEG video in RTP were posted before the IETF meeting. They are draft-ietf-avt-cellb-profile-02.txt draft-ietf-avt-jpeg-00.txt draft-hoffman-rtp-mpeg-encap-01.txt Gerard Fernando from Sun gave a presentation on the MPEG draft. Since Don Hoffman has made presentations on MPEG at previous AVT meetings, Gerard concentrated on the RTP aspects. Sun has implemented the MPEG Elementary Streams encapsulation, which is the second of two defined in the specification and the one targeted for use over the Internet. For this encapsulation, the header is now always 32 bits rather than a minimum 16 with and optional second 16, following the recommendation made at the July AVT meeting in Toronto. Gerard brought up one scenario of concern: there will be cases where MPEG signals are received from satellite transmissions in MPEG2 Transport Systems (MTS) format, which does not provide slice/macroblock fragmentation information, and then retransmitted over the Internet. How can this be made more robust to packet loss? Van Jacobson suggested that it is not very expensive to parse the stream to find the macroblock boundaries. This would allow translation to the MPEG ES encapsulation format. Future Activities The RTP specification will be edited to produce a -07 draft incorporating the changes outlined above along with additional explanatory text for sections that readers have found unclear, for example, on how to use extension mechanisms. The draft will then be submitted for Area Directorate review and IESG Last Call as a Proposed Standard. Future working group tasks and meetings will be considered as needs arise.