From mary.ietf.barnes@gmail.com Mon Aug 1 14:18:30 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D127821F8CD7 for ; Mon, 1 Aug 2011 14:18:30 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -103.482 X-Spam-Level: X-Spam-Status: No, score=-103.482 tagged_above=-999 required=5 tests=[AWL=0.116, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1-1P0CkEX9ge for ; Mon, 1 Aug 2011 14:18:29 -0700 (PDT) Received: from mail-vw0-f44.google.com (mail-vw0-f44.google.com [209.85.212.44]) by ietfa.amsl.com (Postfix) with ESMTP id AC7C921F8C74 for ; Mon, 1 Aug 2011 14:18:29 -0700 (PDT) Received: by vws12 with SMTP id 12so5741359vws.31 for ; Mon, 01 Aug 2011 14:18:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=bg9p2h5+/NFSC3IVMkclZfALz0ByTkUDyP6k5BfItLw=; b=DJAx4fBX80kzI8B6bJs6PwQnvJtUdpCqoBJJ2/d0iPGWsdywDqqPHaVjTTQt8XlHty blQx61jGb8MkUl2vWjpEGyLwOgzmU0S4YpEzMyRijlv8Xj7ArMEzZNI1S5LfdvTA+OBZ 5zYKsym87wxBZw4YqZbVNgUJ3fSSnJySdBbag= MIME-Version: 1.0 Received: by 10.52.21.65 with SMTP id t1mr450194vde.183.1312233515384; Mon, 01 Aug 2011 14:18:35 -0700 (PDT) Received: by 10.52.167.34 with HTTP; Mon, 1 Aug 2011 14:18:35 -0700 (PDT) Date: Mon, 1 Aug 2011 16:18:35 -0500 Message-ID: From: Mary Barnes To: CLUE Content-Type: multipart/alternative; boundary=20cf307d05a68bfed404a9782cdd Subject: [clue] Doodle poll for CLUE virtual Interim Meeting - Poll closes Thursday, August 4th, 5pm Central X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Aug 2011 21:18:30 -0000 --20cf307d05a68bfed404a9782cdd Content-Type: text/plain; charset=ISO-8859-1 Hi all, Per discussion during the f2f meeting last week, an interim meeting is being planned in a few weeks time so that we can finish the discussion of the framework document, Brian Baldino's section on examples in particular. I have setup a doodle poll to find the best time for the majority: http://www.doodle.com/vmhdczdcrc7ck2k5 To determine the time for your timezone, please use the Timezone feature in Doodle (I set the times using Central time). The poll is Yes/Maybe/No so please consider whether you have any flexibility in your schedule in responding in the case that there is no clear majority for a specific day/time. Per the subject, we need to get this arranged ASAP, so the poll closes on Thursday, August 4th at 5pm Central time. Regards, Mary CLUE WG co-chair --20cf307d05a68bfed404a9782cdd Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi all,=A0

Per discussion during the f2f meeting last we= ek, an interim meeting is being planned in a few weeks time so that we can = finish the discussion of the framework document, Brian Baldino's sectio= n on examples in particular. =A0I have setup a doodle poll to find the best= time for the majority:

To determine the time for = your timezone, please use the Timezone feature in Doodle (I set the times u= sing Central time). =A0 The poll is Yes/Maybe/No so please consider whether= you have any flexibility in your schedule in responding in the case that t= here is no clear majority for a specific day/time. =A0Per the subject, we n= eed to get this arranged ASAP, so the poll closes on Thursday, August 4th a= t 5pm Central time.=A0

Regards,
Mary
CLUE WG co-chair --20cf307d05a68bfed404a9782cdd-- From Mark.Duckworth@polycom.com Fri Aug 5 14:02:10 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D790811E80C3 for ; Fri, 5 Aug 2011 14:02:10 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -6.565 X-Spam-Level: X-Spam-Status: No, score=-6.565 tagged_above=-999 required=5 tests=[AWL=0.034, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4] Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dvNVaZFISqAu for ; Fri, 5 Aug 2011 14:02:10 -0700 (PDT) Received: from crpehubprd02.polycom.com (crpehubprd01.polycom.com [140.242.64.158]) by ietfa.amsl.com (Postfix) with ESMTP id 5E60C11E80B2 for ; Fri, 5 Aug 2011 14:02:07 -0700 (PDT) Received: from Crpmboxprd01.polycom.com ([fe80::e001:c7b0:91a1:9443]) by crpehubprd02.polycom.com ([fe80::5efe:10.236.0.154%12]) with mapi; Fri, 5 Aug 2011 14:02:22 -0700 From: "Duckworth, Mark" To: "clue@ietf.org" Date: Fri, 5 Aug 2011 14:02:19 -0700 Thread-Topic: continuing "layout" discussion Thread-Index: AcxTsvb4KcK1pqKFRU6tEtH2M99T7g== Message-ID: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Aug 2011 21:02:11 -0000 I'd like to continue the discussion about layout and rendering issues. The= re are many separate but related things involved. I want to break it down = into separate topics, and see how the topics are related to each other. An= d then we can discuss what CLUE needs to deal with and what is not in scope= . I don't know if I'm using the best terms for each topic. If not, please su= ggest better terms. My use of the term "layout" here is not consistent wit= h draft-wenger-clue-definitions-01, because I don't limit it to the renderi= ng side. But my use of the terms "render" and "source selction" is consist= ent with that draft. 1- video layout composed arrangement within a stream - when multiple video = sources are composed into one stream, they are arranged in some way. Typic= al examples are 2x2 grid, 3x3 grid, 1+5 (1 large plus 5 small), 1+PiP (1 la= rge plus one or more picture-in-picture). These arrangements can be select= ed automatically or based on user input. Arrangements can change over time= . Identifying this composed arrangement is separate from identifying or se= lecting which video images are used to fill in the composition. These arra= ngements can be constructed by an endpoint sending video, by an MCU, or by = an endpoint receiving video as it renders to a display. 2 - source selection and identification - when a device is composing a stre= am made up of other sources, it needs some way to choose which sources to u= se, and some way of choosing how to combine them or where to place video im= ages in the composed arrangement. Various automatic algorithms may be used= , or selections can be made based on user input. Selections can change ove= r time. One example is "select the two most recent talkers". It may also = be desirable to identify which sources are used and where they are placed, = for example so the receiving side can use this information in the user inte= rface. Source selection can be done by an endpoint as it sends media, by a= n MUC, or by an endpoint receiving media. 3 - spatial relation among streams - how multiple streams are related to ea= ch other spatially, to be rendered such that the spatial arrangement is con= sistent. The examples we've been using have multiple video streams that ar= e related in an ordered row from left to right. Audio is also included whe= n it is desirable to match spatial audio to video. 4 - multi stream media format - what the streams mean with respect to each = other, regardless of the actual content on the streams. For audio, example= s are stereo, 5.1 surround, binaural, linear array. (linear array is descr= ibed in the clue framework document). Perhaps 3D video formats would also = fit in this category. This information is needed in order to properly rend= er the media into light and sound for human observers. I see this at the s= ame level as identifying a codec, independent of the audio or video content= carried on the streams, and independent of how any composition of sources = is done. I think there is general agreement that items 3 and 4 are in scope for CLUE= , as they specifically deal with multiple streams to and from an endpoint. = And the framework draft includes these. Items 1 and 2 are not new, those = topics exist for traditional single stream videoconferencing. I'm not sure= what aspects of 1 and 2 should be in scope for CLUE. It is hard to tell f= rom the use cases and requirements. The framework draft includes them only= to a very limited extent. Mark Duckworth From stephane.cazeaux@orange-ftgroup.com Mon Aug 8 08:57:32 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4F61121F8B0E for ; Mon, 8 Aug 2011 08:57:32 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.248 X-Spam-Level: X-Spam-Status: No, score=-2.248 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, HELO_EQ_FR=0.35] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RWNHS7Ur+kyR for ; Mon, 8 Aug 2011 08:57:31 -0700 (PDT) Received: from r-mail2.rd.francetelecom.com (r-mail2.rd.francetelecom.com [217.108.152.42]) by ietfa.amsl.com (Postfix) with ESMTP id 66C2121F85A0 for ; Mon, 8 Aug 2011 08:57:30 -0700 (PDT) Received: from r-mail2.rd.francetelecom.com (localhost.localdomain [127.0.0.1]) by localhost (Postfix) with SMTP id 36092FC4011; Mon, 8 Aug 2011 17:57:53 +0200 (CEST) Received: from ftrdsmtp1.rd.francetelecom.fr (unknown [10.192.128.46]) by r-mail2.rd.francetelecom.com (Postfix) with ESMTP id 17112FC400E; Mon, 8 Aug 2011 17:57:53 +0200 (CEST) Received: from FTRDCH02.rd.francetelecom.fr ([10.194.32.13]) by ftrdsmtp1.rd.francetelecom.fr with Microsoft SMTPSVC(6.0.3790.4675); Mon, 8 Aug 2011 17:56:50 +0200 Received: from FTRDMB03.rd.francetelecom.fr ([fe80::4c06:6ece:ed2d:797e]) by FTRDCH02.rd.francetelecom.fr ([::1]) with mapi id 14.01.0270.001; Mon, 8 Aug 2011 17:56:49 +0200 From: To: Thread-Topic: [clue] Comment on the presentation use case Thread-Index: AcxHlL8VW+C7KhbYQU2HJX/U8MKZpAFFKK/A///k6YCAAFO3gIAACg4AgAAqsQCAAFTQgP/uhI7g Date: Mon, 8 Aug 2011 15:56:48 +0000 Message-ID: References: <00ec01cc4ca9$9cc14e80$d643eb80$%roni@huawei.com><4E309135.7070408@alum.mit.edu><001b01cc4cd6$77d28760$67779620$%roni@huawei.com> <4E30DFDE.3040601@alum.mit.edu> <9ECCF01B52E7AB408A7EB85352642141031F2E5E@ftrdmel0.rd.francetelecom.fr> <4E314AD3.1030406@alum.mit.edu> In-Reply-To: <4E314AD3.1030406@alum.mit.edu> Accept-Language: fr-FR, en-US Content-Language: fr-FR X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.193.193.104] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginalArrivalTime: 08 Aug 2011 15:56:50.0337 (UTC) FILETIME=[C8E63910:01CC55E3] Cc: clue@ietf.org Subject: Re: [clue] Comment on the presentation use case X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2011 15:57:32 -0000 Hi, To me, it should be part of the telepresence protocols at least to enable t= he interoperability, as the presentation based on video stream allows it.=20 The point is that video stream is not convenient for the use cases I sugges= ted. But it does not necessarily mean that we should bundle a full data sha= ring protocol with telepresence. Something simpler, like the RFB option of = draft-garcia-mmusic-sdp-collaboration, could be a candidate. Stephane. -----Message d'origine----- De=A0: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] De la part de P= aul Kyzivat Envoy=E9=A0: jeudi 28 juillet 2011 13:41 =C0=A0: CHATRAS Bruno RD-CORE-ISS Cc=A0: clue@ietf.org Objet=A0: Re: [clue] Comment on the presentation use case On 7/28/11 2:37 AM, bruno.chatras@orange-ftgroup.com wrote: > I think we should take a look to > http://tools.ietf.org/html/draft-garcia-mmusic-sdp-collaboration-00 Maybe. But that is almost orthogonal to what I was suggesting. THanks, Paul (as individual) > Bruno > >> -----Message d'origine----- >> De : clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] De la part de >> Paul Kyzivat >> Envoy=E9 : jeudi 28 juillet 2011 06:05 >> =C0 : Roni Even >> Cc : clue@ietf.org >> Objet : Re: [clue] Comment on the presentation use case >> >> On 7/27/11 11:28 PM, Roni Even wrote: >>> Hi, >>> HTTP is not defining a common data sharing protocol. WebEx may be >> carried >>> over HTTP but the data sharing application is not standard. What I >> meant is >>> that it can either be something that is a common data sharing >> protocol or >>> something that is carried as an RTP payload which require some common >>> defined protocol on top. >> >> I was being a bit tongue in cheek, though not entirely. >> Of course you are right - that if you want to push data to everybody >> you >> need more. >> >> But data sharing by pointing a video camera at a piece of paper is a >> tad >> out of date. Connecting the video port on a user's computer as a video >> source and distributing it with the other video is better than that. >> But >> its not nearly as convenient as webex or any of its competitors. >> >> It isn't entirely clear that its *necessary* to bundle the data sharing >> application with the telepresence protocols. Its kind of limiting since >> the web apps evolve very rapidly. Perhaps we should be doing the >> opposite of that: providing a way to embed the control of the >> telepresence system into a web app. >> >> Thanks, >> Paul >> >>>> -----Original Message----- >>>> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf >> Of >>>> Paul Kyzivat >>>> Sent: Thursday, July 28, 2011 1:29 AM >>>> To: clue@ietf.org >>>> Subject: Re: [clue] Comment on the presentation use case >>>> >>>> On 7/27/11 6:07 PM, Roni Even wrote: >>>>> Hi Stephane, >>>>> >>>>> Is there a standard protocol that is used for conveying this >>>>> information, is it RTP based. >>>> >>>> AFAIK this is often http. (E.g. webex) >>>> >>>>> To me this is a separate application that can be integrated in the >>>>> application level and not as part of the multistream. >>>> >>>> I guess this depends on whether the support for it is integrated >> into >>>> the "room", or is just incidental equipment brought by the users, >> not >>>> formally related to the telepresence session. >>>> >>>> Thanks, >>>> Paul >>>> (speaking as an individual) >>>> >>>>> Roni >>>>> >>>>> *clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] *On Behalf Of >>>>> *stephane.cazeaux@orange-ftgroup.com >>>>> *Sent:* Thursday, July 21, 2011 2:33 PM >>>>> *To:* clue@ietf.org >>>>> *Subject:* [clue] Comment on the presentation use case* >>>>> >>>>> ** >>>>> >>>>> *Hi,* >>>>> >>>>> ** >>>>> >>>>> *The presentation use case as described in the use-cases document >> is >>>>> based on the assumption that the presentation stream relies on a >>>> video >>>>> stream, and is limited to usage of presentation video streams. But >> we >>>>> could also consider collaborative use cases, meaningful for >>>>> telepresence, which are not covered by the existing text.* >>>>> >>>>> *I propose to complete the existing text as follows:* >>>>> >>>>> ** >>>>> >>>>> *Furthermore, although most today's systems use video streams for >>>>> presentations, there are use cases where this is not suitable. For >>>> example:* >>>>> >>>>> *- The professor which shares an electronic whiteboard (could be a >>>>> whiteboard application on a PC, with screen capture of the PC) >> where >>>> all >>>>> students can participate. Students will take control of the shared >>>>> whiteboard in turns.* >>>>> >>>>> *- In a multipoint meeting, a shared document can be kept always >>>> visible >>>>> in a screen, while other documents are presented on other screens >>>> (with >>>>> possible in turns presentation). For instance, for the purpose of >>>> shared >>>>> design document, notes taking, polls, etc. A shared document >> implies >>>>> that all participants can modify it in turns.* >>>>> >>>>> *"* >>>>> >>>>> ** >>>>> >>>>> ** >>>>> >>>>> *St ephane. v> >>>>> * >>>>> >>>>> * >>>>> * >>>>> >>>>> *_______________________________________________ >>>>> clue mailing list >>>>> clue@ietf.org >>>>> https://www.ietf.org/mailman/listinfo/clue >>>>> * >>>> >>>> _______________________________________________ >>>> clue mailing list >>>> clue@ietf.org >>>> https://www.ietf.org/mailman/listinfo/clue >>> >>> >> >> _______________________________________________ >> clue mailing list >> clue@ietf.org >> https://www.ietf.org/mailman/listinfo/clue > _______________________________________________ clue mailing list clue@ietf.org https://www.ietf.org/mailman/listinfo/clue From pkyzivat@alum.mit.edu Mon Aug 8 12:22:20 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5846C21F8A51 for ; Mon, 8 Aug 2011 12:22:20 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.577 X-Spam-Level: X-Spam-Status: No, score=-2.577 tagged_above=-999 required=5 tests=[AWL=0.022, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id M55fa9ql9USA for ; Mon, 8 Aug 2011 12:22:19 -0700 (PDT) Received: from qmta03.westchester.pa.mail.comcast.net (qmta03.westchester.pa.mail.comcast.net [76.96.62.32]) by ietfa.amsl.com (Postfix) with ESMTP id 442CA21F8876 for ; Mon, 8 Aug 2011 12:22:19 -0700 (PDT) Received: from omta15.westchester.pa.mail.comcast.net ([76.96.62.87]) by qmta03.westchester.pa.mail.comcast.net with comcast id HvCy1h00B1swQuc53vNmjC; Mon, 08 Aug 2011 19:22:46 +0000 Received: from Paul-Kyzivats-MacBook-Pro.local ([24.62.109.41]) by omta15.westchester.pa.mail.comcast.net with comcast id HvNk1h00r0tdiYw3bvNlTF; Mon, 08 Aug 2011 19:22:46 +0000 Message-ID: <4E403783.70706@alum.mit.edu> Date: Mon, 08 Aug 2011 15:22:43 -0400 From: Paul Kyzivat User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: stephane.cazeaux@orange-ftgroup.com References: <00ec01cc4ca9$9cc14e80$d643eb80$%roni@huawei.com><4E309135.7070408@alum.mit.edu><001b01cc4cd6$77d28760$67779620$%roni@huawei.com> <4E30DFDE.3040601@alum.mit.edu> <9ECCF01B52E7AB408A7EB85352642141031F2E5E@ftrdmel0.rd.francetelecom.fr> <4E314AD3.1030406@alum.mit.edu> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: clue@ietf.org Subject: Re: [clue] Comment on the presentation use case X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2011 19:22:20 -0000 On 8/8/11 11:56 AM, stephane.cazeaux@orange-ftgroup.com wrote: > Hi, > > To me, it should be part of the telepresence protocols at least to enable the interoperability, as the presentation based on video stream allows it. > > The point is that video stream is not convenient for the use cases I suggested. But it does not necessarily mean that we should bundle a full data sharing protocol with telepresence. Something simpler, like the RFB option of draft-garcia-mmusic-sdp-collaboration, could be a candidate. What I was suggesting is that maybe it would make sense to turn those relationships "inside out". For instance when I use a collaboration tool like Webex, the collaboration is set up first via the web, and defines the set of participants. Then a voice session can be added. For pragmatic reasons, the voice conferencing seems to be pretty distinct. (I'm not certain how webex handles video. I suspect it is doing it via the web, not the telephony conference.) Its not hard to imagine the same sort of setup, but with a multiparty telepresence session instead of the traditional voice conference. In such a case, you would probably want the collaboration tool (webex, or whatever) to mediate the UI for the web collaboration, the roster, etc. It might delegate a lot of that to the telepresence infrastructure. But that does raise some questions about how all the components fit together. Is there a single screen for the collaboration session in a telepresence room? What about input to that - keyboard, mouse, etc.? Or do we assume that each person in the room has their own computer with input, display, etc. and maybe a way to slave the web collaboration session to one or more of the big displays in the room? What probably *can't* be done right now is nail down a particular web collaboration service (e.g. Webex) or protocol. That does complicate slaving the collaboration session to a screen, unless its done by having someone connect a video connection to their own computer. Thanks, Paul > Stephane. > > > -----Message d'origine----- > De : clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] De la part de Paul Kyzivat > Envoyé : jeudi 28 juillet 2011 13:41 > À : CHATRAS Bruno RD-CORE-ISS > Cc : clue@ietf.org > Objet : Re: [clue] Comment on the presentation use case > > On 7/28/11 2:37 AM, bruno.chatras@orange-ftgroup.com wrote: >> I think we should take a look to >> http://tools.ietf.org/html/draft-garcia-mmusic-sdp-collaboration-00 > > Maybe. But that is almost orthogonal to what I was suggesting. > > THanks, > Paul > (as individual) > >> Bruno >> >>> -----Message d'origine----- >>> De : clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] De la part de >>> Paul Kyzivat >>> Envoyé : jeudi 28 juillet 2011 06:05 >>> À : Roni Even >>> Cc : clue@ietf.org >>> Objet : Re: [clue] Comment on the presentation use case >>> >>> On 7/27/11 11:28 PM, Roni Even wrote: >>>> Hi, >>>> HTTP is not defining a common data sharing protocol. WebEx may be >>> carried >>>> over HTTP but the data sharing application is not standard. What I >>> meant is >>>> that it can either be something that is a common data sharing >>> protocol or >>>> something that is carried as an RTP payload which require some common >>>> defined protocol on top. >>> >>> I was being a bit tongue in cheek, though not entirely. >>> Of course you are right - that if you want to push data to everybody >>> you >>> need more. >>> >>> But data sharing by pointing a video camera at a piece of paper is a >>> tad >>> out of date. Connecting the video port on a user's computer as a video >>> source and distributing it with the other video is better than that. >>> But >>> its not nearly as convenient as webex or any of its competitors. >>> >>> It isn't entirely clear that its *necessary* to bundle the data sharing >>> application with the telepresence protocols. Its kind of limiting since >>> the web apps evolve very rapidly. Perhaps we should be doing the >>> opposite of that: providing a way to embed the control of the >>> telepresence system into a web app. >>> >>> Thanks, >>> Paul >>> >>>>> -----Original Message----- >>>>> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf >>> Of >>>>> Paul Kyzivat >>>>> Sent: Thursday, July 28, 2011 1:29 AM >>>>> To: clue@ietf.org >>>>> Subject: Re: [clue] Comment on the presentation use case >>>>> >>>>> On 7/27/11 6:07 PM, Roni Even wrote: >>>>>> Hi Stephane, >>>>>> >>>>>> Is there a standard protocol that is used for conveying this >>>>>> information, is it RTP based. >>>>> >>>>> AFAIK this is often http. (E.g. webex) >>>>> >>>>>> To me this is a separate application that can be integrated in the >>>>>> application level and not as part of the multistream. >>>>> >>>>> I guess this depends on whether the support for it is integrated >>> into >>>>> the "room", or is just incidental equipment brought by the users, >>> not >>>>> formally related to the telepresence session. >>>>> >>>>> Thanks, >>>>> Paul >>>>> (speaking as an individual) >>>>> >>>>>> Roni >>>>>> >>>>>> *clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] *On Behalf Of >>>>>> *stephane.cazeaux@orange-ftgroup.com >>>>>> *Sent:* Thursday, July 21, 2011 2:33 PM >>>>>> *To:* clue@ietf.org >>>>>> *Subject:* [clue] Comment on the presentation use case* >>>>>> >>>>>> ** >>>>>> >>>>>> *Hi,* >>>>>> >>>>>> ** >>>>>> >>>>>> *The presentation use case as described in the use-cases document >>> is >>>>>> based on the assumption that the presentation stream relies on a >>>>> video >>>>>> stream, and is limited to usage of presentation video streams. But >>> we >>>>>> could also consider collaborative use cases, meaningful for >>>>>> telepresence, which are not covered by the existing text.* >>>>>> >>>>>> *I propose to complete the existing text as follows:* >>>>>> >>>>>> ** >>>>>> >>>>>> *Furthermore, although most today's systems use video streams for >>>>>> presentations, there are use cases where this is not suitable. For >>>>> example:* >>>>>> >>>>>> *- The professor which shares an electronic whiteboard (could be a >>>>>> whiteboard application on a PC, with screen capture of the PC) >>> where >>>>> all >>>>>> students can participate. Students will take control of the shared >>>>>> whiteboard in turns.* >>>>>> >>>>>> *- In a multipoint meeting, a shared document can be kept always >>>>> visible >>>>>> in a screen, while other documents are presented on other screens >>>>> (with >>>>>> possible in turns presentation). For instance, for the purpose of >>>>> shared >>>>>> design document, notes taking, polls, etc. A shared document >>> implies >>>>>> that all participants can modify it in turns.* >>>>>> >>>>>> *"* >>>>>> >>>>>> ** >>>>>> >>>>>> ** >>>>>> >>>>>> *St ephane. v> >>>>>> * >>>>>> >>>>>> * >>>>>> * >>>>>> >>>>>> *_______________________________________________ >>>>>> clue mailing list >>>>>> clue@ietf.org >>>>>> https://www.ietf.org/mailman/listinfo/clue >>>>>> * >>>>> >>>>> _______________________________________________ >>>>> clue mailing list >>>>> clue@ietf.org >>>>> https://www.ietf.org/mailman/listinfo/clue >>>> >>>> >>> >>> _______________________________________________ >>> clue mailing list >>> clue@ietf.org >>> https://www.ietf.org/mailman/listinfo/clue >> > > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue > From pkyzivat@alum.mit.edu Tue Aug 9 06:03:03 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EE9BB21F850E for ; Tue, 9 Aug 2011 06:03:03 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.578 X-Spam-Level: X-Spam-Status: No, score=-2.578 tagged_above=-999 required=5 tests=[AWL=0.021, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BRLOx5RduTYg for ; Tue, 9 Aug 2011 06:03:03 -0700 (PDT) Received: from qmta12.westchester.pa.mail.comcast.net (qmta12.westchester.pa.mail.comcast.net [76.96.59.227]) by ietfa.amsl.com (Postfix) with ESMTP id D811421F8AA9 for ; Tue, 9 Aug 2011 06:03:02 -0700 (PDT) Received: from omta24.westchester.pa.mail.comcast.net ([76.96.62.76]) by qmta12.westchester.pa.mail.comcast.net with comcast id JCds1h00B1ei1Bg5CD3Yqa; Tue, 09 Aug 2011 13:03:32 +0000 Received: from Paul-Kyzivats-MacBook-Pro.local ([24.62.109.41]) by omta24.westchester.pa.mail.comcast.net with comcast id JD3X1h00A0tdiYw3kD3XRT; Tue, 09 Aug 2011 13:03:32 +0000 Message-ID: <4E413021.3010509@alum.mit.edu> Date: Tue, 09 Aug 2011 09:03:29 -0400 From: Paul Kyzivat User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: clue@ietf.org References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> In-Reply-To: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Aug 2011 13:03:04 -0000 On 8/5/11 5:02 PM, Duckworth, Mark wrote: > I'd like to continue the discussion about layout and rendering issues. There are many separate but related things involved. I want to break it down into separate topics, and see how the topics are related to each other. And then we can discuss what CLUE needs to deal with and what is not in scope. > > I don't know if I'm using the best terms for each topic. If not, please suggest better terms. My use of the term "layout" here is not consistent with draft-wenger-clue-definitions-01, because I don't limit it to the rendering side. But my use of the terms "render" and "source selction" is consistent with that draft. > > 1- video layout composed arrangement within a stream - when multiple video sources are composed into one stream, they are arranged in some way. Typical examples are 2x2 grid, 3x3 grid, 1+5 (1 large plus 5 small), 1+PiP (1 large plus one or more picture-in-picture). These arrangements can be selected automatically or based on user input. Arrangements can change over time. Identifying this composed arrangement is separate from identifying or selecting which video images are used to fill in the composition. These arrangements can be constructed by an endpoint sending video, by an MCU, or by an endpoint receiving video as it renders to a display. > > 2 - source selection and identification - when a device is composing a stream made up of other sources, it needs some way to choose which sources to use, and some way of choosing how to combine them or where to place video images in the composed arrangement. Various automatic algorithms may be used, or selections can be made based on user input. Selections can change over time. One example is "select the two most recent talkers". It may also be desirable to identify which sources are used and where they are placed, for example so the receiving side can use this information in the user interface. Source selection can be done by an endpoint as it sends media, by an MUC, or by an endpoint receiving media. > > 3 - spatial relation among streams - how multiple streams are related to each other spatially, to be rendered such that the spatial arrangement is consistent. The examples we've been using have multiple video streams that are related in an ordered row from left to right. Audio is also included when it is desirable to match spatial audio to video. > > 4 - multi stream media format - what the streams mean with respect to each other, regardless of the actual content on the streams. For audio, examples are stereo, 5.1 surround, binaural, linear array. (linear array is described in the clue framework document). Perhaps 3D video formats would also fit in this category. This information is needed in order to properly render the media into light and sound for human observers. I see this at the same level as identifying a codec, independent of the audio or video content carried on the streams, and independent of how any composition of sources is done. I was with you all the way until 4. That one I don't understand. The name you chose for this has connotations for me, but isn't fully in harmony with the definitions you give: If we consider audio, it makes sense that multiple streams can be rendered as if they came from different physical locations in the receiving room. That can be done by the receiver if it gets those streams separately, and has information about their intended relationships. It can also be done by the sender or MCU and passed on to the receiver as a single stream with stereo or binaural coding. So it seems to me you have two concepts here, not one. One has to do with describing the relationships between streams, and the other has to do with the encoding of spacial relationships *within* a single stream. Or, are you asserting that stereo and binaural are simply ways to encode multiple logical streams in one RTP stream, together with their spacial relationships? Thanks, Paul > I think there is general agreement that items 3 and 4 are in scope for CLUE, as they specifically deal with multiple streams to and from an endpoint. And the framework draft includes these. Items 1 and 2 are not new, those topics exist for traditional single stream videoconferencing. I'm not sure what aspects of 1 and 2 should be in scope for CLUE. It is hard to tell from the use cases and requirements. The framework draft includes them only to a very limited extent. > > Mark Duckworth > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue > From mary.ietf.barnes@gmail.com Wed Aug 10 08:33:15 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7E99921F863C for ; Wed, 10 Aug 2011 08:33:15 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -103.371 X-Spam-Level: X-Spam-Status: No, score=-103.371 tagged_above=-999 required=5 tests=[AWL=0.227, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id h8rC6GQidGqB for ; Wed, 10 Aug 2011 08:33:14 -0700 (PDT) Received: from mail-vx0-f172.google.com (mail-vx0-f172.google.com [209.85.220.172]) by ietfa.amsl.com (Postfix) with ESMTP id 7750721F8634 for ; Wed, 10 Aug 2011 08:33:14 -0700 (PDT) Received: by vxi29 with SMTP id 29so1160443vxi.31 for ; Wed, 10 Aug 2011 08:33:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=O8gPR4N6UBZxyU+tkJP6T3Q7PJehXJ7AS4f65yhMJ18=; b=iFvE90v8TBGDVb2AvKsZqvrEsy5L+fwP6Fql4EdPZUgmz9tBJA9Hc5yacJ8p/WRB6m GXzMuCRGz22gAPHBqV8BS8ZlhHZgS0246xz7ZsrePg35PYFAoc6huKLLI0yoz90ac0HO pmIeYIWN6B4NIAH79zhsnkGaviifhhsdp/Hd4= MIME-Version: 1.0 Received: by 10.52.100.99 with SMTP id ex3mr2193247vdb.116.1312990426021; Wed, 10 Aug 2011 08:33:46 -0700 (PDT) Received: by 10.52.167.34 with HTTP; Wed, 10 Aug 2011 08:33:45 -0700 (PDT) Date: Wed, 10 Aug 2011 10:33:45 -0500 Message-ID: From: Mary Barnes To: CLUE Content-Type: multipart/alternative; boundary=20cf307f3286efbe2504aa286786 Subject: [clue] CLUE virtual Interim Meeting - August 23rd, 2011 X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2011 15:33:15 -0000 --20cf307f3286efbe2504aa286786 Content-Type: text/plain; charset=ISO-8859-1 Hi all, The doodle poll showed that Tuesday, August 23rd (11:00 am central) is the optimal date and time for the Interim meeting. I'll send the Webex info shortly. http://www.doodle.com/vmhdczdcrc7ck2k5 > > To determine the time for your timezone, please use the Timezone feature in > Doodle (I set the times using Central time). > > Regards, > Mary > CLUE WG co-chair > --20cf307f3286efbe2504aa286786 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi all,

The doodle poll showed that Tuesday, August 23rd= (11:00 am central) is the optimal date and time for the Interim meeting. = =A0I'll send the Webex info shortly.


To deter= mine the time for your timezone, please use the Timezone feature in Doodle = (I set the times using Central time). =A0=A0=A0

Regards,
Mary
CLUE WG co-chair

--20cf307f3286efbe2504aa286786-- From wwwrun@ietfa.amsl.com Wed Aug 10 12:52:15 2011 Return-Path: X-Original-To: clue@ietf.org Delivered-To: clue@ietfa.amsl.com Received: by ietfa.amsl.com (Postfix, from userid 30) id 64BA511E8073; Wed, 10 Aug 2011 12:52:14 -0700 (PDT) From: IESG Secretary To: IETF Announcement list Content-Type: text/plain; charset="utf-8" Mime-Version: 1.0 Message-Id: <20110810195215.64BA511E8073@ietfa.amsl.com> Date: Wed, 10 Aug 2011 12:52:14 -0700 (PDT) Cc: clue@ietf.org Subject: [clue] CLUE WG Virtual Interim Meeting: August 23, 2011 X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2011 19:52:15 -0000 The CLUE WG will hold an interim virtual meeting on: 2011-08-23, 16.00-18.00 GMT (starting at 9.00 Pacific, 11.00 Central, 12.00 Eastern) Agenda and details will be announced on the CLUE WG mailing list (http://www.ietf.org/mail-archive/web/clue/) as soon as available. From Mark.Duckworth@polycom.com Wed Aug 10 14:48:55 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D356711E80BD for ; Wed, 10 Aug 2011 14:48:55 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -6.569 X-Spam-Level: X-Spam-Status: No, score=-6.569 tagged_above=-999 required=5 tests=[AWL=0.030, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LjpZ50MB2aWw for ; Wed, 10 Aug 2011 14:48:55 -0700 (PDT) Received: from crpehubprd02.polycom.com (crpehubprd01.polycom.com [140.242.64.158]) by ietfa.amsl.com (Postfix) with ESMTP id 1C5A811E80AA for ; Wed, 10 Aug 2011 14:48:53 -0700 (PDT) Received: from Crpmboxprd01.polycom.com ([fe80::e001:c7b0:91a1:9443]) by crpehubprd02.polycom.com ([fe80::5efe:10.236.0.154%12]) with mapi; Wed, 10 Aug 2011 14:49:25 -0700 From: "Duckworth, Mark" To: "clue@ietf.org" Date: Wed, 10 Aug 2011 14:49:36 -0700 Thread-Topic: [clue] continuing "layout" discussion Thread-Index: AcxWlOS7i9j5CgQ1RMm1TCHmVfuG/wBCYeww Message-ID: <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> In-Reply-To: <4E413021.3010509@alum.mit.edu> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2011 21:48:56 -0000 > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of > Paul Kyzivat > Sent: Tuesday, August 09, 2011 9:03 AM > To: clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion > > 4 - multi stream media format - what the streams mean with respect to > each other, regardless of the actual content on the streams. For > audio, examples are stereo, 5.1 surround, binaural, linear array. > (linear array is described in the clue framework document). Perhaps 3D > video formats would also fit in this category. This information is > needed in order to properly render the media into light and sound for > human observers. I see this at the same level as identifying a codec, > independent of the audio or video content carried on the streams, and > independent of how any composition of sources is done. >=20 > I was with you all the way until 4. That one I don't understand. > The name you chose for this has connotations for me, but isn't fully in > harmony with the definitions you give: I'm happy to change the name if you have a suggestion > If we consider audio, it makes sense that multiple streams can be > rendered as if they came from different physical locations in the > receiving room. That can be done by the receiver if it gets those > streams separately, and has information about their intended > relationships. It can also be done by the sender or MCU and passed on > to > the receiver as a single stream with stereo or binaural coding. Yes. It could also be done by the sender using the "linear array" audio ch= annel format. Maybe it is true that stereo or binaural audio channels woul= d always be sent as a single stream, but I was not assuming that yet, at le= ast not in general when you consider other types too, such as linear array = channels. > So it seems to me you have two concepts here, not one. One has to do > with describing the relationships between streams, and the other has to > do with the encoding of spacial relationships *within* a single stream. Maybe that is a better way to describe it, if you assume multi-channel audi= o is always sent with all the channels in the same RTP stream. Is that wha= t you mean? I was considering the linear array format to be another type of multi-chann= el audio, and I know people want to be able to send each channel in a separ= ate RTP stream. So it doesn't quite fit with how you separate the two conc= epts. In my view, identifying the separate channels by what they mean is t= he same concept for linear array and stereo. For example "this channel is = left, this channel is center, this channel is right". To me, that is the s= ame concept for identifying channels whether or not they are carried in the= same RTP stream. Maybe we are thinking the same thing but getting confused by terminology ab= out channels vs. streams. > Or, are you asserting that stereo and binaural are simply ways to > encode > multiple logical streams in one RTP stream, together with their spacial > relationships? No, that is not what I'm trying to say. Mark From pkyzivat@alum.mit.edu Thu Aug 11 06:01:20 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D3D2A21F86AE for ; Thu, 11 Aug 2011 06:01:20 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.574 X-Spam-Level: X-Spam-Status: No, score=-2.574 tagged_above=-999 required=5 tests=[AWL=0.025, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id N1t-umC3EUQu for ; Thu, 11 Aug 2011 06:01:20 -0700 (PDT) Received: from qmta08.westchester.pa.mail.comcast.net (qmta08.westchester.pa.mail.comcast.net [76.96.62.80]) by ietfa.amsl.com (Postfix) with ESMTP id 07D8E21F8596 for ; Thu, 11 Aug 2011 06:01:19 -0700 (PDT) Received: from omta14.westchester.pa.mail.comcast.net ([76.96.62.60]) by qmta08.westchester.pa.mail.comcast.net with comcast id K0xi1h0041HzFnQ5811uwU; Thu, 11 Aug 2011 13:01:54 +0000 Received: from Paul-Kyzivats-MacBook-Pro.local ([24.62.109.41]) by omta14.westchester.pa.mail.comcast.net with comcast id K11t1h02v0tdiYw3a11upJ; Thu, 11 Aug 2011 13:01:54 +0000 Message-ID: <4E43D2BE.5010102@alum.mit.edu> Date: Thu, 11 Aug 2011 09:01:50 -0400 From: Paul Kyzivat User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: clue@ietf.org References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> In-Reply-To: <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Aug 2011 13:01:20 -0000 Inline On 8/10/11 5:49 PM, Duckworth, Mark wrote: >> -----Original Message----- >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of >> Paul Kyzivat >> Sent: Tuesday, August 09, 2011 9:03 AM >> To: clue@ietf.org >> Subject: Re: [clue] continuing "layout" discussion > >>> 4 - multi stream media format - what the streams mean with respect to >> each other, regardless of the actual content on the streams. For >> audio, examples are stereo, 5.1 surround, binaural, linear array. >> (linear array is described in the clue framework document). Perhaps 3D >> video formats would also fit in this category. This information is >> needed in order to properly render the media into light and sound for >> human observers. I see this at the same level as identifying a codec, >> independent of the audio or video content carried on the streams, and >> independent of how any composition of sources is done. >> >> I was with you all the way until 4. That one I don't understand. >> The name you chose for this has connotations for me, but isn't fully in >> harmony with the definitions you give: > > I'm happy to change the name if you have a suggestion Not yet. Maybe once the concepts are more clearly defined I will have an opinion. >> If we consider audio, it makes sense that multiple streams can be >> rendered as if they came from different physical locations in the >> receiving room. That can be done by the receiver if it gets those >> streams separately, and has information about their intended >> relationships. It can also be done by the sender or MCU and passed on >> to >> the receiver as a single stream with stereo or binaural coding. > > Yes. It could also be done by the sender using the "linear array" audio channel format. Maybe it is true that stereo or binaural audio channels would always be sent as a single stream, but I was not assuming that yet, at least not in general when you consider other types too, such as linear array channels. >> So it seems to me you have two concepts here, not one. One has to do >> with describing the relationships between streams, and the other has to >> do with the encoding of spacial relationships *within* a single stream. > > Maybe that is a better way to describe it, if you assume multi-channel audio is always sent with all the channels in the same RTP stream. Is that what you mean? > > I was considering the linear array format to be another type of multi-channel audio, and I know people want to be able to send each channel in a separate RTP stream. So it doesn't quite fit with how you separate the two concepts. In my view, identifying the separate channels by what they mean is the same concept for linear array and stereo. For example "this channel is left, this channel is center, this channel is right". To me, that is the same concept for identifying channels whether or not they are carried in the same RTP stream. > > Maybe we are thinking the same thing but getting confused by terminology about channels vs. streams. Maybe. Let me try to restate what I now think you are saying: The audio may consist of several "channels". Each channel may be sent over its own RTP stream, or multiple channels may be multiplexed over an RTP stream. I guess much of this can also apply to video. When there are exactly two audio channels, they may be encoded as "stereo" or "binaural", which then affects how they should be rendered by the recipient. In these cases the primary info that is required about the individual channels is which is left and which is right. (And which perspective to use in interpretting left and right.) For other multi-channel cases more information is required about the role of each channel in order to properly render them. Thanks, Paul >> Or, are you asserting that stereo and binaural are simply ways to >> encode >> multiple logical streams in one RTP stream, together with their spacial >> relationships? > > No, that is not what I'm trying to say. > > Mark > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue > From mary.ietf.barnes@gmail.com Thu Aug 11 08:19:14 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 16EC921F85CA for ; Thu, 11 Aug 2011 08:19:14 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -103.39 X-Spam-Level: X-Spam-Status: No, score=-103.39 tagged_above=-999 required=5 tests=[AWL=0.208, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AERi-uA1UKcy for ; Thu, 11 Aug 2011 08:19:13 -0700 (PDT) Received: from mail-vw0-f44.google.com (mail-vw0-f44.google.com [209.85.212.44]) by ietfa.amsl.com (Postfix) with ESMTP id 08C3521F85B9 for ; Thu, 11 Aug 2011 08:19:12 -0700 (PDT) Received: by vws12 with SMTP id 12so2191252vws.31 for ; Thu, 11 Aug 2011 08:19:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; bh=xWseIDoeIPF0y1lJecsrb7lEe7D+hG92FqZEzQBBGT4=; b=VY9ciidO7dtqZUY0+4+eGHJriZd/7+1mNIpUoLMNy2NEXoJ9UoS9MVhZWl53tNRMGu L4Rrt6xI1iNWDiiYt0mBoqbAjexzxdTF/N6PTSKVFED/zk3uSXpUrRsJRugcJ6p7k4AN dXk/mSDN69LCMz77OFbYedMWFKoWZczuVGwd4= MIME-Version: 1.0 Received: by 10.52.69.194 with SMTP id g2mr6470223vdu.451.1313075986110; Thu, 11 Aug 2011 08:19:46 -0700 (PDT) Received: by 10.52.160.71 with HTTP; Thu, 11 Aug 2011 08:19:45 -0700 (PDT) Date: Thu, 11 Aug 2011 10:19:45 -0500 Message-ID: From: Mary Barnes To: CLUE Content-Type: multipart/alternative; boundary=20cf307cffd8b71a5504aa3c534b Subject: [clue] Webex Details: CLUE WG Virtual Interim Meeting X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Aug 2011 15:19:14 -0000 --20cf307cffd8b71a5504aa3c534b Content-Type: text/plain; charset=ISO-8859-1 Hello , IETF Secretariat invites you to attend this online meeting. Topic: CLUE WG Virtual Interim Meeting Date: Tuesday, August 23, 2011 Time: 9:00 am, Pacific Daylight Time (San Francisco, GMT-07:00) Meeting Number: 963 755 542 Meeting Password: (This meeting does not require a password.) ------------------------------------------------------- To join the online meeting (Now from mobile devices!) ------------------------------------------------------- 1. Go to https://workgreen.webex.com/workgreen/j.php?ED=181742197&UID=1249097532&RT=MiM0 2. If requested, enter your name and email address. 3. If a password is required, enter the meeting password: (This meeting does not require a password.) 4. Click "Join". To view in other time zones or languages, please click the link: https://workgreen.webex.com/workgreen/j.php?ED=181742197&UID=1249097532&ORT=MiM0 ------------------------------------------------------- To join the audio conference only ------------------------------------------------------- To receive a call back, provide your phone number when you join the meeting, or call the number below and enter the access code. Call-in toll number (US/Canada): 1-408-792-6300 Global call-in numbers: https://workgreen.webex.com/workgreen/globalcallin.php?serviceType=MC&ED=181742197&tollFree=0 Access code:963 755 542 ------------------------------------------------------- For assistance ------------------------------------------------------- 1. Go to https://workgreen.webex.com/workgreen/mc 2. On the left navigation bar, click "Support". You can contact me at: amorris@amsl.com 1-510-492-4081 To add this meeting to your calendar program (for example Microsoft Outlook), click this link: https://workgreen.webex.com/workgreen/j.php?ED=181742197&UID=1249097532&ICS=MI&LD=1&RD=2&ST=1&SHA2=1sO7X9GoItG7qDII-/DUsH2iEIlMx8cUMEWOoPlBrjY=&RT=MiM0 The playback of UCF (Universal Communications Format) rich media files requires appropriate players. To view this type of rich media files in the meeting, please check whether you have the players installed on your computer by going to https://workgreen.webex.com/workgreen/systemdiagnosis.php. Sign up for a free trial of WebEx http://www.webex.com/go/mcemfreetrial http://www.webex.com CCP:+14087926300x963755542# IMPORTANT NOTICE: This WebEx service includes a feature that allows audio and any documents and other materials exchanged or viewed during the session to be recorded. By joining this session, you automatically consent to such recordings. If you do not consent to the recording, discuss your concerns with the meeting host prior to the start of the recording or do not join the session. Please note that any such recordings may be subject to discovery in the event of litigation. --20cf307cffd8b71a5504aa3c534b Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hello ,

IETF Secretariat invites you to at= tend this online meeting.

Topic: CLUE WG Virtual Interim Meeting =
Date: Tuesday, August 23, 2011
Time: 9:00 am, Pacific Daylight Time (= San Francisco, GMT-07:00)
Meeting Number: 963 755 542
Meeting Pas= sword: (This meeting does not require a password.)


---------= ----------------------------------------------
To join the online meeting (Now from mobile devices!)
---------------= ----------------------------------------
1. Go to https://workgreen.webex.com/workgreen/j.php?ED= =3D181742197&UID=3D1249097532&RT=3DMiM0
2. If requested, enter your name and email address.
3. If a password = is required, enter the meeting password: (This meeting does not require a p= assword.)
4. Click "Join".

To view in other time z= ones or languages, please click the link:
https://workgreen.webex.= com/workgreen/j.php?ED=3D181742197&UID=3D1249097532&ORT=3DMiM0 =

-------------------------------------------------------
To join the audio conference only
-----------------------------------= --------------------
To receive a call back, provide your phone number= when you join the meeting, or call the number below and enter the access c= ode.
Call-in toll number (US/Canada): 1-408-792-6300
Global call-in numbe= rs: https://= workgreen.webex.com/workgreen/globalcallin.php?serviceType=3DMC&ED=3D18= 1742197&tollFree=3D0

Access code:963 755 542

-----------------------------------= --------------------
For assistance
-----------------------------= --------------------------
1. Go to https://workgreen.webex.com/workgreen/= mc
2. On the left navigation bar, click "Support".

You ca= n contact me at:
amorris@amsl.com
1-510-492-4081

To add this meeting to your calendar program (for example Microsoft O= utlook), click this link:
https://workgreen.webex.com/workgreen/j.p= hp?ED=3D181742197&UID=3D1249097532&ICS=3DMI&LD=3D1&RD=3D2&a= mp;ST=3D1&SHA2=3D1sO7X9GoItG7qDII-/DUsH2iEIlMx8cUMEWOoPlBrjY=3D&RT= =3DMiM0

The playback of UCF (Universal Communications Format) rich media file= s requires appropriate players. To view this type of rich media files in th= e meeting, please check whether you have the players installed on your comp= uter by going to https://workgreen.webex.com/workgreen/systemd= iagnosis.php.

Sign up for a free trial of WebEx
http://www.webex.com/go/mcemfreetrial=

http://ww= w.webex.com

CCP:+14087926300x963755542#

IMPORTANT NOTICE: This WebEx se= rvice includes a feature that allows audio and any documents and other mate= rials exchanged or viewed during the session to be recorded. By joining thi= s session, you automatically consent to such recordings. If you do not cons= ent to the recording, discuss your concerns with the meeting host prior to = the start of the recording or do not join the session. Please note that any= such recordings may be subject to discovery in the event of litigation.

--20cf307cffd8b71a5504aa3c534b-- From mary.ietf.barnes@gmail.com Thu Aug 11 15:57:47 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A837F11E8092 for ; Thu, 11 Aug 2011 15:57:47 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -103.394 X-Spam-Level: X-Spam-Status: No, score=-103.394 tagged_above=-999 required=5 tests=[AWL=0.204, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WhwnMcN5eDT3 for ; Thu, 11 Aug 2011 15:57:46 -0700 (PDT) Received: from mail-vx0-f172.google.com (mail-vx0-f172.google.com [209.85.220.172]) by ietfa.amsl.com (Postfix) with ESMTP id 865EC11E809C for ; Thu, 11 Aug 2011 15:57:46 -0700 (PDT) Received: by vxi29 with SMTP id 29so2512775vxi.31 for ; Thu, 11 Aug 2011 15:58:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=IYwT+QqErEHTVUGqh9Q2aQngGQqqn3KHqD0DoJDR9TI=; b=kRU/e6V4OBcWkk/uR1xvkKWthNpj9MmjggNm1r6LojvUb7GeaUzgqeh2OOqjGqKL17 8jPGh0pddY9D+ke65Wv82iYgTtS0lA/Vr4dStY/4cCoEKlSNIphgQTGEnWc5ak/kdokK 7jRRb/j5BA1a5i3XRHVGW9NQbs2mKlUqSTAO4= MIME-Version: 1.0 Received: by 10.52.93.72 with SMTP id cs8mr160322vdb.518.1313103501746; Thu, 11 Aug 2011 15:58:21 -0700 (PDT) Received: by 10.52.160.71 with HTTP; Thu, 11 Aug 2011 15:58:21 -0700 (PDT) In-Reply-To: References: Date: Thu, 11 Aug 2011 17:58:21 -0500 Message-ID: From: Mary Barnes To: CLUE Content-Type: multipart/alternative; boundary=20cf307cfdd4c660c204aa42bbc9 Subject: Re: [clue] Webex Details: CLUE WG Virtual Interim Meeting X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Aug 2011 22:57:47 -0000 --20cf307cfdd4c660c204aa42bbc9 Content-Type: text/plain; charset=ISO-8859-1 As a reminder, all the materials for the meeting will be available on the CLUE WG wiki: http://trac.tools.ietf.org/wg/clue/trac/wiki There is a tentative agenda available at this time. Regards, Mary. On Thu, Aug 11, 2011 at 10:19 AM, Mary Barnes wrote: > Hello , > > IETF Secretariat invites you to attend this online meeting. > > Topic: CLUE WG Virtual Interim Meeting > Date: Tuesday, August 23, 2011 > Time: 9:00 am, Pacific Daylight Time (San Francisco, GMT-07:00) > Meeting Number: 963 755 542 > Meeting Password: (This meeting does not require a password.) > > > ------------------------------------------------------- > To join the online meeting (Now from mobile devices!) > ------------------------------------------------------- > 1. Go to > https://workgreen.webex.com/workgreen/j.php?ED=181742197&UID=1249097532&RT=MiM0 > 2. If requested, enter your name and email address. > 3. If a password is required, enter the meeting password: (This meeting > does not require a password.) > 4. Click "Join". > > To view in other time zones or languages, please click the link: > > https://workgreen.webex.com/workgreen/j.php?ED=181742197&UID=1249097532&ORT=MiM0 > > ------------------------------------------------------- > To join the audio conference only > ------------------------------------------------------- > To receive a call back, provide your phone number when you join the > meeting, or call the number below and enter the access code. > Call-in toll number (US/Canada): 1-408-792-6300 > Global call-in numbers: > https://workgreen.webex.com/workgreen/globalcallin.php?serviceType=MC&ED=181742197&tollFree=0 > > Access code:963 755 542 > > ------------------------------------------------------- > For assistance > ------------------------------------------------------- > 1. Go to https://workgreen.webex.com/workgreen/mc > 2. On the left navigation bar, click "Support". > > You can contact me at: > amorris@amsl.com > 1-510-492-4081 > > To add this meeting to your calendar program (for example Microsoft > Outlook), click this link: > > https://workgreen.webex.com/workgreen/j.php?ED=181742197&UID=1249097532&ICS=MI&LD=1&RD=2&ST=1&SHA2=1sO7X9GoItG7qDII-/DUsH2iEIlMx8cUMEWOoPlBrjY=&RT=MiM0 > > The playback of UCF (Universal Communications Format) rich media files > requires appropriate players. To view this type of rich media files in the > meeting, please check whether you have the players installed on your > computer by going to > https://workgreen.webex.com/workgreen/systemdiagnosis.php. > > Sign up for a free trial of WebEx > http://www.webex.com/go/mcemfreetrial > > http://www.webex.com > > CCP:+14087926300x963755542# > > IMPORTANT NOTICE: This WebEx service includes a feature that allows audio > and any documents and other materials exchanged or viewed during the session > to be recorded. By joining this session, you automatically consent to such > recordings. If you do not consent to the recording, discuss your concerns > with the meeting host prior to the start of the recording or do not join the > session. Please note that any such recordings may be subject to discovery in > the event of litigation. > > --20cf307cfdd4c660c204aa42bbc9 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable As a reminder, all the materials for the meeting will be available on the C= LUE WG wiki:

Th= ere is a tentative agenda available at this time.

Regards,
Mary.=A0

On Thu, Aug 11, 2011 at 10:19 AM, Mary Barnes <= mary.ietf.barnes@gmail.com> wrote:
Hello ,

IE= TF Secretariat invites you to attend this online meeting.

Topic: CLUE WG Virtual Interim Meeting
Date: Tuesday, August 23, 2011
Time: 9:00 am, Pacific Daylight Time (= San Francisco, GMT-07:00)
Meeting Number: 963 755 542
Meeting Pas= sword: (This meeting does not require a password.)


---------= ----------------------------------------------
To join the online meeting (Now from mobile devices!)
---------------= ----------------------------------------
1. Go to
https://workgreen.webex.com/workgreen/j.php?ED= =3D181742197&UID=3D1249097532&RT=3DMiM0
2. If requested, enter your name and email address.
3. If a password = is required, enter the meeting password: (This meeting does not require a p= assword.)
4. Click "Join".

To view in other time z= ones or languages, please click the link:
https://workgreen.webex.= com/workgreen/j.php?ED=3D181742197&UID=3D1249097532&ORT=3DMiM0 =

-------------------------------------------------------
To join the audio conference only
-----------------------------------= --------------------
To receive a call back, provide your phone number= when you join the meeting, or call the number below and enter the access c= ode.
Call-in toll number (US/Canada): 1-408-792-6300
Global call-in numbe= rs: https://= workgreen.webex.com/workgreen/globalcallin.php?serviceType=3DMC&ED=3D18= 1742197&tollFree=3D0

Access code:963 755 542

-----------------------------------= --------------------
For assistance
-----------------------------= --------------------------
1. Go to https://workgreen.webex.com/workgreen/= mc
2. On the left navigation bar, click "Support".

You ca= n contact me at:
amorris@amsl.com
1-510-492-4081

To add this meeting to your calendar program (for example Microsoft O= utlook), click this link:
https://workgreen.webex.com/workgreen/j.p= hp?ED=3D181742197&UID=3D1249097532&ICS=3DMI&LD=3D1&RD=3D2&a= mp;ST=3D1&SHA2=3D1sO7X9GoItG7qDII-/DUsH2iEIlMx8cUMEWOoPlBrjY=3D&RT= =3DMiM0

The playback of UCF (Universal Communications Format) rich media file= s requires appropriate players. To view this type of rich media files in th= e meeting, please check whether you have the players installed on your comp= uter by going to https://workgreen.webex.com/workgreen/systemd= iagnosis.php.

Sign up for a free trial of WebEx
http://www.webex.com/go/mcemfreetrial=

http://ww= w.webex.com

CCP:+14087926300x963755542#

IMPORTANT NOTICE: This WebEx se= rvice includes a feature that allows audio and any documents and other mate= rials exchanged or viewed during the session to be recorded. By joining thi= s session, you automatically consent to such recordings. If you do not cons= ent to the recording, discuss your concerns with the meeting host prior to = the start of the recording or do not join the session. Please note that any= such recordings may be subject to discovery in the event of litigation.


--20cf307cfdd4c660c204aa42bbc9-- From stephane.cazeaux@orange-ftgroup.com Fri Aug 12 02:04:24 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7902421F863A for ; Fri, 12 Aug 2011 02:04:24 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -3.249 X-Spam-Level: X-Spam-Status: No, score=-3.249 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_EQ_FR=0.35, RCVD_IN_DNSWL_LOW=-1] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8wqMJmN4qEbV for ; Fri, 12 Aug 2011 02:04:23 -0700 (PDT) Received: from p-mail1.rd.francetelecom.com (p-mail1.rd.francetelecom.com [195.101.245.15]) by ietfa.amsl.com (Postfix) with ESMTP id D389B21F867A for ; Fri, 12 Aug 2011 02:04:22 -0700 (PDT) Received: from p-mail1.rd.francetelecom.com (localhost.localdomain [127.0.0.1]) by localhost (Postfix) with SMTP id 721608B8008; Fri, 12 Aug 2011 10:49:34 +0200 (CEST) Received: from ftrdsmtp1.rd.francetelecom.fr (unknown [10.192.128.46]) by p-mail1.rd.francetelecom.com (Postfix) with ESMTP id 133B98B8007; Fri, 12 Aug 2011 10:49:34 +0200 (CEST) Received: from FTRDCH02.rd.francetelecom.fr ([10.194.32.13]) by ftrdsmtp1.rd.francetelecom.fr with Microsoft SMTPSVC(6.0.3790.4675); Fri, 12 Aug 2011 10:48:42 +0200 Received: from FTRDMB03.rd.francetelecom.fr ([fe80::4c06:6ece:ed2d:797e]) by FTRDCH02.rd.francetelecom.fr ([::1]) with mapi id 14.01.0270.001; Fri, 12 Aug 2011 10:48:42 +0200 From: To: Thread-Topic: [clue] Comment on the presentation use case Thread-Index: AcxHlL8VW+C7KhbYQU2HJX/U8MKZpAFFKK/A///k6YCAAFO3gIAACg4AgAAqsQCAAFTQgP/uhI7ggCNGD4D/+k3aoA== Date: Fri, 12 Aug 2011 08:48:41 +0000 Message-ID: References: <00ec01cc4ca9$9cc14e80$d643eb80$%roni@huawei.com><4E309135.7070408@alum.mit.edu><001b01cc4cd6$77d28760$67779620$%roni@huawei.com> <4E30DFDE.3040601@alum.mit.edu> <9ECCF01B52E7AB408A7EB85352642141031F2E5E@ftrdmel0.rd.francetelecom.fr> <4E314AD3.1030406@alum.mit.edu> <4E403783.70706@alum.mit.edu> In-Reply-To: <4E403783.70706@alum.mit.edu> Accept-Language: fr-FR, en-US Content-Language: fr-FR X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.193.193.104] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginalArrivalTime: 12 Aug 2011 08:48:42.0948 (UTC) FILETIME=[A3ACB840:01CC58CC] Cc: clue@ietf.org Subject: Re: [clue] Comment on the presentation use case X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2011 09:04:24 -0000 Hi Paul, I understand your suggestion, and I am aware that it could be a solution. I= t certainly makes sense, but I am not sure of how it could be integrated in= CLUE. To me, this is a matter of application integration, at application l= evel. My suggestion is that we should think of a solution that provides more sati= sfying collaboration within telepresence, with the same level of interopera= bility and multistream interaction as presentation video stream provides, b= ut with more satisfying quality (video is not suitable for all kinds of sha= red documents) and more collaborative features. I don't think that the kind of relationship that you suggest is the only so= lution. Stephane.=20 -----Message d'origine----- De=A0: Paul Kyzivat [mailto:pkyzivat@alum.mit.edu]=20 Envoy=E9=A0: lundi 8 ao=FBt 2011 21:23 =C0=A0: CAZEAUX Stephane RD-BIZZ-CAE Cc=A0: clue@ietf.org Objet=A0: Re: [clue] Comment on the presentation use case On 8/8/11 11:56 AM, stephane.cazeaux@orange-ftgroup.com wrote: > Hi, > > To me, it should be part of the telepresence protocols at least to enable= the interoperability, as the presentation based on video stream allows it. > > The point is that video stream is not convenient for the use cases I sugg= ested. But it does not necessarily mean that we should bundle a full data s= haring protocol with telepresence. Something simpler, like the RFB option o= f draft-garcia-mmusic-sdp-collaboration, could be a candidate. What I was suggesting is that maybe it would make sense to turn those=20 relationships "inside out". For instance when I use a collaboration tool like Webex, the=20 collaboration is set up first via the web, and defines the set of=20 participants. Then a voice session can be added. For pragmatic reasons,=20 the voice conferencing seems to be pretty distinct. (I'm not certain how=20 webex handles video. I suspect it is doing it via the web, not the=20 telephony conference.) Its not hard to imagine the same sort of setup, but with a multiparty=20 telepresence session instead of the traditional voice conference. In=20 such a case, you would probably want the collaboration tool (webex, or=20 whatever) to mediate the UI for the web collaboration, the roster, etc.=20 It might delegate a lot of that to the telepresence infrastructure. But that does raise some questions about how all the components fit=20 together. Is there a single screen for the collaboration session in a=20 telepresence room? What about input to that - keyboard, mouse, etc.? Or=20 do we assume that each person in the room has their own computer with=20 input, display, etc. and maybe a way to slave the web collaboration=20 session to one or more of the big displays in the room? What probably *can't* be done right now is nail down a particular web=20 collaboration service (e.g. Webex) or protocol. That does complicate=20 slaving the collaboration session to a screen, unless its done by having=20 someone connect a video connection to their own computer. Thanks, Paul > Stephane. > > > -----Message d'origine----- > De : clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] De la part de P= aul Kyzivat > Envoy=E9 : jeudi 28 juillet 2011 13:41 > =C0 : CHATRAS Bruno RD-CORE-ISS > Cc : clue@ietf.org > Objet : Re: [clue] Comment on the presentation use case > > On 7/28/11 2:37 AM, bruno.chatras@orange-ftgroup.com wrote: >> I think we should take a look to >> http://tools.ietf.org/html/draft-garcia-mmusic-sdp-collaboration-00 > > Maybe. But that is almost orthogonal to what I was suggesting. > > THanks, > Paul > (as individual) > >> Bruno >> >>> -----Message d'origine----- >>> De : clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] De la part de >>> Paul Kyzivat >>> Envoy=E9 : jeudi 28 juillet 2011 06:05 >>> =C0 : Roni Even >>> Cc : clue@ietf.org >>> Objet : Re: [clue] Comment on the presentation use case >>> >>> On 7/27/11 11:28 PM, Roni Even wrote: >>>> Hi, >>>> HTTP is not defining a common data sharing protocol. WebEx may be >>> carried >>>> over HTTP but the data sharing application is not standard. What I >>> meant is >>>> that it can either be something that is a common data sharing >>> protocol or >>>> something that is carried as an RTP payload which require some common >>>> defined protocol on top. >>> >>> I was being a bit tongue in cheek, though not entirely. >>> Of course you are right - that if you want to push data to everybody >>> you >>> need more. >>> >>> But data sharing by pointing a video camera at a piece of paper is a >>> tad >>> out of date. Connecting the video port on a user's computer as a video >>> source and distributing it with the other video is better than that. >>> But >>> its not nearly as convenient as webex or any of its competitors. >>> >>> It isn't entirely clear that its *necessary* to bundle the data sharing >>> application with the telepresence protocols. Its kind of limiting since >>> the web apps evolve very rapidly. Perhaps we should be doing the >>> opposite of that: providing a way to embed the control of the >>> telepresence system into a web app. >>> >>> Thanks, >>> Paul >>> >>>>> -----Original Message----- >>>>> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf >>> Of >>>>> Paul Kyzivat >>>>> Sent: Thursday, July 28, 2011 1:29 AM >>>>> To: clue@ietf.org >>>>> Subject: Re: [clue] Comment on the presentation use case >>>>> >>>>> On 7/27/11 6:07 PM, Roni Even wrote: >>>>>> Hi Stephane, >>>>>> >>>>>> Is there a standard protocol that is used for conveying this >>>>>> information, is it RTP based. >>>>> >>>>> AFAIK this is often http. (E.g. webex) >>>>> >>>>>> To me this is a separate application that can be integrated in the >>>>>> application level and not as part of the multistream. >>>>> >>>>> I guess this depends on whether the support for it is integrated >>> into >>>>> the "room", or is just incidental equipment brought by the users, >>> not >>>>> formally related to the telepresence session. >>>>> >>>>> Thanks, >>>>> Paul >>>>> (speaking as an individual) >>>>> >>>>>> Roni >>>>>> >>>>>> *clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] *On Behalf Of >>>>>> *stephane.cazeaux@orange-ftgroup.com >>>>>> *Sent:* Thursday, July 21, 2011 2:33 PM >>>>>> *To:* clue@ietf.org >>>>>> *Subject:* [clue] Comment on the presentation use case* >>>>>> >>>>>> ** >>>>>> >>>>>> *Hi,* >>>>>> >>>>>> ** >>>>>> >>>>>> *The presentation use case as described in the use-cases document >>> is >>>>>> based on the assumption that the presentation stream relies on a >>>>> video >>>>>> stream, and is limited to usage of presentation video streams. But >>> we >>>>>> could also consider collaborative use cases, meaningful for >>>>>> telepresence, which are not covered by the existing text.* >>>>>> >>>>>> *I propose to complete the existing text as follows:* >>>>>> >>>>>> ** >>>>>> >>>>>> *Furthermore, although most today's systems use video streams for >>>>>> presentations, there are use cases where this is not suitable. For >>>>> example:* >>>>>> >>>>>> *- The professor which shares an electronic whiteboard (could be a >>>>>> whiteboard application on a PC, with screen capture of the PC) >>> where >>>>> all >>>>>> students can participate. Students will take control of the shared >>>>>> whiteboard in turns.* >>>>>> >>>>>> *- In a multipoint meeting, a shared document can be kept always >>>>> visible >>>>>> in a screen, while other documents are presented on other screens >>>>> (with >>>>>> possible in turns presentation). For instance, for the purpose of >>>>> shared >>>>>> design document, notes taking, polls, etc. A shared document >>> implies >>>>>> that all participants can modify it in turns.* >>>>>> >>>>>> *"* >>>>>> >>>>>> ** >>>>>> >>>>>> ** >>>>>> >>>>>> *St ephane. v> >>>>>> * >>>>>> >>>>>> * >>>>>> * >>>>>> >>>>>> *_______________________________________________ >>>>>> clue mailing list >>>>>> clue@ietf.org >>>>>> https://www.ietf.org/mailman/listinfo/clue >>>>>> * >>>>> >>>>> _______________________________________________ >>>>> clue mailing list >>>>> clue@ietf.org >>>>> https://www.ietf.org/mailman/listinfo/clue >>>> >>>> >>> >>> _______________________________________________ >>> clue mailing list >>> clue@ietf.org >>> https://www.ietf.org/mailman/listinfo/clue >> > > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue > From pkyzivat@alum.mit.edu Fri Aug 12 08:30:09 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1CEE321F888A for ; Fri, 12 Aug 2011 08:30:09 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.545 X-Spam-Level: X-Spam-Status: No, score=-2.545 tagged_above=-999 required=5 tests=[AWL=0.054, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JmM6B9CIkCks for ; Fri, 12 Aug 2011 08:30:08 -0700 (PDT) Received: from qmta05.westchester.pa.mail.comcast.net (qmta05.westchester.pa.mail.comcast.net [76.96.62.48]) by ietfa.amsl.com (Postfix) with ESMTP id D240121F8888 for ; Fri, 12 Aug 2011 08:30:07 -0700 (PDT) Received: from omta21.westchester.pa.mail.comcast.net ([76.96.62.72]) by qmta05.westchester.pa.mail.comcast.net with comcast id KT7r1h00D1ZXKqc55TWlGa; Fri, 12 Aug 2011 15:30:45 +0000 Received: from Paul-Kyzivats-MacBook-Pro.local ([24.62.109.41]) by omta21.westchester.pa.mail.comcast.net with comcast id KTWk1h01E0tdiYw3hTWlHi; Fri, 12 Aug 2011 15:30:45 +0000 Message-ID: <4E454723.4080501@alum.mit.edu> Date: Fri, 12 Aug 2011 11:30:43 -0400 From: Paul Kyzivat User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: stephane.cazeaux@orange-ftgroup.com References: <00ec01cc4ca9$9cc14e80$d643eb80$%roni@huawei.com><4E309135.7070408@alum.mit.edu><001b01cc4cd6$77d28760$67779620$%roni@huawei.com> <4E30DFDE.3040601@alum.mit.edu> <9ECCF01B52E7AB408A7EB85352642141031F2E5E@ftrdmel0.rd.francetelecom.fr> <4E314AD3.1030406@alum.mit.edu> <4E403783.70706@alum.mit.edu> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: clue@ietf.org Subject: Re: [clue] Comment on the presentation use case X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Aug 2011 15:30:09 -0000 On 8/12/11 4:48 AM, stephane.cazeaux@orange-ftgroup.com wrote: > Hi Paul, > > I understand your suggestion, and I am aware that it could be a solution. It certainly makes sense, but I am not sure of how it could be integrated in CLUE. To me, this is a matter of application integration, at application level. > > My suggestion is that we should think of a solution that provides more satisfying collaboration within telepresence, with the same level of interoperability and multistream interaction as presentation video stream provides, but with more satisfying quality (video is not suitable for all kinds of shared documents) and more collaborative features. > I don't think that the kind of relationship that you suggest is the only solution. IMO the main thing is to not constrain the mechanism used for collaboration/web sharing. That stuff is evolving at "web speed". Anything you nail down that is constraining will be obsolete before it gets out. (I've been looking more at RTCWEB recently, and I'm wondering if perhaps CLUE ought to be based on that.) Thanks, Paul (as individual) > Stephane. > > > -----Message d'origine----- > De : Paul Kyzivat [mailto:pkyzivat@alum.mit.edu] > Envoyé : lundi 8 août 2011 21:23 > À : CAZEAUX Stephane RD-BIZZ-CAE > Cc : clue@ietf.org > Objet : Re: [clue] Comment on the presentation use case > > On 8/8/11 11:56 AM, stephane.cazeaux@orange-ftgroup.com wrote: >> Hi, >> >> To me, it should be part of the telepresence protocols at least to enable the interoperability, as the presentation based on video stream allows it. >> >> The point is that video stream is not convenient for the use cases I suggested. But it does not necessarily mean that we should bundle a full data sharing protocol with telepresence. Something simpler, like the RFB option of draft-garcia-mmusic-sdp-collaboration, could be a candidate. > > What I was suggesting is that maybe it would make sense to turn those > relationships "inside out". > > For instance when I use a collaboration tool like Webex, the > collaboration is set up first via the web, and defines the set of > participants. Then a voice session can be added. For pragmatic reasons, > the voice conferencing seems to be pretty distinct. (I'm not certain how > webex handles video. I suspect it is doing it via the web, not the > telephony conference.) > > Its not hard to imagine the same sort of setup, but with a multiparty > telepresence session instead of the traditional voice conference. In > such a case, you would probably want the collaboration tool (webex, or > whatever) to mediate the UI for the web collaboration, the roster, etc. > It might delegate a lot of that to the telepresence infrastructure. > > But that does raise some questions about how all the components fit > together. Is there a single screen for the collaboration session in a > telepresence room? What about input to that - keyboard, mouse, etc.? Or > do we assume that each person in the room has their own computer with > input, display, etc. and maybe a way to slave the web collaboration > session to one or more of the big displays in the room? > > What probably *can't* be done right now is nail down a particular web > collaboration service (e.g. Webex) or protocol. That does complicate > slaving the collaboration session to a screen, unless its done by having > someone connect a video connection to their own computer. > > Thanks, > Paul > >> Stephane. >> >> >> -----Message d'origine----- >> De : clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] De la part de Paul Kyzivat >> Envoyé : jeudi 28 juillet 2011 13:41 >> À : CHATRAS Bruno RD-CORE-ISS >> Cc : clue@ietf.org >> Objet : Re: [clue] Comment on the presentation use case >> >> On 7/28/11 2:37 AM, bruno.chatras@orange-ftgroup.com wrote: >>> I think we should take a look to >>> http://tools.ietf.org/html/draft-garcia-mmusic-sdp-collaboration-00 >> >> Maybe. But that is almost orthogonal to what I was suggesting. >> >> THanks, >> Paul >> (as individual) >> >>> Bruno >>> >>>> -----Message d'origine----- >>>> De : clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] De la part de >>>> Paul Kyzivat >>>> Envoyé : jeudi 28 juillet 2011 06:05 >>>> À : Roni Even >>>> Cc : clue@ietf.org >>>> Objet : Re: [clue] Comment on the presentation use case >>>> >>>> On 7/27/11 11:28 PM, Roni Even wrote: >>>>> Hi, >>>>> HTTP is not defining a common data sharing protocol. WebEx may be >>>> carried >>>>> over HTTP but the data sharing application is not standard. What I >>>> meant is >>>>> that it can either be something that is a common data sharing >>>> protocol or >>>>> something that is carried as an RTP payload which require some common >>>>> defined protocol on top. >>>> >>>> I was being a bit tongue in cheek, though not entirely. >>>> Of course you are right - that if you want to push data to everybody >>>> you >>>> need more. >>>> >>>> But data sharing by pointing a video camera at a piece of paper is a >>>> tad >>>> out of date. Connecting the video port on a user's computer as a video >>>> source and distributing it with the other video is better than that. >>>> But >>>> its not nearly as convenient as webex or any of its competitors. >>>> >>>> It isn't entirely clear that its *necessary* to bundle the data sharing >>>> application with the telepresence protocols. Its kind of limiting since >>>> the web apps evolve very rapidly. Perhaps we should be doing the >>>> opposite of that: providing a way to embed the control of the >>>> telepresence system into a web app. >>>> >>>> Thanks, >>>> Paul >>>> >>>>>> -----Original Message----- >>>>>> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf >>>> Of >>>>>> Paul Kyzivat >>>>>> Sent: Thursday, July 28, 2011 1:29 AM >>>>>> To: clue@ietf.org >>>>>> Subject: Re: [clue] Comment on the presentation use case >>>>>> >>>>>> On 7/27/11 6:07 PM, Roni Even wrote: >>>>>>> Hi Stephane, >>>>>>> >>>>>>> Is there a standard protocol that is used for conveying this >>>>>>> information, is it RTP based. >>>>>> >>>>>> AFAIK this is often http. (E.g. webex) >>>>>> >>>>>>> To me this is a separate application that can be integrated in the >>>>>>> application level and not as part of the multistream. >>>>>> >>>>>> I guess this depends on whether the support for it is integrated >>>> into >>>>>> the "room", or is just incidental equipment brought by the users, >>>> not >>>>>> formally related to the telepresence session. >>>>>> >>>>>> Thanks, >>>>>> Paul >>>>>> (speaking as an individual) >>>>>> >>>>>>> Roni >>>>>>> >>>>>>> *clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] *On Behalf Of >>>>>>> *stephane.cazeaux@orange-ftgroup.com >>>>>>> *Sent:* Thursday, July 21, 2011 2:33 PM >>>>>>> *To:* clue@ietf.org >>>>>>> *Subject:* [clue] Comment on the presentation use case* >>>>>>> >>>>>>> ** >>>>>>> >>>>>>> *Hi,* >>>>>>> >>>>>>> ** >>>>>>> >>>>>>> *The presentation use case as described in the use-cases document >>>> is >>>>>>> based on the assumption that the presentation stream relies on a >>>>>> video >>>>>>> stream, and is limited to usage of presentation video streams. But >>>> we >>>>>>> could also consider collaborative use cases, meaningful for >>>>>>> telepresence, which are not covered by the existing text.* >>>>>>> >>>>>>> *I propose to complete the existing text as follows:* >>>>>>> >>>>>>> ** >>>>>>> >>>>>>> *Furthermore, although most today's systems use video streams for >>>>>>> presentations, there are use cases where this is not suitable. For >>>>>> example:* >>>>>>> >>>>>>> *- The professor which shares an electronic whiteboard (could be a >>>>>>> whiteboard application on a PC, with screen capture of the PC) >>>> where >>>>>> all >>>>>>> students can participate. Students will take control of the shared >>>>>>> whiteboard in turns.* >>>>>>> >>>>>>> *- In a multipoint meeting, a shared document can be kept always >>>>>> visible >>>>>>> in a screen, while other documents are presented on other screens >>>>>> (with >>>>>>> possible in turns presentation). For instance, for the purpose of >>>>>> shared >>>>>>> design document, notes taking, polls, etc. A shared document >>>> implies >>>>>>> that all participants can modify it in turns.* >>>>>>> >>>>>>> *"* >>>>>>> >>>>>>> ** >>>>>>> >>>>>>> ** >>>>>>> >>>>>>> *St ephane. v> >>>>>>> * >>>>>>> >>>>>>> * >>>>>>> * >>>>>>> >>>>>>> *_______________________________________________ >>>>>>> clue mailing list >>>>>>> clue@ietf.org >>>>>>> https://www.ietf.org/mailman/listinfo/clue >>>>>>> * >>>>>> >>>>>> _______________________________________________ >>>>>> clue mailing list >>>>>> clue@ietf.org >>>>>> https://www.ietf.org/mailman/listinfo/clue >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> clue mailing list >>>> clue@ietf.org >>>> https://www.ietf.org/mailman/listinfo/clue >>> >> >> _______________________________________________ >> clue mailing list >> clue@ietf.org >> https://www.ietf.org/mailman/listinfo/clue >> > > From Even.roni@huawei.com Sun Aug 14 03:12:58 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B685C21F8661 for ; Sun, 14 Aug 2011 03:12:58 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -105.622 X-Spam-Level: X-Spam-Status: No, score=-105.622 tagged_above=-999 required=5 tests=[AWL=-0.356, BAYES_00=-2.599, FRT_FOLLOW1=1.332, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DMZ8zFXvcGMp for ; Sun, 14 Aug 2011 03:12:58 -0700 (PDT) Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [119.145.14.67]) by ietfa.amsl.com (Postfix) with ESMTP id AC1AB21F85BB for ; Sun, 14 Aug 2011 03:12:57 -0700 (PDT) Received: from huawei.com (szxga04-in [172.24.2.12]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LPW00LUXXQR6B@szxga04-in.huawei.com> for clue@ietf.org; Sun, 14 Aug 2011 18:13:39 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LPW0065AXQRUR@szxga04-in.huawei.com> for clue@ietf.org; Sun, 14 Aug 2011 18:13:39 +0800 (CST) Received: from windows8d787f9 (bzq-79-180-16-191.red.bezeqint.net [79.180.16.191]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LPW00I1AXQKT8@szxml12-in.huawei.com> for clue@ietf.org; Sun, 14 Aug 2011 18:13:39 +0800 (CST) Date: Sun, 14 Aug 2011 13:13:17 +0300 From: Roni Even To: clue@ietf.org Message-id: <02c701cc5a6a$cd8bdbb0$68a39310$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_uf15amNaWv9+zvhSjz96LA)" Content-language: en-us Thread-index: Acxaasg6Cyi54ZGPRR+etvD1LpiaSA== Subject: [clue] Capture Scene and system description X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Aug 2011 10:12:59 -0000 This is a multi-part message in MIME format. --Boundary_(ID_uf15amNaWv9+zvhSjz96LA) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Hi, The way I read the framework is that it assumes one model of a generic endpoint described by the cameras having a left to right spatial relation. My view from the charter is that the description should also address the displays and camera positions which are not the same for all endpoint. Cameras can be centrally located or on top of each screen. The screen may be very close to one other or at some distance from one others. All this information is relevant of you want to convey the "being there" experience which is why we chartered this endpoint and not to achieve just a simple multi-stream connection. >From the charter: This working group is chartered to specify the following information about media streams from one entity to another entity: * Spatial relationships of cameras, displays, microphones, and loudspeakers - relative to each other and to likely positions of participants * Viewpoint, field of view/capture for camera/microphone/display/loudspeaker - so that senders and intermediate devices can understand how best to compose streams for receivers, and the receiver will know the characteristics of its received streams I think that the current base model does not address this two bullets from the charter. My preference is to define the "Capture Scene" so it will have parameters that will enable the advertisement of the camera positions and the number of displays and their relative position. As for the camera viewpoint I think this is being discussed in a separate thread on the layout and I will address my comments there. BR Roni Even --Boundary_(ID_uf15amNaWv9+zvhSjz96LA) Content-type: text/html; charset=us-ascii Content-transfer-encoding: 7BIT

Hi,

The way I read the framework is that it assumes one model of a generic endpoint described by the cameras having a left to right spatial relation.

My view from the charter is that the description should also address the displays and camera positions which are not the same for all endpoint. Cameras can be centrally located or on top of each screen. The screen may be very close to one other or at some distance from one others. All this information is relevant of you want to convey the "being there" experience which is why we chartered this endpoint and not to achieve just a simple multi-stream connection.

 

From the charter:

This working group is chartered to specify the follo wing inf class=MsoNormal>  about media streams from one entity to another entity:

 

  * Spatial relationships of cameras, displays, microphones, and

    loudspeakers - relative to each other and to likely positions of

    participants

 

  * Viewpoint, field of view/capture for

    camera/microphone/display/loudspeaker - so that senders and

    intermediate devices can understand how best to compose streams for

    receivers, and the receiver will know the characteristics of its

    received streams

 

I think that the current base model does not address this two bullets from the charter.

My preference is to define the "Capture Scene" so it will have parameters that will enable the advertisement of the camera positions and the number of displays and their relative position.

 

As for the camera viewpoint I think this is being discussed in a separate thread on the layout and I will address my comments there.

 

BR

Roni Even

 

 

--Boundary_(ID_uf15amNaWv9+zvhSjz96LA)-- From bbaldino@cisco.com Mon Aug 15 13:20:51 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 568B421F8D4B for ; Mon, 15 Aug 2011 13:20:51 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -1.266 X-Spam-Level: X-Spam-Status: No, score=-1.266 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FRT_FOLLOW1=1.332, HTML_MESSAGE=0.001] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GKIR27uUFMHJ for ; Mon, 15 Aug 2011 13:20:49 -0700 (PDT) Received: from rcdn-iport-6.cisco.com (rcdn-iport-6.cisco.com [173.37.86.77]) by ietfa.amsl.com (Postfix) with ESMTP id 7CB0921F8D4A for ; Mon, 15 Aug 2011 13:20:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=bbaldino@cisco.com; l=12355; q=dns/txt; s=iport; t=1313439696; x=1314649296; h=mime-version:subject:date:message-id:in-reply-to: references:from:to; bh=1Hl3nmo3Im/JMxD86OqZIxnKAqD0h+7sW8jGCkQednE=; b=lpGImvIvAonEW1fPugXHrmRTLtV0n4IGv9nnjhQAdh7w6igfFnykEpDm 7b99STSgGooAeDHKM8aj6ejA+g3UxQ8SrHisxRyYw5XNj+EhPLwx7JDLV Ye8wEZHsrmwIaR/rukQqp6OR0OtQaq+7Ha8Eq1xjY6vLWyQm2mCS5XE05 A=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtMAAEZ/SU6rRDoJ/2dsb2JhbABBgk2Ve49Od4FAAQEBAQMSAQkRA0IXAgEIEQQBAQsGFwEGAUUJCAEBBAESCBqiJgGfBIVoXwSHX5BIjAA X-IronPort-AV: E=Sophos;i="4.67,375,1309737600"; d="scan'208,217";a="13322659" Received: from mtv-core-4.cisco.com ([171.68.58.9]) by rcdn-iport-6.cisco.com with ESMTP; 15 Aug 2011 20:21:35 +0000 Received: from xbh-sjc-211.amer.cisco.com (xbh-sjc-211.cisco.com [171.70.151.144]) by mtv-core-4.cisco.com (8.14.3/8.14.3) with ESMTP id p7FKLZTR029844; Mon, 15 Aug 2011 20:21:35 GMT Received: from xmb-sjc-233.amer.cisco.com ([128.107.191.88]) by xbh-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 15 Aug 2011 13:21:35 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CC5B88.ED8D4BF4" Date: Mon, 15 Aug 2011 13:21:34 -0700 Message-ID: In-Reply-To: <02c701cc5a6a$cd8bdbb0$68a39310$%roni@huawei.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [clue] Capture Scene and system description thread-index: Acxaasg6Cyi54ZGPRR+etvD1LpiaSABHcgjw References: <02c701cc5a6a$cd8bdbb0$68a39310$%roni@huawei.com> From: "Brian Baldino (bbaldino)" To: "Roni Even" , X-OriginalArrivalTime: 15 Aug 2011 20:21:35.0119 (UTC) FILETIME=[EDDBD9F0:01CC5B88] Subject: Re: [clue] Capture Scene and system description X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Aug 2011 20:20:51 -0000 This is a multi-part message in MIME format. ------_=_NextPart_001_01CC5B88.ED8D4BF4 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hey Roni, I agree that the current description of the framework doesn't provide a mechanism to describe the concepts you mentioned; we plan on adding support for them and the mechanisms for doing so will be added to the framework soon. Once we take our best shot at them we can make sure they cover the use cases you described. =20 -Brian =20 From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Roni Even Sent: Sunday, August 14, 2011 3:13 AM To: clue@ietf.org Subject: [clue] Capture Scene and system description =20 Hi, The way I read the framework is that it assumes one model of a generic endpoint described by the cameras having a left to right spatial relation. My view from the charter is that the description should also address the displays and camera positions which are not the same for all endpoint. Cameras can be centrally located or on top of each screen. The screen may be very close to one other or at some distance from one others. All this information is relevant of you want to convey the "being there" experience which is why we chartered this endpoint and not to achieve just a simple multi-stream connection. =20 >From the charter: This working group is chartered to specify the follo wing inf class=3DMsoNormal> about media streams from one entity to another = entity: =20 * Spatial relationships of cameras, displays, microphones, and loudspeakers - relative to each other and to likely positions of participants =20 * Viewpoint, field of view/capture for camera/microphone/display/loudspeaker - so that senders and intermediate devices can understand how best to compose streams for receivers, and the receiver will know the characteristics of its received streams =20 I think that the current base model does not address this two bullets from the charter.=20 My preference is to define the "Capture Scene" so it will have parameters that will enable the advertisement of the camera positions and the number of displays and their relative position. =20 As for the camera viewpoint I think this is being discussed in a separate thread on the layout and I will address my comments there. =20 BR Roni Even =20 =20 ------_=_NextPart_001_01CC5B88.ED8D4BF4 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hey Roni,

I agree that the current description of the framework = doesn’t provide a mechanism to describe the concepts you mentioned; we plan on = adding support for them and the mechanisms for doing so will be added to the = framework soon.  Once we take our best shot at them we can make sure they = cover the use cases you described.

 

-Brian

 

From:= clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of = Roni Even
Sent: Sunday, August 14, 2011 3:13 AM
To: clue@ietf.org
Subject: [clue] Capture Scene and system = description

 

Hi,

The way I read the framework is that it assumes one = model of a generic endpoint described by the cameras having a left to right = spatial relation.

My view from the charter is that the description = should also address the displays and camera positions which are not the same for all endpoint. Cameras can be centrally located or on top of each screen. The = screen may be very close to one other or at some distance from one others. All = this information is relevant of you want to convey the "being = there" experience which is why we chartered this endpoint and not to achieve = just a simple multi-stream connection.

 

From the charter:

This working group is chartered to specify the = follo wing inf class=3DMsoNormal>  about media streams from one entity to = another entity:

 

  * Spatial relationships of cameras, = displays, microphones, and

    loudspeakers - relative to each = other and to likely positions of

    participants

 

  * Viewpoint, field of view/capture = for

    = camera/microphone/display/loudspeaker - so that senders and

    intermediate devices can = understand how best to compose streams for

    receivers, and the receiver will = know the characteristics of its

    received streams

 

I think that the current base model does not = address this two bullets from the charter.

My preference is to define the "Capture = Scene" so it will have parameters that will enable the advertisement of the camera positions and the number of displays and their relative = position.

 

As for the camera viewpoint I think this is being = discussed in a separate thread on the layout and I will address my comments = there.

 

BR

Roni Even

 

 

------_=_NextPart_001_01CC5B88.ED8D4BF4-- From eckelcu@cisco.com Mon Aug 15 13:21:08 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1E49121F8D50 for ; Mon, 15 Aug 2011 13:21:08 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.835 X-Spam-Level: X-Spam-Status: No, score=-2.835 tagged_above=-999 required=5 tests=[AWL=-0.236, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id o1JfFUDBKkrY for ; Mon, 15 Aug 2011 13:21:07 -0700 (PDT) Received: from rcdn-iport-5.cisco.com (rcdn-iport-5.cisco.com [173.37.86.76]) by ietfa.amsl.com (Postfix) with ESMTP id 25C3E21F8D4A for ; Mon, 15 Aug 2011 13:21:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=eckelcu@cisco.com; l=5641; q=dns/txt; s=iport; t=1313439714; x=1314649314; h=mime-version:content-transfer-encoding:subject:date: message-id:in-reply-to:references:from:to; bh=8dn+y55XdPWRxmgHmMb+JI10dCQIgsbo7FxQ4Y8I/XY=; b=GeVru/i3RQlha6x/8y3zcvu1/OwQcZyuURfF65+rqevwa4T07IQLTlLd cUBxp2okL8bpKbOmvXB0LkcXL/qw3Pdqwrs2BVYYbS4iN5iksTAT/C6xc ojCr1uwfbEzniAmwb34VNyTqfdNrjdPGsSfwXv+atc9vxUCi4x+k4u3uU U=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtMAAEZ/SU6rRDoH/2dsb2JhbABBDpg6j053gUABAQEBAgEBAQEPAR0KLQcEEwQCAQgOAwQBAQEKBhcBBgEmHwkIAQEEARIIGodOBJpUAZ8EhWhfBIdfkEiLKFg X-IronPort-AV: E=Sophos;i="4.67,375,1309737600"; d="scan'208";a="13328250" Received: from mtv-core-2.cisco.com ([171.68.58.7]) by rcdn-iport-5.cisco.com with ESMTP; 15 Aug 2011 20:21:46 +0000 Received: from xbh-sjc-231.amer.cisco.com (xbh-sjc-231.cisco.com [128.107.191.100]) by mtv-core-2.cisco.com (8.14.3/8.14.3) with ESMTP id p7FKLkWw032396; Mon, 15 Aug 2011 20:21:46 GMT Received: from xmb-sjc-234.amer.cisco.com ([128.107.191.111]) by xbh-sjc-231.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 15 Aug 2011 13:21:46 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Mon, 15 Aug 2011 13:21:44 -0700 Message-ID: In-Reply-To: <4E43D2BE.5010102@alum.mit.edu> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [clue] continuing "layout" discussion Thread-Index: AcxYJtujoPLfXJRJTtC50DnQ79LPywDYQvaA References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com><4E413021.3010509@alum.mit.edu><44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> From: "Charles Eckel (eckelcu)" To: "Paul Kyzivat" , X-OriginalArrivalTime: 15 Aug 2011 20:21:46.0070 (UTC) FILETIME=[F462D760:01CC5B88] Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Aug 2011 20:21:08 -0000 Please see inline. > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Paul Kyzivat > Sent: Thursday, August 11, 2011 6:02 AM > To: clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion >=20 > Inline >=20 > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > >> -----Original Message----- > >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of > >> Paul Kyzivat > >> Sent: Tuesday, August 09, 2011 9:03 AM > >> To: clue@ietf.org > >> Subject: Re: [clue] continuing "layout" discussion > > > >>> 4 - multi stream media format - what the streams mean with respect to > >> each other, regardless of the actual content on the streams. For > >> audio, examples are stereo, 5.1 surround, binaural, linear array. > >> (linear array is described in the clue framework document). Perhaps 3D > >> video formats would also fit in this category. This information is > >> needed in order to properly render the media into light and sound for > >> human observers. I see this at the same level as identifying a codec, > >> independent of the audio or video content carried on the streams, and > >> independent of how any composition of sources is done. I do not think this is necessarily true. Taking audio as an example, you could have two audio streams that are mixed to form a single stereo audio stream, or you could have them as two independent (not mixed) streams that are associate with each other by some grouping mechanism. This group would be categorized as being stereo audio with one audio stream being the left and the other the right. The codec used for each could be different, though I agree they would typically be the same. Consequently, I think at attribute such as "stereo" as being more of a grouping concept, where the group may consist of: - multiple independent streams, each with potentially its own spatial orientation, codec, bandwidth, etc.,=20 - a single mixed stream Cheers, Charles > >> I was with you all the way until 4. That one I don't understand. > >> The name you chose for this has connotations for me, but isn't fully in > >> harmony with the definitions you give: > > > > I'm happy to change the name if you have a suggestion >=20 > Not yet. Maybe once the concepts are more clearly defined I will have an > opinion. >=20 > >> If we consider audio, it makes sense that multiple streams can be > >> rendered as if they came from different physical locations in the > >> receiving room. That can be done by the receiver if it gets those > >> streams separately, and has information about their intended > >> relationships. It can also be done by the sender or MCU and passed on > >> to > >> the receiver as a single stream with stereo or binaural coding. > > > > Yes. It could also be done by the sender using the "linear array" audio channel format. Maybe it > is true that stereo or binaural audio channels would always be sent as a single stream, but I was not > assuming that yet, at least not in general when you consider other types too, such as linear array > channels. >=20 > >> So it seems to me you have two concepts here, not one. One has to do > >> with describing the relationships between streams, and the other has to > >> do with the encoding of spacial relationships *within* a single stream. > > > > Maybe that is a better way to describe it, if you assume multi-channel audio is always sent with all > the channels in the same RTP stream. Is that what you mean? > > > > I was considering the linear array format to be another type of multi-channel audio, and I know > people want to be able to send each channel in a separate RTP stream. So it doesn't quite fit with > how you separate the two concepts. In my view, identifying the separate channels by what they mean is > the same concept for linear array and stereo. For example "this channel is left, this channel is > center, this channel is right". To me, that is the same concept for identifying channels whether or > not they are carried in the same RTP stream. > > > > Maybe we are thinking the same thing but getting confused by terminology about channels vs. streams. >=20 > Maybe. Let me try to restate what I now think you are saying: >=20 > The audio may consist of several "channels". >=20 > Each channel may be sent over its own RTP stream, > or multiple channels may be multiplexed over an RTP stream. >=20 > I guess much of this can also apply to video. >=20 > When there are exactly two audio channels, they may be encoded as > "stereo" or "binaural", which then affects how they should be rendered > by the recipient. In these cases the primary info that is required about > the individual channels is which is left and which is right. (And which > perspective to use in interpretting left and right.) >=20 > For other multi-channel cases more information is required about the > role of each channel in order to properly render them. >=20 > Thanks, > Paul >=20 >=20 > >> Or, are you asserting that stereo and binaural are simply ways to > >> encode > >> multiple logical streams in one RTP stream, together with their spacial > >> relationships? > > > > No, that is not what I'm trying to say. > > > > Mark > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue > > >=20 > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue From stephen.botzko@gmail.com Mon Aug 15 14:12:51 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BF48721F8D17 for ; Mon, 15 Aug 2011 14:12:51 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -3.331 X-Spam-Level: X-Spam-Status: No, score=-3.331 tagged_above=-999 required=5 tests=[AWL=0.267, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0x9FSGLpR1rI for ; Mon, 15 Aug 2011 14:12:50 -0700 (PDT) Received: from mail-vx0-f172.google.com (mail-vx0-f172.google.com [209.85.220.172]) by ietfa.amsl.com (Postfix) with ESMTP id 49C7621F8D12 for ; Mon, 15 Aug 2011 14:12:50 -0700 (PDT) Received: by vxi29 with SMTP id 29so5241302vxi.31 for ; Mon, 15 Aug 2011 14:13:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=5NtnF6S6cAjeBmH+PB/4f/ao49DUzib84FtBV/k4HW4=; b=O6fn01pFmwFaIFixFhUJXQotGjiOU5YYuLaMYBcE4xQ/BVau9A9hZxomfQuuoku41F k4IVhW+5ASJfyTv8fz7vQzi8GQZj/lW6pFZZSYYnk3vBnbi+y83OTD7ZeIlz8iby3uCT tbslc/LBSw+Cz/Jabu7JzMufCspiXNf/zSLcw= MIME-Version: 1.0 Received: by 10.52.176.166 with SMTP id cj6mr4360489vdc.155.1313442816547; Mon, 15 Aug 2011 14:13:36 -0700 (PDT) Received: by 10.52.115.103 with HTTP; Mon, 15 Aug 2011 14:13:36 -0700 (PDT) In-Reply-To: References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> Date: Mon, 15 Aug 2011 17:13:36 -0400 Message-ID: From: Stephen Botzko To: "Charles Eckel (eckelcu)" Content-Type: multipart/alternative; boundary=bcaec5186a58835aaa04aa91bcb0 Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Aug 2011 21:12:51 -0000 --bcaec5186a58835aaa04aa91bcb0 Content-Type: text/plain; charset=ISO-8859-1 Inline On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel (eckelcu) wrote: > Please see inline. > > > -----Original Message----- > > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf > Of Paul Kyzivat > > Sent: Thursday, August 11, 2011 6:02 AM > > To: clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > Inline > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > >> -----Original Message----- > > >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On > Behalf Of > > >> Paul Kyzivat > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > >> To: clue@ietf.org > > >> Subject: Re: [clue] continuing "layout" discussion > > > > > >>> 4 - multi stream media format - what the streams mean with respect > to > > >> each other, regardless of the actual content on the streams. For > > >> audio, examples are stereo, 5.1 surround, binaural, linear array. > > >> (linear array is described in the clue framework document). > Perhaps 3D > > >> video formats would also fit in this category. This information is > > >> needed in order to properly render the media into light and sound > for > > >> human observers. I see this at the same level as identifying a > codec, > > >> independent of the audio or video content carried on the streams, > and > > >> independent of how any composition of sources is done. > > I do not think this is necessarily true. Taking audio as an example, you > could have two audio streams that are mixed to form a single stereo > audio stream, or you could have them as two independent (not mixed) > streams that are associate with each other by some grouping mechanism. > This group would be categorized as being stereo audio with one audio > stream being the left and the other the right. The codec used for each > could be different, though I agree they would typically be the same. > Consequently, I think at attribute such as "stereo" as being more of a > grouping concept, where the group may consist of: > - multiple independent streams, each with potentially its own spatial > orientation, codec, bandwidth, etc., > - a single mixed stream > [sb] I do not understand this distinction. What do you mean when you say "two audio streams that are mixed to form a single stereo stream", and how is this different from the left and right grouping? > > Cheers, > Charles > > > >> I was with you all the way until 4. That one I don't understand. > > >> The name you chose for this has connotations for me, but isn't > fully in > > >> harmony with the definitions you give: > > > > > > I'm happy to change the name if you have a suggestion > > > > Not yet. Maybe once the concepts are more clearly defined I will have > an > > opinion. > > > > >> If we consider audio, it makes sense that multiple streams can be > > >> rendered as if they came from different physical locations in the > > >> receiving room. That can be done by the receiver if it gets those > > >> streams separately, and has information about their intended > > >> relationships. It can also be done by the sender or MCU and passed > on > > >> to > > >> the receiver as a single stream with stereo or binaural coding. > > > > > > Yes. It could also be done by the sender using the "linear array" > audio channel format. Maybe it > > is true that stereo or binaural audio channels would always be sent as > a single stream, but I was not > > assuming that yet, at least not in general when you consider other > types too, such as linear array > > channels. > > > > >> So it seems to me you have two concepts here, not one. One has to > do > > >> with describing the relationships between streams, and the other > has to > > >> do with the encoding of spacial relationships *within* a single > stream. > > > > > > Maybe that is a better way to describe it, if you assume > multi-channel audio is always sent with all > > the channels in the same RTP stream. Is that what you mean? > > > > > > I was considering the linear array format to be another type of > multi-channel audio, and I know > > people want to be able to send each channel in a separate RTP stream. > So it doesn't quite fit with > > how you separate the two concepts. In my view, identifying the > separate channels by what they mean is > > the same concept for linear array and stereo. For example "this > channel is left, this channel is > > center, this channel is right". To me, that is the same concept for > identifying channels whether or > > not they are carried in the same RTP stream. > > > > > > Maybe we are thinking the same thing but getting confused by > terminology about channels vs. streams. > > > > Maybe. Let me try to restate what I now think you are saying: > > > > The audio may consist of several "channels". > > > > Each channel may be sent over its own RTP stream, > > or multiple channels may be multiplexed over an RTP stream. > > > > I guess much of this can also apply to video. > > > > When there are exactly two audio channels, they may be encoded as > > "stereo" or "binaural", which then affects how they should be rendered > > by the recipient. In these cases the primary info that is required > about > > the individual channels is which is left and which is right. (And > which > > perspective to use in interpretting left and right.) > > > > For other multi-channel cases more information is required about the > > role of each channel in order to properly render them. > > > > Thanks, > > Paul > > > > > > >> Or, are you asserting that stereo and binaural are simply ways to > > >> encode > > >> multiple logical streams in one RTP stream, together with their > spacial > > >> relationships? > > > > > > No, that is not what I'm trying to say. > > > > > > Mark > > > _______________________________________________ > > > clue mailing list > > > clue@ietf.org > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue > --bcaec5186a58835aaa04aa91bcb0 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Inline

On Mon, Aug 15, 2011 at 4:21 PM, C= harles Eckel (eckelcu) <eckelcu@cisco.com> wrote:
Please see inline.
> Sent: Thursday, August 11, 2011 6:02 AM
> To: clue@ietf.org
> Subject: Re: [clue] continuing "layout" discussion
>
> Inline
>
> On 8/10/11 5:49 PM, Duckworth, Mark wrote:
> >> -----Original Message-----
> >> From: clue-bounces@i= etf.org [mailto:clue-bounces@i= etf.org] On
Behalf Of
> >> Paul Kyzivat
> >> Sent: Tuesday, August 09, 2011 9:03 AM
> >> To: clue@ietf.org
> >> Subject: Re: [clue] continuing "layout" discussion<= br> > >
> >>> 4 - multi stream media format - what the streams mean wit= h respect
to
> >> each other, regardless of the actual content on the streams. = =A0For
> >> audio, examples are stereo, 5.1 surround, binaural, linear ar= ray.
> >> (linear array is described in the clue framework document). Perhaps 3D
> >> video formats would also fit in this category. =A0This inform= ation is
> >> needed in order to properly render the media into light and s= ound
for
> >> human observers. =A0I see this at the same level as identifyi= ng a
codec,
> >> independent of the audio or video content carried on the stre= ams,
and
> >> independent of how any composition of sources is done.

I do not think this is necessarily true. Taking audio as an exa= mple, you
could have two audio streams that are mixed to form a single stereo
audio stream, or you could have them as two independent (not mixed)
streams that are associate with each other by some grouping mechanism.
This group would be categorized as being stereo audio with one audio
stream being the left and the other the right. The codec used for each
could be different, though I agree they would typically be the same.
Consequently, I think at attribute such as "stereo" as being more= of a
grouping concept, where the group may consist of:
- multiple independent streams, each with potentially its own spatial
orientation, codec, bandwidth, etc.,
- a single mixed stream

[sb] I do not understand t= his distinction.=A0 What do you mean when you say "two audio streams t= hat are mixed to form a single stereo stream", and how is this differe= nt from the left and right grouping?
=A0

Cheers,
Charles

> >> I was with you all the way until 4. That one I don't unde= rstand.
> >> The name you chose for this has connotations for me, but isn&= #39;t
fully in
> >> harmony with the definitions you give:
> >
> > I'm happy to change the name if you have a suggestion
>
> Not yet. Maybe once the concepts are more clearly defined I will have<= br> an
> opinion.
>
> >> If we consider audio, it makes sense that multiple streams ca= n be
> >> rendered as if they came from different physical locations in= the
> >> receiving room. That can be done by the receiver if it gets t= hose
> >> streams separately, and has information about their intended<= br> > >> relationships. It can also be done by the sender or MCU and p= assed
on
> >> to
> >> the receiver as a single stream with stereo or binaural codin= g.
> >
> > Yes. =A0It could also be done by the sender using the "linea= r array"
audio channel format. =A0Maybe it
> is true that stereo or binaural audio channels would always be sent as=
a single stream, but I was not
> assuming that yet, at least not in general when you consider other
types too, such as linear array
> channels.
>
> >> So it seems to me you have two concepts here, not one. One ha= s to
do
> >> with describing the relationships between streams, and the ot= her
has to
> >> do with the encoding of spacial relationships *within* a sing= le
stream.
> >
> > Maybe that is a better way to describe it, if you assume
multi-channel audio is always sent with all
> the channels in the same RTP stream. =A0Is that what you mean?
> >
> > I was considering the linear array format to be another type of multi-channel audio, and I know
> people want to be able to send each channel in a separate RTP stream.<= br> So it doesn't quite fit with
> how you separate the two concepts. =A0In my view, identifying the
separate channels by what they mean is
> the same concept for linear array and stereo. =A0For example "thi= s
channel is left, this channel is
> center, this channel is right". =A0To me, that is the same concep= t for
identifying channels whether or
> not they are carried in the same RTP stream.
> >
> > Maybe we are thinking the same thing but getting confused by
terminology about channels vs. streams.
>
> Maybe. Let me try to restate what I now think you are saying:
>
> The audio may consist of several "channels".
>
> Each channel may be sent over its own RTP stream,
> or multiple channels may be multiplexed over an RTP stream.
>
> I guess much of this can also apply to video.
>
> When there are exactly two audio channels, they may be encoded as
> "stereo" or "binaural", which then affects how the= y should be rendered
> by the recipient. In these cases the primary info that is required
about
> the individual channels is which is left and which is right. (And
which
> perspective to use in interpretting left and right.)
>
> For other multi-channel cases more information is required about the > role of each channel in order to properly render them.
>
> =A0 =A0 =A0 Thanks,
> =A0 =A0 =A0 Paul
>
>
> >> Or, are you asserting that stereo and binaural are simply way= s to
> >> encode
> >> multiple logical streams in one RTP stream, together with the= ir
spacial
> >> relationships?
> >
> > No, that is not what I'm trying to say.
> >
> > Mark
> > _______________________________________________
> > clue mailing list
> > clue@ietf.org
> > https://www.ietf.org/mailman/listinfo/clue
> >
>
> _______________________________________________
> clue mailing list
> clue@ietf.org
> https://www.ietf.org/mailman/listinfo/clue
_______________________________________________
clue mailing list
clue@ietf.org
ht= tps://www.ietf.org/mailman/listinfo/clue

--bcaec5186a58835aaa04aa91bcb0-- From Even.roni@huawei.com Mon Aug 15 14:22:27 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2284721F8D48 for ; Mon, 15 Aug 2011 14:22:27 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.599 X-Spam-Level: X-Spam-Status: No, score=-106.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uBYLvHqXzdNo for ; Mon, 15 Aug 2011 14:22:26 -0700 (PDT) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [119.145.14.64]) by ietfa.amsl.com (Postfix) with ESMTP id 0422C21F8D3B for ; Mon, 15 Aug 2011 14:22:26 -0700 (PDT) Received: from huawei.com (szxga05-in [172.24.2.49]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LPZ0031PNEMQ8@szxga05-in.huawei.com> for clue@ietf.org; Tue, 16 Aug 2011 05:23:10 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LPZ00H96NELPW@szxga05-in.huawei.com> for clue@ietf.org; Tue, 16 Aug 2011 05:23:10 +0800 (CST) Received: from windows8d787f9 (bzq-79-178-13-148.red.bezeqint.net [79.178.13.148]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LPZ000WJNEBO0@szxml12-in.huawei.com>; Tue, 16 Aug 2011 05:23:09 +0800 (CST) Date: Tue, 16 Aug 2011 00:22:28 +0300 From: Roni Even In-reply-to: To: "'Charles Eckel (eckelcu)'" , 'Paul Kyzivat' , clue@ietf.org Message-id: <010001cc5b91$7604bfb0$620e3f10$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: text/plain; charset=us-ascii Content-language: en-us Content-transfer-encoding: 7BIT Thread-index: AcxYJtujoPLfXJRJTtC50DnQ79LPywDYQvaAAAInzpA= References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Aug 2011 21:22:27 -0000 Hi, It looks to me like I agree with Charles but taking it to video I see that we have two separate entities which appear now in eth framework as one. We have three video captures devices or streams (similar to two audio streams) and we have the grouping which is left to right. Currently the left to right is assumed. Roni > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of > Charles Eckel (eckelcu) > Sent: Monday, August 15, 2011 11:22 PM > To: Paul Kyzivat; clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion > > Please see inline. > > > -----Original Message----- > > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf > Of Paul Kyzivat > > Sent: Thursday, August 11, 2011 6:02 AM > > To: clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > Inline > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > >> -----Original Message----- > > >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On > Behalf Of > > >> Paul Kyzivat > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > >> To: clue@ietf.org > > >> Subject: Re: [clue] continuing "layout" discussion > > > > > >>> 4 - multi stream media format - what the streams mean with > respect > to > > >> each other, regardless of the actual content on the streams. For > > >> audio, examples are stereo, 5.1 surround, binaural, linear array. > > >> (linear array is described in the clue framework document). > Perhaps 3D > > >> video formats would also fit in this category. This information > is > > >> needed in order to properly render the media into light and sound > for > > >> human observers. I see this at the same level as identifying a > codec, > > >> independent of the audio or video content carried on the streams, > and > > >> independent of how any composition of sources is done. > > I do not think this is necessarily true. Taking audio as an example, > you > could have two audio streams that are mixed to form a single stereo > audio stream, or you could have them as two independent (not mixed) > streams that are associate with each other by some grouping mechanism. > This group would be categorized as being stereo audio with one audio > stream being the left and the other the right. The codec used for each > could be different, though I agree they would typically be the same. > Consequently, I think at attribute such as "stereo" as being more of a > grouping concept, where the group may consist of: > - multiple independent streams, each with potentially its own spatial > orientation, codec, bandwidth, etc., > - a single mixed stream > > Cheers, > Charles > > > >> I was with you all the way until 4. That one I don't understand. > > >> The name you chose for this has connotations for me, but isn't > fully in > > >> harmony with the definitions you give: > > > > > > I'm happy to change the name if you have a suggestion > > > > Not yet. Maybe once the concepts are more clearly defined I will have > an > > opinion. > > > > >> If we consider audio, it makes sense that multiple streams can be > > >> rendered as if they came from different physical locations in the > > >> receiving room. That can be done by the receiver if it gets those > > >> streams separately, and has information about their intended > > >> relationships. It can also be done by the sender or MCU and passed > on > > >> to > > >> the receiver as a single stream with stereo or binaural coding. > > > > > > Yes. It could also be done by the sender using the "linear array" > audio channel format. Maybe it > > is true that stereo or binaural audio channels would always be sent > as > a single stream, but I was not > > assuming that yet, at least not in general when you consider other > types too, such as linear array > > channels. > > > > >> So it seems to me you have two concepts here, not one. One has to > do > > >> with describing the relationships between streams, and the other > has to > > >> do with the encoding of spacial relationships *within* a single > stream. > > > > > > Maybe that is a better way to describe it, if you assume > multi-channel audio is always sent with all > > the channels in the same RTP stream. Is that what you mean? > > > > > > I was considering the linear array format to be another type of > multi-channel audio, and I know > > people want to be able to send each channel in a separate RTP stream. > So it doesn't quite fit with > > how you separate the two concepts. In my view, identifying the > separate channels by what they mean is > > the same concept for linear array and stereo. For example "this > channel is left, this channel is > > center, this channel is right". To me, that is the same concept for > identifying channels whether or > > not they are carried in the same RTP stream. > > > > > > Maybe we are thinking the same thing but getting confused by > terminology about channels vs. streams. > > > > Maybe. Let me try to restate what I now think you are saying: > > > > The audio may consist of several "channels". > > > > Each channel may be sent over its own RTP stream, > > or multiple channels may be multiplexed over an RTP stream. > > > > I guess much of this can also apply to video. > > > > When there are exactly two audio channels, they may be encoded as > > "stereo" or "binaural", which then affects how they should be > rendered > > by the recipient. In these cases the primary info that is required > about > > the individual channels is which is left and which is right. (And > which > > perspective to use in interpretting left and right.) > > > > For other multi-channel cases more information is required about the > > role of each channel in order to properly render them. > > > > Thanks, > > Paul > > > > > > >> Or, are you asserting that stereo and binaural are simply ways to > > >> encode > > >> multiple logical streams in one RTP stream, together with their > spacial > > >> relationships? > > > > > > No, that is not what I'm trying to say. > > > > > > Mark > > > _______________________________________________ > > > clue mailing list > > > clue@ietf.org > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue From eckelcu@cisco.com Mon Aug 15 14:44:35 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EBF0221F8CF7 for ; Mon, 15 Aug 2011 14:44:35 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.825 X-Spam-Level: X-Spam-Status: No, score=-2.825 tagged_above=-999 required=5 tests=[AWL=-0.226, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MQLvl06noKQT for ; Mon, 15 Aug 2011 14:44:34 -0700 (PDT) Received: from rcdn-iport-6.cisco.com (rcdn-iport-6.cisco.com [173.37.86.77]) by ietfa.amsl.com (Postfix) with ESMTP id 4F66121F8CF8 for ; Mon, 15 Aug 2011 14:44:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=eckelcu@cisco.com; l=7368; q=dns/txt; s=iport; t=1313444721; x=1314654321; h=mime-version:content-transfer-encoding:subject:date: message-id:in-reply-to:references:from:to:cc; bh=5+bdJbLTTrrV83tAtxLuXdFmj68l9yfx85NVfRig4zU=; b=Q2U/ErYEry/tTOZPioPSHuLJ4lWNxyzNBOPsgwpzEnnBYmPTFzTrfb3T 65dOPbS35schYlmFrjrQvRD/WnKio0fB//+xFJ++wJLhFVd2ARyxJOrqp bpQjBZahaZNvRfsgiI41VP+1zjnJbtXcq57x0NZMNrTFFx981k2kPI4Kd k=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtMAADOTSU6rRDoG/2dsb2JhbABBmEiPTneBQAEBAQEDAQEBDwEdCi0HBAcMBAIBCBEEAQEBCgYXAQYBIAYfCQgBAQQTCBqHUpxLAZ8EhWhfBIdfkEiEYYcf X-IronPort-AV: E=Sophos;i="4.67,376,1309737600"; d="scan'208";a="13347683" Received: from mtv-core-1.cisco.com ([171.68.58.6]) by rcdn-iport-6.cisco.com with ESMTP; 15 Aug 2011 21:45:20 +0000 Received: from xbh-sjc-211.amer.cisco.com (xbh-sjc-211.cisco.com [171.70.151.144]) by mtv-core-1.cisco.com (8.14.3/8.14.3) with ESMTP id p7FLjKT6008568; Mon, 15 Aug 2011 21:45:20 GMT Received: from xmb-sjc-234.amer.cisco.com ([128.107.191.111]) by xbh-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 15 Aug 2011 14:45:19 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Mon, 15 Aug 2011 14:45:18 -0700 Message-ID: In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [clue] continuing "layout" discussion Thread-Index: AcxbkD0Rs5kNuIj8Qvyg+A9Bk9uN5QAA9LrQ References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com><4E413021.3010509@alum.mit.edu><44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com><4E43D2BE.5010102@alum.mit.edu> From: "Charles Eckel (eckelcu)" To: "Stephen Botzko" X-OriginalArrivalTime: 15 Aug 2011 21:45:19.0707 (UTC) FILETIME=[A0BF22B0:01CC5B94] Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Aug 2011 21:44:36 -0000 > -----Original Message----- > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > Sent: Monday, August 15, 2011 2:14 PM > To: Charles Eckel (eckelcu) > Cc: Paul Kyzivat; clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion >=20 > Inline >=20 >=20 > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel (eckelcu) wrote: >=20 >=20 > Please see inline. >=20 >=20 > > -----Original Message----- > > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf > Of Paul Kyzivat >=20 > > Sent: Thursday, August 11, 2011 6:02 AM >=20 > > To: clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > Inline > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > >> -----Original Message----- > > >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On > Behalf Of > > >> Paul Kyzivat > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > >> To: clue@ietf.org > > >> Subject: Re: [clue] continuing "layout" discussion > > > > > >>> 4 - multi stream media format - what the streams mean with respect > to > > >> each other, regardless of the actual content on the streams. For > > >> audio, examples are stereo, 5.1 surround, binaural, linear array. > > >> (linear array is described in the clue framework document). > Perhaps 3D > > >> video formats would also fit in this category. This information is > > >> needed in order to properly render the media into light and sound > for > > >> human observers. I see this at the same level as identifying a > codec, > > >> independent of the audio or video content carried on the streams, > and > > >> independent of how any composition of sources is done. >=20 >=20 > I do not think this is necessarily true. Taking audio as an example, you > could have two audio streams that are mixed to form a single stereo > audio stream, or you could have them as two independent (not mixed) > streams that are associate with each other by some grouping mechanism. > This group would be categorized as being stereo audio with one audio > stream being the left and the other the right. The codec used for each > could be different, though I agree they would typically be the same. > Consequently, I think at attribute such as "stereo" as being more of a > grouping concept, where the group may consist of: > - multiple independent streams, each with potentially its own spatial > orientation, codec, bandwidth, etc., > - a single mixed stream >=20 >=20 >=20 > [sb] I do not understand this distinction. What do you mean when you say "two audio streams that are > mixed to form a single stereo stream", and how is this different from the left and right grouping? In one case they are mixed by the source of the stream into a single stream, and in another they are sent as two separate streams by the source. The end result once rendered at the receiver may be the same, but what is sent is different. This example with audio is perhaps too simple. If you think of it as video that is composed into a single video stream vs. multiple via streams that are sent individually, the difference may be more clear. Cheers, Charles >=20 >=20 >=20 > Cheers, > Charles >=20 >=20 > > >> I was with you all the way until 4. That one I don't understand. > > >> The name you chose for this has connotations for me, but isn't > fully in > > >> harmony with the definitions you give: > > > > > > I'm happy to change the name if you have a suggestion > > > > Not yet. Maybe once the concepts are more clearly defined I will have > an > > opinion. > > > > >> If we consider audio, it makes sense that multiple streams can be > > >> rendered as if they came from different physical locations in the > > >> receiving room. That can be done by the receiver if it gets those > > >> streams separately, and has information about their intended > > >> relationships. It can also be done by the sender or MCU and passed > on > > >> to > > >> the receiver as a single stream with stereo or binaural coding. > > > > > > Yes. It could also be done by the sender using the "linear array" > audio channel format. Maybe it > > is true that stereo or binaural audio channels would always be sent as > a single stream, but I was not > > assuming that yet, at least not in general when you consider other > types too, such as linear array > > channels. > > > > >> So it seems to me you have two concepts here, not one. One has to > do > > >> with describing the relationships between streams, and the other > has to > > >> do with the encoding of spacial relationships *within* a single > stream. > > > > > > Maybe that is a better way to describe it, if you assume > multi-channel audio is always sent with all > > the channels in the same RTP stream. Is that what you mean? > > > > > > I was considering the linear array format to be another type of > multi-channel audio, and I know > > people want to be able to send each channel in a separate RTP stream. > So it doesn't quite fit with > > how you separate the two concepts. In my view, identifying the > separate channels by what they mean is > > the same concept for linear array and stereo. For example "this > channel is left, this channel is > > center, this channel is right". To me, that is the same concept for > identifying channels whether or > > not they are carried in the same RTP stream. > > > > > > Maybe we are thinking the same thing but getting confused by > terminology about channels vs. streams. > > > > Maybe. Let me try to restate what I now think you are saying: > > > > The audio may consist of several "channels". > > > > Each channel may be sent over its own RTP stream, > > or multiple channels may be multiplexed over an RTP stream. > > > > I guess much of this can also apply to video. > > > > When there are exactly two audio channels, they may be encoded as > > "stereo" or "binaural", which then affects how they should be rendered > > by the recipient. In these cases the primary info that is required > about > > the individual channels is which is left and which is right. (And > which > > perspective to use in interpretting left and right.) > > > > For other multi-channel cases more information is required about the > > role of each channel in order to properly render them. > > > > Thanks, > > Paul > > > > > > >> Or, are you asserting that stereo and binaural are simply ways to > > >> encode > > >> multiple logical streams in one RTP stream, together with their > spacial > > >> relationships? > > > > > > No, that is not what I'm trying to say. > > > > > > Mark > > > _______________________________________________ > > > clue mailing list > > > clue@ietf.org > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue >=20 >=20 From stephen.botzko@gmail.com Tue Aug 16 06:19:17 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2F3AC21F8A91 for ; Tue, 16 Aug 2011 06:19:17 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -3.346 X-Spam-Level: X-Spam-Status: No, score=-3.346 tagged_above=-999 required=5 tests=[AWL=0.252, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id diNnMxP6hUWR for ; Tue, 16 Aug 2011 06:19:15 -0700 (PDT) Received: from mail-vx0-f172.google.com (mail-vx0-f172.google.com [209.85.220.172]) by ietfa.amsl.com (Postfix) with ESMTP id 84F4B21F8A66 for ; Tue, 16 Aug 2011 06:19:15 -0700 (PDT) Received: by vxi29 with SMTP id 29so5824916vxi.31 for ; Tue, 16 Aug 2011 06:20:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=iXyg6FNU8DWbR9MUd987m7cVw+d7rCIaIdwJKUSsVww=; b=VX3ndrkbZ7V8iOHm1ZbJ+jsVItzIkJ0R5mTejtAuw8SnY2TTak55UEQAuXSR70hXwa DOhl3WOFmb8pAYSY8Z4znOBuXSL0OXyUIYWTBB97VQiKKX4+Ra4+dO4SjEbvXZXHR4yh hKZr49r0lQkU+naJvXGMmVM2a5oHcGNgst2OQ= MIME-Version: 1.0 Received: by 10.52.183.37 with SMTP id ej5mr4716438vdc.423.1313500803637; Tue, 16 Aug 2011 06:20:03 -0700 (PDT) Received: by 10.52.115.103 with HTTP; Tue, 16 Aug 2011 06:20:03 -0700 (PDT) In-Reply-To: References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> Date: Tue, 16 Aug 2011 09:20:03 -0400 Message-ID: From: Stephen Botzko To: "Charles Eckel (eckelcu)" Content-Type: multipart/alternative; boundary=bcaec548a379d0220a04aa9f3cac Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Aug 2011 13:19:17 -0000 --bcaec548a379d0220a04aa9f3cac Content-Type: text/plain; charset=ISO-8859-1 I guess by "stream" you are meaning RTP stream? in which case by "mix" you perhaps mean that the left and right channels are placed in a single RTP stream??? What do you mean when you describe some audio captures as "independent" - are you thinking they come from different rooms???. I think in many respects audio distribution and spatial audio layout is at least as difficult as video layout, and have some unique issues. For one thing, you need to sort out how you should place the audio from human participants who are not on camera, and what should happen later on if some of those participants are shown. I suggest it is necessary to be very careful with terminology. In particular, I think it is important to distinguish composition from RTP transmission. Regards, Stephen Botzko On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) wrote: > > -----Original Message----- > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > Sent: Monday, August 15, 2011 2:14 PM > > To: Charles Eckel (eckelcu) > > Cc: Paul Kyzivat; clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > Inline > > > > > > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel (eckelcu) > wrote: > > > > > > Please see inline. > > > > > > > -----Original Message----- > > > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On > Behalf > > Of Paul Kyzivat > > > > > Sent: Thursday, August 11, 2011 6:02 AM > > > > > To: clue@ietf.org > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > Inline > > > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > > >> -----Original Message----- > > > >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] > On > > Behalf Of > > > >> Paul Kyzivat > > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > > >> To: clue@ietf.org > > > >> Subject: Re: [clue] continuing "layout" discussion > > > > > > > >>> 4 - multi stream media format - what the streams mean with > respect > > to > > > >> each other, regardless of the actual content on the > streams. For > > > >> audio, examples are stereo, 5.1 surround, binaural, linear > array. > > > >> (linear array is described in the clue framework document). > > Perhaps 3D > > > >> video formats would also fit in this category. This > information is > > > >> needed in order to properly render the media into light and > sound > > for > > > >> human observers. I see this at the same level as > identifying a > > codec, > > > >> independent of the audio or video content carried on the > streams, > > and > > > >> independent of how any composition of sources is done. > > > > > > I do not think this is necessarily true. Taking audio as an > example, you > > could have two audio streams that are mixed to form a single > stereo > > audio stream, or you could have them as two independent (not > mixed) > > streams that are associate with each other by some grouping > mechanism. > > This group would be categorized as being stereo audio with one > audio > > stream being the left and the other the right. The codec used > for each > > could be different, though I agree they would typically be the > same. > > Consequently, I think at attribute such as "stereo" as being > more of a > > grouping concept, where the group may consist of: > > - multiple independent streams, each with potentially its own > spatial > > orientation, codec, bandwidth, etc., > > - a single mixed stream > > > > > > > > [sb] I do not understand this distinction. What do you mean when you > say "two audio streams that are > > mixed to form a single stereo stream", and how is this different from > the left and right grouping? > > In one case they are mixed by the source of the stream into a single > stream, and in another they are sent as two separate streams by the > source. The end result once rendered at the receiver may be the same, > but what is sent is different. This example with audio is perhaps too > simple. If you think of it as video that is composed into a single video > stream vs. multiple via streams that are sent individually, the > difference may be more clear. > > Cheers, > Charles > > > > > > > > > Cheers, > > Charles > > > > > > > >> I was with you all the way until 4. That one I don't > understand. > > > >> The name you chose for this has connotations for me, but > isn't > > fully in > > > >> harmony with the definitions you give: > > > > > > > > I'm happy to change the name if you have a suggestion > > > > > > Not yet. Maybe once the concepts are more clearly defined I > will have > > an > > > opinion. > > > > > > >> If we consider audio, it makes sense that multiple streams > can be > > > >> rendered as if they came from different physical locations > in the > > > >> receiving room. That can be done by the receiver if it gets > those > > > >> streams separately, and has information about their > intended > > > >> relationships. It can also be done by the sender or MCU and > passed > > on > > > >> to > > > >> the receiver as a single stream with stereo or binaural > coding. > > > > > > > > Yes. It could also be done by the sender using the "linear > array" > > audio channel format. Maybe it > > > is true that stereo or binaural audio channels would always be > sent as > > a single stream, but I was not > > > assuming that yet, at least not in general when you consider > other > > types too, such as linear array > > > channels. > > > > > > >> So it seems to me you have two concepts here, not one. One > has to > > do > > > >> with describing the relationships between streams, and the > other > > has to > > > >> do with the encoding of spacial relationships *within* a > single > > stream. > > > > > > > > Maybe that is a better way to describe it, if you assume > > multi-channel audio is always sent with all > > > the channels in the same RTP stream. Is that what you mean? > > > > > > > > I was considering the linear array format to be another type > of > > multi-channel audio, and I know > > > people want to be able to send each channel in a separate RTP > stream. > > So it doesn't quite fit with > > > how you separate the two concepts. In my view, identifying > the > > separate channels by what they mean is > > > the same concept for linear array and stereo. For example > "this > > channel is left, this channel is > > > center, this channel is right". To me, that is the same > concept for > > identifying channels whether or > > > not they are carried in the same RTP stream. > > > > > > > > Maybe we are thinking the same thing but getting confused by > > terminology about channels vs. streams. > > > > > > Maybe. Let me try to restate what I now think you are saying: > > > > > > The audio may consist of several "channels". > > > > > > Each channel may be sent over its own RTP stream, > > > or multiple channels may be multiplexed over an RTP stream. > > > > > > I guess much of this can also apply to video. > > > > > > When there are exactly two audio channels, they may be encoded > as > > > "stereo" or "binaural", which then affects how they should be > rendered > > > by the recipient. In these cases the primary info that is > required > > about > > > the individual channels is which is left and which is right. > (And > > which > > > perspective to use in interpretting left and right.) > > > > > > For other multi-channel cases more information is required > about the > > > role of each channel in order to properly render them. > > > > > > Thanks, > > > Paul > > > > > > > > > >> Or, are you asserting that stereo and binaural are simply > ways to > > > >> encode > > > >> multiple logical streams in one RTP stream, together with > their > > spacial > > > >> relationships? > > > > > > > > No, that is not what I'm trying to say. > > > > > > > > Mark > > > > _______________________________________________ > > > > clue mailing list > > > > clue@ietf.org > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > _______________________________________________ > > > clue mailing list > > > clue@ietf.org > > > https://www.ietf.org/mailman/listinfo/clue > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue > > > > > > --bcaec548a379d0220a04aa9f3cac Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I guess by "stream" you are meaning RTP stream?=A0 in which case = by "mix" you perhaps mean that the left and right channels are pl= aced in a single RTP stream???=A0 What do you mean when you describe some a= udio captures as "independent" - are you thinking they come from = different rooms???.

I think in many respects audio distribution and spatial audio layout is= at least as difficult as video layout, and have some unique issues.=A0 For= one thing, you need to sort out how you should place the audio from human = participants who are not on camera, and what should happen later on if some= of those participants are shown.=A0

I suggest it is necessary to be very careful with terminology.=A0 In pa= rticular, I think it is important to distinguish composition from RTP trans= mission.

Regards,
Stephen Botzko


On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) = <eckelcu@cisco.com> w= rote:
> -----Original Message-----
> From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> Sent: Monday, August 15, 2011 2:14 PM
> To: Charles Eckel (eckelcu)
> Cc: Paul Kyzivat; clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion
>
> Inline
>
>
> On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel (eckelcu)
<eckelcu@cisco.com> wrote: >
>
> =A0 =A0 =A0 Please see inline.
>
>
> =A0 =A0 =A0 > -----Original Message-----
> =A0 =A0 =A0 > From: clue-b= ounces@ietf.org [mailto:clue-b= ounces@ietf.org] On
Behalf
> =A0 =A0 =A0 Of Paul Kyzivat
>
> =A0 =A0 =A0 > Sent: Thursday, August 11, 2011 6:02 AM
>
> =A0 =A0 =A0 > To: clue@ietf.org
> =A0 =A0 =A0 > Subject: Re: [clue] continuing "layout" dis= cussion
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > Inline
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > On 8/10/11 5:49 PM, Duckworth, Mark wrote:
> =A0 =A0 =A0 > >> -----Original Message-----
> =A0 =A0 =A0 > >> From:
clue-bounces@ietf.org [mailto:clue-bounces@ietf.org]
On
> =A0 =A0 =A0 Behalf Of
> =A0 =A0 =A0 > >> Paul Kyzivat
> =A0 =A0 =A0 > >> Sent: Tuesday, August 09, 2011 9:03 AM
> =A0 =A0 =A0 > >> To: clue@ie= tf.org
> =A0 =A0 =A0 > >> Subject: Re: [clue] continuing "layout&= quot; discussion
> =A0 =A0 =A0 > >
> =A0 =A0 =A0 > >>> 4 - multi stream media format - what the= streams mean with
respect
> =A0 =A0 =A0 to
> =A0 =A0 =A0 > >> each other, regardless of the actual content= on the
streams. =A0For
> =A0 =A0 =A0 > >> audio, examples are stereo, 5.1 surround, bi= naural, linear
array.
> =A0 =A0 =A0 > >> (linear array is described in the clue frame= work document).
> =A0 =A0 =A0 Perhaps 3D
> =A0 =A0 =A0 > >> video formats would also fit in this categor= y. =A0This
information is
> =A0 =A0 =A0 > >> needed in order to properly render the media= into light and
sound
> =A0 =A0 =A0 for
> =A0 =A0 =A0 > >> human observers. =A0I see this at the same l= evel as
identifying a
> =A0 =A0 =A0 codec,
> =A0 =A0 =A0 > >> independent of the audio or video content ca= rried on the
streams,
> =A0 =A0 =A0 and
> =A0 =A0 =A0 > >> independent of how any composition of source= s is done.
>
>
> =A0 =A0 =A0 I do not think this is necessarily true. Taking audio as a= n
example, you
> =A0 =A0 =A0 could have two audio streams that are mixed to form a sing= le
stereo
> =A0 =A0 =A0 audio stream, or you could have them as two independent (n= ot
mixed)
> =A0 =A0 =A0 streams that are associate with each other by some groupin= g
mechanism.
> =A0 =A0 =A0 This group would be categorized as being stereo audio with= one
audio
> =A0 =A0 =A0 stream being the left and the other the right. The codec u= sed
for each
> =A0 =A0 =A0 could be different, though I agree they would typically be= the
same.
> =A0 =A0 =A0 Consequently, I think at attribute such as "stereo&qu= ot; as being
more of a
> =A0 =A0 =A0 grouping concept, where the group may consist of:
> =A0 =A0 =A0 - multiple independent streams, each with potentially its = own
spatial
> =A0 =A0 =A0 orientation, codec, bandwidth, etc.,
> =A0 =A0 =A0 - a single mixed stream
>
>
>
> [sb] I do not understand this distinction. =A0What do you mean when yo= u
say "two audio streams that are
> mixed to form a single stereo stream", and how is this different = from
the left and right grouping?

In one case they are mixed by the source of the stream into a s= ingle
stream, and in another they are sent as two separate streams by the
source. The end result once rendered at the receiver may be the same,
but what is sent is different. This example with audio is perhaps too
simple. If you think of it as video that is composed into a single video stream vs. multiple via streams that are sent individually, the
difference may be more clear.

Cheers,
Charles

>
>
>
> =A0 =A0 =A0 Cheers,
> =A0 =A0 =A0 Charles
>
>
> =A0 =A0 =A0 > >> I was with you all the way until 4. That one= I don't
understand.
> =A0 =A0 =A0 > >> The name you chose for this has connotations= for me, but
isn't
> =A0 =A0 =A0 fully in
> =A0 =A0 =A0 > >> harmony with the definitions you give:
> =A0 =A0 =A0 > >
> =A0 =A0 =A0 > > I'm happy to change the name if you have a s= uggestion
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > Not yet. Maybe once the concepts are more clearly def= ined I
will have
> =A0 =A0 =A0 an
> =A0 =A0 =A0 > opinion.
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > >> If we consider audio, it makes sense that mu= ltiple streams
can be
> =A0 =A0 =A0 > >> rendered as if they came from different phys= ical locations
in the
> =A0 =A0 =A0 > >> receiving room. That can be done by the rece= iver if it gets
those
> =A0 =A0 =A0 > >> streams separately, and has information abou= t their
intended
> =A0 =A0 =A0 > >> relationships. It can also be done by the se= nder or MCU and
passed
> =A0 =A0 =A0 on
> =A0 =A0 =A0 > >> to
> =A0 =A0 =A0 > >> the receiver as a single stream with stereo = or binaural
coding.
> =A0 =A0 =A0 > >
> =A0 =A0 =A0 > > Yes. =A0It could also be done by the sender usin= g the "linear
array"
> =A0 =A0 =A0 audio channel format. =A0Maybe it
> =A0 =A0 =A0 > is true that stereo or binaural audio channels would = always be
sent as
> =A0 =A0 =A0 a single stream, but I was not
> =A0 =A0 =A0 > assuming that yet, at least not in general when you c= onsider
other
> =A0 =A0 =A0 types too, such as linear array
> =A0 =A0 =A0 > channels.
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > >> So it seems to me you have two concepts here= , not one. One
has to
> =A0 =A0 =A0 do
> =A0 =A0 =A0 > >> with describing the relationships between st= reams, and the
other
> =A0 =A0 =A0 has to
> =A0 =A0 =A0 > >> do with the encoding of spacial relationship= s *within* a
single
> =A0 =A0 =A0 stream.
> =A0 =A0 =A0 > >
> =A0 =A0 =A0 > > Maybe that is a better way to describe it, if yo= u assume
> =A0 =A0 =A0 multi-channel audio is always sent with all
> =A0 =A0 =A0 > the channels in the same RTP stream. =A0Is that what = you mean?
> =A0 =A0 =A0 > >
> =A0 =A0 =A0 > > I was considering the linear array format to be = another type
of
> =A0 =A0 =A0 multi-channel audio, and I know
> =A0 =A0 =A0 > people want to be able to send each channel in a sepa= rate RTP
stream.
> =A0 =A0 =A0 So it doesn't quite fit with
> =A0 =A0 =A0 > how you separate the two concepts. =A0In my view, ide= ntifying
the
> =A0 =A0 =A0 separate channels by what they mean is
> =A0 =A0 =A0 > the same concept for linear array and stereo. =A0For = example
"this
> =A0 =A0 =A0 channel is left, this channel is
> =A0 =A0 =A0 > center, this channel is right". =A0To me, that i= s the same
concept for
> =A0 =A0 =A0 identifying channels whether or
> =A0 =A0 =A0 > not they are carried in the same RTP stream.
> =A0 =A0 =A0 > >
> =A0 =A0 =A0 > > Maybe we are thinking the same thing but getting= confused by
> =A0 =A0 =A0 terminology about channels vs. streams.
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > Maybe. Let me try to restate what I now think you are= saying:
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > The audio may consist of several "channels"= .
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > Each channel may be sent over its own RTP stream,
> =A0 =A0 =A0 > or multiple channels may be multiplexed over an RTP s= tream.
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > I guess much of this can also apply to video.
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > When there are exactly two audio channels, they may b= e encoded
as
> =A0 =A0 =A0 > "stereo" or "binaural", which the= n affects how they should be
rendered
> =A0 =A0 =A0 > by the recipient. In these cases the primary info tha= t is
required
> =A0 =A0 =A0 about
> =A0 =A0 =A0 > the individual channels is which is left and which is= right.
(And
> =A0 =A0 =A0 which
> =A0 =A0 =A0 > perspective to use in interpretting left and right.)<= br> > =A0 =A0 =A0 >
> =A0 =A0 =A0 > For other multi-channel cases more information is req= uired
about the
> =A0 =A0 =A0 > role of each channel in order to properly render them= .
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 Thanks,
> =A0 =A0 =A0 > =A0 =A0 =A0 Paul
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > >> Or, are you asserting that stereo and binaur= al are simply
ways to
> =A0 =A0 =A0 > >> encode
> =A0 =A0 =A0 > >> multiple logical streams in one RTP stream, = together with
their
> =A0 =A0 =A0 spacial
> =A0 =A0 =A0 > >> relationships?
> =A0 =A0 =A0 > >
> =A0 =A0 =A0 > > No, that is not what I'm trying to say.
> =A0 =A0 =A0 > >
> =A0 =A0 =A0 > > Mark
> =A0 =A0 =A0 > > _______________________________________________<= br> > =A0 =A0 =A0 > > clue mailing list
> =A0 =A0 =A0 > > clue@ietf.org
> =A0 =A0 =A0 > >
https://www.ietf.org/mailman/listinfo/clue
> =A0 =A0 =A0 > >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > _______________________________________________
> =A0 =A0 =A0 > clue mailing list
> =A0 =A0 =A0 > clue@ietf.org > =A0 =A0 =A0 > https://www.ietf.org/mailman/listinfo/clue
> =A0 =A0 =A0 _______________________________________________
> =A0 =A0 =A0 clue mailing list
> =A0 =A0 =A0 clue@ietf.org
> =A0 =A0 =A0 https://www.ietf.org/mailman/listinfo/clue
>
>


--bcaec548a379d0220a04aa9f3cac-- From eckelcu@cisco.com Tue Aug 16 13:22:18 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 125BC21F8B00 for ; Tue, 16 Aug 2011 13:22:18 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.821 X-Spam-Level: X-Spam-Status: No, score=-2.821 tagged_above=-999 required=5 tests=[AWL=-0.222, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wNqAXf0jAQZ2 for ; Tue, 16 Aug 2011 13:22:17 -0700 (PDT) Received: from rcdn-iport-5.cisco.com (rcdn-iport-5.cisco.com [173.37.86.76]) by ietfa.amsl.com (Postfix) with ESMTP id BF65F21F8B03 for ; Tue, 16 Aug 2011 13:22:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=eckelcu@cisco.com; l=10596; q=dns/txt; s=iport; t=1313526186; x=1314735786; h=mime-version:content-transfer-encoding:subject:date: message-id:in-reply-to:references:from:to:cc; bh=0pDGZ6RW2jCmqRSMlQZYkVvS5cCRaJAxsjukysnwgi8=; b=QH3dr6VGm9tGGWxiVV9P+gRt8E7YxSqLnmg+U/brJG4h4D0B3OKiSu/s +88VyoC9nsrJhdcbzftgF198nsD8ZEz+Yt37JacqqEpdLmwsyytzdXPEG /4mBcWt4s0WxpTbq54UaTEJVMoxzwes+HkYJCsu+c1asN1L0tqlqtsUgV Q=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtYAAM/QSk6rRDoG/2dsb2JhbABBmG6PVXeBQAEBAQEDAQEBDwEdCi0HBAcMBAIBCBEEAQEBCgYXAQYBIAYfCQgBAQQTCBqHUpp3AZ84hWlfBIdfkEiEYYcf X-IronPort-AV: E=Sophos;i="4.68,235,1312156800"; d="scan'208";a="13699421" Received: from mtv-core-1.cisco.com ([171.68.58.6]) by rcdn-iport-5.cisco.com with ESMTP; 16 Aug 2011 20:23:05 +0000 Received: from xbh-sjc-221.amer.cisco.com (xbh-sjc-221.cisco.com [128.107.191.63]) by mtv-core-1.cisco.com (8.14.3/8.14.3) with ESMTP id p7GKN46H029326; Tue, 16 Aug 2011 20:23:05 GMT Received: from xmb-sjc-234.amer.cisco.com ([128.107.191.111]) by xbh-sjc-221.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Tue, 16 Aug 2011 13:23:02 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Tue, 16 Aug 2011 13:23:01 -0700 Message-ID: In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [clue] continuing "layout" discussion Thread-Index: AcxcFzb78VHrrjMFQ4imR+ZssKuA5AAOduhA References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com><4E413021.3010509@alum.mit.edu><44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com><4E43D2BE.5010102@alum.mit.edu> From: "Charles Eckel (eckelcu)" To: "Stephen Botzko" X-OriginalArrivalTime: 16 Aug 2011 20:23:02.0731 (UTC) FILETIME=[4C7E21B0:01CC5C52] Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Aug 2011 20:22:18 -0000 I am distinguishing between: (1) a single RTP stream that consists of a single stereo audio stream (2) two RTP streams, one that contains left speaker audio and the other than contains right speaker audio (2) could also be transmitted in a single RTP stream using SSRC multiplexing. Let me call that (2b).=20 (2) and (2b) are essentially the same. Just the RTP mechanism employed is difference. (1) is different from (2) and (2b) in that the audio signal encoded is actually different. Cheers, Charles > -----Original Message----- > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > Sent: Tuesday, August 16, 2011 6:20 AM > To: Charles Eckel (eckelcu) > Cc: Paul Kyzivat; clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion >=20 > I guess by "stream" you are meaning RTP stream? in which case by "mix" you perhaps mean that the left > and right channels are placed in a single RTP stream??? What do you mean when you describe some audio > captures as "independent" - are you thinking they come from different rooms???. >=20 > I think in many respects audio distribution and spatial audio layout is at least as difficult as video > layout, and have some unique issues. For one thing, you need to sort out how you should place the > audio from human participants who are not on camera, and what should happen later on if some of those > participants are shown. >=20 > I suggest it is necessary to be very careful with terminology. In particular, I think it is important > to distinguish composition from RTP transmission. >=20 > Regards, > Stephen Botzko >=20 >=20 >=20 > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) wrote: >=20 >=20 > > -----Original Message----- > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > Sent: Monday, August 15, 2011 2:14 PM > > To: Charles Eckel (eckelcu) > > Cc: Paul Kyzivat; clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > Inline > > > > > > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel (eckelcu) > wrote: > > > > > > Please see inline. > > > > > > > -----Original Message----- > > > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On > Behalf > > Of Paul Kyzivat > > > > > Sent: Thursday, August 11, 2011 6:02 AM > > > > > To: clue@ietf.org > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > Inline > > > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > > >> -----Original Message----- > > > >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] > On > > Behalf Of > > > >> Paul Kyzivat > > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > > >> To: clue@ietf.org > > > >> Subject: Re: [clue] continuing "layout" discussion > > > > > > > >>> 4 - multi stream media format - what the streams mean with > respect > > to > > > >> each other, regardless of the actual content on the > streams. For > > > >> audio, examples are stereo, 5.1 surround, binaural, linear > array. > > > >> (linear array is described in the clue framework document). > > Perhaps 3D > > > >> video formats would also fit in this category. This > information is > > > >> needed in order to properly render the media into light and > sound > > for > > > >> human observers. I see this at the same level as > identifying a > > codec, > > > >> independent of the audio or video content carried on the > streams, > > and > > > >> independent of how any composition of sources is done. > > > > > > I do not think this is necessarily true. Taking audio as an > example, you > > could have two audio streams that are mixed to form a single > stereo > > audio stream, or you could have them as two independent (not > mixed) > > streams that are associate with each other by some grouping > mechanism. > > This group would be categorized as being stereo audio with one > audio > > stream being the left and the other the right. The codec used > for each > > could be different, though I agree they would typically be the > same. > > Consequently, I think at attribute such as "stereo" as being > more of a > > grouping concept, where the group may consist of: > > - multiple independent streams, each with potentially its own > spatial > > orientation, codec, bandwidth, etc., > > - a single mixed stream > > > > > > > > [sb] I do not understand this distinction. What do you mean when you > say "two audio streams that are > > mixed to form a single stereo stream", and how is this different from > the left and right grouping? >=20 >=20 > In one case they are mixed by the source of the stream into a single > stream, and in another they are sent as two separate streams by the > source. The end result once rendered at the receiver may be the same, > but what is sent is different. This example with audio is perhaps too > simple. If you think of it as video that is composed into a single video > stream vs. multiple via streams that are sent individually, the > difference may be more clear. >=20 > Cheers, > Charles >=20 >=20 > > > > > > > > Cheers, > > Charles > > > > > > > >> I was with you all the way until 4. That one I don't > understand. > > > >> The name you chose for this has connotations for me, but > isn't > > fully in > > > >> harmony with the definitions you give: > > > > > > > > I'm happy to change the name if you have a suggestion > > > > > > Not yet. Maybe once the concepts are more clearly defined I > will have > > an > > > opinion. > > > > > > >> If we consider audio, it makes sense that multiple streams > can be > > > >> rendered as if they came from different physical locations > in the > > > >> receiving room. That can be done by the receiver if it gets > those > > > >> streams separately, and has information about their > intended > > > >> relationships. It can also be done by the sender or MCU and > passed > > on > > > >> to > > > >> the receiver as a single stream with stereo or binaural > coding. > > > > > > > > Yes. It could also be done by the sender using the "linear > array" > > audio channel format. Maybe it > > > is true that stereo or binaural audio channels would always be > sent as > > a single stream, but I was not > > > assuming that yet, at least not in general when you consider > other > > types too, such as linear array > > > channels. > > > > > > >> So it seems to me you have two concepts here, not one. One > has to > > do > > > >> with describing the relationships between streams, and the > other > > has to > > > >> do with the encoding of spacial relationships *within* a > single > > stream. > > > > > > > > Maybe that is a better way to describe it, if you assume > > multi-channel audio is always sent with all > > > the channels in the same RTP stream. Is that what you mean? > > > > > > > > I was considering the linear array format to be another type > of > > multi-channel audio, and I know > > > people want to be able to send each channel in a separate RTP > stream. > > So it doesn't quite fit with > > > how you separate the two concepts. In my view, identifying > the > > separate channels by what they mean is > > > the same concept for linear array and stereo. For example > "this > > channel is left, this channel is > > > center, this channel is right". To me, that is the same > concept for > > identifying channels whether or > > > not they are carried in the same RTP stream. > > > > > > > > Maybe we are thinking the same thing but getting confused by > > terminology about channels vs. streams. > > > > > > Maybe. Let me try to restate what I now think you are saying: > > > > > > The audio may consist of several "channels". > > > > > > Each channel may be sent over its own RTP stream, > > > or multiple channels may be multiplexed over an RTP stream. > > > > > > I guess much of this can also apply to video. > > > > > > When there are exactly two audio channels, they may be encoded > as > > > "stereo" or "binaural", which then affects how they should be > rendered > > > by the recipient. In these cases the primary info that is > required > > about > > > the individual channels is which is left and which is right. > (And > > which > > > perspective to use in interpretting left and right.) > > > > > > For other multi-channel cases more information is required > about the > > > role of each channel in order to properly render them. > > > > > > Thanks, > > > Paul > > > > > > > > > >> Or, are you asserting that stereo and binaural are simply > ways to > > > >> encode > > > >> multiple logical streams in one RTP stream, together with > their > > spacial > > > >> relationships? > > > > > > > > No, that is not what I'm trying to say. > > > > > > > > Mark > > > > _______________________________________________ > > > > clue mailing list > > > > clue@ietf.org > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > _______________________________________________ > > > clue mailing list > > > clue@ietf.org > > > https://www.ietf.org/mailman/listinfo/clue > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue > > > > >=20 >=20 >=20 From stephen.botzko@gmail.com Tue Aug 16 14:13:28 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DE79911E808E for ; Tue, 16 Aug 2011 14:13:28 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -3.359 X-Spam-Level: X-Spam-Status: No, score=-3.359 tagged_above=-999 required=5 tests=[AWL=0.239, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fqlBH4xZkGrr for ; Tue, 16 Aug 2011 14:13:27 -0700 (PDT) Received: from mail-qy0-f172.google.com (mail-qy0-f172.google.com [209.85.216.172]) by ietfa.amsl.com (Postfix) with ESMTP id 0679821F8B4F for ; Tue, 16 Aug 2011 14:13:12 -0700 (PDT) Received: by qyk34 with SMTP id 34so1664467qyk.10 for ; Tue, 16 Aug 2011 14:14:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=5bgFaOcsaoUom7MlZiOdQwfo3lnv1Jl2rz/OVZ+99aY=; b=j9CcQqHy0chECr0SFOaj1iayn48nvP4Am+OSpGlbz9KNmVm6lBJ55kE4mL9DMwkDv8 x9UfIHIHA7K5F3MjXJuBWClgk776jsyIvUQermJ85dP0Fko7Vw9cO90baQEHnXzBYp8+ UiNqV34xsBuyim8piaYkXmqLcSpgBwZSq+60Q= MIME-Version: 1.0 Received: by 10.52.182.6 with SMTP id ea6mr244566vdc.222.1313529241937; Tue, 16 Aug 2011 14:14:01 -0700 (PDT) Received: by 10.52.115.103 with HTTP; Tue, 16 Aug 2011 14:14:01 -0700 (PDT) In-Reply-To: References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> Date: Tue, 16 Aug 2011 17:14:01 -0400 Message-ID: From: Stephen Botzko To: "Charles Eckel (eckelcu)" Content-Type: multipart/alternative; boundary=bcaec5486194de26c404aaa5dbd8 Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Aug 2011 21:13:29 -0000 --bcaec5486194de26c404aaa5dbd8 Content-Type: text/plain; charset=ISO-8859-1 Well, the audio in (1) and (2b) is certainly packetized differently. But not compressed differently (unless you are assuming that the signal in (1) is jointly encoded stereo - which it could be I guess, but it would be unusual for telepresence systems). Also, the audio in (1) is not mixed, no matter how it is encoded. In any event, I believe that the difference between (1) and (2) and (2b) is really a transport question that has nothing to do with layout. The same information is needed to enable proper rendering, and once the streams are received, they are rendered in precisely the same way. Regards, Stephen Botzko On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu) wrote: > I am distinguishing between: > > (1) a single RTP stream that consists of a single stereo audio stream > (2) two RTP streams, one that contains left speaker audio and the other > than contains right speaker audio > > (2) could also be transmitted in a single RTP stream using SSRC > multiplexing. Let me call that (2b). > (2) and (2b) are essentially the same. Just the RTP mechanism employed > is difference. > (1) is different from (2) and (2b) in that the audio signal encoded is > actually different. > > Cheers, > Charles > > > -----Original Message----- > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > Sent: Tuesday, August 16, 2011 6:20 AM > > To: Charles Eckel (eckelcu) > > Cc: Paul Kyzivat; clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > I guess by "stream" you are meaning RTP stream? in which case by > "mix" you perhaps mean that the left > > and right channels are placed in a single RTP stream??? What do you > mean when you describe some audio > > captures as "independent" - are you thinking they come from different > rooms???. > > > > I think in many respects audio distribution and spatial audio layout > is at least as difficult as video > > layout, and have some unique issues. For one thing, you need to sort > out how you should place the > > audio from human participants who are not on camera, and what should > happen later on if some of those > > participants are shown. > > > > I suggest it is necessary to be very careful with terminology. In > particular, I think it is important > > to distinguish composition from RTP transmission. > > > > Regards, > > Stephen Botzko > > > > > > > > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) > wrote: > > > > > > > -----Original Message----- > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > Sent: Monday, August 15, 2011 2:14 PM > > > To: Charles Eckel (eckelcu) > > > Cc: Paul Kyzivat; clue@ietf.org > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > Inline > > > > > > > > > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel (eckelcu) > > wrote: > > > > > > > > > Please see inline. > > > > > > > > > > -----Original Message----- > > > > From: clue-bounces@ietf.org > [mailto:clue-bounces@ietf.org] On > > Behalf > > > Of Paul Kyzivat > > > > > > > Sent: Thursday, August 11, 2011 6:02 AM > > > > > > > To: clue@ietf.org > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > Inline > > > > > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > > > >> -----Original Message----- > > > > >> From: clue-bounces@ietf.org > [mailto:clue-bounces@ietf.org] > > On > > > Behalf Of > > > > >> Paul Kyzivat > > > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > > > >> To: clue@ietf.org > > > > >> Subject: Re: [clue] continuing "layout" discussion > > > > > > > > > >>> 4 - multi stream media format - what the streams > mean with > > respect > > > to > > > > >> each other, regardless of the actual content on the > > streams. For > > > > >> audio, examples are stereo, 5.1 surround, binaural, > linear > > array. > > > > >> (linear array is described in the clue framework > document). > > > Perhaps 3D > > > > >> video formats would also fit in this category. > This > > information is > > > > >> needed in order to properly render the media into > light and > > sound > > > for > > > > >> human observers. I see this at the same level as > > identifying a > > > codec, > > > > >> independent of the audio or video content carried > on the > > streams, > > > and > > > > >> independent of how any composition of sources is > done. > > > > > > > > > I do not think this is necessarily true. Taking audio as > an > > example, you > > > could have two audio streams that are mixed to form a > single > > stereo > > > audio stream, or you could have them as two independent > (not > > mixed) > > > streams that are associate with each other by some > grouping > > mechanism. > > > This group would be categorized as being stereo audio > with one > > audio > > > stream being the left and the other the right. The codec > used > > for each > > > could be different, though I agree they would typically > be the > > same. > > > Consequently, I think at attribute such as "stereo" as > being > > more of a > > > grouping concept, where the group may consist of: > > > - multiple independent streams, each with potentially > its own > > spatial > > > orientation, codec, bandwidth, etc., > > > - a single mixed stream > > > > > > > > > > > > [sb] I do not understand this distinction. What do you mean > when you > > say "two audio streams that are > > > mixed to form a single stereo stream", and how is this > different from > > the left and right grouping? > > > > > > In one case they are mixed by the source of the stream into a > single > > stream, and in another they are sent as two separate streams by > the > > source. The end result once rendered at the receiver may be the > same, > > but what is sent is different. This example with audio is > perhaps too > > simple. If you think of it as video that is composed into a > single video > > stream vs. multiple via streams that are sent individually, the > > difference may be more clear. > > > > Cheers, > > Charles > > > > > > > > > > > > > > > > Cheers, > > > Charles > > > > > > > > > > >> I was with you all the way until 4. That one I > don't > > understand. > > > > >> The name you chose for this has connotations for > me, but > > isn't > > > fully in > > > > >> harmony with the definitions you give: > > > > > > > > > > I'm happy to change the name if you have a > suggestion > > > > > > > > Not yet. Maybe once the concepts are more clearly > defined I > > will have > > > an > > > > opinion. > > > > > > > > >> If we consider audio, it makes sense that multiple > streams > > can be > > > > >> rendered as if they came from different physical > locations > > in the > > > > >> receiving room. That can be done by the receiver if > it gets > > those > > > > >> streams separately, and has information about their > > intended > > > > >> relationships. It can also be done by the sender or > MCU and > > passed > > > on > > > > >> to > > > > >> the receiver as a single stream with stereo or > binaural > > coding. > > > > > > > > > > Yes. It could also be done by the sender using the > "linear > > array" > > > audio channel format. Maybe it > > > > is true that stereo or binaural audio channels would > always be > > sent as > > > a single stream, but I was not > > > > assuming that yet, at least not in general when you > consider > > other > > > types too, such as linear array > > > > channels. > > > > > > > > >> So it seems to me you have two concepts here, not > one. One > > has to > > > do > > > > >> with describing the relationships between streams, > and the > > other > > > has to > > > > >> do with the encoding of spacial relationships > *within* a > > single > > > stream. > > > > > > > > > > Maybe that is a better way to describe it, if you > assume > > > multi-channel audio is always sent with all > > > > the channels in the same RTP stream. Is that what you > mean? > > > > > > > > > > I was considering the linear array format to be > another type > > of > > > multi-channel audio, and I know > > > > people want to be able to send each channel in a > separate RTP > > stream. > > > So it doesn't quite fit with > > > > how you separate the two concepts. In my view, > identifying > > the > > > separate channels by what they mean is > > > > the same concept for linear array and stereo. For > example > > "this > > > channel is left, this channel is > > > > center, this channel is right". To me, that is the > same > > concept for > > > identifying channels whether or > > > > not they are carried in the same RTP stream. > > > > > > > > > > Maybe we are thinking the same thing but getting > confused by > > > terminology about channels vs. streams. > > > > > > > > Maybe. Let me try to restate what I now think you are > saying: > > > > > > > > The audio may consist of several "channels". > > > > > > > > Each channel may be sent over its own RTP stream, > > > > or multiple channels may be multiplexed over an RTP > stream. > > > > > > > > I guess much of this can also apply to video. > > > > > > > > When there are exactly two audio channels, they may be > encoded > > as > > > > "stereo" or "binaural", which then affects how they > should be > > rendered > > > > by the recipient. In these cases the primary info that > is > > required > > > about > > > > the individual channels is which is left and which is > right. > > (And > > > which > > > > perspective to use in interpretting left and right.) > > > > > > > > For other multi-channel cases more information is > required > > about the > > > > role of each channel in order to properly render them. > > > > > > > > Thanks, > > > > Paul > > > > > > > > > > > > >> Or, are you asserting that stereo and binaural are > simply > > ways to > > > > >> encode > > > > >> multiple logical streams in one RTP stream, > together with > > their > > > spacial > > > > >> relationships? > > > > > > > > > > No, that is not what I'm trying to say. > > > > > > > > > > Mark > > > > > _______________________________________________ > > > > > clue mailing list > > > > > clue@ietf.org > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > _______________________________________________ > > > > clue mailing list > > > > clue@ietf.org > > > > https://www.ietf.org/mailman/listinfo/clue > > > _______________________________________________ > > > clue mailing list > > > clue@ietf.org > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > --bcaec5486194de26c404aaa5dbd8 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Well, the audio in (1) and (2b) is certainly packetized differently.=A0 But= not compressed differently (unless you are assuming that the signal in (1)= is jointly encoded stereo - which it could be I guess, but it would be unu= sual for telepresence systems). Also, the audio in (1) is not mixed, no mat= ter how it is encoded.

In any event, I believe that the difference between (1) and (2) and (2b= ) is really a transport question that has nothing to do with layout. The sa= me information is needed to enable proper rendering, and once the streams a= re received, they are rendered in precisely the same way.

Regards,
Stephen Botzko

On Tue, Au= g 16, 2011 at 4:23 PM, Charles Eckel (eckelcu) <eckelcu@cisco.com> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex;"> I am distinguishing between:

(1) a single RTP stream that consists of a single stereo audio stream
(2) two RTP streams, one that contains left speaker audio and the other
than contains right speaker audio

(2) could also be transmitted in a single RTP stream using SSRC
multiplexing. Let me call that (2b).
(2) and (2b) are essentially the same. Just the RTP mechanism employed
is difference.
(1) is different from (2) and (2b) in that the audio signal encoded is
actually different.

Cheers,
Charles

> -----Original Message-----
> From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> Sent: Tuesday, August 16, 2011= 6:20 AM
> To: Charles Eckel (eckelcu)
> Cc: Paul Kyzivat; clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion
>
> I guess by "stream" you are meaning RTP stream? =A0in which = case by
"mix" you perhaps mean that the left
> and right channels are placed in a single RTP stream??? =A0What do you=
mean when you describe some audio
> captures as "independent" - are you thinking they come from = different
rooms???.
>
> I think in many respects audio distribution and spatial audio layout is at least as difficult as video
> layout, and have some unique issues. =A0For one thing, you need to sor= t
out how you should place the
> audio from human participants who are not on camera, and what should happen later on if some of those
> participants are shown.
>
> I suggest it is necessary to be very careful with terminology. =A0In particular, I think it is important
> to distinguish composition from RTP transmission.
>
> Regards,
> Stephen Botzko
>
>
>
> On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu)
<eckelcu@cisco.com> wrote: >
>
> =A0 =A0 =A0 > -----Original Message-----
> =A0 =A0 =A0 > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> =A0 =A0 =A0 > Sent: Monday, August 15, 2011 2:14 PM
> =A0 =A0 =A0 > To: Charles Eckel (eckelcu)
> =A0 =A0 =A0 > Cc: Paul Kyzivat; cl= ue@ietf.org
> =A0 =A0 =A0 > Subject: Re: [clue] continuing "layout" dis= cussion
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > Inline
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel (eckel= cu)
> =A0 =A0 =A0 <eckelcu@cisco.com= > wrote:
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 Please see inline.
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > -----Original Message-----
> =A0 =A0 =A0 > =A0 =A0 =A0 > From: clue-bounces@ietf.org
[mailto:clue-bounces@ietf.org]= On
> =A0 =A0 =A0 Behalf
> =A0 =A0 =A0 > =A0 =A0 =A0 Of Paul Kyzivat
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > Sent: Thursday, August 11, 2011 6:02= AM
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > To: clue@ietf.org
> =A0 =A0 =A0 > =A0 =A0 =A0 > Subject: Re: [clue] continuing "= ;layout" discussion
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > Inline
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > On 8/10/11 5:49 PM, Duckworth, Mark = wrote:
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> -----Original Message-----<= br> > =A0 =A0 =A0 > =A0 =A0 =A0 > >> From: clue-bounces@ietf.org
[mailto:clue-bounces@ietf.org]=
> =A0 =A0 =A0 On
> =A0 =A0 =A0 > =A0 =A0 =A0 Behalf Of
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> Paul Kyzivat
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> Sent: Tuesday, August 09, 2= 011 9:03 AM
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> To: clue@ietf.org
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> Subject: Re: [clue] continu= ing "layout" discussion
> =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > >>> 4 - multi stream media = format - what the streams
mean with
> =A0 =A0 =A0 respect
> =A0 =A0 =A0 > =A0 =A0 =A0 to
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> each other, regardless of t= he actual content on the
> =A0 =A0 =A0 streams. =A0For
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> audio, examples are stereo,= 5.1 surround, binaural,
linear
> =A0 =A0 =A0 array.
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> (linear array is described = in the clue framework
document).
> =A0 =A0 =A0 > =A0 =A0 =A0 Perhaps 3D
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> video formats would also fi= t in this category.
This
> =A0 =A0 =A0 information is
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> needed in order to properly= render the media into
light and
> =A0 =A0 =A0 sound
> =A0 =A0 =A0 > =A0 =A0 =A0 for
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> human observers. =A0I see t= his at the same level as
> =A0 =A0 =A0 identifying a
> =A0 =A0 =A0 > =A0 =A0 =A0 codec,
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> independent of the audio or= video content carried
on the
> =A0 =A0 =A0 streams,
> =A0 =A0 =A0 > =A0 =A0 =A0 and
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> independent of how any comp= osition of sources is
done.
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 I do not think this is necessarily true. = Taking audio as
an
> =A0 =A0 =A0 example, you
> =A0 =A0 =A0 > =A0 =A0 =A0 could have two audio streams that are mix= ed to form a
single
> =A0 =A0 =A0 stereo
> =A0 =A0 =A0 > =A0 =A0 =A0 audio stream, or you could have them as t= wo independent
(not
> =A0 =A0 =A0 mixed)
> =A0 =A0 =A0 > =A0 =A0 =A0 streams that are associate with each othe= r by some
grouping
> =A0 =A0 =A0 mechanism.
> =A0 =A0 =A0 > =A0 =A0 =A0 This group would be categorized as being = stereo audio
with one
> =A0 =A0 =A0 audio
> =A0 =A0 =A0 > =A0 =A0 =A0 stream being the left and the other the r= ight. The codec
used
> =A0 =A0 =A0 for each
> =A0 =A0 =A0 > =A0 =A0 =A0 could be different, though I agree they w= ould typically
be the
> =A0 =A0 =A0 same.
> =A0 =A0 =A0 > =A0 =A0 =A0 Consequently, I think at attribute such a= s "stereo" as
being
> =A0 =A0 =A0 more of a
> =A0 =A0 =A0 > =A0 =A0 =A0 grouping concept, where the group may con= sist of:
> =A0 =A0 =A0 > =A0 =A0 =A0 - multiple independent streams, each with= potentially
its own
> =A0 =A0 =A0 spatial
> =A0 =A0 =A0 > =A0 =A0 =A0 orientation, codec, bandwidth, etc.,
> =A0 =A0 =A0 > =A0 =A0 =A0 - a single mixed stream
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > [sb] I do not understand this distinction. =A0What do= you mean
when you
> =A0 =A0 =A0 say "two audio streams that are
> =A0 =A0 =A0 > mixed to form a single stereo stream", and how i= s this
different from
> =A0 =A0 =A0 the left and right grouping?
>
>
> =A0 =A0 =A0 In one case they are mixed by the source of the stream int= o a
single
> =A0 =A0 =A0 stream, and in another they are sent as two separate strea= ms by
the
> =A0 =A0 =A0 source. The end result once rendered at the receiver may b= e the
same,
> =A0 =A0 =A0 but what is sent is different. This example with audio is<= br> perhaps too
> =A0 =A0 =A0 simple. If you think of it as video that is composed into = a
single video
> =A0 =A0 =A0 stream vs. multiple via streams that are sent individually= , the
> =A0 =A0 =A0 difference may be more clear.
>
> =A0 =A0 =A0 Cheers,
> =A0 =A0 =A0 Charles
>
>
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 Cheers,
> =A0 =A0 =A0 > =A0 =A0 =A0 Charles
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> I was with you all the way = until 4. That one I
don't
> =A0 =A0 =A0 understand.
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> The name you chose for this= has connotations for
me, but
> =A0 =A0 =A0 isn't
> =A0 =A0 =A0 > =A0 =A0 =A0 fully in
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> harmony with the definition= s you give:
> =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > > I'm happy to change the nam= e if you have a
suggestion
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > Not yet. Maybe once the concepts are= more clearly
defined I
> =A0 =A0 =A0 will have
> =A0 =A0 =A0 > =A0 =A0 =A0 an
> =A0 =A0 =A0 > =A0 =A0 =A0 > opinion.
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> If we consider audio, it ma= kes sense that multiple
streams
> =A0 =A0 =A0 can be
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> rendered as if they came fr= om different physical
locations
> =A0 =A0 =A0 in the
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> receiving room. That can be= done by the receiver if
it gets
> =A0 =A0 =A0 those
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> streams separately, and has= information about their
> =A0 =A0 =A0 intended
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> relationships. It can also = be done by the sender or
MCU and
> =A0 =A0 =A0 passed
> =A0 =A0 =A0 > =A0 =A0 =A0 on
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> to
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> the receiver as a single st= ream with stereo or
binaural
> =A0 =A0 =A0 coding.
> =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > > Yes. =A0It could also be done b= y the sender using the
"linear
> =A0 =A0 =A0 array"
> =A0 =A0 =A0 > =A0 =A0 =A0 audio channel format. =A0Maybe it
> =A0 =A0 =A0 > =A0 =A0 =A0 > is true that stereo or binaural audi= o channels would
always be
> =A0 =A0 =A0 sent as
> =A0 =A0 =A0 > =A0 =A0 =A0 a single stream, but I was not
> =A0 =A0 =A0 > =A0 =A0 =A0 > assuming that yet, at least not in g= eneral when you
consider
> =A0 =A0 =A0 other
> =A0 =A0 =A0 > =A0 =A0 =A0 types too, such as linear array
> =A0 =A0 =A0 > =A0 =A0 =A0 > channels.
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> So it seems to me you have = two concepts here, not
one. One
> =A0 =A0 =A0 has to
> =A0 =A0 =A0 > =A0 =A0 =A0 do
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> with describing the relatio= nships between streams,
and the
> =A0 =A0 =A0 other
> =A0 =A0 =A0 > =A0 =A0 =A0 has to
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> do with the encoding of spa= cial relationships
*within* a
> =A0 =A0 =A0 single
> =A0 =A0 =A0 > =A0 =A0 =A0 stream.
> =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > > Maybe that is a better way to d= escribe it, if you
assume
> =A0 =A0 =A0 > =A0 =A0 =A0 multi-channel audio is always sent with a= ll
> =A0 =A0 =A0 > =A0 =A0 =A0 > the channels in the same RTP stream.= =A0Is that what you
mean?
> =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > > I was considering the linear ar= ray format to be
another type
> =A0 =A0 =A0 of
> =A0 =A0 =A0 > =A0 =A0 =A0 multi-channel audio, and I know
> =A0 =A0 =A0 > =A0 =A0 =A0 > people want to be able to send each = channel in a
separate RTP
> =A0 =A0 =A0 stream.
> =A0 =A0 =A0 > =A0 =A0 =A0 So it doesn't quite fit with
> =A0 =A0 =A0 > =A0 =A0 =A0 > how you separate the two concepts. = =A0In my view,
identifying
> =A0 =A0 =A0 the
> =A0 =A0 =A0 > =A0 =A0 =A0 separate channels by what they mean is > =A0 =A0 =A0 > =A0 =A0 =A0 > the same concept for linear array an= d stereo. =A0For
example
> =A0 =A0 =A0 "this
> =A0 =A0 =A0 > =A0 =A0 =A0 channel is left, this channel is
> =A0 =A0 =A0 > =A0 =A0 =A0 > center, this channel is right".= =A0To me, that is the
same
> =A0 =A0 =A0 concept for
> =A0 =A0 =A0 > =A0 =A0 =A0 identifying channels whether or
> =A0 =A0 =A0 > =A0 =A0 =A0 > not they are carried in the same RTP= stream.
> =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > > Maybe we are thinking the same = thing but getting
confused by
> =A0 =A0 =A0 > =A0 =A0 =A0 terminology about channels vs. streams. > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > Maybe. Let me try to restate what I = now think you are
saying:
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > The audio may consist of several &qu= ot;channels".
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > Each channel may be sent over its ow= n RTP stream,
> =A0 =A0 =A0 > =A0 =A0 =A0 > or multiple channels may be multiple= xed over an RTP
stream.
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > I guess much of this can also apply = to video.
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > When there are exactly two audio cha= nnels, they may be
encoded
> =A0 =A0 =A0 as
> =A0 =A0 =A0 > =A0 =A0 =A0 > "stereo" or "binaural= ", which then affects how they
should be
> =A0 =A0 =A0 rendered
> =A0 =A0 =A0 > =A0 =A0 =A0 > by the recipient. In these cases the= primary info that
is
> =A0 =A0 =A0 required
> =A0 =A0 =A0 > =A0 =A0 =A0 about
> =A0 =A0 =A0 > =A0 =A0 =A0 > the individual channels is which is = left and which is
right.
> =A0 =A0 =A0 (And
> =A0 =A0 =A0 > =A0 =A0 =A0 which
> =A0 =A0 =A0 > =A0 =A0 =A0 > perspective to use in interpretting = left and right.)
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > For other multi-channel cases more i= nformation is
required
> =A0 =A0 =A0 about the
> =A0 =A0 =A0 > =A0 =A0 =A0 > role of each channel in order to pro= perly render them.
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Thanks,
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Paul
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> Or, are you asserting that = stereo and binaural are
simply
> =A0 =A0 =A0 ways to
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> encode
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> multiple logical streams in= one RTP stream,
together with
> =A0 =A0 =A0 their
> =A0 =A0 =A0 > =A0 =A0 =A0 spacial
> =A0 =A0 =A0 > =A0 =A0 =A0 > >> relationships?
> =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > > No, that is not what I'm tr= ying to say.
> =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > > Mark
> =A0 =A0 =A0 > =A0 =A0 =A0 > > _______________________________= ________________
> =A0 =A0 =A0 > =A0 =A0 =A0 > > clue mailing list
> =A0 =A0 =A0 > =A0 =A0 =A0 > > clue@ietf.org
> =A0 =A0 =A0 > =A0 =A0 =A0 > > https://www.ietf.org/mailman/list= info/clue
> =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > ____________________________________= ___________
> =A0 =A0 =A0 > =A0 =A0 =A0 > clue mailing list
> =A0 =A0 =A0 > =A0 =A0 =A0 > clu= e@ietf.org
> =A0 =A0 =A0 > =A0 =A0 =A0 > https://www.ietf.org/mailman/listinfo/= clue
> =A0 =A0 =A0 > =A0 =A0 =A0 _________________________________________= ______
> =A0 =A0 =A0 > =A0 =A0 =A0 clue mailing list
> =A0 =A0 =A0 > =A0 =A0 =A0 clue@iet= f.org
> =A0 =A0 =A0 > =A0 =A0 =A0 https://www.ietf.org/mailman/listinfo/clue<= /a>
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
>
>
>


--bcaec5486194de26c404aaa5dbd8-- From eckelcu@cisco.com Tue Aug 16 14:39:26 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 27EFF11E80F9 for ; Tue, 16 Aug 2011 14:39:26 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.816 X-Spam-Level: X-Spam-Status: No, score=-2.816 tagged_above=-999 required=5 tests=[AWL=-0.217, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id osQpbxOAYWxY for ; Tue, 16 Aug 2011 14:39:25 -0700 (PDT) Received: from rcdn-iport-6.cisco.com (rcdn-iport-6.cisco.com [173.37.86.77]) by ietfa.amsl.com (Postfix) with ESMTP id C225C11E80EF for ; Tue, 16 Aug 2011 14:39:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=eckelcu@cisco.com; l=14212; q=dns/txt; s=iport; t=1313530814; x=1314740414; h=mime-version:content-transfer-encoding:subject:date: message-id:in-reply-to:references:from:to:cc; bh=oKZv4hM+ok1auwvmJAWzfQB2N/zFsZ3ppLECi+9sogs=; b=KIuryQDoXX97rd/tbbhwsiZuYIfQDfPoUcbuMczkPLIUyo94zFitHWTW 9fiKrrrxdO9xFs2pz1Q/CfPJK3TvWMZR8M86hKbrZ5OF/aia+GN9xswLB LrXiODEnEvYKkRSMOcRWBkRwYuXSzZiJFnfghtCOJbKdb/SV+6JUsX3rp Y=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtYAAJzjSk6rRDoJ/2dsb2JhbAA7Bphvj1V3gUABAQEBAwEBAQ8BHQotBwQHBQcEAgEIEQQBAQEKBhcBBgEgBh8JCAEBBBMIGodSm00BnyiDMoI3XwSHX5BIhGGHHw X-IronPort-AV: E=Sophos;i="4.68,236,1312156800"; d="scan'208";a="13716553" Received: from mtv-core-4.cisco.com ([171.68.58.9]) by rcdn-iport-6.cisco.com with ESMTP; 16 Aug 2011 21:40:13 +0000 Received: from xbh-sjc-211.amer.cisco.com (xbh-sjc-211.cisco.com [171.70.151.144]) by mtv-core-4.cisco.com (8.14.3/8.14.3) with ESMTP id p7GLeCop021464; Tue, 16 Aug 2011 21:40:13 GMT Received: from xmb-sjc-234.amer.cisco.com ([128.107.191.111]) by xbh-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Tue, 16 Aug 2011 14:40:12 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Tue, 16 Aug 2011 14:40:11 -0700 Message-ID: In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [clue] continuing "layout" discussion Thread-Index: AcxcWW5KBW5fbJZTSCGTySuMAtQXEAAAx3pQ References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com><4E413021.3010509@alum.mit.edu><44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com><4E43D2BE.5010102@alum.mit.edu> From: "Charles Eckel (eckelcu)" To: "Stephen Botzko" X-OriginalArrivalTime: 16 Aug 2011 21:40:12.0713 (UTC) FILETIME=[142D5190:01CC5C5D] Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Aug 2011 21:39:26 -0000 Agreed. The difference I am trying to point out is that in (1), the information you need to describe the audio stream for appropriate rendering is already handled quite well by existing SIP/SDP/RTP and most implementations, whereas you need CLUE for (2) and (2b). Cheers, Charles > -----Original Message----- > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > Sent: Tuesday, August 16, 2011 2:14 PM > To: Charles Eckel (eckelcu) > Cc: Paul Kyzivat; clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion >=20 > Well, the audio in (1) and (2b) is certainly packetized differently. But not compressed differently > (unless you are assuming that the signal in (1) is jointly encoded stereo - which it could be I guess, > but it would be unusual for telepresence systems). Also, the audio in (1) is not mixed, no matter how > it is encoded. >=20 > In any event, I believe that the difference between (1) and (2) and (2b) is really a transport > question that has nothing to do with layout. The same information is needed to enable proper > rendering, and once the streams are received, they are rendered in precisely the same way. >=20 > Regards, > Stephen Botzko >=20 >=20 > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu) wrote: >=20 >=20 > I am distinguishing between: >=20 > (1) a single RTP stream that consists of a single stereo audio stream > (2) two RTP streams, one that contains left speaker audio and the other > than contains right speaker audio >=20 > (2) could also be transmitted in a single RTP stream using SSRC > multiplexing. Let me call that (2b). > (2) and (2b) are essentially the same. Just the RTP mechanism employed > is difference. > (1) is different from (2) and (2b) in that the audio signal encoded is > actually different. >=20 > Cheers, > Charles >=20 >=20 > > -----Original Message----- > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] >=20 > > Sent: Tuesday, August 16, 2011 6:20 AM > > To: Charles Eckel (eckelcu) > > Cc: Paul Kyzivat; clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > I guess by "stream" you are meaning RTP stream? in which case by > "mix" you perhaps mean that the left > > and right channels are placed in a single RTP stream??? What do you > mean when you describe some audio > > captures as "independent" - are you thinking they come from different > rooms???. > > > > I think in many respects audio distribution and spatial audio layout > is at least as difficult as video > > layout, and have some unique issues. For one thing, you need to sort > out how you should place the > > audio from human participants who are not on camera, and what should > happen later on if some of those > > participants are shown. > > > > I suggest it is necessary to be very careful with terminology. In > particular, I think it is important > > to distinguish composition from RTP transmission. > > > > Regards, > > Stephen Botzko > > > > > > > > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) > wrote: > > > > > > > -----Original Message----- > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > Sent: Monday, August 15, 2011 2:14 PM > > > To: Charles Eckel (eckelcu) > > > Cc: Paul Kyzivat; clue@ietf.org > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > Inline > > > > > > > > > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel (eckelcu) > > wrote: > > > > > > > > > Please see inline. > > > > > > > > > > -----Original Message----- > > > > From: clue-bounces@ietf.org > [mailto:clue-bounces@ietf.org] On > > Behalf > > > Of Paul Kyzivat > > > > > > > Sent: Thursday, August 11, 2011 6:02 AM > > > > > > > To: clue@ietf.org > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > Inline > > > > > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > > > >> -----Original Message----- > > > > >> From: clue-bounces@ietf.org > [mailto:clue-bounces@ietf.org] > > On > > > Behalf Of > > > > >> Paul Kyzivat > > > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > > > >> To: clue@ietf.org > > > > >> Subject: Re: [clue] continuing "layout" discussion > > > > > > > > > >>> 4 - multi stream media format - what the streams > mean with > > respect > > > to > > > > >> each other, regardless of the actual content on the > > streams. For > > > > >> audio, examples are stereo, 5.1 surround, binaural, > linear > > array. > > > > >> (linear array is described in the clue framework > document). > > > Perhaps 3D > > > > >> video formats would also fit in this category. > This > > information is > > > > >> needed in order to properly render the media into > light and > > sound > > > for > > > > >> human observers. I see this at the same level as > > identifying a > > > codec, > > > > >> independent of the audio or video content carried > on the > > streams, > > > and > > > > >> independent of how any composition of sources is > done. > > > > > > > > > I do not think this is necessarily true. Taking audio as > an > > example, you > > > could have two audio streams that are mixed to form a > single > > stereo > > > audio stream, or you could have them as two independent > (not > > mixed) > > > streams that are associate with each other by some > grouping > > mechanism. > > > This group would be categorized as being stereo audio > with one > > audio > > > stream being the left and the other the right. The codec > used > > for each > > > could be different, though I agree they would typically > be the > > same. > > > Consequently, I think at attribute such as "stereo" as > being > > more of a > > > grouping concept, where the group may consist of: > > > - multiple independent streams, each with potentially > its own > > spatial > > > orientation, codec, bandwidth, etc., > > > - a single mixed stream > > > > > > > > > > > > [sb] I do not understand this distinction. What do you mean > when you > > say "two audio streams that are > > > mixed to form a single stereo stream", and how is this > different from > > the left and right grouping? > > > > > > In one case they are mixed by the source of the stream into a > single > > stream, and in another they are sent as two separate streams by > the > > source. The end result once rendered at the receiver may be the > same, > > but what is sent is different. This example with audio is > perhaps too > > simple. If you think of it as video that is composed into a > single video > > stream vs. multiple via streams that are sent individually, the > > difference may be more clear. > > > > Cheers, > > Charles > > > > > > > > > > > > > > > > Cheers, > > > Charles > > > > > > > > > > >> I was with you all the way until 4. That one I > don't > > understand. > > > > >> The name you chose for this has connotations for > me, but > > isn't > > > fully in > > > > >> harmony with the definitions you give: > > > > > > > > > > I'm happy to change the name if you have a > suggestion > > > > > > > > Not yet. Maybe once the concepts are more clearly > defined I > > will have > > > an > > > > opinion. > > > > > > > > >> If we consider audio, it makes sense that multiple > streams > > can be > > > > >> rendered as if they came from different physical > locations > > in the > > > > >> receiving room. That can be done by the receiver if > it gets > > those > > > > >> streams separately, and has information about their > > intended > > > > >> relationships. It can also be done by the sender or > MCU and > > passed > > > on > > > > >> to > > > > >> the receiver as a single stream with stereo or > binaural > > coding. > > > > > > > > > > Yes. It could also be done by the sender using the > "linear > > array" > > > audio channel format. Maybe it > > > > is true that stereo or binaural audio channels would > always be > > sent as > > > a single stream, but I was not > > > > assuming that yet, at least not in general when you > consider > > other > > > types too, such as linear array > > > > channels. > > > > > > > > >> So it seems to me you have two concepts here, not > one. One > > has to > > > do > > > > >> with describing the relationships between streams, > and the > > other > > > has to > > > > >> do with the encoding of spacial relationships > *within* a > > single > > > stream. > > > > > > > > > > Maybe that is a better way to describe it, if you > assume > > > multi-channel audio is always sent with all > > > > the channels in the same RTP stream. Is that what you > mean? > > > > > > > > > > I was considering the linear array format to be > another type > > of > > > multi-channel audio, and I know > > > > people want to be able to send each channel in a > separate RTP > > stream. > > > So it doesn't quite fit with > > > > how you separate the two concepts. In my view, > identifying > > the > > > separate channels by what they mean is > > > > the same concept for linear array and stereo. For > example > > "this > > > channel is left, this channel is > > > > center, this channel is right". To me, that is the > same > > concept for > > > identifying channels whether or > > > > not they are carried in the same RTP stream. > > > > > > > > > > Maybe we are thinking the same thing but getting > confused by > > > terminology about channels vs. streams. > > > > > > > > Maybe. Let me try to restate what I now think you are > saying: > > > > > > > > The audio may consist of several "channels". > > > > > > > > Each channel may be sent over its own RTP stream, > > > > or multiple channels may be multiplexed over an RTP > stream. > > > > > > > > I guess much of this can also apply to video. > > > > > > > > When there are exactly two audio channels, they may be > encoded > > as > > > > "stereo" or "binaural", which then affects how they > should be > > rendered > > > > by the recipient. In these cases the primary info that > is > > required > > > about > > > > the individual channels is which is left and which is > right. > > (And > > > which > > > > perspective to use in interpretting left and right.) > > > > > > > > For other multi-channel cases more information is > required > > about the > > > > role of each channel in order to properly render them. > > > > > > > > Thanks, > > > > Paul > > > > > > > > > > > > >> Or, are you asserting that stereo and binaural are > simply > > ways to > > > > >> encode > > > > >> multiple logical streams in one RTP stream, > together with > > their > > > spacial > > > > >> relationships? > > > > > > > > > > No, that is not what I'm trying to say. > > > > > > > > > > Mark > > > > > _______________________________________________ > > > > > clue mailing list > > > > > clue@ietf.org > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > _______________________________________________ > > > > clue mailing list > > > > clue@ietf.org > > > > https://www.ietf.org/mailman/listinfo/clue > > > _______________________________________________ > > > clue mailing list > > > clue@ietf.org > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > >=20 >=20 >=20 From Even.roni@huawei.com Tue Aug 16 16:34:48 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 72AA311E80F6 for ; Tue, 16 Aug 2011 16:34:48 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -103.547 X-Spam-Level: X-Spam-Status: No, score=-103.547 tagged_above=-999 required=5 tests=[AWL=-3.052, BAYES_00=-2.599, FH_RELAY_NODNS=1.451, HELO_MISMATCH_COM=0.553, RDNS_NONE=0.1, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4OYL06Boj5dn for ; Tue, 16 Aug 2011 16:34:47 -0700 (PDT) Received: from szxga03-in.huawei.com (unknown [58.251.152.66]) by ietfa.amsl.com (Postfix) with ESMTP id D5DE511E80EA for ; Tue, 16 Aug 2011 16:34:46 -0700 (PDT) Received: from huawei.com (szxga03-in [172.24.2.9]) by szxga03-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQ1008MRO7B43@szxga03-in.huawei.com> for clue@ietf.org; Wed, 17 Aug 2011 07:35:35 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga03-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQ100ASAO7B2D@szxga03-in.huawei.com> for clue@ietf.org; Wed, 17 Aug 2011 07:35:35 +0800 (CST) Received: from windows8d787f9 (bzq-79-178-13-148.red.bezeqint.net [79.178.13.148]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQ1009M4O71WU@szxml12-in.huawei.com>; Wed, 17 Aug 2011 07:35:35 +0800 (CST) Date: Wed, 17 Aug 2011 02:34:45 +0300 From: Roni Even In-reply-to: To: "'Charles Eckel (eckelcu)'" , 'Stephen Botzko' Message-id: <02a501cc5c6d$1a2bf1e0$4e83d5a0$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: text/plain; charset=us-ascii Content-language: en-us Content-transfer-encoding: 7BIT Thread-index: AcxcWW5KBW5fbJZTSCGTySuMAtQXEAAAx3pQAAPPhbA= References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Aug 2011 23:34:48 -0000 Hi guys, In case 1 according to RFC 3551 (section 4.1) 2 channels in the rtpmap means left and right channels described as stereo. Are you saying that for the 2 and 2b case you also assume stereo capture or can it be any other way of creating the two audio streams from the same room (Binaural recording (not common), or some other arrangements of the microphones). But this talk about the capture side. I think that Christer talked about the rendering side and not only on the capture side. Roni > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of > Charles Eckel (eckelcu) > Sent: Wednesday, August 17, 2011 12:40 AM > To: Stephen Botzko > Cc: clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion > > Agreed. The difference I am trying to point out is that in (1), the > information you need to describe the audio stream for appropriate > rendering is already handled quite well by existing SIP/SDP/RTP and > most > implementations, whereas you need CLUE for (2) and (2b). > > Cheers, > Charles > > > -----Original Message----- > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > Sent: Tuesday, August 16, 2011 2:14 PM > > To: Charles Eckel (eckelcu) > > Cc: Paul Kyzivat; clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > Well, the audio in (1) and (2b) is certainly packetized differently. > But not compressed differently > > (unless you are assuming that the signal in (1) is jointly encoded > stereo - which it could be I guess, > > but it would be unusual for telepresence systems). Also, the audio in > (1) is not mixed, no matter how > > it is encoded. > > > > In any event, I believe that the difference between (1) and (2) and > (2b) is really a transport > > question that has nothing to do with layout. The same information is > needed to enable proper > > rendering, and once the streams are received, they are rendered in > precisely the same way. > > > > Regards, > > Stephen Botzko > > > > > > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu) > wrote: > > > > > > I am distinguishing between: > > > > (1) a single RTP stream that consists of a single stereo audio > stream > > (2) two RTP streams, one that contains left speaker audio and > the other > > than contains right speaker audio > > > > (2) could also be transmitted in a single RTP stream using SSRC > > multiplexing. Let me call that (2b). > > (2) and (2b) are essentially the same. Just the RTP mechanism > employed > > is difference. > > (1) is different from (2) and (2b) in that the audio signal > encoded is > > actually different. > > > > Cheers, > > Charles > > > > > > > -----Original Message----- > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > > Sent: Tuesday, August 16, 2011 6:20 AM > > > To: Charles Eckel (eckelcu) > > > Cc: Paul Kyzivat; clue@ietf.org > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > I guess by "stream" you are meaning RTP stream? in which case > by > > "mix" you perhaps mean that the left > > > and right channels are placed in a single RTP stream??? What > do you > > mean when you describe some audio > > > captures as "independent" - are you thinking they come from > different > > rooms???. > > > > > > I think in many respects audio distribution and spatial audio > layout > > is at least as difficult as video > > > layout, and have some unique issues. For one thing, you need > to sort > > out how you should place the > > > audio from human participants who are not on camera, and what > should > > happen later on if some of those > > > participants are shown. > > > > > > I suggest it is necessary to be very careful with terminology. > In > > particular, I think it is important > > > to distinguish composition from RTP transmission. > > > > > > Regards, > > > Stephen Botzko > > > > > > > > > > > > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) > > wrote: > > > > > > > > > > -----Original Message----- > > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > Sent: Monday, August 15, 2011 2:14 PM > > > > To: Charles Eckel (eckelcu) > > > > Cc: Paul Kyzivat; clue@ietf.org > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > Inline > > > > > > > > > > > > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel > (eckelcu) > > > wrote: > > > > > > > > > > > > Please see inline. > > > > > > > > > > > > > -----Original Message----- > > > > > From: clue-bounces@ietf.org > > [mailto:clue-bounces@ietf.org] On > > > Behalf > > > > Of Paul Kyzivat > > > > > > > > > Sent: Thursday, August 11, 2011 6:02 AM > > > > > > > > > To: clue@ietf.org > > > > > Subject: Re: [clue] continuing "layout" > discussion > > > > > > > > > > Inline > > > > > > > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > > > > >> -----Original Message----- > > > > > >> From: clue-bounces@ietf.org > > [mailto:clue-bounces@ietf.org] > > > On > > > > Behalf Of > > > > > >> Paul Kyzivat > > > > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > > > > >> To: clue@ietf.org > > > > > >> Subject: Re: [clue] continuing "layout" > discussion > > > > > > > > > > > >>> 4 - multi stream media format - what the > streams > > mean with > > > respect > > > > to > > > > > >> each other, regardless of the actual > content on the > > > streams. For > > > > > >> audio, examples are stereo, 5.1 surround, > binaural, > > linear > > > array. > > > > > >> (linear array is described in the clue > framework > > document). > > > > Perhaps 3D > > > > > >> video formats would also fit in this > category. > > This > > > information is > > > > > >> needed in order to properly render the > media into > > light and > > > sound > > > > for > > > > > >> human observers. I see this at the same > level as > > > identifying a > > > > codec, > > > > > >> independent of the audio or video content > carried > > on the > > > streams, > > > > and > > > > > >> independent of how any composition of > sources is > > done. > > > > > > > > > > > > I do not think this is necessarily true. Taking > audio as > > an > > > example, you > > > > could have two audio streams that are mixed to > form a > > single > > > stereo > > > > audio stream, or you could have them as two > independent > > (not > > > mixed) > > > > streams that are associate with each other by > some > > grouping > > > mechanism. > > > > This group would be categorized as being stereo > audio > > with one > > > audio > > > > stream being the left and the other the right. > The codec > > used > > > for each > > > > could be different, though I agree they would > typically > > be the > > > same. > > > > Consequently, I think at attribute such as > "stereo" as > > being > > > more of a > > > > grouping concept, where the group may consist > of: > > > > - multiple independent streams, each with > potentially > > its own > > > spatial > > > > orientation, codec, bandwidth, etc., > > > > - a single mixed stream > > > > > > > > > > > > > > > > [sb] I do not understand this distinction. What do > you mean > > when you > > > say "two audio streams that are > > > > mixed to form a single stereo stream", and how is this > > different from > > > the left and right grouping? > > > > > > > > > In one case they are mixed by the source of the stream > into a > > single > > > stream, and in another they are sent as two separate > streams by > > the > > > source. The end result once rendered at the receiver may > be the > > same, > > > but what is sent is different. This example with audio > is > > perhaps too > > > simple. If you think of it as video that is composed > into a > > single video > > > stream vs. multiple via streams that are sent > individually, the > > > difference may be more clear. > > > > > > Cheers, > > > Charles > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > Charles > > > > > > > > > > > > > >> I was with you all the way until 4. That > one I > > don't > > > understand. > > > > > >> The name you chose for this has > connotations for > > me, but > > > isn't > > > > fully in > > > > > >> harmony with the definitions you give: > > > > > > > > > > > > I'm happy to change the name if you have a > > suggestion > > > > > > > > > > Not yet. Maybe once the concepts are more > clearly > > defined I > > > will have > > > > an > > > > > opinion. > > > > > > > > > > >> If we consider audio, it makes sense that > multiple > > streams > > > can be > > > > > >> rendered as if they came from different > physical > > locations > > > in the > > > > > >> receiving room. That can be done by the > receiver if > > it gets > > > those > > > > > >> streams separately, and has information > about their > > > intended > > > > > >> relationships. It can also be done by the > sender or > > MCU and > > > passed > > > > on > > > > > >> to > > > > > >> the receiver as a single stream with stereo > or > > binaural > > > coding. > > > > > > > > > > > > Yes. It could also be done by the sender > using the > > "linear > > > array" > > > > audio channel format. Maybe it > > > > > is true that stereo or binaural audio channels > would > > always be > > > sent as > > > > a single stream, but I was not > > > > > assuming that yet, at least not in general > when you > > consider > > > other > > > > types too, such as linear array > > > > > channels. > > > > > > > > > > >> So it seems to me you have two concepts > here, not > > one. One > > > has to > > > > do > > > > > >> with describing the relationships between > streams, > > and the > > > other > > > > has to > > > > > >> do with the encoding of spacial > relationships > > *within* a > > > single > > > > stream. > > > > > > > > > > > > Maybe that is a better way to describe it, > if you > > assume > > > > multi-channel audio is always sent with all > > > > > the channels in the same RTP stream. Is that > what you > > mean? > > > > > > > > > > > > I was considering the linear array format to > be > > another type > > > of > > > > multi-channel audio, and I know > > > > > people want to be able to send each channel in > a > > separate RTP > > > stream. > > > > So it doesn't quite fit with > > > > > how you separate the two concepts. In my > view, > > identifying > > > the > > > > separate channels by what they mean is > > > > > the same concept for linear array and stereo. > For > > example > > > "this > > > > channel is left, this channel is > > > > > center, this channel is right". To me, that > is the > > same > > > concept for > > > > identifying channels whether or > > > > > not they are carried in the same RTP stream. > > > > > > > > > > > > Maybe we are thinking the same thing but > getting > > confused by > > > > terminology about channels vs. streams. > > > > > > > > > > Maybe. Let me try to restate what I now think > you are > > saying: > > > > > > > > > > The audio may consist of several "channels". > > > > > > > > > > Each channel may be sent over its own RTP > stream, > > > > > or multiple channels may be multiplexed over > an RTP > > stream. > > > > > > > > > > I guess much of this can also apply to video. > > > > > > > > > > When there are exactly two audio channels, > they may be > > encoded > > > as > > > > > "stereo" or "binaural", which then affects how > they > > should be > > > rendered > > > > > by the recipient. In these cases the primary > info that > > is > > > required > > > > about > > > > > the individual channels is which is left and > which is > > right. > > > (And > > > > which > > > > > perspective to use in interpretting left and > right.) > > > > > > > > > > For other multi-channel cases more information > is > > required > > > about the > > > > > role of each channel in order to properly > render them. > > > > > > > > > > Thanks, > > > > > Paul > > > > > > > > > > > > > > > >> Or, are you asserting that stereo and > binaural are > > simply > > > ways to > > > > > >> encode > > > > > >> multiple logical streams in one RTP stream, > > together with > > > their > > > > spacial > > > > > >> relationships? > > > > > > > > > > > > No, that is not what I'm trying to say. > > > > > > > > > > > > Mark > > > > > > > _______________________________________________ > > > > > > clue mailing list > > > > > > clue@ietf.org > > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > clue mailing list > > > > > clue@ietf.org > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > _______________________________________________ > > > > clue mailing list > > > > clue@ietf.org > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue From stephen.botzko@gmail.com Tue Aug 16 18:19:40 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CE98E21F8B05 for ; Tue, 16 Aug 2011 18:19:40 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -3.371 X-Spam-Level: X-Spam-Status: No, score=-3.371 tagged_above=-999 required=5 tests=[AWL=0.227, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Iet2Hf71+L5G for ; Tue, 16 Aug 2011 18:19:39 -0700 (PDT) Received: from mail-vw0-f44.google.com (mail-vw0-f44.google.com [209.85.212.44]) by ietfa.amsl.com (Postfix) with ESMTP id A578721F8AFA for ; Tue, 16 Aug 2011 18:19:38 -0700 (PDT) Received: by vws12 with SMTP id 12so313128vws.31 for ; Tue, 16 Aug 2011 18:20:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=/2WkaNoxVi9ETCLE5rZxjBSP6jP72qI6Pb1rQ3J2UPs=; b=C9H/fPfnuCT9EltMZYikORCq43Uv46n966OyBpmIpm0jSNmT7pMkHy5g9OdVB3T/xk fDa+uNx1fQAatRuDr7awopqzw4jeGJRiqfIOhJexf/v/Lj1jh4a+oheA1pDqt1emyM/K VMYZK6a0B9kUAfc8Xnr5VvMFW1t7snZ7nkIEw= MIME-Version: 1.0 Received: by 10.52.23.20 with SMTP id i20mr365925vdf.356.1313544026308; Tue, 16 Aug 2011 18:20:26 -0700 (PDT) Received: by 10.52.115.103 with HTTP; Tue, 16 Aug 2011 18:20:26 -0700 (PDT) In-Reply-To: References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> Date: Tue, 16 Aug 2011 21:20:26 -0400 Message-ID: From: Stephen Botzko To: "Charles Eckel (eckelcu)" Content-Type: multipart/alternative; boundary=20cf3079b87015c09804aaa94d16 Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Aug 2011 01:19:40 -0000 --20cf3079b87015c09804aaa94d16 Content-Type: text/plain; charset=ISO-8859-1 That is not a correct conclusion. In order to render correctly (for 1, 2, and 2b), you need to align the apparent sound location with the rendered video image. This requires CLUE, it is not covered at all by existing SIP/SDP/RTP signaling. Regards, Stephen On Tue, Aug 16, 2011 at 5:40 PM, Charles Eckel (eckelcu) wrote: > Agreed. The difference I am trying to point out is that in (1), the > information you need to describe the audio stream for appropriate > rendering is already handled quite well by existing SIP/SDP/RTP and most > implementations, whereas you need CLUE for (2) and (2b). > > Cheers, > Charles > > > -----Original Message----- > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > Sent: Tuesday, August 16, 2011 2:14 PM > > To: Charles Eckel (eckelcu) > > Cc: Paul Kyzivat; clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > Well, the audio in (1) and (2b) is certainly packetized differently. > But not compressed differently > > (unless you are assuming that the signal in (1) is jointly encoded > stereo - which it could be I guess, > > but it would be unusual for telepresence systems). Also, the audio in > (1) is not mixed, no matter how > > it is encoded. > > > > In any event, I believe that the difference between (1) and (2) and > (2b) is really a transport > > question that has nothing to do with layout. The same information is > needed to enable proper > > rendering, and once the streams are received, they are rendered in > precisely the same way. > > > > Regards, > > Stephen Botzko > > > > > > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu) > wrote: > > > > > > I am distinguishing between: > > > > (1) a single RTP stream that consists of a single stereo audio > stream > > (2) two RTP streams, one that contains left speaker audio and > the other > > than contains right speaker audio > > > > (2) could also be transmitted in a single RTP stream using SSRC > > multiplexing. Let me call that (2b). > > (2) and (2b) are essentially the same. Just the RTP mechanism > employed > > is difference. > > (1) is different from (2) and (2b) in that the audio signal > encoded is > > actually different. > > > > Cheers, > > Charles > > > > > > > -----Original Message----- > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > > Sent: Tuesday, August 16, 2011 6:20 AM > > > To: Charles Eckel (eckelcu) > > > Cc: Paul Kyzivat; clue@ietf.org > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > I guess by "stream" you are meaning RTP stream? in which case > by > > "mix" you perhaps mean that the left > > > and right channels are placed in a single RTP stream??? What > do you > > mean when you describe some audio > > > captures as "independent" - are you thinking they come from > different > > rooms???. > > > > > > I think in many respects audio distribution and spatial audio > layout > > is at least as difficult as video > > > layout, and have some unique issues. For one thing, you need > to sort > > out how you should place the > > > audio from human participants who are not on camera, and what > should > > happen later on if some of those > > > participants are shown. > > > > > > I suggest it is necessary to be very careful with terminology. > In > > particular, I think it is important > > > to distinguish composition from RTP transmission. > > > > > > Regards, > > > Stephen Botzko > > > > > > > > > > > > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) > > wrote: > > > > > > > > > > -----Original Message----- > > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > Sent: Monday, August 15, 2011 2:14 PM > > > > To: Charles Eckel (eckelcu) > > > > Cc: Paul Kyzivat; clue@ietf.org > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > Inline > > > > > > > > > > > > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel > (eckelcu) > > > wrote: > > > > > > > > > > > > Please see inline. > > > > > > > > > > > > > -----Original Message----- > > > > > From: clue-bounces@ietf.org > > [mailto:clue-bounces@ietf.org] On > > > Behalf > > > > Of Paul Kyzivat > > > > > > > > > Sent: Thursday, August 11, 2011 6:02 AM > > > > > > > > > To: clue@ietf.org > > > > > Subject: Re: [clue] continuing "layout" > discussion > > > > > > > > > > Inline > > > > > > > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > > > > >> -----Original Message----- > > > > > >> From: clue-bounces@ietf.org > > [mailto:clue-bounces@ietf.org] > > > On > > > > Behalf Of > > > > > >> Paul Kyzivat > > > > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > > > > >> To: clue@ietf.org > > > > > >> Subject: Re: [clue] continuing "layout" > discussion > > > > > > > > > > > >>> 4 - multi stream media format - what the > streams > > mean with > > > respect > > > > to > > > > > >> each other, regardless of the actual > content on the > > > streams. For > > > > > >> audio, examples are stereo, 5.1 surround, > binaural, > > linear > > > array. > > > > > >> (linear array is described in the clue > framework > > document). > > > > Perhaps 3D > > > > > >> video formats would also fit in this > category. > > This > > > information is > > > > > >> needed in order to properly render the > media into > > light and > > > sound > > > > for > > > > > >> human observers. I see this at the same > level as > > > identifying a > > > > codec, > > > > > >> independent of the audio or video content > carried > > on the > > > streams, > > > > and > > > > > >> independent of how any composition of > sources is > > done. > > > > > > > > > > > > I do not think this is necessarily true. Taking > audio as > > an > > > example, you > > > > could have two audio streams that are mixed to > form a > > single > > > stereo > > > > audio stream, or you could have them as two > independent > > (not > > > mixed) > > > > streams that are associate with each other by > some > > grouping > > > mechanism. > > > > This group would be categorized as being stereo > audio > > with one > > > audio > > > > stream being the left and the other the right. > The codec > > used > > > for each > > > > could be different, though I agree they would > typically > > be the > > > same. > > > > Consequently, I think at attribute such as > "stereo" as > > being > > > more of a > > > > grouping concept, where the group may consist > of: > > > > - multiple independent streams, each with > potentially > > its own > > > spatial > > > > orientation, codec, bandwidth, etc., > > > > - a single mixed stream > > > > > > > > > > > > > > > > [sb] I do not understand this distinction. What do > you mean > > when you > > > say "two audio streams that are > > > > mixed to form a single stereo stream", and how is this > > different from > > > the left and right grouping? > > > > > > > > > In one case they are mixed by the source of the stream > into a > > single > > > stream, and in another they are sent as two separate > streams by > > the > > > source. The end result once rendered at the receiver may > be the > > same, > > > but what is sent is different. This example with audio > is > > perhaps too > > > simple. If you think of it as video that is composed > into a > > single video > > > stream vs. multiple via streams that are sent > individually, the > > > difference may be more clear. > > > > > > Cheers, > > > Charles > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > Charles > > > > > > > > > > > > > >> I was with you all the way until 4. That > one I > > don't > > > understand. > > > > > >> The name you chose for this has > connotations for > > me, but > > > isn't > > > > fully in > > > > > >> harmony with the definitions you give: > > > > > > > > > > > > I'm happy to change the name if you have a > > suggestion > > > > > > > > > > Not yet. Maybe once the concepts are more > clearly > > defined I > > > will have > > > > an > > > > > opinion. > > > > > > > > > > >> If we consider audio, it makes sense that > multiple > > streams > > > can be > > > > > >> rendered as if they came from different > physical > > locations > > > in the > > > > > >> receiving room. That can be done by the > receiver if > > it gets > > > those > > > > > >> streams separately, and has information > about their > > > intended > > > > > >> relationships. It can also be done by the > sender or > > MCU and > > > passed > > > > on > > > > > >> to > > > > > >> the receiver as a single stream with stereo > or > > binaural > > > coding. > > > > > > > > > > > > Yes. It could also be done by the sender > using the > > "linear > > > array" > > > > audio channel format. Maybe it > > > > > is true that stereo or binaural audio channels > would > > always be > > > sent as > > > > a single stream, but I was not > > > > > assuming that yet, at least not in general > when you > > consider > > > other > > > > types too, such as linear array > > > > > channels. > > > > > > > > > > >> So it seems to me you have two concepts > here, not > > one. One > > > has to > > > > do > > > > > >> with describing the relationships between > streams, > > and the > > > other > > > > has to > > > > > >> do with the encoding of spacial > relationships > > *within* a > > > single > > > > stream. > > > > > > > > > > > > Maybe that is a better way to describe it, > if you > > assume > > > > multi-channel audio is always sent with all > > > > > the channels in the same RTP stream. Is that > what you > > mean? > > > > > > > > > > > > I was considering the linear array format to > be > > another type > > > of > > > > multi-channel audio, and I know > > > > > people want to be able to send each channel in > a > > separate RTP > > > stream. > > > > So it doesn't quite fit with > > > > > how you separate the two concepts. In my > view, > > identifying > > > the > > > > separate channels by what they mean is > > > > > the same concept for linear array and stereo. > For > > example > > > "this > > > > channel is left, this channel is > > > > > center, this channel is right". To me, that > is the > > same > > > concept for > > > > identifying channels whether or > > > > > not they are carried in the same RTP stream. > > > > > > > > > > > > Maybe we are thinking the same thing but > getting > > confused by > > > > terminology about channels vs. streams. > > > > > > > > > > Maybe. Let me try to restate what I now think > you are > > saying: > > > > > > > > > > The audio may consist of several "channels". > > > > > > > > > > Each channel may be sent over its own RTP > stream, > > > > > or multiple channels may be multiplexed over > an RTP > > stream. > > > > > > > > > > I guess much of this can also apply to video. > > > > > > > > > > When there are exactly two audio channels, > they may be > > encoded > > > as > > > > > "stereo" or "binaural", which then affects how > they > > should be > > > rendered > > > > > by the recipient. In these cases the primary > info that > > is > > > required > > > > about > > > > > the individual channels is which is left and > which is > > right. > > > (And > > > > which > > > > > perspective to use in interpretting left and > right.) > > > > > > > > > > For other multi-channel cases more information > is > > required > > > about the > > > > > role of each channel in order to properly > render them. > > > > > > > > > > Thanks, > > > > > Paul > > > > > > > > > > > > > > > >> Or, are you asserting that stereo and > binaural are > > simply > > > ways to > > > > > >> encode > > > > > >> multiple logical streams in one RTP stream, > > together with > > > their > > > > spacial > > > > > >> relationships? > > > > > > > > > > > > No, that is not what I'm trying to say. > > > > > > > > > > > > Mark > > > > > > > _______________________________________________ > > > > > > clue mailing list > > > > > > clue@ietf.org > > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > clue mailing list > > > > > clue@ietf.org > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > _______________________________________________ > > > > clue mailing list > > > > clue@ietf.org > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > > > > > > > > > --20cf3079b87015c09804aaa94d16 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable That is not a correct conclusion.=A0 In order to render correctly (for 1, 2= , and 2b), you need to align the apparent sound location with the rendered = video image.=A0 This requires CLUE, it is not covered at all by existing SI= P/SDP/RTP signaling.

Regards,
Stephen

On Tue, Aug 16, 2= 011 at 5:40 PM, Charles Eckel (eckelcu) <eckelcu@cisco.com> wrote:
Agreed. The difference I am trying to point out is that in (1), the
information you need to describe the audio stream for appropriate
rendering is already handled quite well by existing SIP/SDP/RTP and most implementations, whereas you need CLUE for (2) and (2b).

Cheers,
Charles

> -----Original Message-----
> From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> Sent: Tuesday, August 16, 2011= 2:14 PM
> To: Charles Eckel (eckelcu)
> Cc: Paul Kyzivat; clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion
>
> Well, the audio in (1) and (2b) is certainly packetized differently. But not compressed differently
> (unless you are assuming that the signal in (1) is jointly encoded
stereo - which it could be I guess,
> but it would be unusual for telepresence systems). Also, the audio in<= br> (1) is not mixed, no matter how
> it is encoded.
>
> In any event, I believe that the difference between (1) and (2) and (2b) is really a transport
> question that has nothing to do with layout. The same information is needed to enable proper
> rendering, and once the streams are received, they are rendered in
precisely the same way.
>
> Regards,
> Stephen Botzko
>
>
> On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu)
<eckelcu@cisco.com> wrote: >
>
> =A0 =A0 =A0 I am distinguishing between:
>
> =A0 =A0 =A0 (1) a single RTP stream that consists of a single stereo a= udio
stream
> =A0 =A0 =A0 (2) two RTP streams, one that contains left speaker audio = and
the other
> =A0 =A0 =A0 than contains right speaker audio
>
> =A0 =A0 =A0 (2) could also be transmitted in a single RTP stream using= SSRC
> =A0 =A0 =A0 multiplexing. Let me call that (2b).
> =A0 =A0 =A0 (2) and (2b) are essentially the same. Just the RTP mechan= ism
employed
> =A0 =A0 =A0 is difference.
> =A0 =A0 =A0 (1) is different from (2) and (2b) in that the audio signa= l
encoded is
> =A0 =A0 =A0 actually different.
>
> =A0 =A0 =A0 Cheers,
> =A0 =A0 =A0 Charles
>
>
> =A0 =A0 =A0 > -----Original Message-----
> =A0 =A0 =A0 > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
>
> =A0 =A0 =A0 > Sent: Tuesday, August 16, 2011 6:20 AM
> =A0 =A0 =A0 > To: Charles Eckel (eckelcu)
> =A0 =A0 =A0 > Cc: Paul Kyzivat; cl= ue@ietf.org
> =A0 =A0 =A0 > Subject: Re: [clue] continuing "layout" dis= cussion
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > I guess by "stream" you are meaning RTP str= eam? =A0in which case
by
> =A0 =A0 =A0 "mix" you perhaps mean that the left
> =A0 =A0 =A0 > and right channels are placed in a single RTP stream?= ?? =A0What
do you
> =A0 =A0 =A0 mean when you describe some audio
> =A0 =A0 =A0 > captures as "independent" - are you thinkin= g they come from
different
> =A0 =A0 =A0 rooms???.
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > I think in many respects audio distribution and spati= al audio
layout
> =A0 =A0 =A0 is at least as difficult as video
> =A0 =A0 =A0 > layout, and have some unique issues. =A0For one thing= , you need
to sort
> =A0 =A0 =A0 out how you should place the
> =A0 =A0 =A0 > audio from human participants who are not on camera, = and what
should
> =A0 =A0 =A0 happen later on if some of those
> =A0 =A0 =A0 > participants are shown.
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > I suggest it is necessary to be very careful with ter= minology.
In
> =A0 =A0 =A0 particular, I think it is important
> =A0 =A0 =A0 > to distinguish composition from RTP transmission.
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > Regards,
> =A0 =A0 =A0 > Stephen Botzko
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckel= cu)
> =A0 =A0 =A0 <eckelcu@cisco.com= > wrote:
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > -----Original Message-----
> =A0 =A0 =A0 > =A0 =A0 =A0 > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> =A0 =A0 =A0 > =A0 =A0 =A0 > Sent: Monday, August 15, 2011 2:14 P= M
> =A0 =A0 =A0 > =A0 =A0 =A0 > To: Charles Eckel (eckelcu)
> =A0 =A0 =A0 > =A0 =A0 =A0 > Cc: Paul Kyzivat; clue@ietf.org
> =A0 =A0 =A0 > =A0 =A0 =A0 > Subject: Re: [clue] continuing "= ;layout" discussion
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > Inline
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > On Mon, Aug 15, 2011 at 4:21 PM, Cha= rles Eckel
(eckelcu)
> =A0 =A0 =A0 > =A0 =A0 =A0 <= eckelcu@cisco.com> wrote:
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Please see inline.
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > -----Original Messa= ge-----
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > From: clue-bounces@ietf.org
> =A0 =A0 =A0 [mailto:clue-boun= ces@ietf.org] On
> =A0 =A0 =A0 > =A0 =A0 =A0 Behalf
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Of Paul Kyzivat
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Sent: Thursday, Aug= ust 11, 2011 6:02 AM
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > To: clue@ietf.org
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Subject: Re: [clue]= continuing "layout"
discussion
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Inline
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > On 8/10/11 5:49 PM,= Duckworth, Mark wrote:
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> -----Origi= nal Message-----
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> From: clue-bounces@ietf.org
> =A0 =A0 =A0 [mailto:clue-boun= ces@ietf.org]
> =A0 =A0 =A0 > =A0 =A0 =A0 On
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Behalf Of
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> Paul Kyziv= at
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> Sent: Tues= day, August 09, 2011 9:03 AM
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> To: clue@ietf.org
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> Subject: R= e: [clue] continuing "layout"
discussion
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >>> 4 - mu= lti stream media format - what the
streams
> =A0 =A0 =A0 mean with
> =A0 =A0 =A0 > =A0 =A0 =A0 respect
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 to
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> each other= , regardless of the actual
content on the
> =A0 =A0 =A0 > =A0 =A0 =A0 streams. =A0For
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> audio, exa= mples are stereo, 5.1 surround,
binaural,
> =A0 =A0 =A0 linear
> =A0 =A0 =A0 > =A0 =A0 =A0 array.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> (linear ar= ray is described in the clue
framework
> =A0 =A0 =A0 document).
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Perhaps 3D
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> video form= ats would also fit in this
category.
> =A0 =A0 =A0 This
> =A0 =A0 =A0 > =A0 =A0 =A0 information is
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> needed in = order to properly render the
media into
> =A0 =A0 =A0 light and
> =A0 =A0 =A0 > =A0 =A0 =A0 sound
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 for
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> human obse= rvers. =A0I see this at the same
level as
> =A0 =A0 =A0 > =A0 =A0 =A0 identifying a
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 codec,
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> independen= t of the audio or video content
carried
> =A0 =A0 =A0 on the
> =A0 =A0 =A0 > =A0 =A0 =A0 streams,
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 and
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> independen= t of how any composition of
sources is
> =A0 =A0 =A0 done.
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 I do not think this is n= ecessarily true. Taking
audio as
> =A0 =A0 =A0 an
> =A0 =A0 =A0 > =A0 =A0 =A0 example, you
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 could have two audio str= eams that are mixed to
form a
> =A0 =A0 =A0 single
> =A0 =A0 =A0 > =A0 =A0 =A0 stereo
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 audio stream, or you cou= ld have them as two
independent
> =A0 =A0 =A0 (not
> =A0 =A0 =A0 > =A0 =A0 =A0 mixed)
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 streams that are associa= te with each other by
some
> =A0 =A0 =A0 grouping
> =A0 =A0 =A0 > =A0 =A0 =A0 mechanism.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 This group would be cate= gorized as being stereo
audio
> =A0 =A0 =A0 with one
> =A0 =A0 =A0 > =A0 =A0 =A0 audio
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 stream being the left an= d the other the right.
The codec
> =A0 =A0 =A0 used
> =A0 =A0 =A0 > =A0 =A0 =A0 for each
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 could be different, thou= gh I agree they would
typically
> =A0 =A0 =A0 be the
> =A0 =A0 =A0 > =A0 =A0 =A0 same.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Consequently, I think at= attribute such as
"stereo" as
> =A0 =A0 =A0 being
> =A0 =A0 =A0 > =A0 =A0 =A0 more of a
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 grouping concept, where = the group may consist
of:
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 - multiple independent s= treams, each with
potentially
> =A0 =A0 =A0 its own
> =A0 =A0 =A0 > =A0 =A0 =A0 spatial
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 orientation, codec, band= width, etc.,
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 - a single mixed stream<= br> > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > [sb] I do not understand this distin= ction. =A0What do
you mean
> =A0 =A0 =A0 when you
> =A0 =A0 =A0 > =A0 =A0 =A0 say "two audio streams that are
> =A0 =A0 =A0 > =A0 =A0 =A0 > mixed to form a single stereo stream= ", and how is this
> =A0 =A0 =A0 different from
> =A0 =A0 =A0 > =A0 =A0 =A0 the left and right grouping?
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 In one case they are mixed by the source = of the stream
into a
> =A0 =A0 =A0 single
> =A0 =A0 =A0 > =A0 =A0 =A0 stream, and in another they are sent as t= wo separate
streams by
> =A0 =A0 =A0 the
> =A0 =A0 =A0 > =A0 =A0 =A0 source. The end result once rendered at t= he receiver may
be the
> =A0 =A0 =A0 same,
> =A0 =A0 =A0 > =A0 =A0 =A0 but what is sent is different. This examp= le with audio
is
> =A0 =A0 =A0 perhaps too
> =A0 =A0 =A0 > =A0 =A0 =A0 simple. If you think of it as video that = is composed
into a
> =A0 =A0 =A0 single video
> =A0 =A0 =A0 > =A0 =A0 =A0 stream vs. multiple via streams that are = sent
individually, the
> =A0 =A0 =A0 > =A0 =A0 =A0 difference may be more clear.
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 Cheers,
> =A0 =A0 =A0 > =A0 =A0 =A0 Charles
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Cheers,
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Charles
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> I was with= you all the way until 4. That
one I
> =A0 =A0 =A0 don't
> =A0 =A0 =A0 > =A0 =A0 =A0 understand.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> The name y= ou chose for this has
connotations for
> =A0 =A0 =A0 me, but
> =A0 =A0 =A0 > =A0 =A0 =A0 isn't
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 fully in
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> harmony wi= th the definitions you give:
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > I'm happy = to change the name if you have a
> =A0 =A0 =A0 suggestion
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Not yet. Maybe once= the concepts are more
clearly
> =A0 =A0 =A0 defined I
> =A0 =A0 =A0 > =A0 =A0 =A0 will have
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 an
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > opinion.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> If we cons= ider audio, it makes sense that
multiple
> =A0 =A0 =A0 streams
> =A0 =A0 =A0 > =A0 =A0 =A0 can be
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> rendered a= s if they came from different
physical
> =A0 =A0 =A0 locations
> =A0 =A0 =A0 > =A0 =A0 =A0 in the
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> receiving = room. That can be done by the
receiver if
> =A0 =A0 =A0 it gets
> =A0 =A0 =A0 > =A0 =A0 =A0 those
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> streams se= parately, and has information
about their
> =A0 =A0 =A0 > =A0 =A0 =A0 intended
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> relationsh= ips. It can also be done by the
sender or
> =A0 =A0 =A0 MCU and
> =A0 =A0 =A0 > =A0 =A0 =A0 passed
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 on
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> to
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> the receiv= er as a single stream with stereo
or
> =A0 =A0 =A0 binaural
> =A0 =A0 =A0 > =A0 =A0 =A0 coding.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > Yes. =A0It cou= ld also be done by the sender
using the
> =A0 =A0 =A0 "linear
> =A0 =A0 =A0 > =A0 =A0 =A0 array"
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 audio channel format. = =A0Maybe it
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > is true that stereo= or binaural audio channels
would
> =A0 =A0 =A0 always be
> =A0 =A0 =A0 > =A0 =A0 =A0 sent as
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 a single stream, but I w= as not
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > assuming that yet, = at least not in general
when you
> =A0 =A0 =A0 consider
> =A0 =A0 =A0 > =A0 =A0 =A0 other
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 types too, such as linea= r array
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > channels.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> So it seem= s to me you have two concepts
here, not
> =A0 =A0 =A0 one. One
> =A0 =A0 =A0 > =A0 =A0 =A0 has to
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 do
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> with descr= ibing the relationships between
streams,
> =A0 =A0 =A0 and the
> =A0 =A0 =A0 > =A0 =A0 =A0 other
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 has to
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> do with th= e encoding of spacial
relationships
> =A0 =A0 =A0 *within* a
> =A0 =A0 =A0 > =A0 =A0 =A0 single
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 stream.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > Maybe that is = a better way to describe it,
if you
> =A0 =A0 =A0 assume
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 multi-channel audio is a= lways sent with all
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > the channels in the= same RTP stream. =A0Is that
what you
> =A0 =A0 =A0 mean?
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > I was consider= ing the linear array format to
be
> =A0 =A0 =A0 another type
> =A0 =A0 =A0 > =A0 =A0 =A0 of
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 multi-channel audio, and= I know
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > people want to be a= ble to send each channel in
a
> =A0 =A0 =A0 separate RTP
> =A0 =A0 =A0 > =A0 =A0 =A0 stream.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 So it doesn't quite = fit with
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > how you separate th= e two concepts. =A0In my
view,
> =A0 =A0 =A0 identifying
> =A0 =A0 =A0 > =A0 =A0 =A0 the
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 separate channels by wha= t they mean is
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > the same concept fo= r linear array and stereo.
For
> =A0 =A0 =A0 example
> =A0 =A0 =A0 > =A0 =A0 =A0 "this
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 channel is left, this ch= annel is
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > center, this channe= l is right". =A0To me, that
is the
> =A0 =A0 =A0 same
> =A0 =A0 =A0 > =A0 =A0 =A0 concept for
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 identifying channels whe= ther or
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > not they are carrie= d in the same RTP stream.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > Maybe we are t= hinking the same thing but
getting
> =A0 =A0 =A0 confused by
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 terminology about channe= ls vs. streams.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Maybe. Let me try t= o restate what I now think
you are
> =A0 =A0 =A0 saying:
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > The audio may consi= st of several "channels".
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Each channel may be= sent over its own RTP
stream,
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > or multiple channel= s may be multiplexed over
an RTP
> =A0 =A0 =A0 stream.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > I guess much of thi= s can also apply to video.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > When there are exac= tly two audio channels,
they may be
> =A0 =A0 =A0 encoded
> =A0 =A0 =A0 > =A0 =A0 =A0 as
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > "stereo" = or "binaural", which then affects how
they
> =A0 =A0 =A0 should be
> =A0 =A0 =A0 > =A0 =A0 =A0 rendered
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > by the recipient. I= n these cases the primary
info that
> =A0 =A0 =A0 is
> =A0 =A0 =A0 > =A0 =A0 =A0 required
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 about
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > the individual chan= nels is which is left and
which is
> =A0 =A0 =A0 right.
> =A0 =A0 =A0 > =A0 =A0 =A0 (And
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 which
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > perspective to use = in interpretting left and
right.)
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > For other multi-cha= nnel cases more information
is
> =A0 =A0 =A0 required
> =A0 =A0 =A0 > =A0 =A0 =A0 about the
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > role of each channe= l in order to properly
render them.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Thanks,=
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Paul > =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> Or, are yo= u asserting that stereo and
binaural are
> =A0 =A0 =A0 simply
> =A0 =A0 =A0 > =A0 =A0 =A0 ways to
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> encode
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> multiple l= ogical streams in one RTP stream,
> =A0 =A0 =A0 together with
> =A0 =A0 =A0 > =A0 =A0 =A0 their
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 spacial
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> relationsh= ips?
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > No, that is no= t what I'm trying to say.
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > Mark
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
_______________________________________________
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > clue mailing l= ist
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > clue@ietf.org
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > https://www.ietf= .org/mailman/listinfo/clue
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
_______________________________________________
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > clue mailing list > =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > clue@ietf.org
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > https://www.ietf.org/= mailman/listinfo/clue
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 ________________________= _______________________
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 clue mailing list
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 clue@ietf.org
> =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 https://www.ietf.org/mailm= an/listinfo/clue
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 > =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
> =A0 =A0 =A0 >
>
>
>


--20cf3079b87015c09804aaa94d16-- From stephen.botzko@gmail.com Tue Aug 16 18:35:52 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C058321F862F for ; Tue, 16 Aug 2011 18:35:52 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -3.382 X-Spam-Level: X-Spam-Status: No, score=-3.382 tagged_above=-999 required=5 tests=[AWL=0.216, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZisEeKe73if6 for ; Tue, 16 Aug 2011 18:35:50 -0700 (PDT) Received: from mail-vx0-f172.google.com (mail-vx0-f172.google.com [209.85.220.172]) by ietfa.amsl.com (Postfix) with ESMTP id 6732121F861E for ; Tue, 16 Aug 2011 18:35:50 -0700 (PDT) Received: by vxi29 with SMTP id 29so564629vxi.31 for ; Tue, 16 Aug 2011 18:36:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=JDjlbgLfGjhjmOs2cZ72+29Rc7aovhrc/lI19lJBJLw=; b=fIZb8MSQ3WIuXrdjYuTdAY3lqjLuLjnJRRCCh8ojAMC5YS8vwVoakmgp9Bo1N6UGZX xk4vhSOVDLmwWp8wxYx3V/+xEYJMkpMYlp5oNpn+2p5V9VKL3vxkKjPRzJbqIHpgHWMK yw+Qo499xofdkWzTrRYlspfeTicjRe6/LFM54= MIME-Version: 1.0 Received: by 10.52.182.6 with SMTP id ea6mr431367vdc.222.1313544999955; Tue, 16 Aug 2011 18:36:39 -0700 (PDT) Received: by 10.52.115.103 with HTTP; Tue, 16 Aug 2011 18:36:39 -0700 (PDT) In-Reply-To: <02a501cc5c6d$1a2bf1e0$4e83d5a0$%roni@huawei.com> References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> <02a501cc5c6d$1a2bf1e0$4e83d5a0$%roni@huawei.com> Date: Tue, 16 Aug 2011 21:36:39 -0400 Message-ID: From: Stephen Botzko To: Roni Even Content-Type: multipart/alternative; boundary=bcaec54861941e6cc804aaa98716 Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Aug 2011 01:35:52 -0000 --bcaec54861941e6cc804aaa98716 Content-Type: text/plain; charset=ISO-8859-1 Hi Roni For this particular discussion, all of the two channel transmissions are "stereo", they are just transported differently. As far as the framework draft is concerned, the various microphone arrangements are accounted for by the signaling of the 1-100 indices for each channel. Binaural is something else- either an HRTF function is applied to the two channels prior to rendering (which was Christer's case with the central rendering server), or you have a dummy head with microphones in the ears in the telepresence room to make the capture. Not sure if we need to distinguish the capture and render cases right now. Regards, Stephen On Tue, Aug 16, 2011 at 7:34 PM, Roni Even wrote: > Hi guys, > In case 1 according to RFC 3551 (section 4.1) 2 channels in the rtpmap > means > left and right channels described as stereo. Are you saying that for the 2 > and 2b case you also assume stereo capture or can it be any other way of > creating the two audio streams from the same room (Binaural recording (not > common), or some other arrangements of the microphones). But this talk > about > the capture side. > > I think that Christer talked about the rendering side and not only on the > capture side. > > Roni > > > -----Original Message----- > > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of > > Charles Eckel (eckelcu) > > Sent: Wednesday, August 17, 2011 12:40 AM > > To: Stephen Botzko > > Cc: clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > Agreed. The difference I am trying to point out is that in (1), the > > information you need to describe the audio stream for appropriate > > rendering is already handled quite well by existing SIP/SDP/RTP and > > most > > implementations, whereas you need CLUE for (2) and (2b). > > > > Cheers, > > Charles > > > > > -----Original Message----- > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > Sent: Tuesday, August 16, 2011 2:14 PM > > > To: Charles Eckel (eckelcu) > > > Cc: Paul Kyzivat; clue@ietf.org > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > Well, the audio in (1) and (2b) is certainly packetized differently. > > But not compressed differently > > > (unless you are assuming that the signal in (1) is jointly encoded > > stereo - which it could be I guess, > > > but it would be unusual for telepresence systems). Also, the audio in > > (1) is not mixed, no matter how > > > it is encoded. > > > > > > In any event, I believe that the difference between (1) and (2) and > > (2b) is really a transport > > > question that has nothing to do with layout. The same information is > > needed to enable proper > > > rendering, and once the streams are received, they are rendered in > > precisely the same way. > > > > > > Regards, > > > Stephen Botzko > > > > > > > > > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu) > > wrote: > > > > > > > > > I am distinguishing between: > > > > > > (1) a single RTP stream that consists of a single stereo audio > > stream > > > (2) two RTP streams, one that contains left speaker audio and > > the other > > > than contains right speaker audio > > > > > > (2) could also be transmitted in a single RTP stream using SSRC > > > multiplexing. Let me call that (2b). > > > (2) and (2b) are essentially the same. Just the RTP mechanism > > employed > > > is difference. > > > (1) is different from (2) and (2b) in that the audio signal > > encoded is > > > actually different. > > > > > > Cheers, > > > Charles > > > > > > > > > > -----Original Message----- > > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > > > > Sent: Tuesday, August 16, 2011 6:20 AM > > > > To: Charles Eckel (eckelcu) > > > > Cc: Paul Kyzivat; clue@ietf.org > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > I guess by "stream" you are meaning RTP stream? in which case > > by > > > "mix" you perhaps mean that the left > > > > and right channels are placed in a single RTP stream??? What > > do you > > > mean when you describe some audio > > > > captures as "independent" - are you thinking they come from > > different > > > rooms???. > > > > > > > > I think in many respects audio distribution and spatial audio > > layout > > > is at least as difficult as video > > > > layout, and have some unique issues. For one thing, you need > > to sort > > > out how you should place the > > > > audio from human participants who are not on camera, and what > > should > > > happen later on if some of those > > > > participants are shown. > > > > > > > > I suggest it is necessary to be very careful with terminology. > > In > > > particular, I think it is important > > > > to distinguish composition from RTP transmission. > > > > > > > > Regards, > > > > Stephen Botzko > > > > > > > > > > > > > > > > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) > > > wrote: > > > > > > > > > > > > > -----Original Message----- > > > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > > Sent: Monday, August 15, 2011 2:14 PM > > > > > To: Charles Eckel (eckelcu) > > > > > Cc: Paul Kyzivat; clue@ietf.org > > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > > > Inline > > > > > > > > > > > > > > > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel > > (eckelcu) > > > > wrote: > > > > > > > > > > > > > > > Please see inline. > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > From: clue-bounces@ietf.org > > > [mailto:clue-bounces@ietf.org] On > > > > Behalf > > > > > Of Paul Kyzivat > > > > > > > > > > > Sent: Thursday, August 11, 2011 6:02 AM > > > > > > > > > > > To: clue@ietf.org > > > > > > Subject: Re: [clue] continuing "layout" > > discussion > > > > > > > > > > > > Inline > > > > > > > > > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > > > > > >> -----Original Message----- > > > > > > >> From: clue-bounces@ietf.org > > > [mailto:clue-bounces@ietf.org] > > > > On > > > > > Behalf Of > > > > > > >> Paul Kyzivat > > > > > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > > > > > >> To: clue@ietf.org > > > > > > >> Subject: Re: [clue] continuing "layout" > > discussion > > > > > > > > > > > > > >>> 4 - multi stream media format - what the > > streams > > > mean with > > > > respect > > > > > to > > > > > > >> each other, regardless of the actual > > content on the > > > > streams. For > > > > > > >> audio, examples are stereo, 5.1 surround, > > binaural, > > > linear > > > > array. > > > > > > >> (linear array is described in the clue > > framework > > > document). > > > > > Perhaps 3D > > > > > > >> video formats would also fit in this > > category. > > > This > > > > information is > > > > > > >> needed in order to properly render the > > media into > > > light and > > > > sound > > > > > for > > > > > > >> human observers. I see this at the same > > level as > > > > identifying a > > > > > codec, > > > > > > >> independent of the audio or video content > > carried > > > on the > > > > streams, > > > > > and > > > > > > >> independent of how any composition of > > sources is > > > done. > > > > > > > > > > > > > > > I do not think this is necessarily true. Taking > > audio as > > > an > > > > example, you > > > > > could have two audio streams that are mixed to > > form a > > > single > > > > stereo > > > > > audio stream, or you could have them as two > > independent > > > (not > > > > mixed) > > > > > streams that are associate with each other by > > some > > > grouping > > > > mechanism. > > > > > This group would be categorized as being stereo > > audio > > > with one > > > > audio > > > > > stream being the left and the other the right. > > The codec > > > used > > > > for each > > > > > could be different, though I agree they would > > typically > > > be the > > > > same. > > > > > Consequently, I think at attribute such as > > "stereo" as > > > being > > > > more of a > > > > > grouping concept, where the group may consist > > of: > > > > > - multiple independent streams, each with > > potentially > > > its own > > > > spatial > > > > > orientation, codec, bandwidth, etc., > > > > > - a single mixed stream > > > > > > > > > > > > > > > > > > > > [sb] I do not understand this distinction. What do > > you mean > > > when you > > > > say "two audio streams that are > > > > > mixed to form a single stereo stream", and how is this > > > different from > > > > the left and right grouping? > > > > > > > > > > > > In one case they are mixed by the source of the stream > > into a > > > single > > > > stream, and in another they are sent as two separate > > streams by > > > the > > > > source. The end result once rendered at the receiver may > > be the > > > same, > > > > but what is sent is different. This example with audio > > is > > > perhaps too > > > > simple. If you think of it as video that is composed > > into a > > > single video > > > > stream vs. multiple via streams that are sent > > individually, the > > > > difference may be more clear. > > > > > > > > Cheers, > > > > Charles > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > Charles > > > > > > > > > > > > > > > > >> I was with you all the way until 4. That > > one I > > > don't > > > > understand. > > > > > > >> The name you chose for this has > > connotations for > > > me, but > > > > isn't > > > > > fully in > > > > > > >> harmony with the definitions you give: > > > > > > > > > > > > > > I'm happy to change the name if you have a > > > suggestion > > > > > > > > > > > > Not yet. Maybe once the concepts are more > > clearly > > > defined I > > > > will have > > > > > an > > > > > > opinion. > > > > > > > > > > > > >> If we consider audio, it makes sense that > > multiple > > > streams > > > > can be > > > > > > >> rendered as if they came from different > > physical > > > locations > > > > in the > > > > > > >> receiving room. That can be done by the > > receiver if > > > it gets > > > > those > > > > > > >> streams separately, and has information > > about their > > > > intended > > > > > > >> relationships. It can also be done by the > > sender or > > > MCU and > > > > passed > > > > > on > > > > > > >> to > > > > > > >> the receiver as a single stream with stereo > > or > > > binaural > > > > coding. > > > > > > > > > > > > > > Yes. It could also be done by the sender > > using the > > > "linear > > > > array" > > > > > audio channel format. Maybe it > > > > > > is true that stereo or binaural audio channels > > would > > > always be > > > > sent as > > > > > a single stream, but I was not > > > > > > assuming that yet, at least not in general > > when you > > > consider > > > > other > > > > > types too, such as linear array > > > > > > channels. > > > > > > > > > > > > >> So it seems to me you have two concepts > > here, not > > > one. One > > > > has to > > > > > do > > > > > > >> with describing the relationships between > > streams, > > > and the > > > > other > > > > > has to > > > > > > >> do with the encoding of spacial > > relationships > > > *within* a > > > > single > > > > > stream. > > > > > > > > > > > > > > Maybe that is a better way to describe it, > > if you > > > assume > > > > > multi-channel audio is always sent with all > > > > > > the channels in the same RTP stream. Is that > > what you > > > mean? > > > > > > > > > > > > > > I was considering the linear array format to > > be > > > another type > > > > of > > > > > multi-channel audio, and I know > > > > > > people want to be able to send each channel in > > a > > > separate RTP > > > > stream. > > > > > So it doesn't quite fit with > > > > > > how you separate the two concepts. In my > > view, > > > identifying > > > > the > > > > > separate channels by what they mean is > > > > > > the same concept for linear array and stereo. > > For > > > example > > > > "this > > > > > channel is left, this channel is > > > > > > center, this channel is right". To me, that > > is the > > > same > > > > concept for > > > > > identifying channels whether or > > > > > > not they are carried in the same RTP stream. > > > > > > > > > > > > > > Maybe we are thinking the same thing but > > getting > > > confused by > > > > > terminology about channels vs. streams. > > > > > > > > > > > > Maybe. Let me try to restate what I now think > > you are > > > saying: > > > > > > > > > > > > The audio may consist of several "channels". > > > > > > > > > > > > Each channel may be sent over its own RTP > > stream, > > > > > > or multiple channels may be multiplexed over > > an RTP > > > stream. > > > > > > > > > > > > I guess much of this can also apply to video. > > > > > > > > > > > > When there are exactly two audio channels, > > they may be > > > encoded > > > > as > > > > > > "stereo" or "binaural", which then affects how > > they > > > should be > > > > rendered > > > > > > by the recipient. In these cases the primary > > info that > > > is > > > > required > > > > > about > > > > > > the individual channels is which is left and > > which is > > > right. > > > > (And > > > > > which > > > > > > perspective to use in interpretting left and > > right.) > > > > > > > > > > > > For other multi-channel cases more information > > is > > > required > > > > about the > > > > > > role of each channel in order to properly > > render them. > > > > > > > > > > > > Thanks, > > > > > > Paul > > > > > > > > > > > > > > > > > > >> Or, are you asserting that stereo and > > binaural are > > > simply > > > > ways to > > > > > > >> encode > > > > > > >> multiple logical streams in one RTP stream, > > > together with > > > > their > > > > > spacial > > > > > > >> relationships? > > > > > > > > > > > > > > No, that is not what I'm trying to say. > > > > > > > > > > > > > > Mark > > > > > > > > > _______________________________________________ > > > > > > > clue mailing list > > > > > > > clue@ietf.org > > > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > clue mailing list > > > > > > clue@ietf.org > > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > _______________________________________________ > > > > > clue mailing list > > > > > clue@ietf.org > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue > > --bcaec54861941e6cc804aaa98716 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Roni

For this particular discussion, all of the two channel trans= missions are "stereo", they are just transported differently.=A0 =

As far as the framework draft is concerned, the various microphone = arrangements are accounted for by the signaling of the 1-100 indices for ea= ch channel.

Binaural is something else- either an HRTF function is applied to the t= wo channels prior to rendering (which was Christer's case with the cent= ral rendering server), or you have a dummy head with microphones in the ear= s in the telepresence room to make the capture.=A0 Not sure if we need to d= istinguish the capture and render cases right now.

Regards,
Stephen

On Tue, Aug 16, 2= 011 at 7:34 PM, Roni Even <Even.roni@huawei.com> wrote:
Hi guys,
In case 1 according to RFC 3551 (section 4.1) 2 channels in the rtpmap mean= s
left and right channels described as stereo. Are you saying that for the 2<= br> and 2b case you also assume stereo capture or can it be any other way of creating the two audio streams =A0from the same room (Binaural recording (n= ot
common), or some other arrangements of the microphones). But this talk abou= t
the capture side.

I think that Christer talked about the rendering side and not only on the capture side.

Roni
> Charles Eckel (eckelcu)
> Sent: Wednesday, August 17, 2011 12:40 AM
> To: Stephen Botzko
> Cc:
clue@ietf.org
> Subject: Re: [clue] continuing "= ;layout" discussion
>
> Agreed. The difference I am trying to point out is that in (1), the > information you need to describe the audio stream for appropriate
> rendering is already handled quite well by existing SIP/SDP/RTP and > most
> implementations, whereas you need CLUE for (2) and (2b).
>
> Cheers,
> Charles
>
> > -----Original Message-----
> > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> > Sent: Tuesday, August 16, 2011 2:14 PM
> > To: Charles Eckel (eckelcu)
> > Cc: Paul Kyzivat; clue@ietf.org<= /a>
> > Subject: Re: [clue] continuing "layout" discussion
> >
> > Well, the audio in (1) and (2b) is certainly packetized different= ly.
> But not compressed differently
> > (unless you are assuming that the signal in (1) is jointly encode= d
> stereo - which it could be I guess,
> > but it would be unusual for telepresence systems). Also, the audi= o in
> (1) is not mixed, no matter how
> > it is encoded.
> >
> > In any event, I believe that the difference between (1) and (2) a= nd
> (2b) is really a transport
> > question that has nothing to do with layout. The same information= is
> needed to enable proper
> > rendering, and once the streams are received, they are rendered i= n
> precisely the same way.
> >
> > Regards,
> > Stephen Botzko
> >
> >
> > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu)
> <
eckelcu@cisco.com> wro= te:
> >
> >
> > =A0 =A0 I am distinguishing between:
> >
> > =A0 =A0 (1) a single RTP stream that consists of a single stereo = audio
> stream
> > =A0 =A0 (2) two RTP streams, one that contains left speaker audio= and
> the other
> > =A0 =A0 than contains right speaker audio
> >
> > =A0 =A0 (2) could also be transmitted in a single RTP stream usin= g SSRC
> > =A0 =A0 multiplexing. Let me call that (2b).
> > =A0 =A0 (2) and (2b) are essentially the same. Just the RTP mecha= nism
> employed
> > =A0 =A0 is difference.
> > =A0 =A0 (1) is different from (2) and (2b) in that the audio sign= al
> encoded is
> > =A0 =A0 actually different.
> >
> > =A0 =A0 Cheers,
> > =A0 =A0 Charles
> >
> >
> > =A0 =A0 > -----Original Message-----
> > =A0 =A0 > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> >
> > =A0 =A0 > Sent: Tuesday, August 16, 2011 6:20 AM
> > =A0 =A0 > To: Charles Eckel (eckelcu)
> > =A0 =A0 > Cc: Paul Kyzivat; c= lue@ietf.org
> > =A0 =A0 > Subject: Re: [clue] continuing "layout" di= scussion
> > =A0 =A0 >
> > =A0 =A0 > I guess by "stream" you are meaning RTP st= ream? =A0in which case
> by
> > =A0 =A0 "mix" you perhaps mean that the left
> > =A0 =A0 > and right channels are placed in a single RTP stream= ??? =A0What
> do you
> > =A0 =A0 mean when you describe some audio
> > =A0 =A0 > captures as "independent" - are you thinki= ng they come from
> different
> > =A0 =A0 rooms???.
> > =A0 =A0 >
> > =A0 =A0 > I think in many respects audio distribution and spat= ial audio
> layout
> > =A0 =A0 is at least as difficult as video
> > =A0 =A0 > layout, and have some unique issues. =A0For one thin= g, you need
> to sort
> > =A0 =A0 out how you should place the
> > =A0 =A0 > audio from human participants who are not on camera,= and what
> should
> > =A0 =A0 happen later on if some of those
> > =A0 =A0 > participants are shown.
> > =A0 =A0 >
> > =A0 =A0 > I suggest it is necessary to be very careful with te= rminology.
> In
> > =A0 =A0 particular, I think it is important
> > =A0 =A0 > to distinguish composition from RTP transmission. > > =A0 =A0 >
> > =A0 =A0 > Regards,
> > =A0 =A0 > Stephen Botzko
> > =A0 =A0 >
> > =A0 =A0 >
> > =A0 =A0 >
> > =A0 =A0 > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (ecke= lcu)
> > =A0 =A0 <eckelcu@cisco.co= m> wrote:
> > =A0 =A0 >
> > =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > -----Original Message-----
> > =A0 =A0 > =A0 =A0 =A0 > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> > =A0 =A0 > =A0 =A0 =A0 > Sent: Monday, August 15, 2011 2:14 = PM
> > =A0 =A0 > =A0 =A0 =A0 > To: Charles Eckel (eckelcu)
> > =A0 =A0 > =A0 =A0 =A0 > Cc: Paul Kyzivat; clue@ietf.org
> > =A0 =A0 > =A0 =A0 =A0 > Subject: Re: [clue] continuing &quo= t;layout" discussion
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > Inline
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > On Mon, Aug 15, 2011 at 4:21 PM, Ch= arles Eckel
> (eckelcu)
> > =A0 =A0 > =A0 =A0 =A0 <eckelcu@cisco.com> wrote:
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Please see inline.
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > -----Original Mess= age-----
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > From: clue-bounces@ietf.org
> > =A0 =A0 [mailto:clue-bou= nces@ietf.org] On
> > =A0 =A0 > =A0 =A0 =A0 Behalf
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Of Paul Kyzivat
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Sent: Thursday, Au= gust 11, 2011 6:02 AM
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > To: clue@ietf.org
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Subject: Re: [clue= ] continuing "layout"
> discussion
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Inline
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > On 8/10/11 5:49 PM= , Duckworth, Mark wrote:
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> -----Orig= inal Message-----
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> From: clue-bounces@ietf.org
> > =A0 =A0 [mailto:clue-bou= nces@ietf.org]
> > =A0 =A0 > =A0 =A0 =A0 On
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Behalf Of
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> Paul Kyzi= vat
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> Sent: Tue= sday, August 09, 2011 9:03 AM
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> To: clue@ietf.org
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> Subject: = Re: [clue] continuing "layout"
> discussion
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >>> 4 - m= ulti stream media format - what the
> streams
> > =A0 =A0 mean with
> > =A0 =A0 > =A0 =A0 =A0 respect
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 to
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> each othe= r, regardless of the actual
> content on the
> > =A0 =A0 > =A0 =A0 =A0 streams. =A0For
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> audio, ex= amples are stereo, 5.1 surround,
> binaural,
> > =A0 =A0 linear
> > =A0 =A0 > =A0 =A0 =A0 array.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> (linear a= rray is described in the clue
> framework
> > =A0 =A0 document).
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Perhaps 3D
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> video for= mats would also fit in this
> category.
> > =A0 =A0 This
> > =A0 =A0 > =A0 =A0 =A0 information is
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> needed in= order to properly render the
> media into
> > =A0 =A0 light and
> > =A0 =A0 > =A0 =A0 =A0 sound
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 for
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> human obs= ervers. =A0I see this at the same
> level as
> > =A0 =A0 > =A0 =A0 =A0 identifying a
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 codec,
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> independe= nt of the audio or video content
> carried
> > =A0 =A0 on the
> > =A0 =A0 > =A0 =A0 =A0 streams,
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 and
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> independe= nt of how any composition of
> sources is
> > =A0 =A0 done.
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 I do not think this is = necessarily true. Taking
> audio as
> > =A0 =A0 an
> > =A0 =A0 > =A0 =A0 =A0 example, you
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 could have two audio st= reams that are mixed to
> form a
> > =A0 =A0 single
> > =A0 =A0 > =A0 =A0 =A0 stereo
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 audio stream, or you co= uld have them as two
> independent
> > =A0 =A0 (not
> > =A0 =A0 > =A0 =A0 =A0 mixed)
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 streams that are associ= ate with each other by
> some
> > =A0 =A0 grouping
> > =A0 =A0 > =A0 =A0 =A0 mechanism.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 This group would be cat= egorized as being stereo
> audio
> > =A0 =A0 with one
> > =A0 =A0 > =A0 =A0 =A0 audio
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 stream being the left a= nd the other the right.
> The codec
> > =A0 =A0 used
> > =A0 =A0 > =A0 =A0 =A0 for each
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 could be different, tho= ugh I agree they would
> typically
> > =A0 =A0 be the
> > =A0 =A0 > =A0 =A0 =A0 same.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Consequently, I think a= t attribute such as
> "stereo" as
> > =A0 =A0 being
> > =A0 =A0 > =A0 =A0 =A0 more of a
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 grouping concept, where= the group may consist
> of:
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 - multiple independent = streams, each with
> potentially
> > =A0 =A0 its own
> > =A0 =A0 > =A0 =A0 =A0 spatial
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 orientation, codec, ban= dwidth, etc.,
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 - a single mixed stream=
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > [sb] I do not understand this disti= nction. =A0What do
> you mean
> > =A0 =A0 when you
> > =A0 =A0 > =A0 =A0 =A0 say "two audio streams that are
> > =A0 =A0 > =A0 =A0 =A0 > mixed to form a single stereo strea= m", and how is this
> > =A0 =A0 different from
> > =A0 =A0 > =A0 =A0 =A0 the left and right grouping?
> > =A0 =A0 >
> > =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 In one case they are mixed by the source= of the stream
> into a
> > =A0 =A0 single
> > =A0 =A0 > =A0 =A0 =A0 stream, and in another they are sent as = two separate
> streams by
> > =A0 =A0 the
> > =A0 =A0 > =A0 =A0 =A0 source. The end result once rendered at = the receiver may
> be the
> > =A0 =A0 same,
> > =A0 =A0 > =A0 =A0 =A0 but what is sent is different. This exam= ple with audio
> is
> > =A0 =A0 perhaps too
> > =A0 =A0 > =A0 =A0 =A0 simple. If you think of it as video that= is composed
> into a
> > =A0 =A0 single video
> > =A0 =A0 > =A0 =A0 =A0 stream vs. multiple via streams that are= sent
> individually, the
> > =A0 =A0 > =A0 =A0 =A0 difference may be more clear.
> > =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 Cheers,
> > =A0 =A0 > =A0 =A0 =A0 Charles
> > =A0 =A0 >
> > =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Cheers,
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Charles
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> I was wit= h you all the way until 4. That
> one I
> > =A0 =A0 don't
> > =A0 =A0 > =A0 =A0 =A0 understand.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> The name = you chose for this has
> connotations for
> > =A0 =A0 me, but
> > =A0 =A0 > =A0 =A0 =A0 isn't
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 fully in
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> harmony w= ith the definitions you give:
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > I'm happy= to change the name if you have a
> > =A0 =A0 suggestion
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Not yet. Maybe onc= e the concepts are more
> clearly
> > =A0 =A0 defined I
> > =A0 =A0 > =A0 =A0 =A0 will have
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 an
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > opinion.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> If we con= sider audio, it makes sense that
> multiple
> > =A0 =A0 streams
> > =A0 =A0 > =A0 =A0 =A0 can be
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> rendered = as if they came from different
> physical
> > =A0 =A0 locations
> > =A0 =A0 > =A0 =A0 =A0 in the
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> receiving= room. That can be done by the
> receiver if
> > =A0 =A0 it gets
> > =A0 =A0 > =A0 =A0 =A0 those
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> streams s= eparately, and has information
> about their
> > =A0 =A0 > =A0 =A0 =A0 intended
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> relations= hips. It can also be done by the
> sender or
> > =A0 =A0 MCU and
> > =A0 =A0 > =A0 =A0 =A0 passed
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 on
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> to
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> the recei= ver as a single stream with stereo
> or
> > =A0 =A0 binaural
> > =A0 =A0 > =A0 =A0 =A0 coding.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > Yes. =A0It co= uld also be done by the sender
> using the
> > =A0 =A0 "linear
> > =A0 =A0 > =A0 =A0 =A0 array"
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 audio channel format. = =A0Maybe it
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > is true that stere= o or binaural audio channels
> would
> > =A0 =A0 always be
> > =A0 =A0 > =A0 =A0 =A0 sent as
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 a single stream, but I = was not
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > assuming that yet,= at least not in general
> when you
> > =A0 =A0 consider
> > =A0 =A0 > =A0 =A0 =A0 other
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 types too, such as line= ar array
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > channels.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> So it see= ms to me you have two concepts
> here, not
> > =A0 =A0 one. One
> > =A0 =A0 > =A0 =A0 =A0 has to
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 do
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> with desc= ribing the relationships between
> streams,
> > =A0 =A0 and the
> > =A0 =A0 > =A0 =A0 =A0 other
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 has to
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> do with t= he encoding of spacial
> relationships
> > =A0 =A0 *within* a
> > =A0 =A0 > =A0 =A0 =A0 single
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 stream.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > Maybe that is= a better way to describe it,
> if you
> > =A0 =A0 assume
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 multi-channel audio is = always sent with all
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > the channels in th= e same RTP stream. =A0Is that
> what you
> > =A0 =A0 mean?
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > I was conside= ring the linear array format to
> be
> > =A0 =A0 another type
> > =A0 =A0 > =A0 =A0 =A0 of
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 multi-channel audio, an= d I know
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > people want to be = able to send each channel in
> a
> > =A0 =A0 separate RTP
> > =A0 =A0 > =A0 =A0 =A0 stream.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 So it doesn't quite= fit with
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > how you separate t= he two concepts. =A0In my
> view,
> > =A0 =A0 identifying
> > =A0 =A0 > =A0 =A0 =A0 the
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 separate channels by wh= at they mean is
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > the same concept f= or linear array and stereo.
> For
> > =A0 =A0 example
> > =A0 =A0 > =A0 =A0 =A0 "this
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 channel is left, this c= hannel is
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > center, this chann= el is right". =A0To me, that
> is the
> > =A0 =A0 same
> > =A0 =A0 > =A0 =A0 =A0 concept for
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 identifying channels wh= ether or
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > not they are carri= ed in the same RTP stream.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > Maybe we are = thinking the same thing but
> getting
> > =A0 =A0 confused by
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 terminology about chann= els vs. streams.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Maybe. Let me try = to restate what I now think
> you are
> > =A0 =A0 saying:
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > The audio may cons= ist of several "channels".
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Each channel may b= e sent over its own RTP
> stream,
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > or multiple channe= ls may be multiplexed over
> an RTP
> > =A0 =A0 stream.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > I guess much of th= is can also apply to video.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > When there are exa= ctly two audio channels,
> they may be
> > =A0 =A0 encoded
> > =A0 =A0 > =A0 =A0 =A0 as
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > "stereo"= or "binaural", which then affects how
> they
> > =A0 =A0 should be
> > =A0 =A0 > =A0 =A0 =A0 rendered
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > by the recipient. = In these cases the primary
> info that
> > =A0 =A0 is
> > =A0 =A0 > =A0 =A0 =A0 required
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 about
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > the individual cha= nnels is which is left and
> which is
> > =A0 =A0 right.
> > =A0 =A0 > =A0 =A0 =A0 (And
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 which
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > perspective to use= in interpretting left and
> right.)
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > For other multi-ch= annel cases more information
> is
> > =A0 =A0 required
> > =A0 =A0 > =A0 =A0 =A0 about the
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > role of each chann= el in order to properly
> render them.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Thanks= ,
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 Paul > > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> Or, are y= ou asserting that stereo and
> binaural are
> > =A0 =A0 simply
> > =A0 =A0 > =A0 =A0 =A0 ways to
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> encode > > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> multiple = logical streams in one RTP stream,
> > =A0 =A0 together with
> > =A0 =A0 > =A0 =A0 =A0 their
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 spacial
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> relations= hips?
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > No, that is n= ot what I'm trying to say.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > Mark
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> _______________________________________________
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > clue mailing = list
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > clue@ietf.org
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > https://www.iet= f.org/mailman/listinfo/clue
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> _______________________________________________
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > clue mailing list<= br> > > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > clue@ietf.org
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > https://www.ietf.org= /mailman/listinfo/clue
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 _______________________= ________________________
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 clue mailing list
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 clue@ietf.org
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 https://www.ietf.org/mail= man/listinfo/clue
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 >
> > =A0 =A0 >
> > =A0 =A0 >
> >
> >
> >
>
> _______________________________________________
> clue mailing list
> clue@ietf.org
> https://www.ietf.org/mailman/listinfo/clue


--bcaec54861941e6cc804aaa98716-- From Even.roni@huawei.com Tue Aug 16 21:54:08 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D58E321F8888 for ; Tue, 16 Aug 2011 21:54:08 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -105.073 X-Spam-Level: X-Spam-Status: No, score=-105.073 tagged_above=-999 required=5 tests=[AWL=1.525, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id h7H2SFhbhML6 for ; Tue, 16 Aug 2011 21:54:06 -0700 (PDT) Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [119.145.14.67]) by ietfa.amsl.com (Postfix) with ESMTP id 74F7921F8880 for ; Tue, 16 Aug 2011 21:54:05 -0700 (PDT) Received: from huawei.com (szxga04-in [172.24.2.12]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQ2007562W5KD@szxga04-in.huawei.com> for clue@ietf.org; Wed, 17 Aug 2011 12:52:54 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQ200BQY2W5KS@szxga04-in.huawei.com> for clue@ietf.org; Wed, 17 Aug 2011 12:52:53 +0800 (CST) Received: from windows8d787f9 (bzq-79-178-13-148.red.bezeqint.net [79.178.13.148]) by szxml11-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQ20020S2V8L6@szxml11-in.huawei.com>; Wed, 17 Aug 2011 12:52:53 +0800 (CST) Date: Wed, 17 Aug 2011 07:51:37 +0300 From: Roni Even In-reply-to: To: 'Stephen Botzko' Message-id: <02bc01cc5c99$6375adb0$2a610910$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_gqO9ck5CtHUS9J9QrBvA5Q)" Content-language: en-us Thread-index: Acxcfl4GQlxhjyvZRbGGwK0sMoRhtAAGCkKQ References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> <02a501cc5c6d$1a2bf1e0$4e83d5a0$%roni@huawei.com> Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Aug 2011 04:54:09 -0000 This is a multi-part message in MIME format. --Boundary_(ID_gqO9ck5CtHUS9J9QrBvA5Q) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Hi Steve, The two channel is a simple private case and using this case to define the required information from the capture and render side is like saying that I proved a mathematical induction for n=2 therefore it works for every n. I see this issue since we are using a n=2 channels audio and a n=3 cameras left to right examples to provide a solution that will scale to any n. What I was trying to say that the current way we describe the stream by a number is not enough if we want to go to the "being there" experience. We need to see what are the dimensions that the model need in order to be able to convey the capture information and the rendering capabilities for both audio and video. I think that there is some similarities since we have a number of streams, how the capture is done and what is the rendering device as the basic information. I think that before going to the framework model it may be beneficial to create a list of the parameters we need to convey, provide a term for each group of parameters and have a way to define them in the model. For example for the capture we have the number of capture devices, the arrangement (Spatial), the encoding process (including mixing if multiple inputs), the capture field and others. Regards Roni From: Stephen Botzko [mailto:stephen.botzko@gmail.com] Sent: Wednesday, August 17, 2011 4:37 AM To: Roni Even Cc: Charles Eckel (eckelcu); clue@ietf.org Subject: Re: [clue] continuing "layout" discussion Hi Roni For this particular discussion, all of the two channel transmissions are "stereo", they are just transported differently. As far as the framework draft is concerned, the various microphone arrangements are accounted for by the signaling of the 1-100 indices for each channel. Binaural is something else- either an HRTF function is applied to the two channels prior to rendering (which was Christer's case with the central rendering server), or you have a dummy head with microphones in the ears in the telepresence room to make the capture. Not sure if we need to distinguish the capture and render cases right now. Regards, Stephen On Tue, Aug 16, 2011 at 7:34 PM, Roni Even wrote: Hi guys, In case 1 according to RFC 3551 (section 4.1) 2 channels in the rtpmap means left and right channels described as stereo. Are you saying that for the 2 and 2b case you also assume stereo capture or can it be any other way of creating the two audio streams from the same room (Binaural recording (not common), or some other arrangements of the microphones). But this talk about the capture side. I think that Christer talked about the rendering side and not only on the capture side. Roni > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of > Charles Eckel (eckelcu) > Sent: Wednesday, August 17, 2011 12:40 AM > To: Stephen Botzko > Cc: clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion > > Agreed. The difference I am trying to point out is that in (1), the > information you need to describe the audio stream for appropriate > rendering is already handled quite well by existing SIP/SDP/RTP and > most > implementations, whereas you need CLUE for (2) and (2b). > > Cheers, > Charles > > > -----Original Message----- > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > Sent: Tuesday, August 16, 2011 2:14 PM > > To: Charles Eckel (eckelcu) > > Cc: Paul Kyzivat; clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > Well, the audio in (1) and (2b) is certainly packetized differently. > But not compressed differently > > (unless you are assuming that the signal in (1) is jointly encoded > stereo - which it could be I guess, > > but it would be unusual for telepresence systems). Also, the audio in > (1) is not mixed, no matter how > > it is encoded. > > > > In any event, I believe that the difference between (1) and (2) and > (2b) is really a transport > > question that has nothing to do with layout. The same information is > needed to enable proper > > rendering, and once the streams are received, they are rendered in > precisely the same way. > > > > Regards, > > Stephen Botzko > > > > > > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu) > wrote: > > > > > > I am distinguishing between: > > > > (1) a single RTP stream that consists of a single stereo audio > stream > > (2) two RTP streams, one that contains left speaker audio and > the other > > than contains right speaker audio > > > > (2) could also be transmitted in a single RTP stream using SSRC > > multiplexing. Let me call that (2b). > > (2) and (2b) are essentially the same. Just the RTP mechanism > employed > > is difference. > > (1) is different from (2) and (2b) in that the audio signal > encoded is > > actually different. > > > > Cheers, > > Charles > > > > > > > -----Original Message----- > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > > Sent: Tuesday, August 16, 2011 6:20 AM > > > To: Charles Eckel (eckelcu) > > > Cc: Paul Kyzivat; clue@ietf.org > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > I guess by "stream" you are meaning RTP stream? in which case > by > > "mix" you perhaps mean that the left > > > and right channels are placed in a single RTP stream??? What > do you > > mean when you describe some audio > > > captures as "independent" - are you thinking they come from > different > > rooms???. > > > > > > I think in many respects audio distribution and spatial audio > layout > > is at least as difficult as video > > > layout, and have some unique issues. For one thing, you need > to sort > > out how you should place the > > > audio from human participants who are not on camera, and what > should > > happen later on if some of those > > > participants are shown. > > > > > > I suggest it is necessary to be very careful with terminology. > In > > particular, I think it is important > > > to distinguish composition from RTP transmission. > > > > > > Regards, > > > Stephen Botzko > > > > > > > > > > > > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) > > wrote: > > > > > > > > > > -----Original Message----- > > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > Sent: Monday, August 15, 2011 2:14 PM > > > > To: Charles Eckel (eckelcu) > > > > Cc: Paul Kyzivat; clue@ietf.org > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > Inline > > > > > > > > > > > > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel > (eckelcu) > > > wrote: > > > > > > > > > > > > Please see inline. > > > > > > > > > > > > > -----Original Message----- > > > > > From: clue-bounces@ietf.org > > [mailto:clue-bounces@ietf.org] On > > > Behalf > > > > Of Paul Kyzivat > > > > > > > > > Sent: Thursday, August 11, 2011 6:02 AM > > > > > > > > > To: clue@ietf.org > > > > > Subject: Re: [clue] continuing "layout" > discussion > > > > > > > > > > Inline > > > > > > > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > > > > >> -----Original Message----- > > > > > >> From: clue-bounces@ietf.org > > [mailto:clue-bounces@ietf.org] > > > On > > > > Behalf Of > > > > > >> Paul Kyzivat > > > > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > > > > >> To: clue@ietf.org > > > > > >> Subject: Re: [clue] continuing "layout" > discussion > > > > > > > > > > > >>> 4 - multi stream media format - what the > streams > > mean with > > > respect > > > > to > > > > > >> each other, regardless of the actual > content on the > > > streams. For > > > > > >> audio, examples are stereo, 5.1 surround, > binaural, > > linear > > > array. > > > > > >> (linear array is described in the clue > framework > > document). > > > > Perhaps 3D > > > > > >> video formats would also fit in this > category. > > This > > > information is > > > > > >> needed in order to properly render the > media into > > light and > > > sound > > > > for > > > > > >> human observers. I see this at the same > level as > > > identifying a > > > > codec, > > > > > >> independent of the audio or video content > carried > > on the > > > streams, > > > > and > > > > > >> independent of how any composition of > sources is > > done. > > > > > > > > > > > > I do not think this is necessarily true. Taking > audio as > > an > > > example, you > > > > could have two audio streams that are mixed to > form a > > single > > > stereo > > > > audio stream, or you could have them as two > independent > > (not > > > mixed) > > > > streams that are associate with each other by > some > > grouping > > > mechanism. > > > > This group would be categorized as being stereo > audio > > with one > > > audio > > > > stream being the left and the other the right. > The codec > > used > > > for each > > > > could be different, though I agree they would > typically > > be the > > > same. > > > > Consequently, I think at attribute such as > "stereo" as > > being > > > more of a > > > > grouping concept, where the group may consist > of: > > > > - multiple independent streams, each with > potentially > > its own > > > spatial > > > > orientation, codec, bandwidth, etc., > > > > - a single mixed stream > > > > > > > > > > > > > > > > [sb] I do not understand this distinction. What do > you mean > > when you > > > say "two audio streams that are > > > > mixed to form a single stereo stream", and how is this > > different from > > > the left and right grouping? > > > > > > > > > In one case they are mixed by the source of the stream > into a > > single > > > stream, and in another they are sent as two separate > streams by > > the > > > source. The end result once rendered at the receiver may > be the > > same, > > > but what is sent is different. This example with audio > is > > perhaps too > > > simple. If you think of it as video that is composed > into a > > single video > > > stream vs. multiple via streams that are sent > individually, the > > > difference may be more clear. > > > > > > Cheers, > > > Charles > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > Charles > > > > > > > > > > > > > >> I was with you all the way until 4. That > one I > > don't > > > understand. > > > > > >> The name you chose for this has > connotations for > > me, but > > > isn't > > > > fully in > > > > > >> harmony with the definitions you give: > > > > > > > > > > > > I'm happy to change the name if you have a > > suggestion > > > > > > > > > > Not yet. Maybe once the concepts are more > clearly > > defined I > > > will have > > > > an > > > > > opinion. > > > > > > > > > > >> If we consider audio, it makes sense that > multiple > > streams > > > can be > > > > > >> rendered as if they came from different > physical > > locations > > > in the > > > > > >> receiving room. That can be done by the > receiver if > > it gets > > > those > > > > > >> streams separately, and has information > about their > > > intended > > > > > >> relationships. It can also be done by the > sender or > > MCU and > > > passed > > > > on > > > > > >> to > > > > > >> the receiver as a single stream with stereo > or > > binaural > > > coding. > > > > > > > > > > > > Yes. It could also be done by the sender > using the > > "linear > > > array" > > > > audio channel format. Maybe it > > > > > is true that stereo or binaural audio channels > would > > always be > > > sent as > > > > a single stream, but I was not > > > > > assuming that yet, at least not in general > when you > > consider > > > other > > > > types too, such as linear array > > > > > channels. > > > > > > > > > > >> So it seems to me you have two concepts > here, not > > one. One > > > has to > > > > do > > > > > >> with describing the relationships between > streams, > > and the > > > other > > > > has to > > > > > >> do with the encoding of spacial > relationships > > *within* a > > > single > > > > stream. > > > > > > > > > > > > Maybe that is a better way to describe it, > if you > > assume > > > > multi-channel audio is always sent with all > > > > > the channels in the same RTP stream. Is that > what you > > mean? > > > > > > > > > > > > I was considering the linear array format to > be > > another type > > > of > > > > multi-channel audio, and I know > > > > > people want to be able to send each channel in > a > > separate RTP > > > stream. > > > > So it doesn't quite fit with > > > > > how you separate the two concepts. In my > view, > > identifying > > > the > > > > separate channels by what they mean is > > > > > the same concept for linear array and stereo. > For > > example > > > "this > > > > channel is left, this channel is > > > > > center, this channel is right". To me, that > is the > > same > > > concept for > > > > identifying channels whether or > > > > > not they are carried in the same RTP stream. > > > > > > > > > > > > Maybe we are thinking the same thing but > getting > > confused by > > > > terminology about channels vs. streams. > > > > > > > > > > Maybe. Let me try to restate what I now think > you are > > saying: > > > > > > > > > > The audio may consist of several "channels". > > > > > > > > > > Each channel may be sent over its own RTP > stream, > > > > > or multiple channels may be multiplexed over > an RTP > > stream. > > > > > > > > > > I guess much of this can also apply to video. > > > > > > > > > > When there are exactly two audio channels, > they may be > > encoded > > > as > > > > > "stereo" or "binaural", which then affects how > they > > should be > > > rendered > > > > > by the recipient. In these cases the primary > info that > > is > > > required > > > > about > > > > > the individual channels is which is left and > which is > > right. > > > (And > > > > which > > > > > perspective to use in interpretting left and > right.) > > > > > > > > > > For other multi-channel cases more information > is > > required > > > about the > > > > > role of each channel in order to properly > render them. > > > > > > > > > > Thanks, > > > > > Paul > > > > > > > > > > > > > > > >> Or, are you asserting that stereo and > binaural are > > simply > > > ways to > > > > > >> encode > > > > > >> multiple logical streams in one RTP stream, > > together with > > > their > > > > spacial > > > > > >> relationships? > > > > > > > > > > > > No, that is not what I'm trying to say. > > > > > > > > > > > > Mark > > > > > > > _______________________________________________ > > > > > > clue mailing list > > > > > > clue@ietf.org > > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > clue mailing list > > > > > clue@ietf.org > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > _______________________________________________ > > > > clue mailing list > > > > clue@ietf.org > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue --Boundary_(ID_gqO9ck5CtHUS9J9QrBvA5Q) Content-type: text/html; charset=us-ascii Content-transfer-encoding: quoted-printable

Hi Steve,

The two channel is a simple private case and using this case to = define the required information from the capture and render side is like = saying that I proved a mathematical induction for n=3D2 therefore it = works for every n. I see this issue since we are using a n=3D2 channels = audio and a n=3D3 cameras left to right examples to provide a solution = that will scale to any n.

 

What I was trying to say that the current way we describe the stream = by a number is not enough if we want to go to the "being = there" experience.

We need to see what are the dimensions that the model need in order = to be able to convey the capture information and the rendering = capabilities for both audio and video. I think that there is some = similarities since we have a number of streams, how the capture is done = and what is the rendering device as the basic information. =

I think that before going to the framework model it may be beneficial = to create a list of the parameters we need to convey, provide a term for = each group of parameters and have a way to define them in the model. For = example   for the capture we have the number of capture = devices, the arrangement (Spatial), the encoding process (including = mixing if multiple inputs), the capture field and = others.

 

Regards

Roni

 

From:= = Stephen Botzko [mailto:stephen.botzko@gmail.com]
Sent: = Wednesday, August 17, 2011 4:37 AM
To: Roni Even
Cc: = Charles Eckel (eckelcu); clue@ietf.org
Subject: Re: [clue] = continuing "layout" = discussion

 

Hi Roni

For this particular = discussion, all of the two channel transmissions are "stereo", = they are just transported differently. 

As far as the = framework draft is concerned, the various microphone arrangements are = accounted for by the signaling of the 1-100 indices for each channel. =

Binaural is something else- either an HRTF function is applied = to the two channels prior to rendering (which was Christer's case with = the central rendering server), or you have a dummy head with microphones = in the ears in the telepresence room to make the capture.  Not sure = if we need to distinguish the capture and render cases right = now.

Regards,
Stephen

On Tue, Aug 16, 2011 at 7:34 PM, Roni Even <Even.roni@huawei.com> = wrote:

Hi guys,
In case 1 = according to RFC 3551 (section 4.1) 2 channels in the rtpmap = means
left and right channels described as stereo. Are you saying = that for the 2
and 2b case you also assume stereo capture or can it = be any other way of
creating the two audio streams  from the = same room (Binaural recording (not
common), or some other = arrangements of the microphones). But this talk about
the capture = side.

I think that Christer talked about the rendering side and = not only on the
capture side.

Roni


> -----Original Message-----
> From: clue-bounces@ietf.org = [mailto:clue-bounces@ietf.org] On = Behalf Of

> Charles Eckel = (eckelcu)
> Sent: Wednesday, August 17, 2011 12:40 AM
> To: = Stephen Botzko
> Cc: clue@ietf.org

<= p class=3DMsoNormal style=3D'margin-bottom:12.0pt'>> Subject: Re: = [clue] continuing "layout" discussion
>
> Agreed. = The difference I am trying to point out is that in (1), the
> = information you need to describe the audio stream for = appropriate
> rendering is already handled quite well by existing = SIP/SDP/RTP and
> most
> implementations, whereas you need = CLUE for (2) and (2b).
>
> Cheers,
> = Charles
>
> > -----Original Message-----
> > = From: Stephen Botzko [mailto:stephen.botzko@gmail.com]> > Sent: Tuesday, August 16, 2011 2:14 PM
> > To: = Charles Eckel (eckelcu)
> > Cc: Paul Kyzivat; clue@ietf.org
> > Subject: = Re: [clue] continuing "layout" discussion
> >
> = > Well, the audio in (1) and (2b) is certainly packetized = differently.
> But not compressed differently
> > (unless = you are assuming that the signal in (1) is jointly encoded
> = stereo - which it could be I guess,
> > but it would be unusual = for telepresence systems). Also, the audio in
> (1) is not mixed, = no matter how
> > it is encoded.
> >
> > In = any event, I believe that the difference between (1) and (2) and
> = (2b) is really a transport
> > question that has nothing to do = with layout. The same information is
> needed to enable = proper
> > rendering, and once the streams are received, they = are rendered in
> precisely the same way.
> >
> = > Regards,
> > Stephen Botzko
> >
> = >
> > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel = (eckelcu)
> <eckelcu@cisco.com> = wrote:
> >
> >
> >     I am = distinguishing between:
> >
> >     (1) a = single RTP stream that consists of a single stereo audio
> = stream
> >     (2) two RTP streams, one that contains = left speaker audio and
> the other
> >     than = contains right speaker audio
> >
> >     (2) = could also be transmitted in a single RTP stream using SSRC
> > =     multiplexing. Let me call that (2b).
> >   =   (2) and (2b) are essentially the same. Just the RTP = mechanism
> employed
> >     is = difference.
> >     (1) is different from (2) and = (2b) in that the audio signal
> encoded is
> >   =   actually different.
> >
> >     = Cheers,
> >     Charles
> >
> = >
> >     > -----Original Message-----
> = >     > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]> >
> >     > Sent: Tuesday, August 16, = 2011 6:20 AM
> >     > To: Charles Eckel = (eckelcu)
> >     > Cc: Paul Kyzivat; clue@ietf.org
> >   =   > Subject: Re: [clue] continuing "layout" = discussion
> >     >
> >     = > I guess by "stream" you are meaning RTP stream?  in = which case
> by
> >     "mix" you = perhaps mean that the left
> >     > and right = channels are placed in a single RTP stream???  What
> do = you
> >     mean when you describe some audio
> = >     > captures as "independent" - are you = thinking they come from
> different
> >     = rooms???.
> >     >
> >     = > I think in many respects audio distribution and spatial = audio
> layout
> >     is at least as difficult = as video
> >     > layout, and have some unique = issues.  For one thing, you need
> to sort
> > =     out how you should place the
> >     = > audio from human participants who are not on camera, and = what
> should
> >     happen later on if some = of those
> >     > participants are shown.
> = >     >
> >     > I suggest it is = necessary to be very careful with terminology.
> In
> > =     particular, I think it is important
> >   =   > to distinguish composition from RTP transmission.
> = >     >
> >     > Regards,
> = >     > Stephen Botzko
> >     = >
> >     >
> >     = >
> >     > On Mon, Aug 15, 2011 at 5:45 PM, = Charles Eckel (eckelcu)
> >     <eckelcu@cisco.com> = wrote:
> >     >
> >     = >
> >     >       > = -----Original Message-----
> >     >     =   > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]> >     >       > Sent: Monday, = August 15, 2011 2:14 PM
> >     >     =   > To: Charles Eckel (eckelcu)
> >     > =       > Cc: Paul Kyzivat; clue@ietf.org
> >   =   >       > Subject: Re: [clue] continuing = "layout" discussion
> >     >   =     >
> >     >       = > Inline
> >     >       = >
> >     >       >
> = >     >       > On Mon, Aug 15, 2011 = at 4:21 PM, Charles Eckel
> (eckelcu)
> >     = >       <eckelcu@cisco.com> = wrote:
> >     >       >
> = >     >       >
> >   =   >       >       Please see = inline.
> >     >       = >
> >     >       >
> = >     >       >       = > -----Original Message-----
> >     >   =     >       > From: clue-bounces@ietf.org
> = >     [mailto:clue-bounces@ietf.org] = On
> >     >       Behalf
> = >     >       >       = Of Paul Kyzivat
> >     >       = >
> >     >       >   =     > Sent: Thursday, August 11, 2011 6:02 AM
> > =     >       >
> >     = >       >       > To: clue@ietf.org
> >   =   >       >       > Subject: = Re: [clue] continuing "layout"
> discussion
> > =     >       >       = >
> >     >       >   =     > Inline
> >     >     =   >       >
> >     > =       >       > On 8/10/11 5:49 PM, = Duckworth, Mark wrote:
> >     >     =   >       > >> -----Original = Message-----
> >     >       > =       > >> From: clue-bounces@ietf.org
> = >     [mailto:clue-bounces@ietf.org]
> = >     >       On
> >   =   >       >       Behalf = Of
> >     >       >   =     > >> Paul Kyzivat
> >     > =       >       > >> Sent: = Tuesday, August 09, 2011 9:03 AM
> >     >   =     >       > >> To: clue@ietf.org
> >   =   >       >       > >> = Subject: Re: [clue] continuing "layout"
> = discussion
> >     >       > =       > >
> >     >   =     >       > >>> 4 - multi = stream media format - what the
> streams
> >   =   mean with
> >     >       = respect
> >     >       >   =     to
> >     >       = >       > >> each other, regardless of the = actual
> content on the
> >     >   =     streams.  For
> >     >   =     >       > >> audio, examples = are stereo, 5.1 surround,
> binaural,
> >     = linear
> >     >       = array.
> >     >       >   =     > >> (linear array is described in the = clue
> framework
> >     document).
> = >     >       >       = Perhaps 3D
> >     >       > =       > >> video formats would also fit in = this
> category.
> >     This
> > =     >       information is
> > =     >       >       > = >> needed in order to properly render the
> media = into
> >     light and
> >     = >       sound
> >     >   =     >       for
> >     = >       >       > >> human = observers.  I see this at the same
> level as
> > =     >       identifying a
> > =     >       >       = codec,
> >     >       >   =     > >> independent of the audio or video = content
> carried
> >     on the
> > =     >       streams,
> >   =   >       >       and
> = >     >       >       = > >> independent of how any composition of
> sources = is
> >     done.
> >     > =       >
> >     >     =   >
> >     >       > =       I do not think this is necessarily true. = Taking
> audio as
> >     an
> > =     >       example, you
> >   =   >       >       could have = two audio streams that are mixed to
> form a
> >   =   single
> >     >       = stereo
> >     >       >   =     audio stream, or you could have them as two
> = independent
> >     (not
> >     = >       mixed)
> >     >   =     >       streams that are associate with = each other by
> some
> >     grouping
> = >     >       mechanism.
> > =     >       >       This = group would be categorized as being stereo
> audio
> > =     with one
> >     >     =   audio
> >     >       > =       stream being the left and the other the = right.
> The codec
> >     used
> > =     >       for each
> >   =   >       >       could be = different, though I agree they would
> typically
> > =     be the
> >     >     =   same.
> >     >       > =       Consequently, I think at attribute such as
> = "stereo" as
> >     being
> > =     >       more of a
> >   =   >       >       grouping = concept, where the group may consist
> of:
> >   =   >       >       - multiple = independent streams, each with
> potentially
> >   =   its own
> >     >       = spatial
> >     >       >   =     orientation, codec, bandwidth, etc.,
> >   =   >       >       - a single = mixed stream
> >     >       = >
> >     >       >
> = >     >       >
> >   =   >       > [sb] I do not understand this = distinction.  What do
> you mean
> >     = when you
> >     >       say = "two audio streams that are
> >     >   =     > mixed to form a single stereo stream", and how = is this
> >     different from
> >   =   >       the left and right grouping?
> = >     >
> >     >
> > =     >       In one case they are mixed by = the source of the stream
> into a
> >     = single
> >     >       stream, and = in another they are sent as two separate
> streams by
> > =     the
> >     >       = source. The end result once rendered at the receiver may
> be = the
> >     same,
> >     > =       but what is sent is different. This example with = audio
> is
> >     perhaps too
> > =     >       simple. If you think of it as = video that is composed
> into a
> >     single = video
> >     >       stream vs. = multiple via streams that are sent
> individually, the
> = >     >       difference may be more = clear.
> >     >
> >     > =       Cheers,
> >     >   =     Charles
> >     >
> > =     >
> >     >       = >
> >     >       >
> = >     >       >
> >   =   >       >       = Cheers,
> >     >       >   =     Charles
> >     >     =   >
> >     >       = >
> >     >       >   =     > >> I was with you all the way until 4. = That
> one I
> >     don't
> >   =   >       understand.
> >     = >       >       > >> The = name you chose for this has
> connotations for
> >   =   me, but
> >     >       = isn't
> >     >       >   =     fully in
> >     >     =   >       > >> harmony with the = definitions you give:
> >     >     =   >       > >
> >     = >       >       > > I'm happy = to change the name if you have a
> >     = suggestion
> >     >       > =       >
> >     >     =   >       > Not yet. Maybe once the concepts = are more
> clearly
> >     defined I
> = >     >       will have
> > =     >       >       = an
> >     >       >   =     > opinion.
> >     >   =     >       >
> >     = >       >       > >> If we = consider audio, it makes sense that
> multiple
> >   =   streams
> >     >       can = be
> >     >       >   =     > >> rendered as if they came from = different
> physical
> >     locations
> = >     >       in the
> >   =   >       >       > >> = receiving room. That can be done by the
> receiver if
> > =     it gets
> >     >     =   those
> >     >       > =       > >> streams separately, and has = information
> about their
> >     >   =     intended
> >     >     =   >       > >> relationships. It can = also be done by the
> sender or
> >     MCU = and
> >     >       passed
> = >     >       >       = on
> >     >       >   =     > >> to
> >     >   =     >       > >> the receiver as a = single stream with stereo
> or
> >     = binaural
> >     >       = coding.
> >     >       >   =     > >
> >     >     =   >       > > Yes.  It could also be = done by the sender
> using the
> >     = "linear
> >     >       = array"
> >     >       > =       audio channel format.  Maybe it
> > =     >       >       > = is true that stereo or binaural audio channels
> would
> = >     always be
> >     >   =     sent as
> >     >     =   >       a single stream, but I was not
> = >     >       >       = > assuming that yet, at least not in general
> when you
> = >     consider
> >     >   =     other
> >     >       = >       types too, such as linear array
> > =     >       >       > = channels.
> >     >       > =       >
> >     >     =   >       > >> So it seems to me you = have two concepts
> here, not
> >     one. = One
> >     >       has to
> = >     >       >       = do
> >     >       >   =     > >> with describing the relationships = between
> streams,
> >     and the
> > =     >       other
> >     = >       >       has to
> > =     >       >       > = >> do with the encoding of spacial
> relationships
> = >     *within* a
> >     >   =     single
> >     >     =   >       stream.
> >     > =       >       > >
> > =     >       >       > = > Maybe that is a better way to describe it,
> if you
> = >     assume
> >     >     =   >       multi-channel audio is always sent with = all
> >     >       >   =     > the channels in the same RTP stream.  Is = that
> what you
> >     mean?
> > =     >       >       > = >
> >     >       >   =     > > I was considering the linear array format = to
> be
> >     another type
> > =     >       of
> >     = >       >       multi-channel audio, = and I know
> >     >       > =       > people want to be able to send each channel = in
> a
> >     separate RTP
> >   =   >       stream.
> >     > =       >       So it doesn't quite fit = with
> >     >       >   =     > how you separate the two concepts.  In = my
> view,
> >     identifying
> > =     >       the
> >     = >       >       separate channels by = what they mean is
> >     >       = >       > the same concept for linear array and = stereo.
> For
> >     example
> > =     >       "this
> >   =   >       >       channel is = left, this channel is
> >     >     =   >       > center, this channel is = right".  To me, that
> is the
> >     = same
> >     >       concept = for
> >     >       >   =     identifying channels whether or
> >     = >       >       > not they are = carried in the same RTP stream.
> >     >   =     >       > >
> >   =   >       >       > > = Maybe we are thinking the same thing but
> getting
> > =     confused by
> >     >     =   >       terminology about channels vs. = streams.
> >     >       > =       >
> >     >     =   >       > Maybe. Let me try to restate what = I now think
> you are
> >     saying:
> = >     >       >       = >
> >     >       >   =     > The audio may consist of several = "channels".
> >     >     =   >       >
> >     > =       >       > Each channel may be = sent over its own RTP
> stream,
> >     > =       >       > or multiple channels = may be multiplexed over
> an RTP
> >     = stream.
> >     >       >   =     >
> >     >       = >       > I guess much of this can also apply to = video.
> >     >       >   =     >
> >     >       = >       > When there are exactly two audio = channels,
> they may be
> >     encoded
> = >     >       as
> >   =   >       >       > = "stereo" or "binaural", which then affects = how
> they
> >     should be
> > =     >       rendered
> >   =   >       >       > by the = recipient. In these cases the primary
> info that
> > =     is
> >     >       = required
> >     >       > =       about
> >     >     =   >       > the individual channels is which = is left and
> which is
> >     right.
> = >     >       (And
> >   =   >       >       which
> = >     >       >       = > perspective to use in interpretting left and
> = right.)
> >     >       >   =     >
> >     >       = >       > For other multi-channel cases more = information
> is
> >     required
> > =     >       about the
> >   =   >       >       > role of = each channel in order to properly
> render them.
> > =     >       >       = >
> >     >       >   =     >       Thanks,
> >   =   >       >       >   =     Paul
> >     >       = >       >
> >     >   =     >       >
> >     = >       >       > >> Or, = are you asserting that stereo and
> binaural are
> > =     simply
> >     >     =   ways to
> >     >       > =       > >> encode
> >     = >       >       > >> = multiple logical streams in one RTP stream,
> >     = together with
> >     >       = their
> >     >       >   =     spacial
> >     >     =   >       > >> relationships?
> = >     >       >       = > >
> >     >       > =       > > No, that is not what I'm trying to = say.
> >     >       >   =     > >
> >     >     =   >       > > Mark
> >   =   >       >       > = >
> _______________________________________________
> = >     >       >       = > > clue mailing list
> >     >   =     >       > > clue@ietf.org
> >   =   >       >       > > https://www.ietf.org/mailman/listinfo/clue
> = >     >       >       = > >
> >     >       > =       >
> >     >     =   >       >
> = _______________________________________________
> >   =   >       >       > clue = mailing list
> >     >       > =       > clue@ietf.org
> >   =   >       >       > https://www.ietf.org/mailman/listinfo/clue
> = >     >       >       = _______________________________________________
> >   =   >       >       clue mailing = list
> >     >       >   =     clue@ietf.org
> = >     >       >       = https://www.ietf.org/mailman/listinfo/clue
> = >     >       >
> >   =   >       >
> >     = >
> >     >
> >     = >
> >
> >
> >
>
> = _______________________________________________
> clue mailing = list
> clue@ietf.org
> = https://www.ietf.org/mailman/listinfo/clue

 

= --Boundary_(ID_gqO9ck5CtHUS9J9QrBvA5Q)-- From stephen.botzko@gmail.com Wed Aug 17 07:05:07 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5560521F8BAE for ; Wed, 17 Aug 2011 07:05:07 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -3.392 X-Spam-Level: X-Spam-Status: No, score=-3.392 tagged_above=-999 required=5 tests=[AWL=0.206, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id z9eh7RW-ybQA for ; Wed, 17 Aug 2011 07:05:04 -0700 (PDT) Received: from mail-qy0-f179.google.com (mail-qy0-f179.google.com [209.85.216.179]) by ietfa.amsl.com (Postfix) with ESMTP id 7D11521F8BAD for ; Wed, 17 Aug 2011 07:05:04 -0700 (PDT) Received: by qyk35 with SMTP id 35so612742qyk.10 for ; Wed, 17 Aug 2011 07:05:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=05zW1pzH/buvH5xLh2Oi8bGEqdu3azidRVi/PHxxBAE=; b=JazFU7PtDbDjL+drxneGD3xN93bMwDilFIF/4GCTMAhgisOb1nnxvUr2yLYoww6dJe X/dWUh+9bqji5lLEZHqZBG1WBKH+xKhzPlc2tqsD6xHea9pSL4wFazpsRkkJSIjhVR8l A0vnArslzuBzw5BmZ2pvjBWRenzzeM3pkKngE= MIME-Version: 1.0 Received: by 10.52.88.133 with SMTP id bg5mr1055638vdb.88.1313589955444; Wed, 17 Aug 2011 07:05:55 -0700 (PDT) Received: by 10.52.115.103 with HTTP; Wed, 17 Aug 2011 07:05:55 -0700 (PDT) In-Reply-To: <02bc01cc5c99$6375adb0$2a610910$%roni@huawei.com> References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> <02a501cc5c6d$1a2bf1e0$4e83d5a0$%roni@huawei.com> <02bc01cc5c99$6375adb0$2a610910$%roni@huawei.com> Date: Wed, 17 Aug 2011 10:05:55 -0400 Message-ID: From: Stephen Botzko To: Roni Even Content-Type: multipart/alternative; boundary=20cf307d049cacbf2c04aab3feec Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Aug 2011 14:05:07 -0000 --20cf307d049cacbf2c04aab3feec Content-Type: text/plain; charset=ISO-8859-1 For audio at least (and probably video) I agree you need the number and placement of *captures*, but I see no value in knowing the number of capture *devices*. For instance, the stereo encoding we started with might be derived from a microphone in front each of two participants, or it might be derived from a large microphone array. For receivers, there is no difference, so I see no reason to signal it. The 3D telepresence demonstration technology in the EU used 3 cameras to derive each 3D view (I think), but it could have also been done with a Kinect-style single camera. Again, the number of cameras used to make the capture would make no difference to a receiver. I don't know what you mean by the "capture field" (or what specifically about it you think we ought to know), so at present I have no opinion as to whether it is needed or not. I agree that mixing of sources from multiple rooms needs more attention (and I think that is what the "layout" conversation should be chiefly about). I think the framework draft is not limited to 2 channels for audio and 3 cameras. I haven't seen any issues for an N-image video wall and an associated M-channel audio capture, as long as you stay within the stated assumption in the model that the audio/video are on one wall. Apparently there is work to extend the model to handle multiple video walls, and of course the audio will need to be adjusted for that. BTW, I would challenge anyone who is either proposing an alternative framework or extending this draft to build the needed rendering equations to get the right sound field from a standard arrangement of speakers. Though rendering itself is out of scope, we do have enablement requirements for rendering. There are lots of things we *could* signal, but if their use in rendering is not easily understood, then we will not achieve interoperability. Regards, Stephen On Wed, Aug 17, 2011 at 12:51 AM, Roni Even wrote: > Hi Steve,**** > > The two channel is a simple private case and using this case to define the > required information from the capture and render side is like saying that I > proved a mathematical induction for n=2 therefore it works for every n. I > see this issue since we are using a n=2 channels audio and a n=3 cameras > left to right examples to provide a solution that will scale to any n.**** > > ** ** > > What I was trying to say that the current way we describe the stream by a > number is not enough if we want to go to the "being there" experience. *** > * > > We need to see what are the dimensions that the model need in order to be > able to convey the capture information and the rendering capabilities for > both audio and video. I think that there is some similarities since we have > a number of streams, how the capture is done and what is the rendering > device as the basic information. **** > > I think that before going to the framework model it may be beneficial to > create a list of the parameters we need to convey, provide a term for each > group of parameters and have a way to define them in the model. For example > for the capture we have the number of capture devices, the arrangement > (Spatial), the encoding process (including mixing if multiple inputs), the > capture field and others.**** > > ** ** > > Regards**** > > Roni**** > > ** ** > > *From:* Stephen Botzko [mailto:stephen.botzko@gmail.com] > *Sent:* Wednesday, August 17, 2011 4:37 AM > *To:* Roni Even > *Cc:* Charles Eckel (eckelcu); clue@ietf.org > > *Subject:* Re: [clue] continuing "layout" discussion**** > > ** ** > > Hi Roni > > For this particular discussion, all of the two channel transmissions are > "stereo", they are just transported differently. > > As far as the framework draft is concerned, the various microphone > arrangements are accounted for by the signaling of the 1-100 indices for > each channel. > > Binaural is something else- either an HRTF function is applied to the two > channels prior to rendering (which was Christer's case with the central > rendering server), or you have a dummy head with microphones in the ears in > the telepresence room to make the capture. Not sure if we need to > distinguish the capture and render cases right now. > > Regards, > Stephen**** > > On Tue, Aug 16, 2011 at 7:34 PM, Roni Even wrote:** > ** > > Hi guys, > In case 1 according to RFC 3551 (section 4.1) 2 channels in the rtpmap > means > left and right channels described as stereo. Are you saying that for the 2 > and 2b case you also assume stereo capture or can it be any other way of > creating the two audio streams from the same room (Binaural recording (not > common), or some other arrangements of the microphones). But this talk > about > the capture side. > > I think that Christer talked about the rendering side and not only on the > capture side. > > Roni**** > > > > -----Original Message----- > > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of* > *** > > > Charles Eckel (eckelcu) > > Sent: Wednesday, August 17, 2011 12:40 AM > > To: Stephen Botzko > > Cc: clue@ietf.org**** > > > Subject: Re: [clue] continuing "layout" discussion > > > > Agreed. The difference I am trying to point out is that in (1), the > > information you need to describe the audio stream for appropriate > > rendering is already handled quite well by existing SIP/SDP/RTP and > > most > > implementations, whereas you need CLUE for (2) and (2b). > > > > Cheers, > > Charles > > > > > -----Original Message----- > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > Sent: Tuesday, August 16, 2011 2:14 PM > > > To: Charles Eckel (eckelcu) > > > Cc: Paul Kyzivat; clue@ietf.org > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > Well, the audio in (1) and (2b) is certainly packetized differently. > > But not compressed differently > > > (unless you are assuming that the signal in (1) is jointly encoded > > stereo - which it could be I guess, > > > but it would be unusual for telepresence systems). Also, the audio in > > (1) is not mixed, no matter how > > > it is encoded. > > > > > > In any event, I believe that the difference between (1) and (2) and > > (2b) is really a transport > > > question that has nothing to do with layout. The same information is > > needed to enable proper > > > rendering, and once the streams are received, they are rendered in > > precisely the same way. > > > > > > Regards, > > > Stephen Botzko > > > > > > > > > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu) > > wrote: > > > > > > > > > I am distinguishing between: > > > > > > (1) a single RTP stream that consists of a single stereo audio > > stream > > > (2) two RTP streams, one that contains left speaker audio and > > the other > > > than contains right speaker audio > > > > > > (2) could also be transmitted in a single RTP stream using SSRC > > > multiplexing. Let me call that (2b). > > > (2) and (2b) are essentially the same. Just the RTP mechanism > > employed > > > is difference. > > > (1) is different from (2) and (2b) in that the audio signal > > encoded is > > > actually different. > > > > > > Cheers, > > > Charles > > > > > > > > > > -----Original Message----- > > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > > > > Sent: Tuesday, August 16, 2011 6:20 AM > > > > To: Charles Eckel (eckelcu) > > > > Cc: Paul Kyzivat; clue@ietf.org > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > I guess by "stream" you are meaning RTP stream? in which case > > by > > > "mix" you perhaps mean that the left > > > > and right channels are placed in a single RTP stream??? What > > do you > > > mean when you describe some audio > > > > captures as "independent" - are you thinking they come from > > different > > > rooms???. > > > > > > > > I think in many respects audio distribution and spatial audio > > layout > > > is at least as difficult as video > > > > layout, and have some unique issues. For one thing, you need > > to sort > > > out how you should place the > > > > audio from human participants who are not on camera, and what > > should > > > happen later on if some of those > > > > participants are shown. > > > > > > > > I suggest it is necessary to be very careful with terminology. > > In > > > particular, I think it is important > > > > to distinguish composition from RTP transmission. > > > > > > > > Regards, > > > > Stephen Botzko > > > > > > > > > > > > > > > > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) > > > wrote: > > > > > > > > > > > > > -----Original Message----- > > > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > > Sent: Monday, August 15, 2011 2:14 PM > > > > > To: Charles Eckel (eckelcu) > > > > > Cc: Paul Kyzivat; clue@ietf.org > > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > > > Inline > > > > > > > > > > > > > > > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel > > (eckelcu) > > > > wrote: > > > > > > > > > > > > > > > Please see inline. > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > From: clue-bounces@ietf.org > > > [mailto:clue-bounces@ietf.org] On > > > > Behalf > > > > > Of Paul Kyzivat > > > > > > > > > > > Sent: Thursday, August 11, 2011 6:02 AM > > > > > > > > > > > To: clue@ietf.org > > > > > > Subject: Re: [clue] continuing "layout" > > discussion > > > > > > > > > > > > Inline > > > > > > > > > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > > > > > >> -----Original Message----- > > > > > > >> From: clue-bounces@ietf.org > > > [mailto:clue-bounces@ietf.org] > > > > On > > > > > Behalf Of > > > > > > >> Paul Kyzivat > > > > > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > > > > > >> To: clue@ietf.org > > > > > > >> Subject: Re: [clue] continuing "layout" > > discussion > > > > > > > > > > > > > >>> 4 - multi stream media format - what the > > streams > > > mean with > > > > respect > > > > > to > > > > > > >> each other, regardless of the actual > > content on the > > > > streams. For > > > > > > >> audio, examples are stereo, 5.1 surround, > > binaural, > > > linear > > > > array. > > > > > > >> (linear array is described in the clue > > framework > > > document). > > > > > Perhaps 3D > > > > > > >> video formats would also fit in this > > category. > > > This > > > > information is > > > > > > >> needed in order to properly render the > > media into > > > light and > > > > sound > > > > > for > > > > > > >> human observers. I see this at the same > > level as > > > > identifying a > > > > > codec, > > > > > > >> independent of the audio or video content > > carried > > > on the > > > > streams, > > > > > and > > > > > > >> independent of how any composition of > > sources is > > > done. > > > > > > > > > > > > > > > I do not think this is necessarily true. Taking > > audio as > > > an > > > > example, you > > > > > could have two audio streams that are mixed to > > form a > > > single > > > > stereo > > > > > audio stream, or you could have them as two > > independent > > > (not > > > > mixed) > > > > > streams that are associate with each other by > > some > > > grouping > > > > mechanism. > > > > > This group would be categorized as being stereo > > audio > > > with one > > > > audio > > > > > stream being the left and the other the right. > > The codec > > > used > > > > for each > > > > > could be different, though I agree they would > > typically > > > be the > > > > same. > > > > > Consequently, I think at attribute such as > > "stereo" as > > > being > > > > more of a > > > > > grouping concept, where the group may consist > > of: > > > > > - multiple independent streams, each with > > potentially > > > its own > > > > spatial > > > > > orientation, codec, bandwidth, etc., > > > > > - a single mixed stream > > > > > > > > > > > > > > > > > > > > [sb] I do not understand this distinction. What do > > you mean > > > when you > > > > say "two audio streams that are > > > > > mixed to form a single stereo stream", and how is this > > > different from > > > > the left and right grouping? > > > > > > > > > > > > In one case they are mixed by the source of the stream > > into a > > > single > > > > stream, and in another they are sent as two separate > > streams by > > > the > > > > source. The end result once rendered at the receiver may > > be the > > > same, > > > > but what is sent is different. This example with audio > > is > > > perhaps too > > > > simple. If you think of it as video that is composed > > into a > > > single video > > > > stream vs. multiple via streams that are sent > > individually, the > > > > difference may be more clear. > > > > > > > > Cheers, > > > > Charles > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > Charles > > > > > > > > > > > > > > > > >> I was with you all the way until 4. That > > one I > > > don't > > > > understand. > > > > > > >> The name you chose for this has > > connotations for > > > me, but > > > > isn't > > > > > fully in > > > > > > >> harmony with the definitions you give: > > > > > > > > > > > > > > I'm happy to change the name if you have a > > > suggestion > > > > > > > > > > > > Not yet. Maybe once the concepts are more > > clearly > > > defined I > > > > will have > > > > > an > > > > > > opinion. > > > > > > > > > > > > >> If we consider audio, it makes sense that > > multiple > > > streams > > > > can be > > > > > > >> rendered as if they came from different > > physical > > > locations > > > > in the > > > > > > >> receiving room. That can be done by the > > receiver if > > > it gets > > > > those > > > > > > >> streams separately, and has information > > about their > > > > intended > > > > > > >> relationships. It can also be done by the > > sender or > > > MCU and > > > > passed > > > > > on > > > > > > >> to > > > > > > >> the receiver as a single stream with stereo > > or > > > binaural > > > > coding. > > > > > > > > > > > > > > Yes. It could also be done by the sender > > using the > > > "linear > > > > array" > > > > > audio channel format. Maybe it > > > > > > is true that stereo or binaural audio channels > > would > > > always be > > > > sent as > > > > > a single stream, but I was not > > > > > > assuming that yet, at least not in general > > when you > > > consider > > > > other > > > > > types too, such as linear array > > > > > > channels. > > > > > > > > > > > > >> So it seems to me you have two concepts > > here, not > > > one. One > > > > has to > > > > > do > > > > > > >> with describing the relationships between > > streams, > > > and the > > > > other > > > > > has to > > > > > > >> do with the encoding of spacial > > relationships > > > *within* a > > > > single > > > > > stream. > > > > > > > > > > > > > > Maybe that is a better way to describe it, > > if you > > > assume > > > > > multi-channel audio is always sent with all > > > > > > the channels in the same RTP stream. Is that > > what you > > > mean? > > > > > > > > > > > > > > I was considering the linear array format to > > be > > > another type > > > > of > > > > > multi-channel audio, and I know > > > > > > people want to be able to send each channel in > > a > > > separate RTP > > > > stream. > > > > > So it doesn't quite fit with > > > > > > how you separate the two concepts. In my > > view, > > > identifying > > > > the > > > > > separate channels by what they mean is > > > > > > the same concept for linear array and stereo. > > For > > > example > > > > "this > > > > > channel is left, this channel is > > > > > > center, this channel is right". To me, that > > is the > > > same > > > > concept for > > > > > identifying channels whether or > > > > > > not they are carried in the same RTP stream. > > > > > > > > > > > > > > Maybe we are thinking the same thing but > > getting > > > confused by > > > > > terminology about channels vs. streams. > > > > > > > > > > > > Maybe. Let me try to restate what I now think > > you are > > > saying: > > > > > > > > > > > > The audio may consist of several "channels". > > > > > > > > > > > > Each channel may be sent over its own RTP > > stream, > > > > > > or multiple channels may be multiplexed over > > an RTP > > > stream. > > > > > > > > > > > > I guess much of this can also apply to video. > > > > > > > > > > > > When there are exactly two audio channels, > > they may be > > > encoded > > > > as > > > > > > "stereo" or "binaural", which then affects how > > they > > > should be > > > > rendered > > > > > > by the recipient. In these cases the primary > > info that > > > is > > > > required > > > > > about > > > > > > the individual channels is which is left and > > which is > > > right. > > > > (And > > > > > which > > > > > > perspective to use in interpretting left and > > right.) > > > > > > > > > > > > For other multi-channel cases more information > > is > > > required > > > > about the > > > > > > role of each channel in order to properly > > render them. > > > > > > > > > > > > Thanks, > > > > > > Paul > > > > > > > > > > > > > > > > > > >> Or, are you asserting that stereo and > > binaural are > > > simply > > > > ways to > > > > > > >> encode > > > > > > >> multiple logical streams in one RTP stream, > > > together with > > > > their > > > > > spacial > > > > > > >> relationships? > > > > > > > > > > > > > > No, that is not what I'm trying to say. > > > > > > > > > > > > > > Mark > > > > > > > > > _______________________________________________ > > > > > > > clue mailing list > > > > > > > clue@ietf.org > > > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > clue mailing list > > > > > > clue@ietf.org > > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > _______________________________________________ > > > > > clue mailing list > > > > > clue@ietf.org > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue**** > > ** ** > --20cf307d049cacbf2c04aab3feec Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable For audio at least (and probably video) I agree you need the number and pla= cement of captures, but I see no value in knowing the number of capt= ure devices.=A0 For instance, the stereo encoding we started with mi= ght be derived from a microphone in front each of two participants, or it m= ight be derived from a large microphone array.=A0 For receivers, there is n= o difference, so I see no reason to signal it.=A0=A0 The 3D telepresence de= monstration technology in the EU used 3 cameras to derive each 3D view (I t= hink), but it could have also been done with a Kinect-style single camera.= =A0 Again, the number of cameras used to make the capture would make no dif= ference to a receiver.

I don't know what you mean by the "capture field" (or wha= t specifically about it you think we ought to know), so at present I have n= o opinion as to whether it is needed or not.=A0 I agree that mixing of sour= ces from multiple rooms needs more attention (and I think that is what the = "layout" conversation should be chiefly about).

I think the framework draft is not limited to 2 channels for audio and = 3 cameras.=A0 I haven't seen any issues for an N-image video wall and a= n associated M-channel audio capture, as long as you stay within the stated= assumption in the model that the audio/video are on one wall.=A0 Apparentl= y there is work to extend the model to handle multiple video=20 walls, and of course the audio will need to be adjusted for that.

B= TW, I would challenge anyone who is either proposing an alternative framewo= rk or extending this draft to build the needed rendering equations to get t= he right sound field from a standard arrangement of speakers.=A0 Though ren= dering itself is out of scope, we do have enablement requirements for rende= ring.=A0 There are lots of things we could signal, but if their use = in rendering is not easily understood, then we will not achieve interoperab= ility.

Regards,
Stephen



On Wed, A= ug 17, 2011 at 12:51 AM, Roni Even <Even.roni@huawei.com> wrote:

Hi Steve,<= /span>

The two channel is a simple private case and using this case to define= the required information from the capture and render side is like saying t= hat I proved a mathematical induction for n=3D2 therefore it works for ever= y n. I see this issue since we are using a n=3D2 channels audio and a n=3D3= cameras left to right examples to provide a solution that will scale to an= y n.

=A0

What I was trying to say that the current way we describe= the stream by a number is not enough if we want to go to the "being t= here" experience.

We ne= ed to see what are the dimensions that the model need in order to be able t= o convey the capture information and the rendering capabilities for both au= dio and video. I think that there is some similarities since we have a numb= er of streams, how the capture is done and what is the rendering device as = the basic information.

I thi= nk that before going to the framework model it may be beneficial to create = a list of the parameters we need to convey, provide a term for each group o= f parameters and have a way to define them in the model. For example =A0=A0= for the capture we have the number of capture devices, the arrangement (Spa= tial), the encoding process (including mixing if multiple inputs), the capt= ure field and others.

=A0

Regards

Roni

=A0

From:= Stephen Botzko [mailto:stephen.botzko@gmail.com] <= br>Sent: Wednesday, August 17, 2011 4:37 AM
To: Roni Even
Cc: Charles Eckel (eckelcu); clue@ietf.org

<= /div>

Subject: Re: [clue] continuing "layo= ut" discussion

=A0

Hi Roni

For this particular discussion, all of the two channe= l transmissions are "stereo", they are just transported different= ly.=A0

As far as the framework draft is concerned, the various microphone arra= ngements are accounted for by the signaling of the 1-100 indices for each c= hannel.

Binaural is something else- either an HRTF function is appl= ied to the two channels prior to rendering (which was Christer's case w= ith the central rendering server), or you have a dummy head with microphone= s in the ears in the telepresence room to make the capture.=A0 Not sure if = we need to distinguish the capture and render cases right now.

Regards,
Stephen

On Tue= , Aug 16, 2011 at 7:34 PM, Roni Even <Even.roni@huawei.com> wrote:

Hi guys,
In case 1 according to RFC 3551 (section= 4.1) 2 channels in the rtpmap means
left and right channels described a= s stereo. Are you saying that for the 2
and 2b case you also assume ster= eo capture or can it be any other way of
creating the two audio streams =A0from the same room (Binaural recording (n= ot
common), or some other arrangements of the microphones). But this tal= k about
the capture side.

I think that Christer talked about the = rendering side and not only on the
capture side.

Roni


= > -----Original Message-----
> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] = On Behalf Of

> Charles Eckel (eckelcu)
> Sent: Wed= nesday, August 17, 2011 12:40 AM
> To: Stephen Botzko
> Cc: clue@ietf.org=

> Subjec= t: Re: [clue] continuing "layout" discussion
>
> Agre= ed. The difference I am trying to point out is that in (1), the
> inf= ormation you need to describe the audio stream for appropriate
> rendering is already handled quite well by existing SIP/SDP/RTP and> most
> implementations, whereas you need CLUE for (2) and (2b).=
>
> Cheers,
> Charles
>
> > -----Original= Message-----
> > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> > Sent: = Tuesday, August 16, 2011 2:14 PM
> > To: Charles Eckel (eckelcu) > > Cc: Paul Kyzivat; clue@ietf.org
> > Subject: Re: [clue] continuing "lay= out" discussion
> >
> > Well, the audio in (1) and (= 2b) is certainly packetized differently.
> But not compressed differently
> > (unless you are assuming t= hat the signal in (1) is jointly encoded
> stereo - which it could be= I guess,
> > but it would be unusual for telepresence systems). A= lso, the audio in
> (1) is not mixed, no matter how
> > it is encoded.
> &g= t;
> > In any event, I believe that the difference between (1) and= (2) and
> (2b) is really a transport
> > question that has = nothing to do with layout. The same information is
> needed to enable proper
> > rendering, and once the streams a= re received, they are rendered in
> precisely the same way.
> &= gt;
> > Regards,
> > Stephen Botzko
> >
> = >
> > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu)
> = <eckelcu@cisco.co= m> wrote:
> >
> >
> > =A0 =A0 I am distin= guishing between:
> >
> > =A0 =A0 (1) a single RTP stream that consists of a s= ingle stereo audio
> stream
> > =A0 =A0 (2) two RTP streams,= one that contains left speaker audio and
> the other
> > = =A0 =A0 than contains right speaker audio
> >
> > =A0 =A0 (2) could also be transmitted in a single RT= P stream using SSRC
> > =A0 =A0 multiplexing. Let me call that (2b= ).
> > =A0 =A0 (2) and (2b) are essentially the same. Just the RTP= mechanism
> employed
> > =A0 =A0 is difference.
> > =A0 =A0 (1) = is different from (2) and (2b) in that the audio signal
> encoded is<= br>> > =A0 =A0 actually different.
> >
> > =A0 =A0 = Cheers,
> > =A0 =A0 Charles
> >
> >
> > =A0 =A0 > -----Original Message-----=
> > =A0 =A0 > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]> >
> > =A0 =A0 > Sent: Tuesday, August 16, 2011 6:20 AM
> > = =A0 =A0 > To: Charles Eckel (eckelcu)
> > =A0 =A0 > Cc: Paul= Kyzivat; clue@ietf.org<= /a>
> > =A0 =A0 > Subject: Re: [clue] continuing "layout&q= uot; discussion
> > =A0 =A0 >
> > =A0 =A0 > I guess by "stream&qu= ot; you are meaning RTP stream? =A0in which case
> by
> > = =A0 =A0 "mix" you perhaps mean that the left
> > =A0 =A0= > and right channels are placed in a single RTP stream??? =A0What
> do you
> > =A0 =A0 mean when you describe some audio
> = > =A0 =A0 > captures as "independent" - are you thinking th= ey come from
> different
> > =A0 =A0 rooms???.
> > = =A0 =A0 >
> > =A0 =A0 > I think in many respects audio distribution and spat= ial audio
> layout
> > =A0 =A0 is at least as difficult as v= ideo
> > =A0 =A0 > layout, and have some unique issues. =A0For = one thing, you need
> to sort
> > =A0 =A0 out how you should place the
> >= =A0 =A0 > audio from human participants who are not on camera, and what=
> should
> > =A0 =A0 happen later on if some of those
&g= t; > =A0 =A0 > participants are shown.
> > =A0 =A0 >
> > =A0 =A0 > I suggest it is necessary = to be very careful with terminology.
> In
> > =A0 =A0 partic= ular, I think it is important
> > =A0 =A0 > to distinguish comp= osition from RTP transmission.
> > =A0 =A0 >
> > =A0 =A0 > Regards,
> > =A0 = =A0 > Stephen Botzko
> > =A0 =A0 >
> > =A0 =A0 >=
> > =A0 =A0 >
> > =A0 =A0 > On Mon, Aug 15, 2011 a= t 5:45 PM, Charles Eckel (eckelcu)
> > =A0 =A0 <
eckelcu@cisco.com> wrote:
> > =A0 =A0 >
> > = =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > -----Original Messa= ge-----
> > =A0 =A0 > =A0 =A0 =A0 > From: Stephen Botzko [ma= ilto:stephen.= botzko@gmail.com]
> > =A0 =A0 > =A0 =A0 =A0 > Sent: Monday, August 15, 2011 2:14 = PM
> > =A0 =A0 > =A0 =A0 =A0 > To: Charles Eckel (eckelcu)> > =A0 =A0 > =A0 =A0 =A0 > Cc: Paul Kyzivat; clue@ietf.org
> > =A0 =A0 > =A0 =A0 =A0 > Subject: Re: [clue] continuing &quo= t;layout" discussion
> > =A0 =A0 > =A0 =A0 =A0 >
>= ; > =A0 =A0 > =A0 =A0 =A0 > Inline
> > =A0 =A0 > =A0 = =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > On Mon, Aug 15, 2011 at 4:21 PM, Ch= arles Eckel
> (eckelcu)
> > =A0 =A0 > =A0 =A0 =A0 <eckelcu@cisco.com&g= t; wrote:
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 Please see inline.
> > =A0 =A0 > =A0 =A0 = =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 &g= t; =A0 =A0 =A0 > =A0 =A0 =A0 > -----Original Message-----
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > From: clue-bounces@ietf.org> > =A0 =A0 [mailto:clue-bounces@ietf.org] On
> > =A0 =A0 > =A0 =A0 =A0 Behalf
> > =A0 =A0 > =A0 =A0= =A0 > =A0 =A0 =A0 Of Paul Kyzivat
> > =A0 =A0 > =A0 =A0 =A0= >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Sent: Thu= rsday, August 11, 2011 6:02 AM
> > =A0 =A0 > =A0 =A0 =A0 > > > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > To: clue@ietf.org
> > =A0 =A0= > =A0 =A0 =A0 > =A0 =A0 =A0 > Subject: Re: [clue] continuing &quo= t;layout"
> discussion
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Inline
> > =A0 =A0 >= =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 &g= t; =A0 =A0 =A0 > On 8/10/11 5:49 PM, Duckworth, Mark wrote:
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> -----Orig= inal Message-----
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 &g= t; >> From: clue-bounces@ietf.org
> > =A0 =A0 [mailto:clue-bounces@ietf.org]
> > =A0 =A0 > =A0 =A0 =A0 On
> > =A0 =A0 > =A0 =A0 =A0= > =A0 =A0 =A0 Behalf Of
> > =A0 =A0 > =A0 =A0 =A0 > =A0 = =A0 =A0 > >> Paul Kyzivat
> > =A0 =A0 > =A0 =A0 =A0 &g= t; =A0 =A0 =A0 > >> Sent: Tuesday, August 09, 2011 9:03 AM
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> To: clue@ietf.org
> >= ; =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> Subject: Re: [clu= e] continuing "layout"
> discussion
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >= >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >>&= gt; 4 - multi stream media format - what the
> streams
> > = =A0 =A0 mean with
> > =A0 =A0 > =A0 =A0 =A0 respect
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 to
> > =A0 =A0= > =A0 =A0 =A0 > =A0 =A0 =A0 > >> each other, regardless of = the actual
> content on the
> > =A0 =A0 > =A0 =A0 =A0 str= eams. =A0For
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > &g= t;> audio, examples are stereo, 5.1 surround,
> binaural,
> > =A0 =A0 linear
> > =A0 =A0 > =A0 = =A0 =A0 array.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > = >> (linear array is described in the clue
> framework
> &= gt; =A0 =A0 document).
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 = =A0 Perhaps 3D
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> video for= mats would also fit in this
> category.
> > =A0 =A0 This
= > > =A0 =A0 > =A0 =A0 =A0 information is
> > =A0 =A0 >= =A0 =A0 =A0 > =A0 =A0 =A0 > >> needed in order to properly ren= der the
> media into
> > =A0 =A0 light and
> > =A0 =A0 > = =A0 =A0 =A0 sound
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 fo= r
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> huma= n observers. =A0I see this at the same
> level as
> > =A0 =A0 > =A0 =A0 =A0 identifying a
> > =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 codec,
> > =A0 =A0 > =A0 =A0 =A0 &= gt; =A0 =A0 =A0 > >> independent of the audio or video content
= > carried
> > =A0 =A0 on the
> > =A0 =A0 > =A0 =A0 =A0 streams,
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 and
> > =A0 =A0 > =A0 =A0 =A0 > =A0= =A0 =A0 > >> independent of how any composition of
> source= s is
> > =A0 =A0 done.
> > =A0 =A0 > =A0 =A0 =A0 ><= br> > > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 I do not think this is necessarily true. Taking
>= ; audio as
> > =A0 =A0 an
> > =A0 =A0 > =A0 =A0 =A0 ex= ample, you
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 could hav= e two audio streams that are mixed to
> form a
> > =A0 =A0 single
> > =A0 =A0 > =A0 =A0 = =A0 stereo
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 audio str= eam, or you could have them as two
> independent
> > =A0 =A0= (not
> > =A0 =A0 > =A0 =A0 =A0 mixed)
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 streams that are associ= ate with each other by
> some
> > =A0 =A0 grouping
> &= gt; =A0 =A0 > =A0 =A0 =A0 mechanism.
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 This group would be categorized as being stereo
> audio
> > =A0 =A0 with one
> > =A0 =A0 > =A0 =A0 = =A0 audio
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 stream bei= ng the left and the other the right.
> The codec
> > =A0 =A0= used
> > =A0 =A0 > =A0 =A0 =A0 for each
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 could be different, tho= ugh I agree they would
> typically
> > =A0 =A0 be the
>= ; > =A0 =A0 > =A0 =A0 =A0 same.
> > =A0 =A0 > =A0 =A0 =A0= > =A0 =A0 =A0 Consequently, I think at attribute such as
> "stereo" as
> > =A0 =A0 being
> > =A0 =A0 = > =A0 =A0 =A0 more of a
> > =A0 =A0 > =A0 =A0 =A0 > =A0 = =A0 =A0 grouping concept, where the group may consist
> of:
> &= gt; =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 - multiple independent stream= s, each with
> potentially
> > =A0 =A0 its own
> > =A0 =A0 > =A0= =A0 =A0 spatial
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 ori= entation, codec, bandwidth, etc.,
> > =A0 =A0 > =A0 =A0 =A0 >= ; =A0 =A0 =A0 - a single mixed stream
> > =A0 =A0 > =A0 =A0 =A0= >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 = =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > [sb] I do not understan= d this distinction. =A0What do
> you mean
> > =A0 =A0 when y= ou
> > =A0 =A0 > =A0 =A0 =A0 say "two audio streams that a= re
> > =A0 =A0 > =A0 =A0 =A0 > mixed to form a single stereo strea= m", and how is this
> > =A0 =A0 different from
> > = =A0 =A0 > =A0 =A0 =A0 the left and right grouping?
> > =A0 =A0 = >
> > =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 In one case they are mixed by the source= of the stream
> into a
> > =A0 =A0 single
> > =A0 = =A0 > =A0 =A0 =A0 stream, and in another they are sent as two separate> streams by
> > =A0 =A0 the
> > =A0 =A0 > =A0 =A0 =A0 source. The end= result once rendered at the receiver may
> be the
> > =A0 = =A0 same,
> > =A0 =A0 > =A0 =A0 =A0 but what is sent is differe= nt. This example with audio
> is
> > =A0 =A0 perhaps too
> > =A0 =A0 > =A0 =A0 = =A0 simple. If you think of it as video that is composed
> into a
= > > =A0 =A0 single video
> > =A0 =A0 > =A0 =A0 =A0 stream= vs. multiple via streams that are sent
> individually, the
> > =A0 =A0 > =A0 =A0 =A0 difference may= be more clear.
> > =A0 =A0 >
> > =A0 =A0 > =A0 =A0= =A0 Cheers,
> > =A0 =A0 > =A0 =A0 =A0 Charles
> > =A0= =A0 >
> > =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 = =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 &g= t; =A0 =A0 =A0 > =A0 =A0 =A0 Cheers,
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 Charles
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 > >> I was with you all the way until 4. That=
> one I
> > =A0 =A0 don't
> > =A0 =A0 > =A0= =A0 =A0 understand.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0= > >> The name you chose for this has
> connotations for
> > =A0 =A0 me, but
> > =A0 =A0 >= ; =A0 =A0 =A0 isn't
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 = =A0 fully in
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > &g= t;> harmony with the definitions you give:
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > = =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > I'm happy to change= the name if you have a
> > =A0 =A0 suggestion
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 > Not yet. Maybe once the concepts are more
> clearly
> > =A0 =A0 defined I
> > =A0 =A0 > =A0 = =A0 =A0 will have
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 an=
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > opinion.
&g= t; > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0= > =A0 =A0 =A0 > =A0 =A0 =A0 > >> If we consider audio, it m= akes sense that
> multiple
> > =A0 =A0 streams
> > =A0 =A0 > =A0 = =A0 =A0 can be
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > = >> rendered as if they came from different
> physical
> &= gt; =A0 =A0 locations
> > =A0 =A0 > =A0 =A0 =A0 in the
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> receiving= room. That can be done by the
> receiver if
> > =A0 =A0 it = gets
> > =A0 =A0 > =A0 =A0 =A0 those
> > =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 > >> streams separately, and has info= rmation
> about their
> > =A0 =A0 > =A0 =A0 =A0 intended
> >= ; =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> relationships. It= can also be done by the
> sender or
> > =A0 =A0 MCU and
= > > =A0 =A0 > =A0 =A0 =A0 passed
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 on
> > =A0 =A0= > =A0 =A0 =A0 > =A0 =A0 =A0 > >> to
> > =A0 =A0 &g= t; =A0 =A0 =A0 > =A0 =A0 =A0 > >> the receiver as a single stre= am with stereo
> or
> > =A0 =A0 binaural
> > =A0 =A0 > =A0 =A0 =A0 coding.
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 &g= t; =A0 =A0 =A0 > > Yes. =A0It could also be done by the sender
>= ; using the
> > =A0 =A0 "linear
> > =A0 =A0 > =A0 =A0 =A0 array"
> > =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 audio channel format. =A0Maybe it
> >= =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > is true that stereo or bina= ural audio channels
> would
> > =A0 =A0 always be
> > =A0 =A0 > =A0 =A0 =A0 sent as
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 a single stream, but I was not
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > assuming that yet, at least not = in general
> when you
> > =A0 =A0 consider
> > =A0 =A0 > =A0 =A0 =A0 other
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 types too, such as linear array
> > =A0 =A0 &= gt; =A0 =A0 =A0 > =A0 =A0 =A0 > channels.
> > =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >= ; =A0 =A0 =A0 > >> So it seems to me you have two concepts
> here, not
> > =A0 =A0 one. One
> > =A0 =A0 > =A0 = =A0 =A0 has to
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 do> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> with des= cribing the relationships between
> streams,
> > =A0 =A0 and the
> > =A0 =A0 > =A0 =A0 =A0 other
&g= t; > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 has to
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> do with the encoding of= spacial
> relationships
> > =A0 =A0 *within* a
> > =A0 =A0 > =A0 =A0 =A0 single
> > =A0 =A0 > =A0 =A0= =A0 > =A0 =A0 =A0 stream.
> > =A0 =A0 > =A0 =A0 =A0 > = =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 = =A0 > > Maybe that is a better way to describe it,
> if you
> > =A0 =A0 assume
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 multi-channel audio is always sent with all
> &g= t; =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > the channels in the same = RTP stream. =A0Is that
> what you
> > =A0 =A0 mean?
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 = =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > &= gt; I was considering the linear array format to
> be
> > = =A0 =A0 another type
> > =A0 =A0 > =A0 =A0 =A0 of
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 multi-channel audio, an= d I know
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > people= want to be able to send each channel in
> a
> > =A0 =A0 sep= arate RTP
> > =A0 =A0 > =A0 =A0 =A0 stream.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 So it doesn't quite= fit with
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > how y= ou separate the two concepts. =A0In my
> view,
> > =A0 =A0 i= dentifying
> > =A0 =A0 > =A0 =A0 =A0 the
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 separate channels by wh= at they mean is
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >= the same concept for linear array and stereo.
> For
> > =A0= =A0 example
> > =A0 =A0 > =A0 =A0 =A0 "this
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 channel is left, this c= hannel is
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > cente= r, this channel is right". =A0To me, that
> is the
> > = =A0 =A0 same
> > =A0 =A0 > =A0 =A0 =A0 concept for
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 identifying channels wh= ether or
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > not th= ey are carried in the same RTP stream.
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > = =A0 =A0 =A0 > > Maybe we are thinking the same thing but
> getting
> > =A0 =A0 confused by
> > =A0 =A0 > =A0= =A0 =A0 > =A0 =A0 =A0 terminology about channels vs. streams.
> &= gt; =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 >= ; =A0 =A0 =A0 > =A0 =A0 =A0 > Maybe. Let me try to restate what I now= think
> you are
> > =A0 =A0 saying:
> > =A0 =A0 > =A0 =A0= =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 = =A0 =A0 > The audio may consist of several "channels".
>= > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Each channel may b= e sent over its own RTP
> stream,
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 > or multiple channels may be multiplexed over
&= gt; an RTP
> > =A0 =A0 stream.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > I guess much of this can also ap= ply to video.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > When there are e= xactly two audio channels,
> they may be
> > =A0 =A0 encoded
> > =A0 =A0 > =A0= =A0 =A0 as
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > &qu= ot;stereo" or "binaural", which then affects how
> the= y
> > =A0 =A0 should be
> > =A0 =A0 > =A0 =A0 =A0 rendered
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 > by the recipient. In these cases the primary<= br>> info that
> > =A0 =A0 is
> > =A0 =A0 > =A0 =A0= =A0 required
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 about<= br> > > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > the individual cha= nnels is which is left and
> which is
> > =A0 =A0 right.
= > > =A0 =A0 > =A0 =A0 =A0 (And
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 which
> > =A0 =A0 > =A0 =A0 =A0 > =A0 = =A0 =A0 > perspective to use in interpretting left and
> right.)
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > For other multi-c= hannel cases more information
> is
> > =A0 =A0 required
&= gt; > =A0 =A0 > =A0 =A0 =A0 about the
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > role of each chann= el in order to properly
> render them.
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 Thanks,
> > =A0 =A0 > =A0 =A0 =A0 = > =A0 =A0 =A0 > =A0 =A0 =A0 Paul
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 > >> Or, are you asserting that stereo an= d
> binaural are
> > =A0 =A0 simply
> > =A0 =A0 > =A0 =A0 =A0 ways to
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 > >> encode
> > =A0 =A0 > =A0= =A0 =A0 > =A0 =A0 =A0 > >> multiple logical streams in one RTP= stream,
> > =A0 =A0 together with
> > =A0 =A0 > =A0 =A0 =A0 their
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 spacial
> > =A0 =A0 > =A0 =A0 =A0 > =A0= =A0 =A0 > >> relationships?
> > =A0 =A0 > =A0 =A0 =A0= > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 = =A0 =A0 > > No, that is not what I'm trying to say.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > = =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > Mark
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> ___________________= ____________________________
> > =A0 =A0 > =A0 =A0 =A0 > =A0= =A0 =A0 > > clue mailing list
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > clue@ietf.org
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > https://www.ietf.org/mailman/= listinfo/clue
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > = =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 >
> _________________________________= ______________
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > = clue mailing list
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > clue@ietf.org
> > =A0 =A0 >= ; =A0 =A0 =A0 > =A0 =A0 =A0 > https://www.ietf.org/mailman/listinfo/clu= e
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 _______________________= ________________________
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0= =A0 clue mailing list
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 = =A0 clue@ietf.org > > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 https://www.ietf.org/mail= man/listinfo/clue
> > =A0 =A0 > =A0 =A0 =A0 >
> &g= t; =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 >
> > =A0 =A0 >
> > =A0 =A0 ><= br>> >
> >
> >
>
> ____________________= ___________________________
> clue mailing list
> clue@ietf.org
> https://www.ietf.org/mailman/listinfo/clue

=A0


--20cf307d049cacbf2c04aab3feec-- From Even.roni@huawei.com Wed Aug 17 08:12:45 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4DA8821F8B51 for ; Wed, 17 Aug 2011 08:12:45 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -105.377 X-Spam-Level: X-Spam-Status: No, score=-105.377 tagged_above=-999 required=5 tests=[AWL=1.221, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8UN5U-v9z3GN for ; Wed, 17 Aug 2011 08:12:39 -0700 (PDT) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [119.145.14.64]) by ietfa.amsl.com (Postfix) with ESMTP id 2C7D421F85C0 for ; Wed, 17 Aug 2011 08:12:38 -0700 (PDT) Received: from huawei.com (szxga05-in [172.24.2.49]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQ200BMVVMEF9@szxga05-in.huawei.com> for clue@ietf.org; Wed, 17 Aug 2011 23:13:26 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQ200MRQVMDIM@szxga05-in.huawei.com> for clue@ietf.org; Wed, 17 Aug 2011 23:13:26 +0800 (CST) Received: from windows8d787f9 (bzq-79-178-13-148.red.bezeqint.net [79.178.13.148]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQ200AGMVLMAY@szxml12-in.huawei.com>; Wed, 17 Aug 2011 23:13:25 +0800 (CST) Date: Wed, 17 Aug 2011 18:12:01 +0300 From: Roni Even In-reply-to: To: 'Stephen Botzko' Message-id: <036801cc5cf0$0d08f8e0$271aeaa0$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_aCEKBF2dYxYHYr3o4/O70g)" Content-language: en-us Thread-index: Acxc5uKMF6o6TW4dRSWdz3Pmf5EIMgABxyCg References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> <02a501cc5c6d$1a2bf1e0$4e83d5a0$%roni@huawei.com> <02bc01cc5c99$6375adb0$2a610910$%roni@huawei.com> Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Aug 2011 15:12:45 -0000 This is a multi-part message in MIME format. --Boundary_(ID_aCEKBF2dYxYHYr3o4/O70g) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Steve, I also would not agree that we need to specify how rendering is done once the streams arrive to the receiver. I think that the receiver should be able to provide information about his rendering capabilities which may be helpful for the sender to create a better content. As for my comment on 2 and 3 case. The current layout discussion is on audio with 2 channels, I was saying that this is a simple case and if we want to discuss layout it should be clear what it means for multi-video and audio cases. My comment on the current framework is that it dives immediately to the model and the examples talk about a three camera left to right case. I was questioning how this model scales. I agree with your comment about the number of capture devices. By capture field I was trying to mention the issue of the view port (is the focus on the first row, second row, multiview,.). Maybe it can be described by the framework but it is not explained how. Roni From: Stephen Botzko [mailto:stephen.botzko@gmail.com] Sent: Wednesday, August 17, 2011 5:06 PM To: Roni Even Cc: Charles Eckel (eckelcu); clue@ietf.org Subject: Re: [clue] continuing "layout" discussion For audio at least (and probably video) I agree you need the number and placement of captures, but I see no value in knowing the number of capture devices. For instance, the stereo encoding we started with might be derived from a microphone in front each of two participants, or it might be derived from a large microphone array. For receivers, there is no difference, so I see no reason to signal it. The 3D telepresence demonstration technology in the EU used 3 cameras to derive each 3D view (I think), but it could have also been done with a Kinect-style single camera. Again, the number of cameras used to make the capture would make no difference to a receiver. I don't know what you mean by the "capture field" (or what specifically about it you think we ought to know), so at present I have no opinion as to whether it is needed or not. I agree that mixing of sources from multiple rooms needs more attention (and I think that is what the "layout" conversation should be chiefly about). I think the framework draft is not limited to 2 channels for audio and 3 cameras. I haven't seen any issues for an N-image video wall and an associated M-channel audio capture, as long as you stay within the stated assumption in the model that the audio/video are on one wall. Apparently there is work to extend the model to handle multiple video walls, and of course the audio will need to be adjusted for that. BTW, I would challenge anyone who is either proposing an alternative framework or extending this draft to build the needed rendering equations to get the right sound field from a standard arrangement of speakers. Though rendering itself is out of scope, we do have enablement requirements for rendering. There are lots of things we could signal, but if their use in rendering is not easily understood, then we will not achieve interoperability. Regards, Stephen On Wed, Aug 17, 2011 at 12:51 AM, Roni Even wrote: Hi Steve, The two channel is a simple private case and using this case to define the required information from the capture and render side is like saying that I proved a mathematical induction for n=2 therefore it works for every n. I see this issue since we are using a n=2 channels audio and a n=3 cameras left to right examples to provide a solution that will scale to any n. What I was trying to say that the current way we describe the stream by a number is not enough if we want to go to the "being there" experience. We need to see what are the dimensions that the model need in order to be able to convey the capture information and the rendering capabilities for both audio and video. I think that there is some similarities since we have a number of streams, how the capture is done and what is the rendering device as the basic information. I think that before going to the framework model it may be beneficial to create a list of the parameters we need to convey, provide a term for each group of parameters and have a way to define them in the model. For example for the capture we have the number of capture devices, the arrangement (Spatial), the encoding process (including mixing if multiple inputs), the capture field and others. Regards Roni From: Stephen Botzko [mailto:stephen.botzko@gmail.com] Sent: Wednesday, August 17, 2011 4:37 AM To: Roni Even Cc: Charles Eckel (eckelcu); clue@ietf.org Subject: Re: [clue] continuing "layout" discussion Hi Roni For this particular discussion, all of the two channel transmissions are "stereo", they are just transported differently. As far as the framework draft is concerned, the various microphone arrangements are accounted for by the signaling of the 1-100 indices for each channel. Binaural is something else- either an HRTF function is applied to the two channels prior to rendering (which was Christer's case with the central rendering server), or you have a dummy head with microphones in the ears in the telepresence room to make the capture. Not sure if we need to distinguish the capture and render cases right now. Regards, Stephen On Tue, Aug 16, 2011 at 7:34 PM, Roni Even wrote: Hi guys, In case 1 according to RFC 3551 (section 4.1) 2 channels in the rtpmap means left and right channels described as stereo. Are you saying that for the 2 and 2b case you also assume stereo capture or can it be any other way of creating the two audio streams from the same room (Binaural recording (not common), or some other arrangements of the microphones). But this talk about the capture side. I think that Christer talked about the rendering side and not only on the capture side. Roni > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of > Charles Eckel (eckelcu) > Sent: Wednesday, August 17, 2011 12:40 AM > To: Stephen Botzko > Cc: clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion > > Agreed. The difference I am trying to point out is that in (1), the > information you need to describe the audio stream for appropriate > rendering is already handled quite well by existing SIP/SDP/RTP and > most > implementations, whereas you need CLUE for (2) and (2b). > > Cheers, > Charles > > > -----Original Message----- > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > Sent: Tuesday, August 16, 2011 2:14 PM > > To: Charles Eckel (eckelcu) > > Cc: Paul Kyzivat; clue@ietf.org > > Subject: Re: [clue] continuing "layout" discussion > > > > Well, the audio in (1) and (2b) is certainly packetized differently. > But not compressed differently > > (unless you are assuming that the signal in (1) is jointly encoded > stereo - which it could be I guess, > > but it would be unusual for telepresence systems). Also, the audio in > (1) is not mixed, no matter how > > it is encoded. > > > > In any event, I believe that the difference between (1) and (2) and > (2b) is really a transport > > question that has nothing to do with layout. The same information is > needed to enable proper > > rendering, and once the streams are received, they are rendered in > precisely the same way. > > > > Regards, > > Stephen Botzko > > > > > > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu) > wrote: > > > > > > I am distinguishing between: > > > > (1) a single RTP stream that consists of a single stereo audio > stream > > (2) two RTP streams, one that contains left speaker audio and > the other > > than contains right speaker audio > > > > (2) could also be transmitted in a single RTP stream using SSRC > > multiplexing. Let me call that (2b). > > (2) and (2b) are essentially the same. Just the RTP mechanism > employed > > is difference. > > (1) is different from (2) and (2b) in that the audio signal > encoded is > > actually different. > > > > Cheers, > > Charles > > > > > > > -----Original Message----- > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > > Sent: Tuesday, August 16, 2011 6:20 AM > > > To: Charles Eckel (eckelcu) > > > Cc: Paul Kyzivat; clue@ietf.org > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > I guess by "stream" you are meaning RTP stream? in which case > by > > "mix" you perhaps mean that the left > > > and right channels are placed in a single RTP stream??? What > do you > > mean when you describe some audio > > > captures as "independent" - are you thinking they come from > different > > rooms???. > > > > > > I think in many respects audio distribution and spatial audio > layout > > is at least as difficult as video > > > layout, and have some unique issues. For one thing, you need > to sort > > out how you should place the > > > audio from human participants who are not on camera, and what > should > > happen later on if some of those > > > participants are shown. > > > > > > I suggest it is necessary to be very careful with terminology. > In > > particular, I think it is important > > > to distinguish composition from RTP transmission. > > > > > > Regards, > > > Stephen Botzko > > > > > > > > > > > > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) > > wrote: > > > > > > > > > > -----Original Message----- > > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > Sent: Monday, August 15, 2011 2:14 PM > > > > To: Charles Eckel (eckelcu) > > > > Cc: Paul Kyzivat; clue@ietf.org > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > Inline > > > > > > > > > > > > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel > (eckelcu) > > > wrote: > > > > > > > > > > > > Please see inline. > > > > > > > > > > > > > -----Original Message----- > > > > > From: clue-bounces@ietf.org > > [mailto:clue-bounces@ietf.org] On > > > Behalf > > > > Of Paul Kyzivat > > > > > > > > > Sent: Thursday, August 11, 2011 6:02 AM > > > > > > > > > To: clue@ietf.org > > > > > Subject: Re: [clue] continuing "layout" > discussion > > > > > > > > > > Inline > > > > > > > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > > > > >> -----Original Message----- > > > > > >> From: clue-bounces@ietf.org > > [mailto:clue-bounces@ietf.org] > > > On > > > > Behalf Of > > > > > >> Paul Kyzivat > > > > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > > > > >> To: clue@ietf.org > > > > > >> Subject: Re: [clue] continuing "layout" > discussion > > > > > > > > > > > >>> 4 - multi stream media format - what the > streams > > mean with > > > respect > > > > to > > > > > >> each other, regardless of the actual > content on the > > > streams. For > > > > > >> audio, examples are stereo, 5.1 surround, > binaural, > > linear > > > array. > > > > > >> (linear array is described in the clue > framework > > document). > > > > Perhaps 3D > > > > > >> video formats would also fit in this > category. > > This > > > information is > > > > > >> needed in order to properly render the > media into > > light and > > > sound > > > > for > > > > > >> human observers. I see this at the same > level as > > > identifying a > > > > codec, > > > > > >> independent of the audio or video content > carried > > on the > > > streams, > > > > and > > > > > >> independent of how any composition of > sources is > > done. > > > > > > > > > > > > I do not think this is necessarily true. Taking > audio as > > an > > > example, you > > > > could have two audio streams that are mixed to > form a > > single > > > stereo > > > > audio stream, or you could have them as two > independent > > (not > > > mixed) > > > > streams that are associate with each other by > some > > grouping > > > mechanism. > > > > This group would be categorized as being stereo > audio > > with one > > > audio > > > > stream being the left and the other the right. > The codec > > used > > > for each > > > > could be different, though I agree they would > typically > > be the > > > same. > > > > Consequently, I think at attribute such as > "stereo" as > > being > > > more of a > > > > grouping concept, where the group may consist > of: > > > > - multiple independent streams, each with > potentially > > its own > > > spatial > > > > orientation, codec, bandwidth, etc., > > > > - a single mixed stream > > > > > > > > > > > > > > > > [sb] I do not understand this distinction. What do > you mean > > when you > > > say "two audio streams that are > > > > mixed to form a single stereo stream", and how is this > > different from > > > the left and right grouping? > > > > > > > > > In one case they are mixed by the source of the stream > into a > > single > > > stream, and in another they are sent as two separate > streams by > > the > > > source. The end result once rendered at the receiver may > be the > > same, > > > but what is sent is different. This example with audio > is > > perhaps too > > > simple. If you think of it as video that is composed > into a > > single video > > > stream vs. multiple via streams that are sent > individually, the > > > difference may be more clear. > > > > > > Cheers, > > > Charles > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > Charles > > > > > > > > > > > > > >> I was with you all the way until 4. That > one I > > don't > > > understand. > > > > > >> The name you chose for this has > connotations for > > me, but > > > isn't > > > > fully in > > > > > >> harmony with the definitions you give: > > > > > > > > > > > > I'm happy to change the name if you have a > > suggestion > > > > > > > > > > Not yet. Maybe once the concepts are more > clearly > > defined I > > > will have > > > > an > > > > > opinion. > > > > > > > > > > >> If we consider audio, it makes sense that > multiple > > streams > > > can be > > > > > >> rendered as if they came from different > physical > > locations > > > in the > > > > > >> receiving room. That can be done by the > receiver if > > it gets > > > those > > > > > >> streams separately, and has information > about their > > > intended > > > > > >> relationships. It can also be done by the > sender or > > MCU and > > > passed > > > > on > > > > > >> to > > > > > >> the receiver as a single stream with stereo > or > > binaural > > > coding. > > > > > > > > > > > > Yes. It could also be done by the sender > using the > > "linear > > > array" > > > > audio channel format. Maybe it > > > > > is true that stereo or binaural audio channels > would > > always be > > > sent as > > > > a single stream, but I was not > > > > > assuming that yet, at least not in general > when you > > consider > > > other > > > > types too, such as linear array > > > > > channels. > > > > > > > > > > >> So it seems to me you have two concepts > here, not > > one. One > > > has to > > > > do > > > > > >> with describing the relationships between > streams, > > and the > > > other > > > > has to > > > > > >> do with the encoding of spacial > relationships > > *within* a > > > single > > > > stream. > > > > > > > > > > > > Maybe that is a better way to describe it, > if you > > assume > > > > multi-channel audio is always sent with all > > > > > the channels in the same RTP stream. Is that > what you > > mean? > > > > > > > > > > > > I was considering the linear array format to > be > > another type > > > of > > > > multi-channel audio, and I know > > > > > people want to be able to send each channel in > a > > separate RTP > > > stream. > > > > So it doesn't quite fit with > > > > > how you separate the two concepts. In my > view, > > identifying > > > the > > > > separate channels by what they mean is > > > > > the same concept for linear array and stereo. > For > > example > > > "this > > > > channel is left, this channel is > > > > > center, this channel is right". To me, that > is the > > same > > > concept for > > > > identifying channels whether or > > > > > not they are carried in the same RTP stream. > > > > > > > > > > > > Maybe we are thinking the same thing but > getting > > confused by > > > > terminology about channels vs. streams. > > > > > > > > > > Maybe. Let me try to restate what I now think > you are > > saying: > > > > > > > > > > The audio may consist of several "channels". > > > > > > > > > > Each channel may be sent over its own RTP > stream, > > > > > or multiple channels may be multiplexed over > an RTP > > stream. > > > > > > > > > > I guess much of this can also apply to video. > > > > > > > > > > When there are exactly two audio channels, > they may be > > encoded > > > as > > > > > "stereo" or "binaural", which then affects how > they > > should be > > > rendered > > > > > by the recipient. In these cases the primary > info that > > is > > > required > > > > about > > > > > the individual channels is which is left and > which is > > right. > > > (And > > > > which > > > > > perspective to use in interpretting left and > right.) > > > > > > > > > > For other multi-channel cases more information > is > > required > > > about the > > > > > role of each channel in order to properly > render them. > > > > > > > > > > Thanks, > > > > > Paul > > > > > > > > > > > > > > > >> Or, are you asserting that stereo and > binaural are > > simply > > > ways to > > > > > >> encode > > > > > >> multiple logical streams in one RTP stream, > > together with > > > their > > > > spacial > > > > > >> relationships? > > > > > > > > > > > > No, that is not what I'm trying to say. > > > > > > > > > > > > Mark > > > > > > > _______________________________________________ > > > > > > clue mailing list > > > > > > clue@ietf.org > > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > clue mailing list > > > > > clue@ietf.org > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > _______________________________________________ > > > > clue mailing list > > > > clue@ietf.org > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue --Boundary_(ID_aCEKBF2dYxYHYr3o4/O70g) Content-type: text/html; charset=us-ascii Content-transfer-encoding: quoted-printable

Steve,

I also would  not agree that we need to specify how rendering is = done once the streams arrive to the receiver. I think that the receiver = should be able to provide information about his rendering capabilities = which may be helpful for the sender to create a better = content.

 

As for my comment on 2 and 3 case. The current layout discussion is = on audio with 2 channels, I was saying that this is a simple case and if = we want to discuss layout it should be clear what it means for = multi-video and audio cases.

 

My comment on the current framework is that it dives immediately to = the model and the examples talk about a three camera left to right case. = I was questioning how this model scales.

 

I agree with your comment about the number of capture devices. By = capture field I was trying to mention the issue of the view port (is the = focus on the first row, second row, multiview,…). Maybe it can be = described by the framework but it is not explained = how.

 

Roni

 

From:= = Stephen Botzko [mailto:stephen.botzko@gmail.com]
Sent: = Wednesday, August 17, 2011 5:06 PM
To: Roni Even
Cc: = Charles Eckel (eckelcu); clue@ietf.org
Subject: Re: [clue] = continuing "layout" = discussion

 

For audio at least (and probably video) I = agree you need the number and placement of captures, but I see no = value in knowing the number of capture devices.  For = instance, the stereo encoding we started with might be derived from a = microphone in front each of two participants, or it might be derived = from a large microphone array.  For receivers, there is no = difference, so I see no reason to signal it.   The 3D = telepresence demonstration technology in the EU used 3 cameras to derive = each 3D view (I think), but it could have also been done with a = Kinect-style single camera.  Again, the number of cameras used to = make the capture would make no difference to a receiver.

I don't = know what you mean by the "capture field" (or what = specifically about it you think we ought to know), so at present I have = no opinion as to whether it is needed or not.  I agree that mixing = of sources from multiple rooms needs more attention (and I think that is = what the "layout" conversation should be chiefly = about).

I think the framework draft is not limited to 2 channels = for audio and 3 cameras.  I haven't seen any issues for an N-image = video wall and an associated M-channel audio capture, as long as you = stay within the stated assumption in the model that the audio/video are = on one wall.  Apparently there is work to extend the model to = handle multiple video walls, and of course the audio will need to be = adjusted for that.

BTW, I would challenge anyone who is either = proposing an alternative framework or extending this draft to build the = needed rendering equations to get the right sound field from a standard = arrangement of speakers.  Though rendering itself is out of scope, = we do have enablement requirements for rendering.  There are lots = of things we could signal, but if their use in rendering is not = easily understood, then we will not achieve = interoperability.

Regards,
Stephen


On Wed, Aug 17, 2011 at 12:51 AM, Roni Even = <Even.roni@huawei.com> = wrote:

Hi = Steve,

The two channel is a simple = private case and using this case to define the required information from = the capture and render side is like saying that I proved a mathematical = induction for n=3D2 therefore it works for every n. I see this issue = since we are using a n=3D2 channels audio and a n=3D3 cameras left to = right examples to provide a solution that will scale to any = n.

 

What I was trying to say that = the current way we describe the stream by a number is not enough if we = want to go to the "being there" experience. =

We need to see what are the = dimensions that the model need in order to be able to convey the capture = information and the rendering capabilities for both audio and video. I = think that there is some similarities since we have a number of streams, = how the capture is done and what is the rendering device as the basic = information.

I think that before going to = the framework model it may be beneficial to create a list of the = parameters we need to convey, provide a term for each group of = parameters and have a way to define them in the model. For example =   for the capture we have the number of capture devices, the = arrangement (Spatial), the encoding process (including mixing if = multiple inputs), the capture field and others.

 

Regards

Roni

 

From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
Sent: = Wednesday, August 17, 2011 4:37 AM
To: Roni Even
Cc: = Charles Eckel (eckelcu); clue@ietf.org


Subject: Re: [clue] continuing = "layout" = discussion

 <= /o:p>

Hi = Roni

For this particular discussion, all of the two channel = transmissions are "stereo", they are just transported = differently. 

As far as the framework draft is concerned, = the various microphone arrangements are accounted for by the signaling = of the 1-100 indices for each channel.

Binaural is something = else- either an HRTF function is applied to the two channels prior to = rendering (which was Christer's case with the central rendering server), = or you have a dummy head with microphones in the ears in the = telepresence room to make the capture.  Not sure if we need to = distinguish the capture and render cases right = now.

Regards,
Stephen

On Tue, Aug = 16, 2011 at 7:34 PM, Roni Even <Even.roni@huawei.com> wrote:

Hi = guys,
In case 1 according to RFC 3551 (section 4.1) 2 channels in the = rtpmap means
left and right channels described as stereo. Are you = saying that for the 2
and 2b case you also assume stereo capture or = can it be any other way of
creating the two audio streams  from = the same room (Binaural recording (not
common), or some other = arrangements of the microphones). But this talk about
the capture = side.

I think that Christer talked about the rendering side and = not only on the
capture side.

Roni


> = -----Original Message-----
> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf = Of

> = Charles Eckel (eckelcu)
> Sent: Wednesday, August 17, 2011 12:40 = AM
> To: Stephen Botzko
> Cc: clue@ietf.org

> Subject: Re: = [clue] continuing "layout" discussion
>
> Agreed. = The difference I am trying to point out is that in (1), the
> = information you need to describe the audio stream for = appropriate
> rendering is already handled quite well by existing = SIP/SDP/RTP and
> most
> implementations, whereas you need = CLUE for (2) and (2b).
>
> Cheers,
> = Charles
>
> > -----Original Message-----
> > = From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> > Sent: = Tuesday, August 16, 2011 2:14 PM
> > To: Charles Eckel = (eckelcu)
> > Cc: Paul Kyzivat; clue@ietf.org
> > Subject: Re: [clue] = continuing "layout" discussion
> >
> > Well, = the audio in (1) and (2b) is certainly packetized differently.
> = But not compressed differently
> > (unless you are assuming = that the signal in (1) is jointly encoded
> stereo - which it = could be I guess,
> > but it would be unusual for telepresence = systems). Also, the audio in
> (1) is not mixed, no matter = how
> > it is encoded.
> >
> > In any event, = I believe that the difference between (1) and (2) and
> (2b) is = really a transport
> > question that has nothing to do with = layout. The same information is
> needed to enable proper
> = > rendering, and once the streams are received, they are rendered = in
> precisely the same way.
> >
> > = Regards,
> > Stephen Botzko
> >
> >
> = > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu)
> = <eckelcu@cisco.com> wrote:
> >
> = >
> >     I am distinguishing between:
> = >
> >     (1) a single RTP stream that consists of = a single stereo audio
> stream
> >     (2) two = RTP streams, one that contains left speaker audio and
> the = other
> >     than contains right speaker = audio
> >
> >     (2) could also be = transmitted in a single RTP stream using SSRC
> >     = multiplexing. Let me call that (2b).
> >     (2) and = (2b) are essentially the same. Just the RTP mechanism
> = employed
> >     is difference.
> >   =   (1) is different from (2) and (2b) in that the audio = signal
> encoded is
> >     actually = different.
> >
> >     Cheers,
> > =     Charles
> >
> >
> >   =   > -----Original Message-----
> >     > = From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> >
> = >     > Sent: Tuesday, August 16, 2011 6:20 AM
> = >     > To: Charles Eckel (eckelcu)
> >   =   > Cc: Paul Kyzivat; clue@ietf.org
> >     > = Subject: Re: [clue] continuing "layout" discussion
> = >     >
> >     > I guess by = "stream" you are meaning RTP stream?  in which = case
> by
> >     "mix" you perhaps = mean that the left
> >     > and right channels = are placed in a single RTP stream???  What
> do you
> = >     mean when you describe some audio
> >   =   > captures as "independent" - are you thinking they = come from
> different
> >     rooms???.
> = >     >
> >     > I think in many = respects audio distribution and spatial audio
> layout
> = >     is at least as difficult as video
> >   =   > layout, and have some unique issues.  For one thing, = you need
> to sort
> >     out how you should = place the
> >     > audio from human participants = who are not on camera, and what
> should
> >   =   happen later on if some of those
> >     > = participants are shown.
> >     >
> > =     > I suggest it is necessary to be very careful with = terminology.
> In
> >     particular, I think = it is important
> >     > to distinguish = composition from RTP transmission.
> >     = >
> >     > Regards,
> >     = > Stephen Botzko
> >     >
> >   =   >
> >     >
> >     = > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu)
> = >     <eckelcu@cisco.com> wrote:
> >   =   >
> >     >
> >     = >       > -----Original Message-----
> > =     >       > From: Stephen Botzko = [mailto:stephen.botzko@gmail.com]
> >   =   >       > Sent: Monday, August 15, 2011 2:14 = PM
> >     >       > To: Charles = Eckel (eckelcu)
> >     >       = > Cc: Paul Kyzivat; clue@ietf.org
> >     > =       > Subject: Re: [clue] continuing = "layout" discussion
> >     >   =     >
> >     >       = > Inline
> >     >       = >
> >     >       >
> = >     >       > On Mon, Aug 15, 2011 = at 4:21 PM, Charles Eckel
> (eckelcu)
> >     = >       <eckelcu@cisco.com> wrote:
> >   =   >       >
> >     > =       >
> >     >     =   >       Please see inline.
> >   =   >       >
> >     > =       >
> >     >     =   >       > -----Original Message-----
> = >     >       >       = > From: clue-bounces@ietf.org
> >     = [mailto:clue-bounces@ietf.org] On
> >   =   >       Behalf
> >     > =       >       Of Paul Kyzivat
> = >     >       >
> >   =   >       >       > Sent: = Thursday, August 11, 2011 6:02 AM
> >     >   =     >
> >     >       = >       > To: clue@ietf.org
> >     > =       >       > Subject: Re: [clue] = continuing "layout"
> discussion
> >   =   >       >       >
> = >     >       >       = > Inline
> >     >       > =       >
> >     >     =   >       > On 8/10/11 5:49 PM, Duckworth, = Mark wrote:
> >     >       > =       > >> -----Original Message-----
> = >     >       >       = > >> From: clue-bounces@ietf.org
> >     = [mailto:clue-bounces@ietf.org]
> >     = >       On
> >     >   =     >       Behalf Of
> >   =   >       >       > >> = Paul Kyzivat
> >     >       > =       > >> Sent: Tuesday, August 09, 2011 9:03 = AM
> >     >       >   =     > >> To: clue@ietf.org
> >     > =       >       > >> Subject: = Re: [clue] continuing "layout"
> discussion
> > =     >       >       > = >
> >     >       >   =     > >>> 4 - multi stream media format - what = the
> streams
> >     mean with
> > =     >       respect
> >   =   >       >       to
> = >     >       >       = > >> each other, regardless of the actual
> content on = the
> >     >       streams. =  For
> >     >       > =       > >> audio, examples are stereo, 5.1 = surround,
> binaural,
> >     linear
> = >     >       array.
> >   =   >       >       > >> = (linear array is described in the clue
> framework
> > =     document).
> >     >     =   >       Perhaps 3D
> >     = >       >       > >> video = formats would also fit in this
> category.
> >   =   This
> >     >       = information is
> >     >       > =       > >> needed in order to properly render = the
> media into
> >     light and
> > =     >       sound
> >     = >       >       for
> > =     >       >       > = >> human observers.  I see this at the same
> level = as
> >     >       identifying = a
> >     >       >   =     codec,
> >     >     =   >       > >> independent of the audio = or video content
> carried
> >     on = the
> >     >       = streams,
> >     >       > =       and
> >     >     =   >       > >> independent of how any = composition of
> sources is
> >     = done.
> >     >       >
> = >     >       >
> >   =   >       >       I do not = think this is necessarily true. Taking
> audio as
> > =     an
> >     >       = example, you
> >     >       > =       could have two audio streams that are mixed = to
> form a
> >     single
> >   =   >       stereo
> >     > =       >       audio stream, or you = could have them as two
> independent
> >     = (not
> >     >       mixed)
> = >     >       >       = streams that are associate with each other by
> some
> > =     grouping
> >     >     =   mechanism.
> >     >       = >       This group would be categorized as being = stereo
> audio
> >     with one
> > =     >       audio
> >     = >       >       stream being the = left and the other the right.
> The codec
> >   =   used
> >     >       for = each
> >     >       >   =     could be different, though I agree they would
> = typically
> >     be the
> >     = >       same.
> >     >   =     >       Consequently, I think at = attribute such as
> "stereo" as
> >   =   being
> >     >       more = of a
> >     >       >   =     grouping concept, where the group may consist
> = of:
> >     >       >   =     - multiple independent streams, each with
> = potentially
> >     its own
> >   =   >       spatial
> >     > =       >       orientation, codec, = bandwidth, etc.,
> >     >       = >       - a single mixed stream
> >   =   >       >
> >     > =       >
> >     >     =   >
> >     >       > = [sb] I do not understand this distinction.  What do
> you = mean
> >     when you
> >     > =       say "two audio streams that are
> > =     >       > mixed to form a single = stereo stream", and how is this
> >     = different from
> >     >       the = left and right grouping?
> >     >
> > =     >
> >     >       = In one case they are mixed by the source of the stream
> into = a
> >     single
> >     > =       stream, and in another they are sent as two = separate
> streams by
> >     the
> > =     >       source. The end result once = rendered at the receiver may
> be the
> >     = same,
> >     >       but what is = sent is different. This example with audio
> is
> > =     perhaps too
> >     >     =   simple. If you think of it as video that is composed
> into = a
> >     single video
> >     = >       stream vs. multiple via streams that are = sent
> individually, the
> >     >   =     difference may be more clear.
> >     = >
> >     >       = Cheers,
> >     >       = Charles
> >     >
> >     = >
> >     >       >
> = >     >       >
> >   =   >       >
> >     > =       >       Cheers,
> > =     >       >       = Charles
> >     >       = >
> >     >       >
> = >     >       >       = > >> I was with you all the way until 4. That
> one = I
> >     don't
> >     > =       understand.
> >     >   =     >       > >> The name you chose = for this has
> connotations for
> >     me, = but
> >     >       isn't
> = >     >       >       = fully in
> >     >       > =       > >> harmony with the definitions you = give:
> >     >       >   =     > >
> >     >     =   >       > > I'm happy to change the name = if you have a
> >     suggestion
> >   =   >       >       >
> = >     >       >       = > Not yet. Maybe once the concepts are more
> clearly
> = >     defined I
> >     >   =     will have
> >     >     =   >       an
> >     > =       >       > opinion.
> = >     >       >       = >
> >     >       >   =     > >> If we consider audio, it makes sense = that
> multiple
> >     streams
> > =     >       can be
> >   =   >       >       > >> = rendered as if they came from different
> physical
> > =     locations
> >     >     =   in the
> >     >       > =       > >> receiving room. That can be done by = the
> receiver if
> >     it gets
> > =     >       those
> >     = >       >       > >> = streams separately, and has information
> about their
> > =     >       intended
> >   =   >       >       > >> = relationships. It can also be done by the
> sender or
> > =     MCU and
> >     >     =   passed
> >     >       > =       on
> >     >     =   >       > >> to
> >   =   >       >       > >> = the receiver as a single stream with stereo
> or
> > =     binaural
> >     >     =   coding.
> >     >       > =       > >
> >     >   =     >       > > Yes.  It could = also be done by the sender
> using the
> >     = "linear
> >     >       = array"
> >     >       > =       audio channel format.  Maybe it
> > =     >       >       > = is true that stereo or binaural audio channels
> would
> = >     always be
> >     >   =     sent as
> >     >     =   >       a single stream, but I was not
> = >     >       >       = > assuming that yet, at least not in general
> when you
> = >     consider
> >     >   =     other
> >     >       = >       types too, such as linear array
> > =     >       >       > = channels.
> >     >       > =       >
> >     >     =   >       > >> So it seems to me you = have two concepts
> here, not
> >     one. = One
> >     >       has to
> = >     >       >       = do
> >     >       >   =     > >> with describing the relationships = between
> streams,
> >     and the
> > =     >       other
> >     = >       >       has to
> > =     >       >       > = >> do with the encoding of spacial
> relationships
> = >     *within* a
> >     >   =     single
> >     >     =   >       stream.
> >     > =       >       > >
> > =     >       >       > = > Maybe that is a better way to describe it,
> if you
> = >     assume
> >     >     =   >       multi-channel audio is always sent with = all
> >     >       >   =     > the channels in the same RTP stream.  Is = that
> what you
> >     mean?
> > =     >       >       > = >
> >     >       >   =     > > I was considering the linear array format = to
> be
> >     another type
> > =     >       of
> >     = >       >       multi-channel audio, = and I know
> >     >       > =       > people want to be able to send each channel = in
> a
> >     separate RTP
> >   =   >       stream.
> >     > =       >       So it doesn't quite fit = with
> >     >       >   =     > how you separate the two concepts.  In = my
> view,
> >     identifying
> > =     >       the
> >     = >       >       separate channels by = what they mean is
> >     >       = >       > the same concept for linear array and = stereo.
> For
> >     example
> > =     >       "this
> >   =   >       >       channel is = left, this channel is
> >     >     =   >       > center, this channel is = right".  To me, that
> is the
> >     = same
> >     >       concept = for
> >     >       >   =     identifying channels whether or
> >     = >       >       > not they are = carried in the same RTP stream.
> >     >   =     >       > >
> >   =   >       >       > > = Maybe we are thinking the same thing but
> getting
> > =     confused by
> >     >     =   >       terminology about channels vs. = streams.
> >     >       > =       >
> >     >     =   >       > Maybe. Let me try to restate what = I now think
> you are
> >     saying:
> = >     >       >       = >
> >     >       >   =     > The audio may consist of several = "channels".
> >     >     =   >       >
> >     > =       >       > Each channel may be = sent over its own RTP
> stream,
> >     > =       >       > or multiple channels = may be multiplexed over
> an RTP
> >     = stream.
> >     >       >   =     >
> >     >       = >       > I guess much of this can also apply to = video.
> >     >       >   =     >
> >     >       = >       > When there are exactly two audio = channels,
> they may be
> >     encoded
> = >     >       as
> >   =   >       >       > = "stereo" or "binaural", which then affects = how
> they
> >     should be
> > =     >       rendered
> >   =   >       >       > by the = recipient. In these cases the primary
> info that
> > =     is
> >     >       = required
> >     >       > =       about
> >     >     =   >       > the individual channels is which = is left and
> which is
> >     right.
> = >     >       (And
> >   =   >       >       which
> = >     >       >       = > perspective to use in interpretting left and
> = right.)
> >     >       >   =     >
> >     >       = >       > For other multi-channel cases more = information
> is
> >     required
> > =     >       about the
> >   =   >       >       > role of = each channel in order to properly
> render them.
> > =     >       >       = >
> >     >       >   =     >       Thanks,
> >   =   >       >       >   =     Paul
> >     >       = >       >
> >     >   =     >       >
> >     = >       >       > >> Or, = are you asserting that stereo and
> binaural are
> > =     simply
> >     >     =   ways to
> >     >       > =       > >> encode
> >     = >       >       > >> = multiple logical streams in one RTP stream,
> >     = together with
> >     >       = their
> >     >       >   =     spacial
> >     >     =   >       > >> relationships?
> = >     >       >       = > >
> >     >       > =       > > No, that is not what I'm trying to = say.
> >     >       >   =     > >
> >     >     =   >       > > Mark
> >   =   >       >       > = >
> _______________________________________________
> = >     >       >       = > > clue mailing list
> >     >   =     >       > > clue@ietf.org
> >     > =       >       > > https://www.ietf.org/mailman/listinfo/clue
> = >     >       >       = > >
> >     >       > =       >
> >     >     =   >       >
> = _______________________________________________
> >   =   >       >       > clue = mailing list
> >     >       > =       > clue@ietf.org
> >     > =       >       > https://www.ietf.org/mailman/listinfo/clue
> = >     >       >       = _______________________________________________
> >   =   >       >       clue mailing = list
> >     >       >   =     clue@ietf.org
> >     > =       >       https://www.ietf.org/mailman/listinfo/clue
> = >     >       >
> >   =   >       >
> >     = >
> >     >
> >     = >
> >
> >
> >
>
> = _______________________________________________
> clue mailing = list
> clue@ietf.org
> https://www.ietf.org/mailman/listinfo/clue

 <= /o:p>

 

= --Boundary_(ID_aCEKBF2dYxYHYr3o4/O70g)-- From stephen.botzko@gmail.com Wed Aug 17 08:34:32 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DA7F621F8B3A for ; Wed, 17 Aug 2011 08:34:32 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -3.401 X-Spam-Level: X-Spam-Status: No, score=-3.401 tagged_above=-999 required=5 tests=[AWL=0.197, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yf97Pyyirdyq for ; Wed, 17 Aug 2011 08:34:30 -0700 (PDT) Received: from mail-vx0-f172.google.com (mail-vx0-f172.google.com [209.85.220.172]) by ietfa.amsl.com (Postfix) with ESMTP id EB84721F859F for ; Wed, 17 Aug 2011 08:34:22 -0700 (PDT) Received: by vxi29 with SMTP id 29so1152323vxi.31 for ; Wed, 17 Aug 2011 08:35:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=d+m8SR3VO4wDlBnfGKOjNXTaEKU7yY1fiXapGHkkLIg=; b=mT6Ieq2yYgXopieJEIodUyiaqZLoUdiaAkxT+Kfs0oNPUUULpaHeQemV3rl92Biq3H RfRuGV/K1nxcVc3DW8NPWaWFqiHgDanwQikNYXjYOsSWRPaXiDVmp1l88zle30QQVq5j wdBOgegSkmBgyCUHEuBatIcTMAMkMTeIPEzqk= MIME-Version: 1.0 Received: by 10.52.20.173 with SMTP id o13mr1154785vde.21.1313595313895; Wed, 17 Aug 2011 08:35:13 -0700 (PDT) Received: by 10.52.115.103 with HTTP; Wed, 17 Aug 2011 08:35:13 -0700 (PDT) In-Reply-To: <036801cc5cf0$0d08f8e0$271aeaa0$%roni@huawei.com> References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E413021.3010509@alum.mit.edu> <44C6B6B2D0CF424AA90B6055548D7A61AEA65C62@CRPMBOXPRD01.polycom.com> <4E43D2BE.5010102@alum.mit.edu> <02a501cc5c6d$1a2bf1e0$4e83d5a0$%roni@huawei.com> <02bc01cc5c99$6375adb0$2a610910$%roni@huawei.com> <036801cc5cf0$0d08f8e0$271aeaa0$%roni@huawei.com> Date: Wed, 17 Aug 2011 11:35:13 -0400 Message-ID: From: Stephen Botzko To: Roni Even Content-Type: multipart/alternative; boundary=20cf307c9f46103a2304aab53e6b Cc: clue@ietf.org Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Aug 2011 15:34:33 -0000 --20cf307c9f46103a2304aab53e6b Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable I didn't say that we needed to *specify* how rendering is done. However, the information we are providing is intended to enable the rendering of an interoperable "being there experience". So we do need to know if the information we signal actually enables such rendering. The best way to ensure that is to construct a rendering method that would work, and which would need the information we propose to provide= . We don't need to specify the rendering method, we don't even have to describe it on the list. But if we all can't construct a reasonable method easily, then CLUE will have failed. Regards, Stephen On Wed, Aug 17, 2011 at 11:12 AM, Roni Even wrote: > Steve,**** > > I also would not agree that we need to specify how rendering is done onc= e > the streams arrive to the receiver. I think that the receiver should be a= ble > to provide information about his rendering capabilities which may be help= ful > for the sender to create a better content.**** > > ** ** > > As for my comment on 2 and 3 case. The current layout discussion is on > audio with 2 channels, I was saying that this is a simple case and if we > want to discuss layout it should be clear what it means for multi-video a= nd > audio cases.**** > > ** ** > > My comment on the current framework is that it dives immediately to the > model and the examples talk about a three camera left to right case. I wa= s > questioning how this model scales.**** > > ** ** > > I agree with your comment about the number of capture devices. By capture > field I was trying to mention the issue of the view port (is the focus on > the first row, second row, multiview,=85). Maybe it can be described by t= he > framework but it is not explained how.**** > > ** ** > > Roni**** > > ** ** > > *From:* Stephen Botzko [mailto:stephen.botzko@gmail.com] > *Sent:* Wednesday, August 17, 2011 5:06 PM > > *To:* Roni Even > *Cc:* Charles Eckel (eckelcu); clue@ietf.org > *Subject:* Re: [clue] continuing "layout" discussion**** > > ** ** > > For audio at least (and probably video) I agree you need the number and > placement of *captures*, but I see no value in knowing the number of > capture *devices*. For instance, the stereo encoding we started with > might be derived from a microphone in front each of two participants, or = it > might be derived from a large microphone array. For receivers, there is = no > difference, so I see no reason to signal it. The 3D telepresence > demonstration technology in the EU used 3 cameras to derive each 3D view = (I > think), but it could have also been done with a Kinect-style single camer= a. > Again, the number of cameras used to make the capture would make no > difference to a receiver. > > I don't know what you mean by the "capture field" (or what specifically > about it you think we ought to know), so at present I have no opinion as = to > whether it is needed or not. I agree that mixing of sources from multipl= e > rooms needs more attention (and I think that is what the "layout" > conversation should be chiefly about). > > I think the framework draft is not limited to 2 channels for audio and 3 > cameras. I haven't seen any issues for an N-image video wall and an > associated M-channel audio capture, as long as you stay within the stated > assumption in the model that the audio/video are on one wall. Apparently > there is work to extend the model to handle multiple video walls, and of > course the audio will need to be adjusted for that. > > BTW, I would challenge anyone who is either proposing an alternative > framework or extending this draft to build the needed rendering equations= to > get the right sound field from a standard arrangement of speakers. Thoug= h > rendering itself is out of scope, we do have enablement requirements for > rendering. There are lots of things we *could* signal, but if their use > in rendering is not easily understood, then we will not achieve > interoperability. > > Regards, > Stephen > > > **** > > On Wed, Aug 17, 2011 at 12:51 AM, Roni Even wrote:= * > *** > > Hi Steve,**** > > The two channel is a simple private case and using this case to define th= e > required information from the capture and render side is like saying that= I > proved a mathematical induction for n=3D2 therefore it works for every n.= I > see this issue since we are using a n=3D2 channels audio and a n=3D3 came= ras > left to right examples to provide a solution that will scale to any n.***= * > > **** > > What I was trying to say that the current way we describe the stream by a > number is not enough if we want to go to the "being there" experience. **= * > * > > We need to see what are the dimensions that the model need in order to be > able to convey the capture information and the rendering capabilities for > both audio and video. I think that there is some similarities since we ha= ve > a number of streams, how the capture is done and what is the rendering > device as the basic information. **** > > I think that before going to the framework model it may be beneficial to > create a list of the parameters we need to convey, provide a term for eac= h > group of parameters and have a way to define them in the model. For examp= le > for the capture we have the number of capture devices, the arrangement > (Spatial), the encoding process (including mixing if multiple inputs), th= e > capture field and others.**** > > **** > > Regards**** > > Roni**** > > **** > > *From:* Stephen Botzko [mailto:stephen.botzko@gmail.com] > *Sent:* Wednesday, August 17, 2011 4:37 AM > *To:* Roni Even > *Cc:* Charles Eckel (eckelcu); clue@ietf.org**** > > > *Subject:* Re: [clue] continuing "layout" discussion**** > > **** > > Hi Roni > > For this particular discussion, all of the two channel transmissions are > "stereo", they are just transported differently. > > As far as the framework draft is concerned, the various microphone > arrangements are accounted for by the signaling of the 1-100 indices for > each channel. > > Binaural is something else- either an HRTF function is applied to the two > channels prior to rendering (which was Christer's case with the central > rendering server), or you have a dummy head with microphones in the ears = in > the telepresence room to make the capture. Not sure if we need to > distinguish the capture and render cases right now. > > Regards, > Stephen**** > > On Tue, Aug 16, 2011 at 7:34 PM, Roni Even wrote:*= * > ** > > Hi guys, > In case 1 according to RFC 3551 (section 4.1) 2 channels in the rtpmap > means > left and right channels described as stereo. Are you saying that for the = 2 > and 2b case you also assume stereo capture or can it be any other way of > creating the two audio streams from the same room (Binaural recording (n= ot > common), or some other arrangements of the microphones). But this talk > about > the capture side. > > I think that Christer talked about the rendering side and not only on the > capture side. > > Roni**** > > > > -----Original Message----- > > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of= * > *** > > > Charles Eckel (eckelcu) > > Sent: Wednesday, August 17, 2011 12:40 AM > > To: Stephen Botzko > > Cc: clue@ietf.org**** > > > Subject: Re: [clue] continuing "layout" discussion > > > > Agreed. The difference I am trying to point out is that in (1), the > > information you need to describe the audio stream for appropriate > > rendering is already handled quite well by existing SIP/SDP/RTP and > > most > > implementations, whereas you need CLUE for (2) and (2b). > > > > Cheers, > > Charles > > > > > -----Original Message----- > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > Sent: Tuesday, August 16, 2011 2:14 PM > > > To: Charles Eckel (eckelcu) > > > Cc: Paul Kyzivat; clue@ietf.org > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > Well, the audio in (1) and (2b) is certainly packetized differently. > > But not compressed differently > > > (unless you are assuming that the signal in (1) is jointly encoded > > stereo - which it could be I guess, > > > but it would be unusual for telepresence systems). Also, the audio in > > (1) is not mixed, no matter how > > > it is encoded. > > > > > > In any event, I believe that the difference between (1) and (2) and > > (2b) is really a transport > > > question that has nothing to do with layout. The same information is > > needed to enable proper > > > rendering, and once the streams are received, they are rendered in > > precisely the same way. > > > > > > Regards, > > > Stephen Botzko > > > > > > > > > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu) > > wrote: > > > > > > > > > I am distinguishing between: > > > > > > (1) a single RTP stream that consists of a single stereo audio > > stream > > > (2) two RTP streams, one that contains left speaker audio and > > the other > > > than contains right speaker audio > > > > > > (2) could also be transmitted in a single RTP stream using SSRC > > > multiplexing. Let me call that (2b). > > > (2) and (2b) are essentially the same. Just the RTP mechanism > > employed > > > is difference. > > > (1) is different from (2) and (2b) in that the audio signal > > encoded is > > > actually different. > > > > > > Cheers, > > > Charles > > > > > > > > > > -----Original Message----- > > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > > > > Sent: Tuesday, August 16, 2011 6:20 AM > > > > To: Charles Eckel (eckelcu) > > > > Cc: Paul Kyzivat; clue@ietf.org > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > I guess by "stream" you are meaning RTP stream? in which case > > by > > > "mix" you perhaps mean that the left > > > > and right channels are placed in a single RTP stream??? What > > do you > > > mean when you describe some audio > > > > captures as "independent" - are you thinking they come from > > different > > > rooms???. > > > > > > > > I think in many respects audio distribution and spatial audio > > layout > > > is at least as difficult as video > > > > layout, and have some unique issues. For one thing, you need > > to sort > > > out how you should place the > > > > audio from human participants who are not on camera, and what > > should > > > happen later on if some of those > > > > participants are shown. > > > > > > > > I suggest it is necessary to be very careful with terminology. > > In > > > particular, I think it is important > > > > to distinguish composition from RTP transmission. > > > > > > > > Regards, > > > > Stephen Botzko > > > > > > > > > > > > > > > > On Mon, Aug 15, 2011 at 5:45 PM, Charles Eckel (eckelcu) > > > wrote: > > > > > > > > > > > > > -----Original Message----- > > > > > From: Stephen Botzko [mailto:stephen.botzko@gmail.com] > > > > > Sent: Monday, August 15, 2011 2:14 PM > > > > > To: Charles Eckel (eckelcu) > > > > > Cc: Paul Kyzivat; clue@ietf.org > > > > > Subject: Re: [clue] continuing "layout" discussion > > > > > > > > > > Inline > > > > > > > > > > > > > > > On Mon, Aug 15, 2011 at 4:21 PM, Charles Eckel > > (eckelcu) > > > > wrote: > > > > > > > > > > > > > > > Please see inline. > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > From: clue-bounces@ietf.org > > > [mailto:clue-bounces@ietf.org] On > > > > Behalf > > > > > Of Paul Kyzivat > > > > > > > > > > > Sent: Thursday, August 11, 2011 6:02 AM > > > > > > > > > > > To: clue@ietf.org > > > > > > Subject: Re: [clue] continuing "layout" > > discussion > > > > > > > > > > > > Inline > > > > > > > > > > > > On 8/10/11 5:49 PM, Duckworth, Mark wrote: > > > > > > >> -----Original Message----- > > > > > > >> From: clue-bounces@ietf.org > > > [mailto:clue-bounces@ietf.org] > > > > On > > > > > Behalf Of > > > > > > >> Paul Kyzivat > > > > > > >> Sent: Tuesday, August 09, 2011 9:03 AM > > > > > > >> To: clue@ietf.org > > > > > > >> Subject: Re: [clue] continuing "layout" > > discussion > > > > > > > > > > > > > >>> 4 - multi stream media format - what the > > streams > > > mean with > > > > respect > > > > > to > > > > > > >> each other, regardless of the actual > > content on the > > > > streams. For > > > > > > >> audio, examples are stereo, 5.1 surround, > > binaural, > > > linear > > > > array. > > > > > > >> (linear array is described in the clue > > framework > > > document). > > > > > Perhaps 3D > > > > > > >> video formats would also fit in this > > category. > > > This > > > > information is > > > > > > >> needed in order to properly render the > > media into > > > light and > > > > sound > > > > > for > > > > > > >> human observers. I see this at the same > > level as > > > > identifying a > > > > > codec, > > > > > > >> independent of the audio or video content > > carried > > > on the > > > > streams, > > > > > and > > > > > > >> independent of how any composition of > > sources is > > > done. > > > > > > > > > > > > > > > I do not think this is necessarily true. Taking > > audio as > > > an > > > > example, you > > > > > could have two audio streams that are mixed to > > form a > > > single > > > > stereo > > > > > audio stream, or you could have them as two > > independent > > > (not > > > > mixed) > > > > > streams that are associate with each other by > > some > > > grouping > > > > mechanism. > > > > > This group would be categorized as being stereo > > audio > > > with one > > > > audio > > > > > stream being the left and the other the right. > > The codec > > > used > > > > for each > > > > > could be different, though I agree they would > > typically > > > be the > > > > same. > > > > > Consequently, I think at attribute such as > > "stereo" as > > > being > > > > more of a > > > > > grouping concept, where the group may consist > > of: > > > > > - multiple independent streams, each with > > potentially > > > its own > > > > spatial > > > > > orientation, codec, bandwidth, etc., > > > > > - a single mixed stream > > > > > > > > > > > > > > > > > > > > [sb] I do not understand this distinction. What do > > you mean > > > when you > > > > say "two audio streams that are > > > > > mixed to form a single stereo stream", and how is this > > > different from > > > > the left and right grouping? > > > > > > > > > > > > In one case they are mixed by the source of the stream > > into a > > > single > > > > stream, and in another they are sent as two separate > > streams by > > > the > > > > source. The end result once rendered at the receiver may > > be the > > > same, > > > > but what is sent is different. This example with audio > > is > > > perhaps too > > > > simple. If you think of it as video that is composed > > into a > > > single video > > > > stream vs. multiple via streams that are sent > > individually, the > > > > difference may be more clear. > > > > > > > > Cheers, > > > > Charles > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > Charles > > > > > > > > > > > > > > > > >> I was with you all the way until 4. That > > one I > > > don't > > > > understand. > > > > > > >> The name you chose for this has > > connotations for > > > me, but > > > > isn't > > > > > fully in > > > > > > >> harmony with the definitions you give: > > > > > > > > > > > > > > I'm happy to change the name if you have a > > > suggestion > > > > > > > > > > > > Not yet. Maybe once the concepts are more > > clearly > > > defined I > > > > will have > > > > > an > > > > > > opinion. > > > > > > > > > > > > >> If we consider audio, it makes sense that > > multiple > > > streams > > > > can be > > > > > > >> rendered as if they came from different > > physical > > > locations > > > > in the > > > > > > >> receiving room. That can be done by the > > receiver if > > > it gets > > > > those > > > > > > >> streams separately, and has information > > about their > > > > intended > > > > > > >> relationships. It can also be done by the > > sender or > > > MCU and > > > > passed > > > > > on > > > > > > >> to > > > > > > >> the receiver as a single stream with stereo > > or > > > binaural > > > > coding. > > > > > > > > > > > > > > Yes. It could also be done by the sender > > using the > > > "linear > > > > array" > > > > > audio channel format. Maybe it > > > > > > is true that stereo or binaural audio channels > > would > > > always be > > > > sent as > > > > > a single stream, but I was not > > > > > > assuming that yet, at least not in general > > when you > > > consider > > > > other > > > > > types too, such as linear array > > > > > > channels. > > > > > > > > > > > > >> So it seems to me you have two concepts > > here, not > > > one. One > > > > has to > > > > > do > > > > > > >> with describing the relationships between > > streams, > > > and the > > > > other > > > > > has to > > > > > > >> do with the encoding of spacial > > relationships > > > *within* a > > > > single > > > > > stream. > > > > > > > > > > > > > > Maybe that is a better way to describe it, > > if you > > > assume > > > > > multi-channel audio is always sent with all > > > > > > the channels in the same RTP stream. Is that > > what you > > > mean? > > > > > > > > > > > > > > I was considering the linear array format to > > be > > > another type > > > > of > > > > > multi-channel audio, and I know > > > > > > people want to be able to send each channel in > > a > > > separate RTP > > > > stream. > > > > > So it doesn't quite fit with > > > > > > how you separate the two concepts. In my > > view, > > > identifying > > > > the > > > > > separate channels by what they mean is > > > > > > the same concept for linear array and stereo. > > For > > > example > > > > "this > > > > > channel is left, this channel is > > > > > > center, this channel is right". To me, that > > is the > > > same > > > > concept for > > > > > identifying channels whether or > > > > > > not they are carried in the same RTP stream. > > > > > > > > > > > > > > Maybe we are thinking the same thing but > > getting > > > confused by > > > > > terminology about channels vs. streams. > > > > > > > > > > > > Maybe. Let me try to restate what I now think > > you are > > > saying: > > > > > > > > > > > > The audio may consist of several "channels". > > > > > > > > > > > > Each channel may be sent over its own RTP > > stream, > > > > > > or multiple channels may be multiplexed over > > an RTP > > > stream. > > > > > > > > > > > > I guess much of this can also apply to video. > > > > > > > > > > > > When there are exactly two audio channels, > > they may be > > > encoded > > > > as > > > > > > "stereo" or "binaural", which then affects how > > they > > > should be > > > > rendered > > > > > > by the recipient. In these cases the primary > > info that > > > is > > > > required > > > > > about > > > > > > the individual channels is which is left and > > which is > > > right. > > > > (And > > > > > which > > > > > > perspective to use in interpretting left and > > right.) > > > > > > > > > > > > For other multi-channel cases more information > > is > > > required > > > > about the > > > > > > role of each channel in order to properly > > render them. > > > > > > > > > > > > Thanks, > > > > > > Paul > > > > > > > > > > > > > > > > > > >> Or, are you asserting that stereo and > > binaural are > > > simply > > > > ways to > > > > > > >> encode > > > > > > >> multiple logical streams in one RTP stream, > > > together with > > > > their > > > > > spacial > > > > > > >> relationships? > > > > > > > > > > > > > > No, that is not what I'm trying to say. > > > > > > > > > > > > > > Mark > > > > > > > > > _______________________________________________ > > > > > > > clue mailing list > > > > > > > clue@ietf.org > > > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > clue mailing list > > > > > > clue@ietf.org > > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > _______________________________________________ > > > > > clue mailing list > > > > > clue@ietf.org > > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue**** > > **** > > ** ** > --20cf307c9f46103a2304aab53e6b Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable I didn't say that we needed to specify how rendering is done.=A0= However, the information we are providing is intended to enable the render= ing of an interoperable "being there experience".
=A0
So w= e do need to know if the information we signal actually enables such render= ing.=A0 The best way to ensure that is to construct a rendering method that= would work, and which would need the information we propose to provide. We= don't need to specify the rendering method, we don't even have to = describe it on the list.=A0 But if we all can't construct a reasonable = method easily, then CLUE will have failed.=A0

Regards,
Stephen

On Wed, Aug 17, 2011 at 11:12 AM, Roni Even = <Even.roni@huawei.com> wrote:

Steve,

I also would=A0 not agree that we need to specify how rendering is done o= nce the streams arrive to the receiver. I think that the receiver should be= able to provide information about his rendering capabilities which may be = helpful for the sender to create a better content.

=A0

As for my comment on 2 and 3 case. The current layout dis= cussion is on audio with 2 channels, I was saying that this is a simple cas= e and if we want to discuss layout it should be clear what it means for mul= ti-video and audio cases.

=A0

My comment on the current framework is that it dives imme= diately to the model and the examples talk about a three camera left to rig= ht case. I was questioning how this model scales.

=A0

I agree with your comment about the number of capture dev= ices. By capture field I was trying to mention the issue of the view port (= is the focus on the first row, second row, multiview,=85). Maybe it can be = described by the framework but it is not explained how.

=A0

Roni

=A0

From: Stephen Botzko [mail= to:stephen.bo= tzko@gmail.com]
Sent: Wednesday, August 17, 2011 5:06 PM

<= div>
To: Roni Even
Cc: Charles Eckel (eckelcu); clue@ietf.org
Subject: Re: [clue] continuing "layout" discussion<= u>

=A0

For audio at least (and probably video) I agree you need the number and pla= cement of captures, but I see no value in knowing the number of capt= ure devices.=A0 For instance, the stereo encoding we started with mi= ght be derived from a microphone in front each of two participants, or it m= ight be derived from a large microphone array.=A0 For receivers, there is n= o difference, so I see no reason to signal it.=A0=A0 The 3D telepresence de= monstration technology in the EU used 3 cameras to derive each 3D view (I t= hink), but it could have also been done with a Kinect-style single camera.= =A0 Again, the number of cameras used to make the capture would make no dif= ference to a receiver.

I don't know what you mean by the "capture field" (or wha= t specifically about it you think we ought to know), so at present I have n= o opinion as to whether it is needed or not.=A0 I agree that mixing of sour= ces from multiple rooms needs more attention (and I think that is what the = "layout" conversation should be chiefly about).

I think the framework draft is not limited to 2 channels for audio and = 3 cameras.=A0 I haven't seen any issues for an N-image video wall and a= n associated M-channel audio capture, as long as you stay within the stated= assumption in the model that the audio/video are on one wall.=A0 Apparentl= y there is work to extend the model to handle multiple video walls, and of = course the audio will need to be adjusted for that.

BTW, I would challenge anyone who is either proposing an alternative fr= amework or extending this draft to build the needed rendering equations to = get the right sound field from a standard arrangement of speakers.=A0 Thoug= h rendering itself is out of scope, we do have enablement requirements for = rendering.=A0 There are lots of things we could signal, but if their= use in rendering is not easily understood, then we will not achieve intero= perability.

Regards,
Stephen


On Wed, Aug 17, 2011 at 12:51 AM, Roni Even <Even.roni@huawei.com> wrote:=

Hi Steve,

The two channel is a simple private cas= e and using this case to define the required information from the capture a= nd render side is like saying that I proved a mathematical induction for n= =3D2 therefore it works for every n. I see this issue since we are using a = n=3D2 channels audio and a n=3D3 cameras left to right examples to provide = a solution that will scale to any n.

=A0

What I was trying to say that the current way we describe= the stream by a number is not enough if we want to go to the "being t= here" experience.

We ne= ed to see what are the dimensions that the model need in order to be able t= o convey the capture information and the rendering capabilities for both au= dio and video. I think that there is some similarities since we have a numb= er of streams, how the capture is done and what is the rendering device as = the basic information.

I thi= nk that before going to the framework model it may be beneficial to create = a list of the parameters we need to convey, provide a term for each group o= f parameters and have a way to define them in the model. For example =A0=A0= for the capture we have the number of capture devices, the arrangement (Spa= tial), the encoding process (including mixing if multiple inputs), the capt= ure field and others.

=A0

Regards

Roni

=A0

From:= Stephen Botzko [mailto:stephen.botzko@gmail.com] <= br>Sent: Wednesday, August 17, 2011 4:37 AM
To: Roni Even
Cc: Charles Eckel (eckelcu); clue@ietf.org
<= /p>


Subject: Re: [clue] continui= ng "layout" discussion

=A0<= /p>

Hi Roni

For= this particular discussion, all of the two channel transmissions are "= ;stereo", they are just transported differently.=A0

As far as the framework draft is concerned, the various microphone arra= ngements are accounted for by the signaling of the 1-100 indices for each c= hannel.

Binaural is something else- either an HRTF function is appl= ied to the two channels prior to rendering (which was Christer's case w= ith the central rendering server), or you have a dummy head with microphone= s in the ears in the telepresence room to make the capture.=A0 Not sure if = we need to distinguish the capture and render cases right now.

Regards,
Stephen

On Tue= , Aug 16, 2011 at 7:34 PM, Roni Even <Even.roni@huawei.com> wrote:

Hi guys,
In case 1 according to RFC 3551 (section= 4.1) 2 channels in the rtpmap means
left and right channels described a= s stereo. Are you saying that for the 2
and 2b case you also assume ster= eo capture or can it be any other way of
creating the two audio streams =A0from the same room (Binaural recording (n= ot
common), or some other arrangements of the microphones). But this tal= k about
the capture side.

I think that Christer talked about the = rendering side and not only on the
capture side.

Roni


= > -----Original Message-----
> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] = On Behalf Of

> Charles Eckel (eckelcu)
> Sent: Wed= nesday, August 17, 2011 12:40 AM
> To: Stephen Botzko
> Cc: clue@ietf.org=

> Subjec= t: Re: [clue] continuing "layout" discussion
>
> Agre= ed. The difference I am trying to point out is that in (1), the
> inf= ormation you need to describe the audio stream for appropriate
> rendering is already handled quite well by existing SIP/SDP/RTP and> most
> implementations, whereas you need CLUE for (2) and (2b).=
>
> Cheers,
> Charles
>
> > -----Original= Message-----
> > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]
> > Sent: = Tuesday, August 16, 2011 2:14 PM
> > To: Charles Eckel (eckelcu) > > Cc: Paul Kyzivat; clue@ietf.org
> > Subject: Re: [clue] continuing "lay= out" discussion
> >
> > Well, the audio in (1) and (= 2b) is certainly packetized differently.
> But not compressed differently
> > (unless you are assuming t= hat the signal in (1) is jointly encoded
> stereo - which it could be= I guess,
> > but it would be unusual for telepresence systems). A= lso, the audio in
> (1) is not mixed, no matter how
> > it is encoded.
> &g= t;
> > In any event, I believe that the difference between (1) and= (2) and
> (2b) is really a transport
> > question that has = nothing to do with layout. The same information is
> needed to enable proper
> > rendering, and once the streams a= re received, they are rendered in
> precisely the same way.
> &= gt;
> > Regards,
> > Stephen Botzko
> >
> >
> > On Tue, Aug 16, 2011 at 4:23 PM, Charles Eckel (eckelcu)
> = <eckelcu@cisco.co= m> wrote:
> >
> >
> > =A0 =A0 I am distin= guishing between:
> >
> > =A0 =A0 (1) a single RTP stream that consists of a s= ingle stereo audio
> stream
> > =A0 =A0 (2) two RTP streams,= one that contains left speaker audio and
> the other
> > = =A0 =A0 than contains right speaker audio
> >
> > =A0 =A0 (2) could also be transmitted in a single RT= P stream using SSRC
> > =A0 =A0 multiplexing. Let me call that (2b= ).
> > =A0 =A0 (2) and (2b) are essentially the same. Just the RTP= mechanism
> employed
> > =A0 =A0 is difference.
> > =A0 =A0 (1) = is different from (2) and (2b) in that the audio signal
> encoded is<= br>> > =A0 =A0 actually different.
> >
> > =A0 =A0 = Cheers,
> > =A0 =A0 Charles
> >
> >
> > =A0 =A0 > -----Original Message-----=
> > =A0 =A0 > From: Stephen Botzko [mailto:stephen.botzko@gmail.com]> >
> > =A0 =A0 > Sent: Tuesday, August 16, 2011 6:20 AM
> > = =A0 =A0 > To: Charles Eckel (eckelcu)
> > =A0 =A0 > Cc: Paul= Kyzivat; clue@ietf.org<= /a>
> > =A0 =A0 > Subject: Re: [clue] continuing "layout&q= uot; discussion
> > =A0 =A0 >
> > =A0 =A0 > I guess by "stream&qu= ot; you are meaning RTP stream? =A0in which case
> by
> > = =A0 =A0 "mix" you perhaps mean that the left
> > =A0 =A0= > and right channels are placed in a single RTP stream??? =A0What
> do you
> > =A0 =A0 mean when you describe some audio
> = > =A0 =A0 > captures as "independent" - are you thinking th= ey come from
> different
> > =A0 =A0 rooms???.
> > = =A0 =A0 >
> > =A0 =A0 > I think in many respects audio distribution and spat= ial audio
> layout
> > =A0 =A0 is at least as difficult as v= ideo
> > =A0 =A0 > layout, and have some unique issues. =A0For = one thing, you need
> to sort
> > =A0 =A0 out how you should place the
> >= =A0 =A0 > audio from human participants who are not on camera, and what=
> should
> > =A0 =A0 happen later on if some of those
&g= t; > =A0 =A0 > participants are shown.
> > =A0 =A0 >
> > =A0 =A0 > I suggest it is necessary = to be very careful with terminology.
> In
> > =A0 =A0 partic= ular, I think it is important
> > =A0 =A0 > to distinguish comp= osition from RTP transmission.
> > =A0 =A0 >
> > =A0 =A0 > Regards,
> > =A0 = =A0 > Stephen Botzko
> > =A0 =A0 >
> > =A0 =A0 >=
> > =A0 =A0 >
> > =A0 =A0 > On Mon, Aug 15, 2011 a= t 5:45 PM, Charles Eckel (eckelcu)
> > =A0 =A0 <
eckelcu@cisco.com> wrote:
> > =A0 =A0 >
> > = =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > -----Original Messa= ge-----
> > =A0 =A0 > =A0 =A0 =A0 > From: Stephen Botzko [ma= ilto:stephen.= botzko@gmail.com]
> > =A0 =A0 > =A0 =A0 =A0 > Sent: Monday, August 15, 2011 2:14 = PM
> > =A0 =A0 > =A0 =A0 =A0 > To: Charles Eckel (eckelcu)> > =A0 =A0 > =A0 =A0 =A0 > Cc: Paul Kyzivat; clue@ietf.org
> > =A0 =A0 > =A0 =A0 =A0 > Subject: Re: [clue] continuing &quo= t;layout" discussion
> > =A0 =A0 > =A0 =A0 =A0 >
>= ; > =A0 =A0 > =A0 =A0 =A0 > Inline
> > =A0 =A0 > =A0 = =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > On Mon, Aug 15, 2011 at 4:21 PM, Ch= arles Eckel
> (eckelcu)
> > =A0 =A0 > =A0 =A0 =A0 <eckelcu@cisco.com&g= t; wrote:
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 Please see inline.
> > =A0 =A0 > =A0 =A0 = =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 &g= t; =A0 =A0 =A0 > =A0 =A0 =A0 > -----Original Message-----
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > From: clue-bounces@ietf.org> > =A0 =A0 [mailto:clue-bounces@ietf.org] On
> > =A0 =A0 > =A0 =A0 =A0 Behalf
> > =A0 =A0 > =A0 =A0= =A0 > =A0 =A0 =A0 Of Paul Kyzivat
> > =A0 =A0 > =A0 =A0 =A0= >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Sent: Thu= rsday, August 11, 2011 6:02 AM
> > =A0 =A0 > =A0 =A0 =A0 > > > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > To: clue@ietf.org
> > =A0 =A0= > =A0 =A0 =A0 > =A0 =A0 =A0 > Subject: Re: [clue] continuing &quo= t;layout"
> discussion
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Inline
> > =A0 =A0 >= =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 &g= t; =A0 =A0 =A0 > On 8/10/11 5:49 PM, Duckworth, Mark wrote:
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> -----Orig= inal Message-----
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 &g= t; >> From: clue-bounces@ietf.org
> > =A0 =A0 [mailto:clue-bounces@ietf.org]
> > =A0 =A0 > =A0 =A0 =A0 On
> > =A0 =A0 > =A0 =A0 =A0= > =A0 =A0 =A0 Behalf Of
> > =A0 =A0 > =A0 =A0 =A0 > =A0 = =A0 =A0 > >> Paul Kyzivat
> > =A0 =A0 > =A0 =A0 =A0 &g= t; =A0 =A0 =A0 > >> Sent: Tuesday, August 09, 2011 9:03 AM
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> To: clue@ietf.org
> >= ; =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> Subject: Re: [clu= e] continuing "layout"
> discussion
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >= >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >>&= gt; 4 - multi stream media format - what the
> streams
> > = =A0 =A0 mean with
> > =A0 =A0 > =A0 =A0 =A0 respect
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 to
> > =A0 =A0= > =A0 =A0 =A0 > =A0 =A0 =A0 > >> each other, regardless of = the actual
> content on the
> > =A0 =A0 > =A0 =A0 =A0 str= eams. =A0For
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > &g= t;> audio, examples are stereo, 5.1 surround,
> binaural,
> > =A0 =A0 linear
> > =A0 =A0 > =A0 = =A0 =A0 array.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > = >> (linear array is described in the clue
> framework
> &= gt; =A0 =A0 document).
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 = =A0 Perhaps 3D
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> video for= mats would also fit in this
> category.
> > =A0 =A0 This
= > > =A0 =A0 > =A0 =A0 =A0 information is
> > =A0 =A0 >= =A0 =A0 =A0 > =A0 =A0 =A0 > >> needed in order to properly ren= der the
> media into
> > =A0 =A0 light and
> > =A0 =A0 > = =A0 =A0 =A0 sound
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 fo= r
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> huma= n observers. =A0I see this at the same
> level as
> > =A0 =A0 > =A0 =A0 =A0 identifying a
> > =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 codec,
> > =A0 =A0 > =A0 =A0 =A0 &= gt; =A0 =A0 =A0 > >> independent of the audio or video content
= > carried
> > =A0 =A0 on the
> > =A0 =A0 > =A0 =A0 =A0 streams,
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 and
> > =A0 =A0 > =A0 =A0 =A0 > =A0= =A0 =A0 > >> independent of how any composition of
> source= s is
> > =A0 =A0 done.
> > =A0 =A0 > =A0 =A0 =A0 ><= br> > > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 I do not think this is necessarily true. Taking
>= ; audio as
> > =A0 =A0 an
> > =A0 =A0 > =A0 =A0 =A0 ex= ample, you
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 could hav= e two audio streams that are mixed to
> form a
> > =A0 =A0 single
> > =A0 =A0 > =A0 =A0 = =A0 stereo
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 audio str= eam, or you could have them as two
> independent
> > =A0 =A0= (not
> > =A0 =A0 > =A0 =A0 =A0 mixed)
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 streams that are associ= ate with each other by
> some
> > =A0 =A0 grouping
> &= gt; =A0 =A0 > =A0 =A0 =A0 mechanism.
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 This group would be categorized as being stereo
> audio
> > =A0 =A0 with one
> > =A0 =A0 > =A0 =A0 = =A0 audio
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 stream bei= ng the left and the other the right.
> The codec
> > =A0 =A0= used
> > =A0 =A0 > =A0 =A0 =A0 for each
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 could be different, tho= ugh I agree they would
> typically
> > =A0 =A0 be the
>= ; > =A0 =A0 > =A0 =A0 =A0 same.
> > =A0 =A0 > =A0 =A0 =A0= > =A0 =A0 =A0 Consequently, I think at attribute such as
> "stereo" as
> > =A0 =A0 being
> > =A0 =A0 = > =A0 =A0 =A0 more of a
> > =A0 =A0 > =A0 =A0 =A0 > =A0 = =A0 =A0 grouping concept, where the group may consist
> of:
> &= gt; =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 - multiple independent stream= s, each with
> potentially
> > =A0 =A0 its own
> > =A0 =A0 > =A0= =A0 =A0 spatial
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 ori= entation, codec, bandwidth, etc.,
> > =A0 =A0 > =A0 =A0 =A0 >= ; =A0 =A0 =A0 - a single mixed stream
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 = =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > [sb] I do not understan= d this distinction. =A0What do
> you mean
> > =A0 =A0 when y= ou
> > =A0 =A0 > =A0 =A0 =A0 say "two audio streams that a= re
> > =A0 =A0 > =A0 =A0 =A0 > mixed to form a single stereo strea= m", and how is this
> > =A0 =A0 different from
> > = =A0 =A0 > =A0 =A0 =A0 the left and right grouping?
> > =A0 =A0 = >
> > =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 In one case they are mixed by the source= of the stream
> into a
> > =A0 =A0 single
> > =A0 = =A0 > =A0 =A0 =A0 stream, and in another they are sent as two separate> streams by
> > =A0 =A0 the
> > =A0 =A0 > =A0 =A0 =A0 source. The end= result once rendered at the receiver may
> be the
> > =A0 = =A0 same,
> > =A0 =A0 > =A0 =A0 =A0 but what is sent is differe= nt. This example with audio
> is
> > =A0 =A0 perhaps too
> > =A0 =A0 > =A0 =A0 = =A0 simple. If you think of it as video that is composed
> into a
= > > =A0 =A0 single video
> > =A0 =A0 > =A0 =A0 =A0 stream= vs. multiple via streams that are sent
> individually, the
> > =A0 =A0 > =A0 =A0 =A0 difference may= be more clear.
> > =A0 =A0 >
> > =A0 =A0 > =A0 =A0= =A0 Cheers,
> > =A0 =A0 > =A0 =A0 =A0 Charles
> > =A0= =A0 >
> > =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 = =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 &g= t; =A0 =A0 =A0 > =A0 =A0 =A0 Cheers,
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 Charles
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 > >> I was with you all the way until 4. That=
> one I
> > =A0 =A0 don't
> > =A0 =A0 > =A0= =A0 =A0 understand.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0= > >> The name you chose for this has
> connotations for
> > =A0 =A0 me, but
> > =A0 =A0 >= ; =A0 =A0 =A0 isn't
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 = =A0 fully in
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > &g= t;> harmony with the definitions you give:
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > = =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > I'm happy to change= the name if you have a
> > =A0 =A0 suggestion
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 > Not yet. Maybe once the concepts are more
> clearly
> > =A0 =A0 defined I
> > =A0 =A0 > =A0 = =A0 =A0 will have
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 an=
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > opinion.
&g= t; > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> If we con= sider audio, it makes sense that
> multiple
> > =A0 =A0 streams
> > =A0 =A0 > =A0 = =A0 =A0 can be
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > = >> rendered as if they came from different
> physical
> &= gt; =A0 =A0 locations
> > =A0 =A0 > =A0 =A0 =A0 in the
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> receiving= room. That can be done by the
> receiver if
> > =A0 =A0 it = gets
> > =A0 =A0 > =A0 =A0 =A0 those
> > =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 > >> streams separately, and has info= rmation
> about their
> > =A0 =A0 > =A0 =A0 =A0 intended
> >= ; =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> relationships. It= can also be done by the
> sender or
> > =A0 =A0 MCU and
= > > =A0 =A0 > =A0 =A0 =A0 passed
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 on
> > =A0 =A0= > =A0 =A0 =A0 > =A0 =A0 =A0 > >> to
> > =A0 =A0 &g= t; =A0 =A0 =A0 > =A0 =A0 =A0 > >> the receiver as a single stre= am with stereo
> or
> > =A0 =A0 binaural
> > =A0 =A0 > =A0 =A0 =A0 coding.
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 &g= t; =A0 =A0 =A0 > > Yes. =A0It could also be done by the sender
>= ; using the
> > =A0 =A0 "linear
> > =A0 =A0 > =A0 =A0 =A0 array"
> > =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 audio channel format. =A0Maybe it
> >= =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > is true that stereo or bina= ural audio channels
> would
> > =A0 =A0 always be
> > =A0 =A0 > =A0 =A0 =A0 sent as
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 a single stream, but I was not
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > assuming that yet, at least not = in general
> when you
> > =A0 =A0 consider
> > =A0 =A0 > =A0 =A0 =A0 other
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 types too, such as linear array
> > =A0 =A0 &= gt; =A0 =A0 =A0 > =A0 =A0 =A0 > channels.
> > =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 >= ; =A0 =A0 =A0 > >> So it seems to me you have two concepts
> here, not
> > =A0 =A0 one. One
> > =A0 =A0 > =A0 = =A0 =A0 has to
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 do> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> with des= cribing the relationships between
> streams,
> > =A0 =A0 and the
> > =A0 =A0 > =A0 =A0 =A0 other
&g= t; > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 has to
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >> do with the encoding of= spacial
> relationships
> > =A0 =A0 *within* a
> > =A0 =A0 > =A0 =A0 =A0 single
> > =A0 =A0 > =A0 =A0= =A0 > =A0 =A0 =A0 stream.
> > =A0 =A0 > =A0 =A0 =A0 > = =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 = =A0 > > Maybe that is a better way to describe it,
> if you
> > =A0 =A0 assume
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 multi-channel audio is always sent with all
> &g= t; =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > the channels in the same = RTP stream. =A0Is that
> what you
> > =A0 =A0 mean?
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 = =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > &= gt; I was considering the linear array format to
> be
> > = =A0 =A0 another type
> > =A0 =A0 > =A0 =A0 =A0 of
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 multi-channel audio, an= d I know
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > people= want to be able to send each channel in
> a
> > =A0 =A0 sep= arate RTP
> > =A0 =A0 > =A0 =A0 =A0 stream.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 So it doesn't quite= fit with
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > how y= ou separate the two concepts. =A0In my
> view,
> > =A0 =A0 i= dentifying
> > =A0 =A0 > =A0 =A0 =A0 the
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 separate channels by wh= at they mean is
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >= the same concept for linear array and stereo.
> For
> > =A0= =A0 example
> > =A0 =A0 > =A0 =A0 =A0 "this
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 channel is left, this c= hannel is
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > cente= r, this channel is right". =A0To me, that
> is the
> > = =A0 =A0 same
> > =A0 =A0 > =A0 =A0 =A0 concept for
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 identifying channels wh= ether or
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > not th= ey are carried in the same RTP stream.
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > = =A0 =A0 =A0 > > Maybe we are thinking the same thing but
> getting
> > =A0 =A0 confused by
> > =A0 =A0 > =A0= =A0 =A0 > =A0 =A0 =A0 terminology about channels vs. streams.
> &= gt; =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 >= ; =A0 =A0 =A0 > =A0 =A0 =A0 > Maybe. Let me try to restate what I now= think
> you are
> > =A0 =A0 saying:
> > =A0 =A0 > =A0 =A0= =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 = =A0 =A0 > The audio may consist of several "channels".
>= > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > Each channel may b= e sent over its own RTP
> stream,
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 > or multiple channels may be multiplexed over
&= gt; an RTP
> > =A0 =A0 stream.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > I guess much of this can also ap= ply to video.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > When there are e= xactly two audio channels,
> they may be
> > =A0 =A0 encoded
> > =A0 =A0 > =A0= =A0 =A0 as
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > &qu= ot;stereo" or "binaural", which then affects how
> the= y
> > =A0 =A0 should be
> > =A0 =A0 > =A0 =A0 =A0 rendered
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 > by the recipient. In these cases the primary<= br>> info that
> > =A0 =A0 is
> > =A0 =A0 > =A0 =A0= =A0 required
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 about<= br> > > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > the individual cha= nnels is which is left and
> which is
> > =A0 =A0 right.
= > > =A0 =A0 > =A0 =A0 =A0 (And
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 which
> > =A0 =A0 > =A0 =A0 =A0 > =A0 = =A0 =A0 > perspective to use in interpretting left and
> right.)
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > For other multi-c= hannel cases more information
> is
> > =A0 =A0 required
&= gt; > =A0 =A0 > =A0 =A0 =A0 about the
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > role of each chann= el in order to properly
> render them.
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 Thanks,
> > =A0 =A0 > =A0 =A0 =A0 = > =A0 =A0 =A0 > =A0 =A0 =A0 Paul
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 > >> Or, are you asserting that stereo an= d
> binaural are
> > =A0 =A0 simply
> > =A0 =A0 > =A0 =A0 =A0 ways to
> > =A0 =A0 > =A0 = =A0 =A0 > =A0 =A0 =A0 > >> encode
> > =A0 =A0 > =A0= =A0 =A0 > =A0 =A0 =A0 > >> multiple logical streams in one RTP= stream,
> > =A0 =A0 together with
> > =A0 =A0 > =A0 =A0 =A0 their
> > =A0 =A0 > =A0 =A0 = =A0 > =A0 =A0 =A0 spacial
> > =A0 =A0 > =A0 =A0 =A0 > =A0= =A0 =A0 > >> relationships?
> > =A0 =A0 > =A0 =A0 =A0= > =A0 =A0 =A0 > >
> > =A0 =A0 > =A0 =A0 =A0 > =A0 = =A0 =A0 > > No, that is not what I'm trying to say.
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > = =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > Mark
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> ___________________= ____________________________
> > =A0 =A0 > =A0 =A0 =A0 > =A0= =A0 =A0 > > clue mailing list
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > clue@ietf.org
> > =A0 = =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > > https://www.ietf.org/mailman/= listinfo/clue
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > >
> > = =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 > = =A0 =A0 =A0 > =A0 =A0 =A0 >
> _________________________________= ______________
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > = clue mailing list
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 > clue@ietf.org
> > =A0 =A0 >= ; =A0 =A0 =A0 > =A0 =A0 =A0 > https://www.ietf.org/mailman/listinfo/clu= e
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 _______________________= ________________________
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0= =A0 clue mailing list
> > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 = =A0 clue@ietf.org > > =A0 =A0 > =A0 =A0 =A0 > =A0 =A0 =A0 https://www.ietf.org/mail= man/listinfo/clue
> > =A0 =A0 > =A0 =A0 =A0 >
> &g= t; =A0 =A0 > =A0 =A0 =A0 >
> > =A0 =A0 >
> > =A0 =A0 >
> > =A0 =A0 ><= br>> >
> >
> >
>
> ____________________= ___________________________
> clue mailing list
> clue@ietf.org
> https://www.ietf.org/mailman/listinfo/clue

=A0

=A0

<= /div>

--20cf307c9f46103a2304aab53e6b-- From mary.ietf.barnes@gmail.com Fri Aug 19 11:34:32 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6E8F611E80AF for ; Fri, 19 Aug 2011 11:34:32 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -103.431 X-Spam-Level: X-Spam-Status: No, score=-103.431 tagged_above=-999 required=5 tests=[AWL=0.167, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id f+HYMd+PYdw7 for ; Fri, 19 Aug 2011 11:34:31 -0700 (PDT) Received: from mail-vx0-f172.google.com (mail-vx0-f172.google.com [209.85.220.172]) by ietfa.amsl.com (Postfix) with ESMTP id 9E80911E80AE for ; Fri, 19 Aug 2011 11:34:31 -0700 (PDT) Received: by vxi29 with SMTP id 29so3523951vxi.31 for ; Fri, 19 Aug 2011 11:35:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; bh=CdTdRT2D0SLQinMbLKN2uDI/TMMU0yRbsex6yoTPtPI=; b=VS0phO7CjHMC5unAdXVOALIXqTNC5MlF5pfliAYJK219poz5axBrpYgSijM+iUR3vG M5Ui0Ag27QsgEABlcCxmkETzPJD5ipWhHGf1uQchep4ESw4ixaD5HqHUcve5dcpG5qpr 77KdCwqlzQT+fVGChVQsaEaFxAEbePvgOCAro= MIME-Version: 1.0 Received: by 10.52.173.208 with SMTP id bm16mr110205vdc.49.1313778928825; Fri, 19 Aug 2011 11:35:28 -0700 (PDT) Received: by 10.52.164.197 with HTTP; Fri, 19 Aug 2011 11:35:28 -0700 (PDT) Date: Fri, 19 Aug 2011 13:35:28 -0500 Message-ID: From: Mary Barnes To: CLUE Content-Type: multipart/alternative; boundary=bcaec5196ddd5db70e04aadffea2 Subject: [clue] Pre-WGLC review for draft-ietf-clue-telepresence-use-cases X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Aug 2011 18:34:32 -0000 --bcaec5196ddd5db70e04aadffea2 Content-Type: text/plain; charset=ISO-8859-1 Hi folks, As mentioned during the IETF-81 WG session we would like to start a pre-WGLC review for the use case: http://datatracker.ietf.org/doc/draft-ietf-clue-telepresence-use-cases/ The objective is for folks to thoroughly review the document to assess how much work (if any) is needed prior to starting an official WGLC. We do not anticipate progressing the document as soon as the WGLC is complete as we want to wait until the framework is more mature to insure that the use cases are adequate. Please let the chairs know offlist if you would like to be a dedicated reviewer for this document. We are asking for explicit volunteers (or we will recruit them) to ensure that at least 3 people in the WG have thoroughly reviewed the document. Obviously, all WG members should review and provide comments on the mailing list. The deadline for the reviews is Sept. 9th (3 weeks time). Thanks, Mary CLUE WG co-chair --bcaec5196ddd5db70e04aadffea2 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi folks,

As mentioned during the IETF-81 WG session we = would like to start a pre-WGLC review for the use case:

The objective is for folks to thoroughly review the doc= ument to assess how much work (if any) is needed prior to starting an offic= ial WGLC. =A0We do not anticipate progressing the document as soon as the W= GLC is complete as we want to wait until the framework is more mature to in= sure that the use cases are adequate. =A0

Please let the chairs know offlist if you would like to= be a dedicated reviewer for this document. =A0We are asking for explicit v= olunteers (or we will recruit them) to ensure that at least 3 people in the= WG have thoroughly reviewed the document. =A0Obviously, all WG members sho= uld review and provide comments on the mailing list. =A0The deadline for th= e reviews is Sept. 9th (3 weeks time). =A0

Thanks,
Mary
CLUE WG co-chair
--bcaec5196ddd5db70e04aadffea2-- From pkyzivat@alum.mit.edu Sat Aug 20 15:14:51 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2749121F85B5 for ; Sat, 20 Aug 2011 15:14:51 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.547 X-Spam-Level: X-Spam-Status: No, score=-2.547 tagged_above=-999 required=5 tests=[AWL=0.052, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SF2-ViAKJuXc for ; Sat, 20 Aug 2011 15:14:50 -0700 (PDT) Received: from qmta13.westchester.pa.mail.comcast.net (qmta13.westchester.pa.mail.comcast.net [76.96.59.243]) by ietfa.amsl.com (Postfix) with ESMTP id 5DB5D21F8549 for ; Sat, 20 Aug 2011 15:14:48 -0700 (PDT) Received: from omta24.westchester.pa.mail.comcast.net ([76.96.62.76]) by qmta13.westchester.pa.mail.comcast.net with comcast id NmFL1h0021ei1Bg5DmFpDh; Sat, 20 Aug 2011 22:15:49 +0000 Received: from Paul-Kyzivats-MacBook-Pro.local ([24.62.109.41]) by omta24.westchester.pa.mail.comcast.net with comcast id NmFn1h0080tdiYw3kmFnne; Sat, 20 Aug 2011 22:15:48 +0000 Message-ID: <4E503211.9080504@alum.mit.edu> Date: Sat, 20 Aug 2011 18:15:45 -0400 From: Paul Kyzivat User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: CLUE Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [clue] CLUE minutes for IETF-81 have been posted X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Aug 2011 22:14:51 -0000 The minutes for the CLUE meetings at ietf-81 have now been posted. You can find them at: http://www.ietf.org/proceedings/81/minutes/clue.html Please let the chairs know if something is wrong. Thanks, Paul From allyn@cisco.com Sun Aug 21 16:03:38 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5156D21F86C1 for ; Sun, 21 Aug 2011 16:03:38 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -3.196 X-Spam-Level: X-Spam-Status: No, score=-3.196 tagged_above=-999 required=5 tests=[AWL=-0.598, BAYES_00=-2.599, HTML_MESSAGE=0.001] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QhKTq4Stqhpc for ; Sun, 21 Aug 2011 16:03:36 -0700 (PDT) Received: from rcdn-iport-5.cisco.com (rcdn-iport-5.cisco.com [173.37.86.76]) by ietfa.amsl.com (Postfix) with ESMTP id 8977A21F86EA for ; Sun, 21 Aug 2011 16:03:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=allyn@cisco.com; l=14883; q=dns/txt; s=iport; t=1313967880; x=1315177480; h=mime-version:subject:date:message-id:from:to; bh=s7m08GtZYbfQLeGVi8URKUO3aOIw8z2pxmzJml5WubQ=; b=kslYpoGsCwnrWGUOLejVjloxCFf/oG6vQtdCwuhyFDgHfhYL/RGn/xqp mb2fBmo3r+mbAFfOwWEciN6wXEkLqY7aF7A7ziWMVb4ua+rxqPCswDPV2 bxKl8AmzYz8ekVOERawDQ9J6u9rmPQqyv9ePGCTMEEjCDT7xsPkc0Mo4h 8=; X-Files: ATT3154706.txt : 146 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtEAANCNUU6rRDoG/2dsb2JhbABBgk2VZI9id4FAAQEBAQMBAQEPAQkRAzcHFwYBCBEDAQEBCwYXAQcBJR8HAQEFBAEEEwgBGYdTliiBIwGdYIVpXwSHYIYvihqEYocf X-IronPort-AV: E=Sophos;i="4.68,260,1312156800"; d="txt'?scan'208,217";a="15144905" Received: from mtv-core-1.cisco.com ([171.68.58.6]) by rcdn-iport-5.cisco.com with ESMTP; 21 Aug 2011 23:04:39 +0000 Received: from xbh-sjc-221.amer.cisco.com (xbh-sjc-221.cisco.com [128.107.191.63]) by mtv-core-1.cisco.com (8.14.3/8.14.3) with ESMTP id p7LN4dJX008803 for ; Sun, 21 Aug 2011 23:04:39 GMT Received: from xmb-sjc-221.amer.cisco.com ([128.107.191.80]) by xbh-sjc-221.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Sun, 21 Aug 2011 16:04:38 -0700 x-mimeole: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----_=_NextPart_001_01CC6056.B3C41465" Date: Sun, 21 Aug 2011 16:04:36 -0700 Message-ID: <9AC2C4348FD86B4BB1F8FA9C5E3A5EDC05602916@xmb-sjc-221.amer.cisco.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [AVTCORE] Review request: I-D Action:draft-alvestrand-one-rtp-01.txt Thread-Index: AcxcxOeCFAg1HmUZT8+Q16qlp+SHOQDkbDuA From: "Allyn Romanow (allyn)" To: X-OriginalArrivalTime: 21 Aug 2011 23:04:38.0883 (UTC) FILETIME=[B3EA4330:01CC6056] Subject: [clue] FW: [AVTCORE] Review request: I-D Action:draft-alvestrand-one-rtp-01.txt X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Aug 2011 23:03:38 -0000 This is a multi-part message in MIME format. ------_=_NextPart_001_01CC6056.B3C41465 Content-Type: multipart/alternative; boundary="----_=_NextPart_002_01CC6056.B3C41465" ------_=_NextPart_002_01CC6056.B3C41465 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable Relevant draft, in case you haven't seen it from other lists. =20 From: avt-bounces@ietf.org [mailto:avt-bounces@ietf.org] On Behalf Of Harald Alvestrand Sent: Wednesday, August 17, 2011 3:03 AM To: 'AVT Core WG' Subject: [AVTCORE] Review request: I-D Action:draft-alvestrand-one-rtp-01.txt =20 After discussions in Quebec City, I tried to summarize results of those discussions into an actionable document. I may have succeeded or failed at this. I have asked the MMUSIC mailing list to review the document (being unsure whether it belonged in MMUSIC or in AVTCORE) after an initial review in RTCWEB. For participants in both MMUSIC and AVTCORE, it may be better to discuss in MMUSIC only. All comments, feedback and suggestions for process welcome! Harald -------- Original Message --------=20 Subject:=20 I-D Action: draft-alvestrand-one-rtp-01.txt Date:=20 Wed, 17 Aug 2011 01:48:18 -0700 From:=20 internet-drafts@ietf.org Reply-To:=20 internet-drafts@ietf.org To:=20 i-d-announce@ietf.org =20 A New Internet-Draft is available from the on-line Internet-Drafts directories. =20 Title : SDP Grouping for Single RTP Sessions Author(s) : Harald Tveit Alvestrand Filename : draft-alvestrand-one-rtp-01.txt Pages : 12 Date : 2011-08-17 =20 This document describes an extension to the Session Description Protocol (SDP) to describe RTP sessions where media of multiple top level types, for example audio and video, are carried in the same RTP session. =20 This document is presented to the RTCWEB, AVTCORE and MMUSIC WGs for consideration. =20 =20 =20 A URL for this Internet-Draft is: http://www.ietf.org/internet-drafts/draft-alvestrand-one-rtp-01.txt =20 Internet-Drafts are also available by anonymous FTP at: ftp://ftp.ietf.org/internet-drafts/ =20 This Internet-Draft can be retrieved at: ftp://ftp.ietf.org/internet-drafts/draft-alvestrand-one-rtp-01.txt _______________________________________________ I-D-Announce mailing list I-D-Announce@ietf.org https://www.ietf.org/mailman/listinfo/i-d-announce Internet-Draft directories: http://www.ietf.org/shadow.html or ftp://ftp.ietf.org/ietf/1shadow-sites.txt =20 ------_=_NextPart_002_01CC6056.B3C41465 Content-Type: text/html; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable

Relevant draft, in case you haven’t seen it from = other lists.

 

From: avt-bounces@ietf.org [mailto:avt-bounces@ietf.org] On Behalf Of Harald Alvestrand
Sent: Wednesday, August 17, 2011 3:03 AM
To: 'AVT Core WG'
Subject: [AVTCORE] Review request: I-D Action:draft-alvestrand-one-rtp-01.txt

 

After discussions in Quebec City, I tried to = summarize results of those discussions into an actionable document. I may have = succeeded or failed at this.

I have asked the MMUSIC mailing list to review the document (being = unsure whether it belonged in MMUSIC or in AVTCORE) after an initial review in = RTCWEB. For participants in both MMUSIC and AVTCORE, it may be better to discuss = in MMUSIC only.

All comments, feedback and suggestions for process welcome!

            &= nbsp;   Harald

-------- Original Message --------

Subject:

I-D Action: = draft-alvestrand-one-rtp-01.txt

Date: =

Wed, 17 Aug 2011 01:48:18 -0700

From: =

internet-drafts@ietf.org

Reply-To:

internet-drafts@ietf.org

To: =

i-d-announce@ietf.org

 

A New Internet-Draft is available from the on-line Internet-Drafts =
directories.
 
  =
;      =
Title           : SDP =
Grouping for Single RTP =
Sessions
        =
Author(s)       : Harald Tveit =
Alvestrand
       =
; Filename        : =
draft-alvestrand-one-rtp-01.txt
   &n=
bsp;    =
Pages           : =
12
        =
Date            : =
2011-08-17
 
   =
This document describes an extension to the Session =
Description
   Protocol (SDP) to describe =
RTP sessions where media of multiple =
top
   level types, for example audio and =
video, are carried in the same RTP
   =
session.
 
   =
This document is presented to the RTCWEB, AVTCORE and MMUSIC WGs =
for
   =
consideration.
 
&nbs=
p;
 
A URL for this =
Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-alvestrand-one-rtp-01.txt
 
Internet-Drafts are =
also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-=
drafts/
 
This =
Internet-Draft can be retrieved at:
ftp://ftp.ietf.org/internet-drafts/draft-alvestrand-one-rtp-01.txt=
_______________________________________________
I-D-Announce mailing list
I-D-Announce@ietf.org
https://www.i=
etf.org/mailman/listinfo/i-d-announce
Internet-D=
raft directories: http://www.ietf.org/shadow.html<=
/a>
or ftp://ftp.ietf.org/iet=
f/1shadow-sites.txt
 
------_=_NextPart_002_01CC6056.B3C41465-- ------_=_NextPart_001_01CC6056.B3C41465 Content-Type: text/plain; name="ATT3154706.txt" Content-Transfer-Encoding: base64 Content-Description: ATT3154706.txt Content-Disposition: inline; filename="ATT3154706.txt" X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18NCkF1ZGlvL1Zp ZGVvIFRyYW5zcG9ydCBDb3JlIE1haW50ZW5hbmNlDQphdnRAaWV0Zi5vcmcNCmh0dHBzOi8vd3d3 LmlldGYub3JnL21haWxtYW4vbGlzdGluZm8vYXZ0DQo= ------_=_NextPart_001_01CC6056.B3C41465-- From internet-drafts@ietf.org Sun Aug 21 18:49:51 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 596C821F8531; Sun, 21 Aug 2011 18:49:51 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -102.531 X-Spam-Level: X-Spam-Status: No, score=-102.531 tagged_above=-999 required=5 tests=[AWL=0.068, BAYES_00=-2.599, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tdfJwnqYia18; Sun, 21 Aug 2011 18:49:50 -0700 (PDT) Received: from ietfa.amsl.com (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EABD421F86AB; Sun, 21 Aug 2011 18:49:50 -0700 (PDT) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: internet-drafts@ietf.org To: i-d-announce@ietf.org X-Test-IDTracker: no X-IETF-IDTracker: 3.59 Message-ID: <20110822014950.7146.35459.idtracker@ietfa.amsl.com> Date: Sun, 21 Aug 2011 18:49:50 -0700 Cc: clue@ietf.org Subject: [clue] I-D Action: draft-ietf-clue-telepresence-requirements-00.txt X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Aug 2011 01:49:51 -0000 A New Internet-Draft is available from the on-line Internet-Drafts director= ies. This draft is a work item of the ControLling mUltiple streams for tEle= presence Working Group of the IETF. Title : Requirements for Telepresence Multi-Streams Author(s) : Allyn Romanow Stephen Botzko Filename : draft-ietf-clue-telepresence-requirements-00.txt Pages : 11 Date : 2011-08-21 This memo discusses the requirements for a specification that enables telepresence interoperability, by describing the relationship between multiple RTP streams. In addition, the problem statement and definitions are also covered herein. A URL for this Internet-Draft is: http://www.ietf.org/internet-drafts/draft-ietf-clue-telepresence-requiremen= ts-00.txt Internet-Drafts are also available by anonymous FTP at: ftp://ftp.ietf.org/internet-drafts/ This Internet-Draft can be retrieved at: ftp://ftp.ietf.org/internet-drafts/draft-ietf-clue-telepresence-requirement= s-00.txt From mary.ietf.barnes@gmail.com Mon Aug 22 15:52:08 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4C43621F8B87 for ; Mon, 22 Aug 2011 15:52:08 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -103.442 X-Spam-Level: X-Spam-Status: No, score=-103.442 tagged_above=-999 required=5 tests=[AWL=0.156, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6Ex3CEpvSSZy for ; Mon, 22 Aug 2011 15:52:07 -0700 (PDT) Received: from mail-vx0-f172.google.com (mail-vx0-f172.google.com [209.85.220.172]) by ietfa.amsl.com (Postfix) with ESMTP id EE42221F8B1D for ; Mon, 22 Aug 2011 15:52:03 -0700 (PDT) Received: by vxi29 with SMTP id 29so5829918vxi.31 for ; Mon, 22 Aug 2011 15:53:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; bh=3ulcjJ7FLRLkl9RoaNSaqgxGAAo2TDac/WIMLAj6TYk=; b=cRL+u9P8CdyZrjkYhDQL513LWVNKjaZnT3mpF+SNYEG1X2MxUDD6ree5CIePYxvOII wYElT+cl7TnCreU9ihDvTHmo/lPVJo7QZzcl2KZu5KVWvWD8aJIw915HcWnHDHCUkOEd cqNuG6RnZ0qNFJCARinksqRX9CbJw3z93fK5I= MIME-Version: 1.0 Received: by 10.52.21.65 with SMTP id t1mr2843413vde.183.1314053589824; Mon, 22 Aug 2011 15:53:09 -0700 (PDT) Received: by 10.52.160.36 with HTTP; Mon, 22 Aug 2011 15:53:09 -0700 (PDT) Date: Mon, 22 Aug 2011 17:53:09 -0500 Message-ID: From: Mary Barnes To: CLUE Content-Type: multipart/alternative; boundary=20cf307d05a66ff69504ab1ff15c Subject: [clue] Reminder: CLUE WG Virtual Interim Meeting X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Aug 2011 22:52:08 -0000 --20cf307d05a66ff69504ab1ff15c Content-Type: text/plain; charset=ISO-8859-1 Hi all, This is a reminder of the interim meeting tomorrow. The meeting will be two hours long. We'll be focusing on the Framework. The authors will first do a summary of the areas presented at IETF-81 and then Brian will get a chance to go through the examples. The chairs do not yet have the updated charts. We'll post them on the Wiki as soon as they're available. Thanks, Mary. On Thu, Aug 11, 2011 at 5:58 PM, Mary Barnes wrote: > As a reminder, all the materials for the meeting will be available on the > CLUE WG wiki: > http://trac.tools.ietf.org/wg/clue/trac/wiki > > There is a tentative agenda available at this time. > > Regards, > Mary. > > > On Thu, Aug 11, 2011 at 10:19 AM, Mary Barnes wrote: > >> Hello , >> >> IETF Secretariat invites you to attend this online meeting. >> >> Topic: CLUE WG Virtual Interim Meeting >> Date: Tuesday, August 23, 2011 >> Time: 9:00 am, Pacific Daylight Time (San Francisco, GMT-07:00) >> Meeting Number: 963 755 542 >> Meeting Password: (This meeting does not require a password.) >> >> >> ------------------------------------------------------- >> To join the online meeting (Now from mobile devices!) >> ------------------------------------------------------- >> 1. Go to >> https://workgreen.webex.com/workgreen/j.php?ED=181742197&UID=1249097532&RT=MiM0 >> 2. If requested, enter your name and email address. >> 3. If a password is required, enter the meeting password: (This meeting >> does not require a password.) >> 4. Click "Join". >> >> To view in other time zones or languages, please click the link: >> >> https://workgreen.webex.com/workgreen/j.php?ED=181742197&UID=1249097532&ORT=MiM0 >> >> ------------------------------------------------------- >> To join the audio conference only >> ------------------------------------------------------- >> To receive a call back, provide your phone number when you join the >> meeting, or call the number below and enter the access code. >> Call-in toll number (US/Canada): 1-408-792-6300 >> Global call-in numbers: >> https://workgreen.webex.com/workgreen/globalcallin.php?serviceType=MC&ED=181742197&tollFree=0 >> >> Access code:963 755 542 >> >> ------------------------------------------------------- >> For assistance >> ------------------------------------------------------- >> 1. Go to https://workgreen.webex.com/workgreen/mc >> 2. On the left navigation bar, click "Support". >> >> You can contact me at: >> amorris@amsl.com >> 1-510-492-4081 >> >> To add this meeting to your calendar program (for example Microsoft >> Outlook), click this link: >> >> https://workgreen.webex.com/workgreen/j.php?ED=181742197&UID=1249097532&ICS=MI&LD=1&RD=2&ST=1&SHA2=1sO7X9GoItG7qDII-/DUsH2iEIlMx8cUMEWOoPlBrjY=&RT=MiM0 >> >> The playback of UCF (Universal Communications Format) rich media files >> requires appropriate players. To view this type of rich media files in the >> meeting, please check whether you have the players installed on your >> computer by going to >> https://workgreen.webex.com/workgreen/systemdiagnosis.php. >> >> Sign up for a free trial of WebEx >> http://www.webex.com/go/mcemfreetrial >> >> http://www.webex.com >> >> CCP:+14087926300x963755542# >> >> IMPORTANT NOTICE: This WebEx service includes a feature that allows audio >> and any documents and other materials exchanged or viewed during the session >> to be recorded. By joining this session, you automatically consent to such >> recordings. If you do not consent to the recording, discuss your concerns >> with the meeting host prior to the start of the recording or do not join the >> session. Please note that any such recordings may be subject to discovery in >> the event of litigation. >> >> > --20cf307d05a66ff69504ab1ff15c Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi all,

This is a reminder of the interim meeting tomorr= ow. The meeting will be two hours long. =A0 We'll be focusing on the Fr= amework. =A0The authors will first do a summary of the areas presented at I= ETF-81 and then Brian will get a chance to go through the examples. =A0The = chairs do not yet have the updated charts. =A0We'll post them on the Wi= ki as soon as they're available.

Thanks,
Mary.

On Thu, Aug 11, 2011 at 5:58 PM, Mary Barnes &l= t;mary.ietf.barnes@gmail.com<= /a>> wrote:
As a reminder, all the materials for the me= eting will be available on the CLUE WG wiki:

There is a tentative agenda available at this time.

Regards,
Mary.=A0


On Thu, Aug 11, 2011 at 10:19 AM= , Mary Barnes <mary.ietf.barnes@gmail.com> wrote:
Hello ,

IET= F Secretariat invites you to attend this online meeting.

Topic: CLUE WG Virtual Interim Meeting
Date: Tuesday, August 23, 2011
Time: 9:00 am, Pacific Daylight Time (= San Francisco, GMT-07:00)
Meeting Number: 963 755 542
Meeting Pas= sword: (This meeting does not require a password.)


---------= ----------------------------------------------
To join the online meeting (Now from mobile devices!)
---------------= ----------------------------------------
1. Go to https://workgreen.webex.com/workgreen/j.php?ED= =3D181742197&UID=3D1249097532&RT=3DMiM0
2. If requested, enter your name and email address.
3. If a password = is required, enter the meeting password: (This meeting does not require a p= assword.)
4. Click "Join".

To view in other time z= ones or languages, please click the link:
https://workgreen.webex.= com/workgreen/j.php?ED=3D181742197&UID=3D1249097532&ORT=3DMiM0 =

-------------------------------------------------------
To join the audio conference only
-----------------------------------= --------------------
To receive a call back, provide your phone number= when you join the meeting, or call the number below and enter the access c= ode.
Call-in toll number (US/Canada): 1-408-792-6300
Global call-in numbe= rs: https://= workgreen.webex.com/workgreen/globalcallin.php?serviceType=3DMC&ED=3D18= 1742197&tollFree=3D0

Access code:963 755 542

-----------------------------------= --------------------
For assistance
-----------------------------= --------------------------
1. Go to https://workgreen.webex.com/workgreen/= mc
2. On the left navigation bar, click "Support".

You ca= n contact me at:
amorris@amsl.com
1-510-492-4081

To add this meeting to your calendar program (for example Microsoft O= utlook), click this link:
https://workgreen.webex.com/workgreen/j.p= hp?ED=3D181742197&UID=3D1249097532&ICS=3DMI&LD=3D1&RD=3D2&a= mp;ST=3D1&SHA2=3D1sO7X9GoItG7qDII-/DUsH2iEIlMx8cUMEWOoPlBrjY=3D&RT= =3DMiM0

The playback of UCF (Universal Communications Format) rich media file= s requires appropriate players. To view this type of rich media files in th= e meeting, please check whether you have the players installed on your comp= uter by going to https://workgreen.webex.com/workgreen/systemd= iagnosis.php.

Sign up for a free trial of WebEx
http://www.webex.com/go/mcemfreetrial=

http://ww= w.webex.com

CCP:+14087926300x963755542#

IMPORTANT NOTICE: This WebEx se= rvice includes a feature that allows audio and any documents and other mate= rials exchanged or viewed during the session to be recorded. By joining thi= s session, you automatically consent to such recordings. If you do not cons= ent to the recording, discuss your concerns with the meeting host prior to = the start of the recording or do not join the session. Please note that any= such recordings may be subject to discovery in the event of litigation.



--20cf307d05a66ff69504ab1ff15c-- From Christian.Groves@nteczone.com Mon Aug 22 22:57:46 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5B9EE21F8B74 for ; Mon, 22 Aug 2011 22:57:46 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.599 X-Spam-Level: X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zG-CY8XXY9Cv for ; Mon, 22 Aug 2011 22:57:45 -0700 (PDT) Received: from ipmail05.adl6.internode.on.net (ipmail05.adl6.internode.on.net [150.101.137.143]) by ietfa.amsl.com (Postfix) with ESMTP id 40DED21F8B72 for ; Mon, 22 Aug 2011 22:57:44 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApMBADE+U0520T5Q/2dsb2JhbAAMKwqqWQEBAQEDAQEBNRsUBwQGEQsYCRYPCQMCAQIBFTATBgIBAYdxtkWDJ4MhBKQh Received: from ppp118-209-62-80.lns20.mel4.internode.on.net (HELO [127.0.0.1]) ([118.209.62.80]) by ipmail05.adl6.internode.on.net with ESMTP; 23 Aug 2011 15:28:49 +0930 Message-ID: <4E534181.7080705@nteczone.com> Date: Tue, 23 Aug 2011 15:58:25 +1000 From: Christian Groves User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20110812 Thunderbird/6.0 MIME-Version: 1.0 To: clue@ietf.org References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> In-Reply-To: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2011 05:57:46 -0000 Hello, With regards to spatial relation among streams, if I start a telepresence session and I only use the centre screen/camera of a three screen/camera telepresence system. i.e. its only me in the room, the left and right screen/cameras are off. According to CLUE framework what do I send in terms of capture set? a) VC1 - only one video capture it doesn't matter which of my screen it came from b) VC2, VC1, VC3 - VC2 and VC3 would be NULL. By having VC1 it the middle it indicates that its the centre screen. c) both are valid? Now while I'm there a person comes to join me and the left screen and camera are turned on. I guess now there's two capture sets VC2, VC1. Again do I have to add VC3 as NULL, to indicate that its my left and centre camera/screen being used. I guess for the far end it doesn't care it will render the VCs how it wants?? Regards, Christian On 6/08/2011 7:02 AM, Duckworth, Mark wrote: > I'd like to continue the discussion about layout and rendering issues. There are many separate but related things involved. I want to break it down into separate topics, and see how the topics are related to each other. And then we can discuss what CLUE needs to deal with and what is not in scope. > > I don't know if I'm using the best terms for each topic. If not, please suggest better terms. My use of the term "layout" here is not consistent with draft-wenger-clue-definitions-01, because I don't limit it to the rendering side. But my use of the terms "render" and "source selction" is consistent with that draft. > > 1- video layout composed arrangement within a stream - when multiple video sources are composed into one stream, they are arranged in some way. Typical examples are 2x2 grid, 3x3 grid, 1+5 (1 large plus 5 small), 1+PiP (1 large plus one or more picture-in-picture). These arrangements can be selected automatically or based on user input. Arrangements can change over time. Identifying this composed arrangement is separate from identifying or selecting which video images are used to fill in the composition. These arrangements can be constructed by an endpoint sending video, by an MCU, or by an endpoint receiving video as it renders to a display. > > 2 - source selection and identification - when a device is composing a stream made up of other sources, it needs some way to choose which sources to use, and some way of choosing how to combine them or where to place video images in the composed arrangement. Various automatic algorithms may be used, or selections can be made based on user input. Selections can change over time. One example is "select the two most recent talkers". It may also be desirable to identify which sources are used and where they are placed, for example so the receiving side can use this information in the user interface. Source selection can be done by an endpoint as it sends media, by an MUC, or by an endpoint receiving media. > > 3 - spatial relation among streams - how multiple streams are related to each other spatially, to be rendered such that the spatial arrangement is consistent. The examples we've been using have multiple video streams that are related in an ordered row from left to right. Audio is also included when it is desirable to match spatial audio to video. > > 4 - multi stream media format - what the streams mean with respect to each other, regardless of the actual content on the streams. For audio, examples are stereo, 5.1 surround, binaural, linear array. (linear array is described in the clue framework document). Perhaps 3D video formats would also fit in this category. This information is needed in order to properly render the media into light and sound for human observers. I see this at the same level as identifying a codec, independent of the audio or video content carried on the streams, and independent of how any composition of sources is done. > > I think there is general agreement that items 3 and 4 are in scope for CLUE, as they specifically deal with multiple streams to and from an endpoint. And the framework draft includes these. Items 1 and 2 are not new, those topics exist for traditional single stream videoconferencing. I'm not sure what aspects of 1 and 2 should be in scope for CLUE. It is hard to tell from the use cases and requirements. The framework draft includes them only to a very limited extent. > > Mark Duckworth > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue > From Even.roni@huawei.com Tue Aug 23 07:47:57 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E2F4E21F86FF for ; Tue, 23 Aug 2011 07:47:57 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -104.854 X-Spam-Level: X-Spam-Status: No, score=-104.854 tagged_above=-999 required=5 tests=[AWL=0.255, BAYES_05=-1.11, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aSfCnw4TqxEg for ; Tue, 23 Aug 2011 07:47:57 -0700 (PDT) Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [119.145.14.67]) by ietfa.amsl.com (Postfix) with ESMTP id EA68521F86B1 for ; Tue, 23 Aug 2011 07:47:56 -0700 (PDT) Received: from huawei.com (szxga04-in [172.24.2.12]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQD007JCYHR8M@szxga04-in.huawei.com> for clue@ietf.org; Tue, 23 Aug 2011 22:49:04 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQD00751YHRZE@szxga04-in.huawei.com> for clue@ietf.org; Tue, 23 Aug 2011 22:49:03 +0800 (CST) Received: from windows8d787f9 ([109.64.200.234]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQD00HEYYHLFO@szxml12-in.huawei.com> for clue@ietf.org; Tue, 23 Aug 2011 22:49:03 +0800 (CST) Date: Tue, 23 Aug 2011 17:48:18 +0300 From: Roni Even To: clue@ietf.org Message-id: <01be01cc61a3$b68b0770$23a11650$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_LlTD5xCWy76VJkNbvgPL7Q)" Content-language: en-us Thread-index: Acxho7CZ3anlGs5ORBGgXr7/nlfh7Q== Subject: [clue] Question on framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2011 14:47:58 -0000 This is a multi-part message in MIME format. --Boundary_(ID_LlTD5xCWy76VJkNbvgPL7Q) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Hi, I was wondering what is the difference between section 7.2.1 and 7.2.1 Roni --Boundary_(ID_LlTD5xCWy76VJkNbvgPL7Q) Content-type: text/html; charset=us-ascii Content-transfer-encoding: quoted-printable

Hi,

I was wondering = what is the difference between section 7.2.1 and 7.2.1

Roni

= --Boundary_(ID_LlTD5xCWy76VJkNbvgPL7Q)-- From Even.roni@huawei.com Tue Aug 23 07:48:38 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AC7A321F888A for ; Tue, 23 Aug 2011 07:48:38 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -105.683 X-Spam-Level: X-Spam-Status: No, score=-105.683 tagged_above=-999 required=5 tests=[AWL=0.914, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TW+uRwbLx56x for ; Tue, 23 Aug 2011 07:48:38 -0700 (PDT) Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [119.145.14.67]) by ietfa.amsl.com (Postfix) with ESMTP id B9D2A21F86B1 for ; Tue, 23 Aug 2011 07:48:37 -0700 (PDT) Received: from huawei.com (szxga04-in [172.24.2.12]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQD005PMYIWGW@szxga04-in.huawei.com> for clue@ietf.org; Tue, 23 Aug 2011 22:49:45 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQD004HUYIWXM@szxga04-in.huawei.com> for clue@ietf.org; Tue, 23 Aug 2011 22:49:44 +0800 (CST) Received: from windows8d787f9 ([109.64.200.234]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQD00I35YIRC6@szxml12-in.huawei.com> for clue@ietf.org; Tue, 23 Aug 2011 22:49:44 +0800 (CST) Date: Tue, 23 Aug 2011 17:49:00 +0300 From: Roni Even To: clue@ietf.org Message-id: <01c301cc61a3$cf1a1ba0$6d4e52e0$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_Ek84H0zBK/SVZdLqX8ajWw)" Content-language: en-us Thread-index: Acxho7CZ3anlGs5ORBGgXr7/nlfh7Q== Subject: [clue] Question on framework - sent to early X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2011 14:48:38 -0000 This is a multi-part message in MIME format. --Boundary_(ID_Ek84H0zBK/SVZdLqX8ajWw) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Hi, I was wondering what is the difference between section 7.2.1 and 7.2.2 Roni --Boundary_(ID_Ek84H0zBK/SVZdLqX8ajWw) Content-type: text/html; charset=us-ascii Content-transfer-encoding: quoted-printable

Hi,

I was wondering = what is the difference between section 7.2.1 and 7.2.2

Roni

= --Boundary_(ID_Ek84H0zBK/SVZdLqX8ajWw)-- From Even.roni@huawei.com Tue Aug 23 08:08:14 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CC15821F8B2F for ; Tue, 23 Aug 2011 08:08:14 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -105.912 X-Spam-Level: X-Spam-Status: No, score=-105.912 tagged_above=-999 required=5 tests=[AWL=0.686, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vgZ7sheRT+Bt for ; Tue, 23 Aug 2011 08:08:14 -0700 (PDT) Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [119.145.14.66]) by ietfa.amsl.com (Postfix) with ESMTP id 1688D21F8B27 for ; Tue, 23 Aug 2011 08:08:08 -0700 (PDT) Received: from huawei.com (szxga03-in [172.24.2.9]) by szxga03-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQD0020QZFDTL@szxga03-in.huawei.com> for clue@ietf.org; Tue, 23 Aug 2011 23:09:14 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga03-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQD00DSVZFD34@szxga03-in.huawei.com> for clue@ietf.org; Tue, 23 Aug 2011 23:09:13 +0800 (CST) Received: from windows8d787f9 ([109.64.200.234]) by szxml11-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQD00J6EZF8IU@szxml11-in.huawei.com> for clue@ietf.org; Tue, 23 Aug 2011 23:09:13 +0800 (CST) Date: Tue, 23 Aug 2011 18:08:30 +0300 From: Roni Even To: clue@ietf.org Message-id: <01e201cc61a6$87e61510$97b23f30$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_eCIs1UXJEky0okvJgeeGmA)" Content-language: en-us Thread-index: AcxhpoPDsaD4D0E1TAS+LNkioNlKpw== Subject: [clue] What CLUE is about? X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2011 15:08:14 -0000 This is a multi-part message in MIME format. --Boundary_(ID_eCIs1UXJEky0okvJgeeGmA) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Hi, Going back through the requirements and framework I noticed the term " satisfactory user experience" being used in both documents. See requirement 1 in the requirement document and the following paragraph from the framework. "The purpose of this effort is to make it possible to handle multiple streams of media in such a way that a satisfactory user experience is possible even when participants are on different vendor equipment and when they are using devices with different types of communication capabilities." I am not sure what the term means. The charter talks about " high definition, high quality audio/video enabling a "being-there" experience. My question is if satisfactory user experience means satisfactory to achieve a "being there" experience or is this term reducing the charter. Thanks Roni Even --Boundary_(ID_eCIs1UXJEky0okvJgeeGmA) Content-type: text/html; charset=us-ascii Content-transfer-encoding: quoted-printable

Hi, =

Going back through the requirements = and framework I noticed the term " satisfactory = user experience" being used in both = documents.

See = requirement 1 in the requirement document and the following paragraph = from the framework.

 

"The purpose of this effort is to make it possible to handle = multiple  streams of media in such a way that a satisfactory user = experience is

   possible = even when participants are on different vendor equipment and   = when they are using devices with different types of = communication

   capabilities."

 

I am not sure what the term means. The = charter talks about " high definition, high quality  = audio/video enabling a "being-there" = experience.

 

My question is if satisfactory user experience means = satisfactory to achieve a "being there" experience or is this = term reducing the charter.

 

Thanks

Roni = Even

= --Boundary_(ID_eCIs1UXJEky0okvJgeeGmA)-- From Even.roni@huawei.com Tue Aug 23 11:12:40 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5B92121F8B1C for ; Tue, 23 Aug 2011 11:12:40 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.049 X-Spam-Level: X-Spam-Status: No, score=-106.049 tagged_above=-999 required=5 tests=[AWL=0.549, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rMvSw4+026wr for ; Tue, 23 Aug 2011 11:12:39 -0700 (PDT) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [119.145.14.64]) by ietfa.amsl.com (Postfix) with ESMTP id 2AE2221F8AE6 for ; Tue, 23 Aug 2011 11:12:39 -0700 (PDT) Received: from huawei.com (szxga05-in [172.24.2.49]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQE00LMC7YXT0@szxga05-in.huawei.com> for clue@ietf.org; Wed, 24 Aug 2011 02:13:45 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQE006AX7YXT1@szxga05-in.huawei.com> for clue@ietf.org; Wed, 24 Aug 2011 02:13:45 +0800 (CST) Received: from windows8d787f9 (bzq-109-64-200-234.red.bezeqint.net [109.64.200.234]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQE003SF7YRL0@szxml12-in.huawei.com> for clue@ietf.org; Wed, 24 Aug 2011 02:13:44 +0800 (CST) Date: Tue, 23 Aug 2011 21:12:58 +0300 From: Roni Even To: clue@ietf.org Message-id: <026601cc61c0$4d84b6a0$e88e23e0$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_uxWNXk5UsacapQhNK/GsPQ)" Content-language: en-us Thread-index: AcxhwEkMuFsHjZhGRcaMeqH3OyClTQ== Subject: [clue] Full mesh conferneces X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2011 18:12:40 -0000 This is a multi-part message in MIME format. --Boundary_(ID_uxWNXk5UsacapQhNK/GsPQ) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Hi, During the interim meeting today when we talked about Simultaneous transmission sets there was a question if the provider may run to conflict requests for capture sets. There was a question if it is relevant for centralized multipoint and in my view this is not a problem. I mentioned that such a problem can occur if we support full mesh conferences. For example if a provider can send 3 video captures or by using one of the camera send a zoom version of the same scene it may cause a one screen system to ask for the zoom version and a three screen system to ask for the three streams. This can happen in a full mesh three way call and will require some way to resolve the conflict. My personal view is that we are not doing full mesh but just centralized multipoint conferences. I am looking for input is this is a problem we need to address Thanks Roni Even --Boundary_(ID_uxWNXk5UsacapQhNK/GsPQ) Content-type: text/html; charset=us-ascii Content-transfer-encoding: quoted-printable

Hi,

During the = interim meeting today when we talked about Simultaneous transmission = sets there was a question if the provider may run to conflict requests = for capture sets. There was a question if it is relevant for centralized = multipoint and in my view this is not a problem.

 

I mentioned = that such a problem can occur if we support full mesh conferences. =  For example if a provider can send 3 video captures or by using = one of the camera send a zoom version of the same scene it may cause a = one screen system to ask for the zoom version and a three screen system = to ask for the three streams. This can happen in a full mesh three way = call and will require some way to resolve the conflict.

 

My personal = view is that we are not doing full mesh but just centralized multipoint = conferences.

 

I am looking for input is this is a problem we need to = address

 

Thanks

Roni = Even

 

 

= --Boundary_(ID_uxWNXk5UsacapQhNK/GsPQ)-- From Mark.Duckworth@polycom.com Tue Aug 23 12:02:03 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 755A321F8C26 for ; Tue, 23 Aug 2011 12:02:03 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -6.571 X-Spam-Level: X-Spam-Status: No, score=-6.571 tagged_above=-999 required=5 tests=[AWL=0.027, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b7eswa6plCMh for ; Tue, 23 Aug 2011 12:02:02 -0700 (PDT) Received: from crpehubprd02.polycom.com (crpehubprd01.polycom.com [140.242.64.158]) by ietfa.amsl.com (Postfix) with ESMTP id E6DD721F8C18 for ; Tue, 23 Aug 2011 12:02:00 -0700 (PDT) Received: from Crpmboxprd01.polycom.com ([fe80::e001:c7b0:91a1:9443]) by crpehubprd02.polycom.com ([fe80::5efe:10.236.0.154%12]) with mapi; Tue, 23 Aug 2011 12:03:08 -0700 From: "Duckworth, Mark" To: "clue@ietf.org" Date: Tue, 23 Aug 2011 12:03:25 -0700 Thread-Topic: [clue] Full mesh conferneces Thread-Index: AcxhwEkMuFsHjZhGRcaMeqH3OyClTQABucaw Message-ID: <44C6B6B2D0CF424AA90B6055548D7A61AED0B444@CRPMBOXPRD01.polycom.com> References: <026601cc61c0$4d84b6a0$e88e23e0$%roni@huawei.com> In-Reply-To: <026601cc61c0$4d84b6a0$e88e23e0$%roni@huawei.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_44C6B6B2D0CF424AA90B6055548D7A61AED0B444CRPMBOXPRD01pol_" MIME-Version: 1.0 Subject: Re: [clue] Full mesh conferneces X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2011 19:02:03 -0000 --_000_44C6B6B2D0CF424AA90B6055548D7A61AED0B444CRPMBOXPRD01pol_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I also thought full mesh is not in scope and we don't need to address it. Mark Duckworth From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Ron= i Even Sent: Tuesday, August 23, 2011 2:13 PM To: clue@ietf.org Subject: [clue] Full mesh conferneces Hi, During the interim meeting today when we talked about Simultaneous transmis= sion sets there was a question if the provider may run to conflict requests= for capture sets. There was a question if it is relevant for centralized m= ultipoint and in my view this is not a problem. I mentioned that such a problem can occur if we support full mesh conferenc= es. For example if a provider can send 3 video captures or by using one of= the camera send a zoom version of the same scene it may cause a one screen= system to ask for the zoom version and a three screen system to ask for th= e three streams. This can happen in a full mesh three way call and will req= uire some way to resolve the conflict. My personal view is that we are not doing full mesh but just centralized mu= ltipoint conferences. I am looking for input is this is a problem we need to address Thanks Roni Even --_000_44C6B6B2D0CF424AA90B6055548D7A61AED0B444CRPMBOXPRD01pol_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

I also thought full mesh is not in scope and we don’t n= eed to address it.

Mark Duckworth

 

From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.= org] On Behalf Of Roni Even
Sent: Tuesday, August 23, 2011= 2:13 PM
To: clue@ietf.org
Subject: [clue] Full mesh co= nferneces

 =

Hi,

Durin= g the interim meeting today when we talked about Simultaneous transmission = sets there was a question if the provider may run to conflict requests for = capture sets. There was a question if it is relevant for centralized multip= oint and in my view this is not a problem.

 

I mentioned that such a proble= m can occur if we support full mesh conferences.  For example if a pro= vider can send 3 video captures or by using one of the camera send a zoom v= ersion of the same scene it may cause a one screen system to ask for the zo= om version and a three screen system to ask for the three streams. This can= happen in a full mesh three way call and will require some way to resolve = the conflict.

 

My personal view is that we are not doing full mesh but jus= t centralized multipoint conferences.

 

I am looking for input is this is a= problem we need to address

 <= /o:p>

Thanks

Ron= i Even

 

 

 

= --_000_44C6B6B2D0CF424AA90B6055548D7A61AED0B444CRPMBOXPRD01pol_-- From marshall.eubanks@gmail.com Tue Aug 23 12:49:03 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9361921F85A4 for ; Tue, 23 Aug 2011 12:49:03 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -104.259 X-Spam-Level: X-Spam-Status: No, score=-104.259 tagged_above=-999 required=5 tests=[AWL=1.339, BAYES_00=-2.599, GB_I_LETTER=-2, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XoPargdoWWcT for ; Tue, 23 Aug 2011 12:49:02 -0700 (PDT) Received: from mail-gy0-f172.google.com (mail-gy0-f172.google.com [209.85.160.172]) by ietfa.amsl.com (Postfix) with ESMTP id E637C21F8C95 for ; Tue, 23 Aug 2011 12:49:01 -0700 (PDT) Received: by gyf3 with SMTP id 3so437209gyf.31 for ; Tue, 23 Aug 2011 12:50:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=zoKWKpTlOQk3eXQdhlxIZd6OfVeyuz2dosrsZxi0BUM=; b=scDZ60ycCeksoYwyFTmFBaOYqFd/6GSfFvvk9KIyiOtuYvviTdgTrsMXdXsJyDpkY0 H/nzBncqniIW+gTJnGV8nW24uLqnwTL8za0hjFs3BatU4wuBIiMYbFjZFkMsUcwZpQfO 85Ab0oY/6JWibkAIhvVGizMUHAmgUY4xc/ylQ= MIME-Version: 1.0 Received: by 10.150.164.1 with SMTP id m1mr4354311ybe.297.1314129010253; Tue, 23 Aug 2011 12:50:10 -0700 (PDT) Received: by 10.150.202.16 with HTTP; Tue, 23 Aug 2011 12:50:10 -0700 (PDT) Date: Tue, 23 Aug 2011 15:50:10 -0400 Message-ID: From: Marshall Eubanks To: clue Content-Type: multipart/alternative; boundary=000e0cd58ca0d860c604ab318095 Subject: [clue] Notes from today's interim meeting. X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2011 19:49:03 -0000 --000e0cd58ca0d860c604ab318095 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable It ended just in time, as I lost cell service when the Earthquake happened. Regards Marshall ----------------------- Notes : Clue Interim August 23 2011 Noon EDT. Attendance via WEBEX Mary Barnes Marshall Eubanks Andy Hutton Andy Pepperell Basavaraj Brian Baldino 3 Call in Users Charles Eckel Claude Lamblin Dan Romascanu Espen Berger John Elwell Jonathan Lennox Mark Duckworth Michael Lundberg Paul C Paul Kyzivat PM Robert Sparks Roni Even Sfry Sohel Spencer Dawkins Stephan Wenger Tom Kristensen Allyn Romanow Started with the usual administrivia Allyn - We would like to encourage people with other use cases to step forward. Mark Duckworth - Attributes Audio attributes Video attributes Mixed attributes The sender can tell the receiver a little something about the streams they might want to receive. The Audio Channel Format (Mono, Stereo, Linear array) could be extended In Video, spatial scale is how wide it is in real world units. If there are three people, Image width might be 1.5 meters. Capture scene : Various options for how captures might be done. Say there are 6 people - might be 3 screens, 2 people each, might be 2 screens, 3 people each, might be 1 screen, switched based on voice with PIP for the rest. A capture set is used to form capture set rows, each being the screens from one of the previous examples. Andy Pepperell - Choosing streams The 3 element handshake Media Stream Consumer Media Stream Provider Consumer capability advertisement |-----------------------------> Media Capture Advertisement <-----------------------------| Consumer config of provider |-----------------------------> streams Roni : How does this relate to the previous part ? Andy : This is more to do with the mechanics to make things happen. Roni : This is a whole different set of parameters. Andy : That's correct. Roni : It's currently not in the document. Andy : The document is not up to date. Capabilities sent by consumer (at start of the session) It sends hints about itself, such as the number of screens, software limitations, etc. Then the MSP uses that in a media capture advertisement, using facts such a= s the number of cameras available. Also dynamic factors could cause a new Media Capture Advertisement, such as starting to share a document. The MSC then combines media capture advertisements with its characteristics to send a stream configure message Media Capture Advertisement =3D=3D Provider Capture Advertisement Capture attributes, simultaneous transmission sets, capture sets, and encoding groups. Encoding groups - multiple potential encodes - to enable provider to convey restrictions to the consumer Brian Baldino Examples If I have only one screen, I want a single capture of the entire video and audio scene This could come directly from a physical device, or from a composition of some devices. There is a spatial relationship between elements and a ROW of capture set VC0, VC1, VC2 implies a spatial ordering (left to right) There is nothing to prohibit a consumer from picking different capture set rows for audio and video MCU Scenarios What might a MCU want ? It might want to accommodate everything connecting to it. It may chose to receive ALL captures available to it, or only the ra= w captures, or only one choice, etc. Presentation streams are part of a separate capture scene, with no spatial relation between them. Questions ? Roni : I am not sure about the concept of the capture set. What can be sent at the same time. Brian : There is a separate simultaneous transmission set, what can be sent at the same time. Andy : Multiple capture sets represent the same scene. Roni : Does a capture set mean what can be sent simultaneously. Brian : They don't convey any information as to what can be set at the same time. Suppose you have a camera that can be zoomed in to capture 2 people, or zoomed out to capture the entire scene. It can't do both at the same time. Roni ; So, Vc3 could be a zoom out, or a composite picture. How do you distinguish between these two cases. Brian : That cannot be conveyed in the capture set. Roni : So there is one part to describe capture sets, and another to convey Paul K. : If you have mutual exclusion, then when you pick one (say zoomed out) that exclude the other Stephan : You are excluding a model where the MCU is hiding all attributes of other endpoints. Brian : If an end point can only do one or the other, somehow a decision must be made. ? : The MCU can be a producer or a consumer, and they have very different models. Roni : If there are physical limitations you have to pick one. Mary : Roni, if you think this is something people are doing or are likely to do, please write it up. We are not designing for all possible cases. Andy : We wanted to focus on the simplest MCU cases. There are so many possibilities. Brian : From my point of view I don't think you should send all possible capture sets. A middlebox might have to make some sort of reasonable decision on what to exclude. Andy : There might be other constraints, such as middlebox CPU. Paul K : I am having trouble with understand what "mix" means. Brian : In video we used the term "composed." Paul : Would this be conveyed even if you didn't say the term "mixed" ? It could be composed, or produced by zooming out, for example. Andy : A stream might be happy to receive a composed stream while a MCY might not. Paul : I would like to see a more precise definition of what it means. Andy : We are keen not to specify algorithms. Roni : What is repeated ? Can a consumer change its Consumer capability advertisement mid-meeting. In the MCU case it may change mid-call. Andy : I so reason why not. Brian : The word "hint" is a little of a misnomer. Paul Cloverdale : What we want is what does each end point got ? Andy : The number of physical screens - some people thought this would be useful. Paul Cloverdale : The users may want to have some control over what they see. The protocol should provide all the information you need to reconstruct a telepresence session at the far end, no more and no less. In the end you don't want you design the whole system=85 My point is that the key aspect is that all of the key attributes are known to the other end, we are putting all of the hooks in place. Allyn : That is exactly what we have tried to do. We are trying to find the minimum set of information necessary, not the maximum. Paul Cloverdale : I clued in on the terms, hints. Allyn : We can certainly take the term out. Roni : Are we trying to support just some low level system, or are we supporting telepresence. It says something here about "good enough" or strange terms like that. "Satisfactory reproduction." Paul C. : I think that your letter has opened a big can of worms. Roni : The manufacturers will always have some secret sauce. Paul C. : This is way beyond CLUE. CLUE should just be about providing hooks. Jonathan L : Roni, you should come Mark : You didn't think the issues are related to use cases Roni : Today, when systems talk, they know the architecture of the other site, where the cameras, etc. The current model doesn't address this? ? : Can you provide a use case ? Roni : I will give an example about why this is important. Paul : Is there a difference between stereo and linear array with 2 element= s ? Charles E. : You might have a stereo mike in the middle, rather than two mikes. Paul : Is there a difference between that and 2 mono mikes ? Charles : I would assume so. I am not an expert. Mark : A very specific case of stereo could be considered a linear array, but saying that they are always the same isn't quite accurate. Meeting ended 1346 EDT. 5.9 Earthquake at 1351 EDT. --000e0cd58ca0d860c604ab318095 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable It ended just in time, as I lost cell service when the Earthquake happened.= =A0

Regards
Marshall

= -----------------------
Notes :=A0

= Clue Interim August 23 2011 Noon EDT.

Attendance via WEBEX

Mary Barn= es
Marshall Eubanks=A0
Andy Hutton
Andy Peppe= rell=A0
Basavaraj
Brian Baldino=A0
3 Call in = Users
Charles Eckel
Claude Lamblin
Dan Romascanu
Espen Berger
John Elwell
Jonathan Lennox
M= ark Duckworth
Michael Lundberg
Paul C
Paul Ky= zivat
PM
Robert Sparks
Roni Even
Sfry
Sohel
Spencer Dawkins
Stephan Wenger
Tom Kr= istensen
Allyn Romanow

Started with the = usual administrivia=A0

Allyn -=A0

<I missed the be= ginning of this short presentation>

We would li= ke to encourage people with other use cases to step forward.

Mark Duckworth - Attributes

Audio attri= butes
Video attributes=A0
Mixed attributes
The sender can tell the receiver a little something about the = streams they might want to receive.

The Audio Channel Format (Mono, Stereo, Linear array) c= ould be extended=A0

In Video, spatial scale is how= wide it is in real world units.

If there are thre= e people, Image width might be 1.5 meters.

Capture scene : Various options for how captures might = be done.=A0

Say there are 6 people - might be 3 sc= reens, 2 people each, might be 2 screens, 3 people each, might be 1 screen,= switched based on voice with PIP for the rest.=A0

A capture set is used to form capture set rows, each be= ing the screens from one of the previous examples.

Andy Pepperell - Choosing streams

The 3 element h= andshake

=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0Media Stream Consumer =A0 =A0 Media Stream Provider

Consumer capability advertisement |-----------------------= ------>

Media Capture Advertisement =A0 =A0 =A0 <-----------------------------|<= /div>

Consumer config of provider =A0 =A0 =A0 |---------= -------------------->
streams=A0

Roni= : How does this relate to the previous part ?=A0

Andy : This is more to do with the mechanics to make th= ings happen.

Roni : This is a whole different set = of parameters.

Andy : That's correct.

Roni : It's currently not in the document.=A0
=

Andy : The document is not up to date.=A0
Capabilities sent by consumer (at start of the session)

It sends hints about itself, such as the number of screens, = software limitations, etc.

Then the MSP uses that = in a media capture advertisement, using facts such as the number of cameras= available.=A0

Also dynamic factors could cause a new Media Capture Ad= vertisement, such as starting to share a document.=A0

<= div>The MSC then combines media capture advertisements with its characteris= tics to send a stream configure message

Media Capture Advertisement =3D=3D Provider Capture Adv= ertisement

Capture attributes, simultaneous transm= ission sets, capture sets, and encoding groups.=A0

Encoding groups - multiple potential encodes - to enable provider to conve= y restrictions to the consumer

Brian Baldino

Examples

If I have only one screen, I want a single capture of the= entire video and audio scene

This could come dire= ctly from a physical device, or from a composition of some devices.

There is a spatial relationship between elements and a = ROW of capture set

VC0, VC1, VC2 implies a spatial= ordering (left to right)

There is nothing to proh= ibit a consumer from picking different capture set rows for audio and video=

MCU Scenarios=A0

What might a = MCU want ? It might want to accommodate everything connecting to it. It may= chose to receive ALL captures available to it, or only the raw captures, o= r only one choice, etc.=A0

Presentation streams are part of a separate capture sce= ne, with no spatial relation between them.

Questio= ns ?

Roni : I am not sure about the concept of the= capture set. What can be sent at the same time.

Brian : There is a separate simultaneous transmission s= et, what can be sent at the same time.=A0

Andy : M= ultiple capture sets represent the same scene.=A0

Roni : Does a capture set mean what can be sent simultaneously.=A0

Brian : They don't convey any information as to what c= an be set at the same time.=A0

Suppose you have a = camera that can be zoomed in to capture 2 people, or zoomed out to capture = the entire scene. It can't do both at the same time.=A0

Roni ; So, Vc3 could be a zoom out, or a composite pict= ure. How do you distinguish between these two cases.

Brian : That cannot be conveyed in the capture set.=A0

Roni : So there is one part to describe capture sets, and another to c= onvey=A0

Paul K. : If you have mutual exclusion, t= hen when you pick one (say zoomed out) that exclude the other
Stephan : You are excluding a model where the MCU is hiding all = attributes of other endpoints.

Brian : If an end p= oint can only do one or the other, somehow a decision must be made.=A0

? : The MCU can be a producer or a consumer, and they h= ave very different models.

Roni : If there are phy= sical limitations you have to pick one.=A0

Mary : = Roni, if you think this is something people are doing or are likely to do, = please write it up.=A0

We are not designing for all possible cases.=A0

Andy : We wanted to focus on the simplest MCU cases. Ther= e are so many possibilities.=A0

Brian : From my po= int of view I don't think you should send all possible capture sets. A = middlebox might have to make some sort of reasonable decision on what to ex= clude.

Andy : There might be other constraints, such as middle= box CPU.

Paul K : I am having trouble with underst= and what "mix" means.

Brian : In video w= e used the term "composed."

Paul : Would this be conveyed even if you didn't sa= y the term "mixed" ? It could be composed, or produced by zooming= out, for example.=A0

Andy : A stream might be hap= py to receive a composed stream while a MCY might not.=A0

Paul : I would like to see a more precise definition of= what it means.=A0

Andy : We are keen not to speci= fy algorithms.=A0

Roni : What is repeated ? Can a = consumer change its Consumer capability advertisement =A0mid-meeting. In th= e MCU case it may change mid-call.=A0

Andy : I so reason why not.=A0

Brian : The word "hint" is a little of a misnomer.=A0
=
Paul Cloverdale : What we want is what does each end point g= ot ?=A0

Andy : The number of physical screens - some people tho= ught this would be useful.

Paul Cloverdale =A0: Th= e users may want to have some control over what they see.=A0

The protocol should provide all the information you need to reco= nstruct a telepresence session at the far end, no more and no less.

In the end you don't want you design the whole system= =85

My point is that the key aspect is that all of the key = attributes are known to the other end, we are putting all of the hooks in p= lace.=A0

Allyn : That is exactly what we have trie= d to do. We are trying to find the minimum set of information necessary, no= t the maximum.=A0

Paul Cloverdale : I clued in on the terms, hints.=A0

Allyn : We can certainly take the term out.=A0
=

Roni : Are we trying to support just some low level sys= tem, or are we supporting telepresence.=A0

It says something here about "good enough" or= strange terms like that. =A0"Satisfactory reproduction."=A0

Paul C. : I think that your letter has opened a big ca= n of worms.

Roni : The manufacturers will always have some secret s= auce.=A0

Paul C. : This is way beyond CLUE. CLUE s= hould just be about providing hooks.=A0

Jonathan L= : Roni, you should come

Mark : You didn't think the issues are related to u= se cases

Roni : Today, when systems talk, they kno= w the architecture of the other site, where the cameras, etc. The current m= odel doesn't address this?=A0

? : Can you provide a use case ?

Roni : I will give an example about why this is important.=A0
=
Paul : Is there a difference between stereo and linear array= with 2 elements ?

Charles =A0E. : You might have a stereo mike in the mid= dle, rather than two mikes.

Paul : Is there a diff= erence between that and 2 mono mikes ?

Charles : I= would assume so. I am not an expert.=A0

Mark : A very specific case of stereo could be consider= ed a linear array, but saying that they are always the same isn't quite= accurate.=A0

Meeting ended 1346 EDT.

5.9 Earthquake at 1351 EDT.
--000e0cd58ca0d860c604ab318095-- From stewe@stewe.org Tue Aug 23 13:19:17 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9605021F8D11 for ; Tue, 23 Aug 2011 13:19:17 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -1.202 X-Spam-Level: X-Spam-Status: No, score=-1.202 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, MIME_QP_LONG_LINE=1.396] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9eLPwPSQ-48S for ; Tue, 23 Aug 2011 13:19:15 -0700 (PDT) Received: from stewe.org (stewe.org [85.214.122.234]) by ietfa.amsl.com (Postfix) with ESMTP id 500D921F8C13 for ; Tue, 23 Aug 2011 13:19:13 -0700 (PDT) Received: from [172.16.7.125] (unverified [160.79.219.114]) by stewe.org (SurgeMail 3.9e) with ESMTP id 27960-1743317 for multiple; Tue, 23 Aug 2011 22:20:06 +0200 User-Agent: Microsoft-MacOutlook/14.12.0.110505 Date: Tue, 23 Aug 2011 16:19:56 -0400 From: Stephan Wenger To: "Duckworth, Mark" , "clue@ietf.org" Message-ID: Thread-Topic: [clue] Full mesh conferneces In-Reply-To: <44C6B6B2D0CF424AA90B6055548D7A61AED0B444@CRPMBOXPRD01.polycom.com> Mime-version: 1.0 Content-type: multipart/alternative; boundary="B_3396961202_4069997" X-Originating-IP: 160.79.219.114 X-Authenticated-User: stewe@stewe.org Subject: Re: [clue] Full mesh conferneces X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2011 20:19:17 -0000 > This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --B_3396961202_4069997 Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable Hi, Roni's problem is not limited to full mesh. What we are really looking at is transcoder-less topologies. The topology for the media distribution can be anything from full mesh through multicast to technologies similar to th= e one my employer is using. Let me suggest to think this through a bit more, keeping in mind that the other videoonferencing related WG in the IETF (webrtc) certainly views full mesh topologies as an option, and not kill the concept using procedural arguments. My current viewpoint is that if it were possible to address capture-type conflicts as Roni presented (there are many more; think of codec capability mismatches and similar), we should address those. Not just expecting that the magic MCU solves all those mismatches for everyone. However, I'm not sure that it is possible at all, at least not without an signaling middlebo= x that makes smart decisions (regardless of whether that box performs media transcoding or not). At least I have not seen a protocol that would allow for multiparty feature negotiation without involving a middlebox and with reasonable delay constraints. But perhaps others that are closer to signaling, have? Stephan From: "Duckworth, Mark" Date: Tue, 23 Aug 2011 12:03:25 -0700 To: "clue@ietf.org" Subject: Re: [clue] Full mesh conferneces I also thought full mesh is not in scope and we don=B9t need to address it. Mark Duckworth =20 From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Ron= i Even Sent: Tuesday, August 23, 2011 2:13 PM To: clue@ietf.org Subject: [clue] Full mesh conferneces =20 Hi, During the interim meeting today when we talked about Simultaneous transmission sets there was a question if the provider may run to conflict requests for capture sets. There was a question if it is relevant for centralized multipoint and in my view this is not a problem. =20 I mentioned that such a problem can occur if we support full mesh conferences. For example if a provider can send 3 video captures or by using one of the camera send a zoom version of the same scene it may cause = a one screen system to ask for the zoom version and a three screen system to ask for the three streams. This can happen in a full mesh three way call an= d will require some way to resolve the conflict. =20 My personal view is that we are not doing full mesh but just centralized multipoint conferences. =20 I am looking for input is this is a problem we need to address =20 Thanks Roni Even =20 =20 =20 _______________________________________________ clue mailing list clue@ietf.org https://www.ietf.org/mailman/listinfo/clue --B_3396961202_4069997 Content-type: text/html; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable
Hi,
Roni's problem= is not limited to full mesh.  What we are really looking at is transco= der-less topologies.  The topology for the media distribution can be an= ything from  full mesh through multicast to technologies similar to the= one my employer is using.
Let me suggest to think this through a = bit more, keeping in mind that the other videoonferencing related WG in the = IETF (webrtc) certainly views full mesh topologies as an option, and no= t kill the concept using procedural arguments.
My current viewpoin= t is that if it were possible to address capture-type conflicts as Roni pres= ented (there are many more; think of codec capability mismatches and similar= ), we should address those.  Not just expecting that the magic MCU solv= es all those mismatches for everyone.  However, I'm not sure that it is= possible at all, at least not without an signaling middlebox that makes sma= rt decisions (regardless of whether that box performs media transcoding or n= ot).  At least I have not seen a protocol that would allow for multipar= ty feature negotiation without involving a middlebox and with reasonable del= ay constraints.  But perhaps others that are closer to signaling, have?=
Stephan


From: "Duckworth, Mark" <Mark.Duckworth@polycom.com>
Date: Tue, 23 Aug 2011 12:03:25 -0700
To: "clue@ietf.org" <clue@ietf.org>
Subject: Re: [clue] Full mesh conferneces

I also thought full mesh is not in scope and we don’t need to addre= ss it.

Mark Duckworth

 

From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of <= /b>Roni Even
Sent: Tuesday, August 23, 2011 2:13 PM
To: = clue@ietf.org
Subject: [clue] F= ull mesh conferneces

<= o:p> 

Hi,

During the interim meeting today when we talked about Simultaneous trans= mission sets there was a question if the provider may run to conflict reques= ts for capture sets. There was a question if it is relevant for centralized = multipoint and in my view this is not a problem.

 

I mentioned that such a pr= oblem can occur if we support full mesh conferences.  For example if a = provider can send 3 video captures or by using one of the camera send a zoom= version of the same scene it may cause a one screen system to ask for the z= oom version and a three screen system to ask for the three streams. This can= happen in a full mesh three way call and will require some way to resolve t= he conflict.

 

My personal view is that we are not doing full mesh but just c= entralized multipoint conferences.

&= nbsp;

I am looking for input is this is a prob= lem we need to address

 <= /p>

Thanks

Roni Even<= o:p>

 

 

 

=
_______________________________________________ clue mailing list clue@ietf.org https://www.ietf.org/m= ailman/listinfo/clue
--B_3396961202_4069997-- From coverdale@sympatico.ca Tue Aug 23 16:36:40 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6E0EF21F8BB2 for ; Tue, 23 Aug 2011 16:36:40 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: 0.107 X-Spam-Level: X-Spam-Status: No, score=0.107 tagged_above=-999 required=5 tests=[AWL=-0.512, BAYES_40=-0.185, HTML_MESSAGE=0.001, MSGID_FROM_MTA_HEADER=0.803] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9uClr54gAD1H for ; Tue, 23 Aug 2011 16:36:39 -0700 (PDT) Received: from blu0-omc1-s11.blu0.hotmail.com (blu0-omc1-s11.blu0.hotmail.com [65.55.116.22]) by ietfa.amsl.com (Postfix) with ESMTP id F336F21F8B98 for ; Tue, 23 Aug 2011 16:36:38 -0700 (PDT) Received: from BLU0-SMTP97 ([65.55.116.7]) by blu0-omc1-s11.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Tue, 23 Aug 2011 16:37:47 -0700 X-Originating-IP: [67.70.128.80] X-Originating-Email: [coverdale@sympatico.ca] Message-ID: Received: from PaulNewPC ([67.70.128.80]) by BLU0-SMTP97.phx.gbl over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Tue, 23 Aug 2011 16:37:45 -0700 From: Paul Coverdale To: "'Roni Even'" , References: <01e201cc61a6$87e61510$97b23f30$%roni@huawei.com> In-Reply-To: <01e201cc61a6$87e61510$97b23f30$%roni@huawei.com> Date: Tue, 23 Aug 2011 19:37:33 -0400 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0025_01CC61CC.1B9712D0" X-Mailer: Microsoft Office Outlook 12.0 thread-index: AcxhpoPDsaD4D0E1TAS+LNkioNlKpwAM+weQ Content-Language: en-us X-OriginalArrivalTime: 23 Aug 2011 23:37:45.0847 (UTC) FILETIME=[A9106870:01CC61ED] Subject: Re: [clue] What CLUE is about? X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Aug 2011 23:36:40 -0000 ------=_NextPart_000_0025_01CC61CC.1B9712D0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi Roni, You've highlighted an interesting point, but I think we may be opening a can of worms if we get too picky in trying to quantify what is meant by "satisfactory user experience". This gets into the realm of subjective assessment, mean opinion scores and all that sort of thing. I think your interpretation that "satisfactory user experience" means preserving the "being there" experience of Telepresence when interworking between equipment from different vendors is what we are really aiming at. Cheers, ...Paul From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Roni Even Sent: Tuesday, August 23, 2011 11:08 AM To: clue@ietf.org Subject: [clue] What CLUE is about? Hi, Going back through the requirements and framework I noticed the term " satisfactory user experience" being used in both documents. See requirement 1 in the requirement document and the following paragraph from the framework. "The purpose of this effort is to make it possible to handle multiple streams of media in such a way that a satisfactory user experience is possible even when participants are on different vendor equipment and when they are using devices with different types of communication capabilities." I am not sure what the term means. The charter talks about " high definition, high quality audio/video enabling a "being-there" experience. My question is if satisfactory user experience means satisfactory to achieve a "being there" experience or is this term reducing the charter. Thanks Roni Even ------=_NextPart_000_0025_01CC61CC.1B9712D0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi Roni,

 

You’ve highlighted = an interesting point, but I think we may be opening a can of worms if we = get too picky in trying to quantify what is meant by “satisfactory = user experience”. This gets into the realm of subjective = assessment, mean opinion scores and all that sort of thing. I think your = interpretation that “satisfactory user experience” means = preserving the “being there” experience of Telepresence when = interworking between equipment from different vendors is what we are = really aiming at.

 

Cheers,

 

...Paul

 

From:= = clue-bounces@ietf.org [mailto:clue-bounces@ietf.= org] On Behalf Of Roni Even
Sent: Tuesday, August = 23, 2011 11:08 AM
To: clue@ietf.org
Subject: = [clue] What CLUE is about?

 

Hi, =

Going back through the requirements = and framework I noticed the term " satisfactory = user experience" being used in both = documents.

See = requirement 1 in the requirement document and the following paragraph = from the framework.

 

"The purpose of this effort is to make it possible to handle = multiple  streams of media in such a way that a satisfactory user = experience is

   possible = even when participants are on different vendor equipment and   = when they are using devices with different types of = communication

   capabilities."

 

I am not sure what the term means. The = charter talks about " high definition, high quality  = audio/video enabling a "being-there" = experience.

 

My question is if satisfactory user experience means = satisfactory to achieve a "being there" experience or is this = term reducing the charter.

 

Thanks

Roni = Even

------=_NextPart_000_0025_01CC61CC.1B9712D0-- From Even.roni@huawei.com Wed Aug 24 06:30:14 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C0AD321F8B5B for ; Wed, 24 Aug 2011 06:30:01 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.141 X-Spam-Level: X-Spam-Status: No, score=-106.141 tagged_above=-999 required=5 tests=[AWL=0.457, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 82yR8KF4MKwT for ; Wed, 24 Aug 2011 06:29:59 -0700 (PDT) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [119.145.14.64]) by ietfa.amsl.com (Postfix) with ESMTP id 5DF7B21F8B4D for ; Wed, 24 Aug 2011 06:29:59 -0700 (PDT) Received: from huawei.com (szxga05-in [172.24.2.49]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQF00GNWPJV0J@szxga05-in.huawei.com> for clue@ietf.org; Wed, 24 Aug 2011 21:31:07 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQF00KJJPJVXB@szxga05-in.huawei.com> for clue@ietf.org; Wed, 24 Aug 2011 21:31:07 +0800 (CST) Received: from windows8d787f9 (bzq-109-64-200-234.red.bezeqint.net [109.64.200.234]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQF0059TPJMGG@szxml12-in.huawei.com> for clue@ietf.org; Wed, 24 Aug 2011 21:31:07 +0800 (CST) Date: Wed, 24 Aug 2011 16:30:09 +0300 From: Roni Even To: clue@ietf.org Message-id: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_tDQ7l/OC38NGx66e+r9CuA)" Content-language: en-us Thread-index: AcxiYe68/44BQ96wQbK+wGKWI3Tf/A== Subject: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Aug 2011 13:30:14 -0000 This is a multi-part message in MIME format. --Boundary_(ID_tDQ7l/OC38NGx66e+r9CuA) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Hi, In the interim meeting I mentioned that I that I support the model but think that there are parameters that I would like to add. At the meeting it was clear to me that there will be a new revision soon that will support parameters at the capture scene level. Trying to see which parameters I would like to see supported I looked at the message flow and I have some questions. Andy presented the basic message flow with three messages: 1. Consumer capability advertisement 2. Provider - media capture advertisement 3. Consumer configuration of provider streams. I was looking at the use cases of 3 to 3 and 3 to 1 and tried to understand what will be conveyed in the three messages and how will we use the information. The first question I had was how this relates to SIP. At which stage of the SIP call will the consumer advertize its capabilities? In the second part, I was then looking at a telepresence system that has 3 65" screens where the distance between the screens including the frames is 6". The system has three cameras, each mounted on the center of a screen. The system is facing a room with three rows each row sits 6 people and each camera is capable of capturing a third of the room but the default views of each camera does not overlap with the others. The cameras support zoom and pan (local from the application). The system can decode up to four video streams where one is presentation (H.239 like). The system can support an internal 4-way multipoint call, means that it can receive the three main video streams from one, two or three endpoints. I think that this is a very standard system, nothing special. The telepresence application is willing to provide all this information as part of the consumer capability advertisement and according to Andy's slides the message include physical factors , user preferences and software limitations. I am now trying to understand what the purpose of the consumer capability advertisement is in order to see what information is important to convey. Is the reason for the consumer capability advertisement to allow the provider to propose a better media capability advertisement, or is it to allow the provider to optimize the content of the media streams he is sending based on the information provided. This will help with looking at which parameters can be used. The slides show that the information is used for the capability advertisements. The third question I had was if these three messages can be repeated at any time or do we see a different message to request a mode change. Thanks Roni Even --Boundary_(ID_tDQ7l/OC38NGx66e+r9CuA) Content-type: text/html; charset=us-ascii Content-transfer-encoding: quoted-printable

Hi,

In the interim = meeting I mentioned that I that I support the model but think that there = are parameters that I would like to add. At the meeting it was clear to = me that there will be a new revision soon that will support parameters = at the capture scene level. Trying to see which parameters I would like = to see supported I looked at the message flow and I have some = questions.

Andy presented the basic = message flow with three messages:

 

1.       = Consumer capability = advertisement

2.       = Provider – media = capture advertisement

3.       = Consumer configuration of = provider streams.

 

I was = looking at the use cases of 3 to 3 and 3 to 1 and tried to understand = what will be conveyed in the three messages and how will we use the = information.

The first question I = had was how this relates to SIP. At which stage of the SIP call will the = consumer advertize its capabilities?

In the second part, I was then looking at a  = telepresence system that has 3 65” screens where the distance = between the screens including the frames is 6”. The system has = three cameras, each mounted on the center of a screen. The system is = facing a room with three rows each row sits 6 people and each camera is = capable of capturing a third of the room but the default views of each = camera does not overlap with the others. The cameras support zoom and = pan (local from the application).

The = system can decode up to four video streams where one is presentation = (H.239 like). The system can support an internal 4-way multipoint call, = means that it can receive the three main video streams from one, two or = three endpoints.

I think that this is = a very standard system, nothing special.

The telepresence application is willing to provide all = this information as part of the consumer capability advertisement and = according to Andy’s slides the message include physical factors , = user preferences and software limitations.

I am now trying to understand what the purpose of the = consumer capability advertisement is in order to see what information is = important to convey. 

Is the = reason for the consumer capability advertisement to allow the provider = to propose a better media capability advertisement, or is it to allow = the provider to optimize the content of the media streams he is sending = based on the information provided. This will help with looking at which = parameters can be used. The slides show that the information is used for = the capability advertisements.

 

The third = question I had was if these three messages can be repeated at any time = or do we see a different message to request a mode = change.

 

Thanks

Roni = Even

 

= --Boundary_(ID_tDQ7l/OC38NGx66e+r9CuA)-- From eckelcu@cisco.com Wed Aug 24 13:08:24 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B00C821F862F for ; Wed, 24 Aug 2011 13:08:24 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.774 X-Spam-Level: X-Spam-Status: No, score=-2.774 tagged_above=-999 required=5 tests=[AWL=-0.175, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vZZcz8wdvlR0 for ; Wed, 24 Aug 2011 13:08:24 -0700 (PDT) Received: from rcdn-iport-1.cisco.com (rcdn-iport-1.cisco.com [173.37.86.72]) by ietfa.amsl.com (Postfix) with ESMTP id DB4E021F84F7 for ; Wed, 24 Aug 2011 13:08:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=eckelcu@cisco.com; l=3678; q=dns/txt; s=iport; t=1314216575; x=1315426175; h=mime-version:content-transfer-encoding:subject:date: message-id:in-reply-to:references:from:to; bh=28c2zfkNkgweA4V4/ammcgKTjMIjQtlV0Osei6K2N4g=; b=hY+374S6ehsw0dyMPsWcmBeaVMqu840qwunLJ+MYBUi4TSqOf9DQPjsc zeAc5uPG30Ubza+pD/U1hfTGPOv1vb5Zcs0Wb3A+44/cWBngPLMBtcGkO ULsNtBWUwvK3wgqe+G5q1JwRDMKCj0luaW9mCYiaKniq5bvr5pYypC923 A=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AroAADNaVU6rRDoH/2dsb2JhbABCmCePWXeBQAEBAQECARIBHQo6CgcEAgEIEQQBAQsGFwEGAUUJCAEBBAESCBqHT5sgAZ8ehWpfBIdhkE6MBQ X-IronPort-AV: E=Sophos;i="4.68,277,1312156800"; d="scan'208";a="16186771" Received: from mtv-core-2.cisco.com ([171.68.58.7]) by rcdn-iport-1.cisco.com with ESMTP; 24 Aug 2011 20:09:35 +0000 Received: from xbh-sjc-211.amer.cisco.com (xbh-sjc-211.cisco.com [171.70.151.144]) by mtv-core-2.cisco.com (8.14.3/8.14.3) with ESMTP id p7OK9YCN018374; Wed, 24 Aug 2011 20:09:34 GMT Received: from xmb-sjc-234.amer.cisco.com ([128.107.191.111]) by xbh-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 24 Aug 2011 13:09:34 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Wed, 24 Aug 2011 13:09:33 -0700 Message-ID: In-Reply-To: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [clue] Questions on basic message flow in the framework Thread-Index: AcxiYe68/44BQ96wQbK+wGKWI3Tf/AANlong References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> From: "Charles Eckel (eckelcu)" To: "Roni Even" , X-OriginalArrivalTime: 24 Aug 2011 20:09:34.0529 (UTC) FILETIME=[BE124B10:01CC6299] Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Aug 2011 20:08:24 -0000 Hi Roni, Please see inline. > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Roni Even > Sent: Wednesday, August 24, 2011 6:30 AM > To: clue@ietf.org > Subject: [clue] Questions on basic message flow in the framework >=20 > Hi, >=20 > In the interim meeting I mentioned that I that I support the model but think that there are parameters > that I would like to add. At the meeting it was clear to me that there will be a new revision soon > that will support parameters at the capture scene level. Trying to see which parameters I would like > to see supported I looked at the message flow and I have some questions. >=20 > Andy presented the basic message flow with three messages: >=20 >=20 >=20 > 1. Consumer capability advertisement >=20 > 2. Provider - media capture advertisement >=20 > 3. Consumer configuration of provider streams. >=20 >=20 >=20 > I was looking at the use cases of 3 to 3 and 3 to 1 and tried to understand what will be conveyed in > the three messages and how will we use the information. >=20 > The first question I had was how this relates to SIP. At which stage of the SIP call will the consumer > advertize its capabilities? I think it is best to focus on the framework a bit more before mapping to SIP. =20 > In the second part, I was then looking at a telepresence system that has 3 65" screens where the > distance between the screens including the frames is 6". The system has three cameras, each mounted on > the center of a screen. The system is facing a room with three rows each row sits 6 people and each > camera is capable of capturing a third of the room but the default views of each camera does not > overlap with the others. The cameras support zoom and pan (local from the application). >=20 > The system can decode up to four video streams where one is presentation (H.239 like). The system can > support an internal 4-way multipoint call, means that it can receive the three main video streams from > one, two or three endpoints. >=20 > I think that this is a very standard system, nothing special. >=20 > The telepresence application is willing to provide all this information as part of the consumer > capability advertisement and according to Andy's slides the message include physical factors , user > preferences and software limitations. >=20 > I am now trying to understand what the purpose of the consumer capability advertisement is in order to > see what information is important to convey. >=20 > Is the reason for the consumer capability advertisement to allow the provider to propose a better > media capability advertisement, or is it to allow the provider to optimize the content of the media > streams he is sending based on the information provided. This will help with looking at which > parameters can be used. The slides show that the information is used for the capability > advertisements. I viewed it as being primarily for the former, but using it for the latter may make sense as well and should not be excluded. The extent to which the provider actually uses the information is implementation dependent. =20 >=20 >=20 >=20 > The third question I had was if these three messages can be repeated at any time or do we see a > different message to request a mode change. >=20 My understanding, based on the presentation in the virtual meeting, is that these messages, though shown as an ordered exchange, could theoretically come in any order at any time. Cheers, Charles =20 >=20 > Thanks >=20 > Roni Even >=20 >=20 From Mark.Duckworth@polycom.com Wed Aug 24 13:46:27 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7E9FA21F8D6A for ; Wed, 24 Aug 2011 13:46:27 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -6.574 X-Spam-Level: X-Spam-Status: No, score=-6.574 tagged_above=-999 required=5 tests=[AWL=0.025, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qM-kiUg5StST for ; Wed, 24 Aug 2011 13:46:26 -0700 (PDT) Received: from crpehubprd02.polycom.com (crpehubprd01.polycom.com [140.242.64.158]) by ietfa.amsl.com (Postfix) with ESMTP id 2BEB721F8D69 for ; Wed, 24 Aug 2011 13:46:25 -0700 (PDT) Received: from Crpmboxprd01.polycom.com ([fe80::e001:c7b0:91a1:9443]) by crpehubprd02.polycom.com ([fe80::5efe:10.236.0.154%12]) with mapi; Wed, 24 Aug 2011 13:47:36 -0700 From: "Duckworth, Mark" To: "clue@ietf.org" Date: Wed, 24 Aug 2011 13:47:44 -0700 Thread-Topic: [clue] continuing "layout" discussion Thread-Index: AcxhWcEGzwTHJ2OySvGMNEbltZJvzQBRAavg Message-ID: <44C6B6B2D0CF424AA90B6055548D7A61AED0BA65@CRPMBOXPRD01.polycom.com> References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E534181.7080705@nteczone.com> In-Reply-To: <4E534181.7080705@nteczone.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Aug 2011 20:46:27 -0000 Christian, Thanks for the questions, I'll answer below. Mark > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of > Christian Groves > Sent: Tuesday, August 23, 2011 1:58 AM > To: clue@ietf.org > Subject: Re: [clue] continuing "layout" discussion >=20 > Hello, >=20 > With regards to spatial relation among streams, if I start a > telepresence session and I only use the centre screen/camera of a three > screen/camera telepresence system. i.e. its only me in the room, the > left and right screen/cameras are off. Just because it is only you in the room doesn't mean you don't want to see = all three screens worth of video coming from the other sites. So I suppose= it is possible you would use only one screen, I don't think it would be ty= pical. Using only one camera when there is nothing interesting for the oth= er cameras to see makes more sense. > According to CLUE framework what > do I send in terms of capture set? > a) VC1 - only one video capture it doesn't matter which of my screen it > came from > b) VC2, VC1, VC3 - VC2 and VC3 would be NULL. By having VC1 it the > middle it indicates that its the centre screen. > c) both are valid? =20 Choice a) just VC1 by itself is the simplest and what I think it should do = if it wants to send only one video capture because that is all that is inte= resting. But I would say this video capture comes from a camera, not from = a screen. I don't know what a NULL media capture is, the framework doesn't= include that concept. > Now while I'm there a person comes to join me and the left screen and > camera are turned on. Again, I don't understand why your example has a correlation between how ma= ny people are in your local room and how many screens you are using to view= the people from other locations. I expect you would use all the screens i= n your room to view the other people, no matter how many people are with yo= u looking at those screens.=20 I guess now there's two capture sets VC2, VC1. > Again do I have to add VC3 as NULL, to indicate that its my left and > centre camera/screen being used. You would just use a capture set with (VC2, VC1) indicating two video captu= res, one on the left and one on the right. >=20 > I guess for the far end it doesn't care it will render the VCs how it > wants?? The far end will know that VC2 should be rendered to the left of VC1. It p= robably cares about that, otherwise it wouldn't be using CLUE.=20 >=20 > Regards, Christian >=20 > On 6/08/2011 7:02 AM, Duckworth, Mark wrote: > > I'd like to continue the discussion about layout and rendering > issues. There are many separate but related things involved. I want > to break it down into separate topics, and see how the topics are > related to each other. And then we can discuss what CLUE needs to deal > with and what is not in scope. > > > > I don't know if I'm using the best terms for each topic. If not, > please suggest better terms. My use of the term "layout" here is not > consistent with draft-wenger-clue-definitions-01, because I don't limit > it to the rendering side. But my use of the terms "render" and "source > selction" is consistent with that draft. > > > > 1- video layout composed arrangement within a stream - when multiple > video sources are composed into one stream, they are arranged in some > way. Typical examples are 2x2 grid, 3x3 grid, 1+5 (1 large plus 5 > small), 1+PiP (1 large plus one or more picture-in-picture). These > arrangements can be selected automatically or based on user input. > Arrangements can change over time. Identifying this composed > arrangement is separate from identifying or selecting which video > images are used to fill in the composition. These arrangements can be > constructed by an endpoint sending video, by an MCU, or by an endpoint > receiving video as it renders to a display. > > > > 2 - source selection and identification - when a device is composing > a stream made up of other sources, it needs some way to choose which > sources to use, and some way of choosing how to combine them or where > to place video images in the composed arrangement. Various automatic > algorithms may be used, or selections can be made based on user input. > Selections can change over time. One example is "select the two most > recent talkers". It may also be desirable to identify which sources > are used and where they are placed, for example so the receiving side > can use this information in the user interface. Source selection can > be done by an endpoint as it sends media, by an MUC, or by an endpoint > receiving media. > > > > 3 - spatial relation among streams - how multiple streams are related > to each other spatially, to be rendered such that the spatial > arrangement is consistent. The examples we've been using have multiple > video streams that are related in an ordered row from left to right. > Audio is also included when it is desirable to match spatial audio to > video. > > > > 4 - multi stream media format - what the streams mean with respect to > each other, regardless of the actual content on the streams. For > audio, examples are stereo, 5.1 surround, binaural, linear array. > (linear array is described in the clue framework document). Perhaps 3D > video formats would also fit in this category. This information is > needed in order to properly render the media into light and sound for > human observers. I see this at the same level as identifying a codec, > independent of the audio or video content carried on the streams, and > independent of how any composition of sources is done. > > > > I think there is general agreement that items 3 and 4 are in scope > for CLUE, as they specifically deal with multiple streams to and from > an endpoint. And the framework draft includes these. Items 1 and 2 > are not new, those topics exist for traditional single stream > videoconferencing. I'm not sure what aspects of 1 and 2 should be in > scope for CLUE. It is hard to tell from the use cases and > requirements. The framework draft includes them only to a very limited > extent. > > > > Mark Duckworth > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue > > > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue From pkyzivat@alum.mit.edu Wed Aug 24 14:12:26 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 806F621F8D23 for ; Wed, 24 Aug 2011 14:12:25 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.534 X-Spam-Level: X-Spam-Status: No, score=-2.534 tagged_above=-999 required=5 tests=[AWL=0.065, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YI9qY2FMOdjI for ; Wed, 24 Aug 2011 14:12:24 -0700 (PDT) Received: from qmta09.westchester.pa.mail.comcast.net (qmta09.westchester.pa.mail.comcast.net [76.96.62.96]) by ietfa.amsl.com (Postfix) with ESMTP id 4A55D21F8D28 for ; Wed, 24 Aug 2011 14:12:24 -0700 (PDT) Received: from omta22.westchester.pa.mail.comcast.net ([76.96.62.73]) by qmta09.westchester.pa.mail.comcast.net with comcast id QM011h0011ap0As59MDb11; Wed, 24 Aug 2011 21:13:35 +0000 Received: from Paul-Kyzivats-MacBook-Pro.local ([24.62.109.41]) by omta22.westchester.pa.mail.comcast.net with comcast id QMDa1h00T0tdiYw3iMDbFc; Wed, 24 Aug 2011 21:13:35 +0000 Message-ID: <4E55697C.9070605@alum.mit.edu> Date: Wed, 24 Aug 2011 17:13:32 -0400 From: Paul Kyzivat User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: clue@ietf.org References: In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [clue] Full mesh conferneces X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Aug 2011 21:12:26 -0000 ISTM that the key distinction here is between "configure an endpoint" and "configure a session between two endpoints". It feels like these have so far been conflated, and should not be. When viewed that way, it seems obvious to me that a different negotiation approach is needed for the two: - an "O/A-like" approach is fine for "configure a session between two endpoints" - a "voting" or "indication of preference" approach, preceding an offer that reflects the current configuration seems more appropriate for "configure an endpoint". Thanks, Paul On 8/23/11 4:19 PM, Stephan Wenger wrote: > Hi, > Roni's problem is not limited to full mesh. What we are really looking > at is transcoder-less topologies. The topology for the media > distribution can be anything from full mesh through multicast to > technologies similar to the one my employer is using. > Let me suggest to think this through a bit more, keeping in mind that > the other videoonferencing related WG in the IETF (webrtc) certainly > views full mesh topologies as an option, and not kill the concept using > procedural arguments. > My current viewpoint is that if it were possible to address capture-type > conflicts as Roni presented (there are many more; think of codec > capability mismatches and similar), we should address those. Not just > expecting that the magic MCU solves all those mismatches for everyone. > However, I'm not sure that it is possible at all, at least not without > an signaling middlebox that makes smart decisions (regardless of whether > that box performs media transcoding or not). At least I have not seen a > protocol that would allow for multiparty feature negotiation without > involving a middlebox and with reasonable delay constraints. But perhaps > others that are closer to signaling, have? > Stephan > > > From: "Duckworth, Mark" > > Date: Tue, 23 Aug 2011 12:03:25 -0700 > To: "clue@ietf.org " > > Subject: Re: [clue] Full mesh conferneces > > I also thought full mesh is not in scope and we don’t need to address it. > > Mark Duckworth > > *From:*clue-bounces@ietf.org > [mailto:clue-bounces@ietf.org] *On Behalf Of *Roni Even > *Sent:* Tuesday, August 23, 2011 2:13 PM > *To:* clue@ietf.org > *Subject:* [clue] Full mesh conferneces > > Hi, > > During the interim meeting today when we talked about Simultaneous > transmission sets there was a question if the provider may run to > conflict requests for capture sets. There was a question if it is > relevant for centralized multipoint and in my view this is not a problem. > > I mentioned that such a problem can occur if we support full mesh > conferences. For example if a provider can send 3 video captures or by > using one of the camera send a zoom version of the same scene it may > cause a one screen system to ask for the zoom version and a three screen > system to ask for the three streams. This can happen in a full mesh > three way call and will require some way to resolve the conflict. > > My personal view is that we are not doing full mesh but just centralized > multipoint conferences. > > I am looking for input is this is a problem we need to address > > Thanks > > Roni Even > > _______________________________________________ clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue > > > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue From Christian.Groves@nteczone.com Wed Aug 24 18:01:09 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8825321F8C00 for ; Wed, 24 Aug 2011 18:01:09 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.599 X-Spam-Level: X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qIweSyUWGLMd for ; Wed, 24 Aug 2011 18:01:08 -0700 (PDT) Received: from ipmail05.adl6.internode.on.net (ipmail05.adl6.internode.on.net [150.101.137.143]) by ietfa.amsl.com (Postfix) with ESMTP id DA96221F8B8B for ; Wed, 24 Aug 2011 18:01:07 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjoAAISeVU520Xap/2dsb2JhbAAMLAqYKJIQAQEBAQIBAQEBNRsUBwQGDQQLEQQBAQEJFggHCQMCAQIBFR8JCAYBDAYCAQGHbQS6V4MngyIEpCY Received: from ppp118-209-118-169.lns20.mel4.internode.on.net (HELO [127.0.0.1]) ([118.209.118.169]) by ipmail05.adl6.internode.on.net with ESMTP; 25 Aug 2011 10:32:16 +0930 Message-ID: <4E559EFF.1060008@nteczone.com> Date: Thu, 25 Aug 2011 11:01:51 +1000 From: Christian Groves User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20110812 Thunderbird/6.0 MIME-Version: 1.0 To: "Duckworth, Mark" , clue@ietf.org References: <44C6B6B2D0CF424AA90B6055548D7A61AE9B48AD@CRPMBOXPRD01.polycom.com> <4E534181.7080705@nteczone.com> <44C6B6B2D0CF424AA90B6055548D7A61AED0BA65@CRPMBOXPRD01.polycom.com> In-Reply-To: <44C6B6B2D0CF424AA90B6055548D7A61AED0BA65@CRPMBOXPRD01.polycom.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [clue] continuing "layout" discussion X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Aug 2011 01:01:09 -0000 Hello Mark, Sorry for the "confused" email, probably matches my reading of the framework. Please see my responses [CNG] below to clarify. regards, Christian On 25/08/2011 6:47 AM, Duckworth, Mark wrote: > Christian, > > Thanks for the questions, I'll answer below. > > Mark > >> -----Original Message----- >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of >> Christian Groves >> Sent: Tuesday, August 23, 2011 1:58 AM >> To: clue@ietf.org >> Subject: Re: [clue] continuing "layout" discussion >> >> Hello, >> >> With regards to spatial relation among streams, if I start a >> telepresence session and I only use the centre screen/camera of a three >> screen/camera telepresence system. i.e. its only me in the room, the >> left and right screen/cameras are off. > Just because it is only you in the room doesn't mean you don't want to see all three screens worth of video coming from the other sites. So I suppose it is possible you would use only one screen, I don't think it would be typical. Using only one camera when there is nothing interesting for the other cameras to see makes more sense. [CNG] Yes you're correct I shouldn't have mixed the camera/screen. The intention is that locally only one camera (centre) is used as there is only one participant (although the system has a 3 camera capability). The remote system is a 3 screen system (number of cameras not relevant). > >> According to CLUE framework what >> do I send in terms of capture set? >> a) VC1 - only one video capture it doesn't matter which of my screen it >> came from >> b) VC2, VC1, VC3 - VC2 and VC3 would be NULL. By having VC1 it the >> middle it indicates that its the centre screen. >> c) both are valid? > > Choice a) just VC1 by itself is the simplest and what I think it should do if it wants to send only one video capture because that is all that is interesting. But I would say this video capture comes from a camera, not from a screen. I don't know what a NULL media capture is, the framework doesn't include that concept. [CNG] Yes the capture comes from a camera. In this case a centre camera. I guess what I was trying to get at was: is a capture related to what am is actually being used (i.e. VC1)? or it is related to what is being used in terms of the of the overall system (i.e. VC2, VC1, VC3)? When I refer to a NULL media capture I'm referring to this case where VC2 and VC3 represent potential video captures. i.e. I don't want to use them now but they may be used in the future. > >> Now while I'm there a person comes to join me and the left screen and >> camera are turned on. > Again, I don't understand why your example has a correlation between how many people are in your local room and how many screens you are using to view the people from other locations. I expect you would use all the screens in your room to view the other people, no matter how many people are with you looking at those screens. [CNG] Sorry, my sloppy use of screens. I'll try again. The remote end has three screens. It only receives one video capture VC1 (no information about a possible VC2 and VC3 is sent). My question is what screen is VC1 displayed on? I guess the answer is the centre one as the remote TPS figures it is talking to a single camera/screen system because there's only one video capture? So whilst the VCs are described left to right, as there only one the remote TSP assumes that it related to a "centre" screen. In the framework draft section 10 it discusses cases where 1, 2 and 3 screen systems receives VC that are equal to or exceed their capabilities. However there's no cases describing what systems do when receiving VC less than their capabilities. i.e. a 3 screen systems receiving one or two video captures. > > I guess now there's two capture sets VC2, VC1. >> Again do I have to add VC3 as NULL, to indicate that its my left and >> centre camera/screen being used. > You would just use a capture set with (VC2, VC1) indicating two video captures, one on the left and one on the right. [CNG] Yes two video captures could be used. In this case the remote end system I'm assuming would display VC2 on the left screen, and VC1 on the centre screen. Now what if (VC1,VC3) was instead sent? The framework says they represent left to right video captures. In this case would the TSP display VC1 on the left screen and VC3 in the centre screen? Effectively moving the displayed scene from the centre to the left. To ensure VC1 remains on the centre screen the remote TSP would need to compare the two video capture sets to determine the correlation between the old and new individual captures. It would be difficult to determine from the position in the VC set as the position, and number of positions in the set may change. Is there the intention that the "numbering" mentioned in the framework document clause 6.1 be used to actually be used in the encoding of the video captures to help with this sort of scenario? and generally for the "dynamic" behaviour mentioned in section 6.1? > >> I guess for the far end it doesn't care it will render the VCs how it >> wants?? > The far end will know that VC2 should be rendered to the left of VC1. It probably cares about that, otherwise it wouldn't be using CLUE. > >> Regards, Christian >> >> On 6/08/2011 7:02 AM, Duckworth, Mark wrote: >>> I'd like to continue the discussion about layout and rendering >> issues. There are many separate but related things involved. I want >> to break it down into separate topics, and see how the topics are >> related to each other. And then we can discuss what CLUE needs to deal >> with and what is not in scope. >>> I don't know if I'm using the best terms for each topic. If not, >> please suggest better terms. My use of the term "layout" here is not >> consistent with draft-wenger-clue-definitions-01, because I don't limit >> it to the rendering side. But my use of the terms "render" and "source >> selction" is consistent with that draft. >>> 1- video layout composed arrangement within a stream - when multiple >> video sources are composed into one stream, they are arranged in some >> way. Typical examples are 2x2 grid, 3x3 grid, 1+5 (1 large plus 5 >> small), 1+PiP (1 large plus one or more picture-in-picture). These >> arrangements can be selected automatically or based on user input. >> Arrangements can change over time. Identifying this composed >> arrangement is separate from identifying or selecting which video >> images are used to fill in the composition. These arrangements can be >> constructed by an endpoint sending video, by an MCU, or by an endpoint >> receiving video as it renders to a display. >>> 2 - source selection and identification - when a device is composing >> a stream made up of other sources, it needs some way to choose which >> sources to use, and some way of choosing how to combine them or where >> to place video images in the composed arrangement. Various automatic >> algorithms may be used, or selections can be made based on user input. >> Selections can change over time. One example is "select the two most >> recent talkers". It may also be desirable to identify which sources >> are used and where they are placed, for example so the receiving side >> can use this information in the user interface. Source selection can >> be done by an endpoint as it sends media, by an MUC, or by an endpoint >> receiving media. >>> 3 - spatial relation among streams - how multiple streams are related >> to each other spatially, to be rendered such that the spatial >> arrangement is consistent. The examples we've been using have multiple >> video streams that are related in an ordered row from left to right. >> Audio is also included when it is desirable to match spatial audio to >> video. >>> 4 - multi stream media format - what the streams mean with respect to >> each other, regardless of the actual content on the streams. For >> audio, examples are stereo, 5.1 surround, binaural, linear array. >> (linear array is described in the clue framework document). Perhaps 3D >> video formats would also fit in this category. This information is >> needed in order to properly render the media into light and sound for >> human observers. I see this at the same level as identifying a codec, >> independent of the audio or video content carried on the streams, and >> independent of how any composition of sources is done. >>> I think there is general agreement that items 3 and 4 are in scope >> for CLUE, as they specifically deal with multiple streams to and from >> an endpoint. And the framework draft includes these. Items 1 and 2 >> are not new, those topics exist for traditional single stream >> videoconferencing. I'm not sure what aspects of 1 and 2 should be in >> scope for CLUE. It is hard to tell from the use cases and >> requirements. The framework draft includes them only to a very limited >> extent. >>> Mark Duckworth >>> _______________________________________________ >>> clue mailing list >>> clue@ietf.org >>> https://www.ietf.org/mailman/listinfo/clue >>> >> _______________________________________________ >> clue mailing list >> clue@ietf.org >> https://www.ietf.org/mailman/listinfo/clue > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue > From allyn@cisco.com Wed Aug 24 19:34:13 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A91FB21F8B2D for ; Wed, 24 Aug 2011 19:34:13 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -3.047 X-Spam-Level: X-Spam-Status: No, score=-3.047 tagged_above=-999 required=5 tests=[AWL=-0.449, BAYES_00=-2.599, HTML_MESSAGE=0.001] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KaTQwyGohBKd for ; Wed, 24 Aug 2011 19:34:11 -0700 (PDT) Received: from rcdn-iport-3.cisco.com (rcdn-iport-3.cisco.com [173.37.86.74]) by ietfa.amsl.com (Postfix) with ESMTP id 348A921F8B2B for ; Wed, 24 Aug 2011 19:34:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=allyn@cisco.com; l=11924; q=dns/txt; s=iport; t=1314239723; x=1315449323; h=mime-version:subject:date:message-id:from:to; bh=exMGbUb8cL4TABUR7ubCaxUd0Tx8Xl2KS29ye2hrj30=; b=OBlrtfHQHBOfcwI4BcR6LUd8pbIOU/0K9zhPqrjky3FosdDHBIn4V4Fa 0GDjhjGS74nfCdQRzPpWZSNUcIFJ7q03J1QD9g8G/s2bcuk2tGQsgPQJh zhiFuFWe3M/WVZzqasEAch+vP50rO4pYELnGes/v0lKHMeKn3bx+cRd7u s=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Av8EABS0VU6rRDoH/2dsb2JhbABCgk2lNHeBQgEBAxIBCREDPh0BKgYYB1cBBBsaoQ2BIwGfA4VsXwSHYZBOjAU X-IronPort-AV: E=Sophos;i="4.68,279,1312156800"; d="scan'208,217";a="16266037" Received: from mtv-core-2.cisco.com ([171.68.58.7]) by rcdn-iport-3.cisco.com with ESMTP; 25 Aug 2011 02:35:23 +0000 Received: from xbh-sjc-211.amer.cisco.com (xbh-sjc-211.cisco.com [171.70.151.144]) by mtv-core-2.cisco.com (8.14.3/8.14.3) with ESMTP id p7P2ZMhd031902 for ; Thu, 25 Aug 2011 02:35:22 GMT Received: from xmb-sjc-221.amer.cisco.com ([128.107.191.80]) by xbh-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 24 Aug 2011 19:35:22 -0700 x-mimeole: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CC62CF.A3399D09" Date: Wed, 24 Aug 2011 19:35:15 -0700 Message-ID: <9AC2C4348FD86B4BB1F8FA9C5E3A5EDC056035BB@xmb-sjc-221.amer.cisco.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: A second set of notes from today's interim meeting Thread-Index: Acxiz59qjVNRESGoS0qudGC6vUizaw== From: "Allyn Romanow (allyn)" To: X-OriginalArrivalTime: 25 Aug 2011 02:35:22.0520 (UTC) FILETIME=[A3596980:01CC62CF] Subject: [clue] A second set of notes from today's interim meeting X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Aug 2011 02:34:13 -0000 This is a multi-part message in MIME format. ------_=_NextPart_001_01CC62CF.A3399D09 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable CLUE Interim Aug. 23 2011=20 Notes taken by Allyn Romanow Beginning 9:30 am, meeting started at 9:00 Roni- observation that consumer capability description is different than media capture and not currently covered in the draft. Needs to be put into draft and made explicit. Andy go over his slides Brian Examples Illustrative rather than real world examples Single camera, single mic 3 camera endpoint When captures in a row, there is a "left of" implied In capture set, if there are 2 rows, it means 2 different view of the scene=20 For example, vc0, vc1, vc2 and VC3 the "switched" view Consumer can choose whatever rows wanted Suppose a 2 screen endpoint - what would it choose since neither is perfect? Up to the implementer of the endpoint =20 MCU providing for multiple endpoints. Can receive all the captures, or not. Presentation. New capture scene. A second, and therefore orthogonal capture set.=20 Question - about capture set. Answer - simultaneous transmission set describes non-overlapping uses of camera Capture set does not convey info about simultaneity- that is in a simultaneity list. Means equivalent views of the same scene. Andy- a capture set is a set of equivalent information. To get all the info, consumer gets a set of streams from each capture set. Jonathan L. we need to have model of MCU as consumer and as producer. Wants an example of each. =20 Stephan W. Left as notetaker Discussion of how it is handled if there are 300 endpoints-- don't want each consumer to have 300 views. MCU will have some way of deciding which streams to offer. Allyn- isn't this covered by fact that provider can offer whatever it wants and receiver can choose what it wants?=20 Jonathan, just trying to see if this model works=20 Maybe he will provide scenario that can use for an example =20 Paul K- What does"mixed"means. Wants a more precise determination so provider and consumer can know when to use. Andy- we don't want to choose algorithms.=20 Roni- it will be a problem for the MCU to be able to change its advertisements. Andy- yes perfectly valid to send as many of each type of message as things change. MCU does have to provide what each want, whether or not capabilities change.=20 =20 Roni- what is expected from the provider to send as capture advertisement? What he can do, or what he gets from the receiver? What he can do. =20 Paul C- what is a hint? It is a confusing designation. Allyn- we can change it. Action - update the document. Using discussion on the mailing list.=20 Roni, see his email written this morning.=20 Mary- go thru the use cases and updated framework and see where you see use cases. John Elwell- Roni put examples on the list about what he thinks should be in and out of scope. Roni is waiting to see what the consumer capabilities are, then he can suggest use cases. Mark - Roni, didn't think the missing issues are related to use cases? Mark thought it was. Some of us having trouble understanding what Roni thinks the missing things are without seeing use cases.=20 Roni - today systems know the details of the other side, so can provide a picture =20 Charles - discussion of the audio model and how it handles true stereo vs faux stereo Paul - linear array and stereo, what's the difference? Mark- there is a case where they are the same, but there are other cases where it is different. Can have a linear array where both are in the center.=20 Jonathan - stereo in one RTP stream. No more discussion? Mary - Agree to have another interim before the November meeting. =20 =20 =20 ------_=_NextPart_001_01CC62CF.A3399D09 Content-Type: text/html; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable

CLUE Interim Aug. 23 2011

Notes taken by Allyn Romanow

Beginning 9:30 am, meeting started at = 9:00

Roni-  observation that consumer capability = description is different than media capture and not currently covered in the draft. = Needs to be put into draft and made explicit.

Andy go over his slides

Brian Examples

Illustrative rather than real world = examples

Single camera, single mic

3 camera endpoint

When captures in a row, there is a “left = of” implied

In capture set, if  there are 2 rows, it = means 2 different view of the scene

For example, vc0, vc1, vc2 and VC3 the “switched” view

Consumer can choose whatever rows  = wanted

Suppose a 2 screen endpoint – what would = it choose since neither is perfect? Up to the implementer of the = endpoint

 

MCU providing for multiple endpoints. Can receive = all the captures, or not.

Presentation. New capture scene. A second, and = therefore orthogonal capture set.

Question – about capture = set.

Answer - simultaneous transmission set describes non-overlapping uses of camera

Capture set does not convey info about = simultaneity- that is in a simultaneity list. Means equivalent views of the same = scene.

Andy- a capture set is a set of equivalent = information. To get all the info, consumer gets a set of streams from each capture = set.

Jonathan L. we need to have model of MCU as = consumer and as producer. Wants an example of each. 

Stephan W. Left as notetaker

Discussion of how it is handled if there are 300 endpoints-- don’t want each consumer to have 300 views.  MCU = will have some way of deciding which streams to offer.

Allyn- isn’t this covered by fact that = provider can offer whatever it wants and receiver can choose what it wants? =

Jonathan, just trying to see if this model works =

Maybe he will provide scenario that can use for = an example

 

Paul K–   What = does“mixed”means. Wants a more precise determination so provider and consumer can know = when to use. Andy- we don’t want to choose algorithms.

Roni- it will be a problem for the MCU to be = able to change its advertisements. Andy- yes perfectly valid to send as many of = each type of message as things change. MCU does have to provide what each = want, whether or not capabilities change.

 

Roni- what is expected from the provider to send = as capture advertisement? What he can do, or what he gets from the = receiver? What he can do.

 

Paul C- what is a hint? It is a confusing = designation. Allyn- we can change it.

Action – update the document. Using = discussion on the mailing list.

Roni, see his email written this morning. =

Mary- go thru the use cases and updated framework = and see where you see use cases.

John Elwell- Roni put examples on the list about = what he thinks should be in and out of scope.

Roni is waiting to see what the consumer = capabilities are, then he can suggest use cases.

Mark – Roni, didn’t think the = missing issues are related to use cases? Mark thought it was. Some of us having trouble understanding what Roni thinks the missing things are without seeing use = cases.

Roni – today systems know the details of = the other side, so can provide a picture

 

Charles – discussion of the audio model and = how it handles true stereo vs faux stereo

Paul – linear array and stereo, = what’s the difference?

Mark- there is a case where they are the same, = but there are other cases where it is different. Can have a linear array where = both are in the center.

Jonathan – stereo in one RTP = stream.

No more discussion?

Mary - Agree to have another interim before the = November meeting.

 

 

 

------_=_NextPart_001_01CC62CF.A3399D09-- From apeppere@cisco.com Thu Aug 25 09:44:51 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 08C7F21F8AD1 for ; Thu, 25 Aug 2011 09:44:51 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -10.599 X-Spam-Level: X-Spam-Status: No, score=-10.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id esfvSawUtKcK for ; Thu, 25 Aug 2011 09:44:50 -0700 (PDT) Received: from ams-iport-1.cisco.com (ams-iport-1.cisco.com [144.254.224.140]) by ietfa.amsl.com (Postfix) with ESMTP id 85C8821F8AA8 for ; Thu, 25 Aug 2011 09:44:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=apeppere@cisco.com; l=5252; q=dns/txt; s=iport; t=1314290763; x=1315500363; h=message-id:date:from:mime-version:to:subject:references: in-reply-to:content-transfer-encoding; bh=2JPPQ0iqO4o3kzNGMlCNSastt4u1b6SLd4VUvqWALG4=; b=RNwGrTG0GAsQr+aXbJfaXK5CggGVKSOqFPiXFCy2k/pHgzx3Hr7MigqC btObCAoRniVntAOawjJLfiX6s+EpFLDl2x17eXwAFkTICiE8ypRmjnN3N KSD6JY+EOSiPe/D7kZC4NxsbCoLa+tcyoI+Uq7ETvuqNK0lbwJQ3MU9I7 U=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ar4AAKt7Vk6Q/khR/2dsb2JhbABCmCuPWHeBQAEBAQECAQEBAQ8BJTYGBAYHBAsRBAEBAQkWCAcJAwIBAgEVHwkIEwYCAQEeh1AEnAoBnxGGTASTGYUNi28 X-IronPort-AV: E=Sophos;i="4.68,281,1312156800"; d="scan'208";a="112472400" Received: from ams-core-1.cisco.com ([144.254.72.81]) by ams-iport-1.cisco.com with ESMTP; 25 Aug 2011 16:46:02 +0000 Received: from [10.47.196.246] ([10.47.196.246]) by ams-core-1.cisco.com (8.14.3/8.14.3) with ESMTP id p7PGjcwi032483 for ; Thu, 25 Aug 2011 18:45:38 +0200 Message-ID: <4E567C6C.9010504@cisco.com> Date: Thu, 25 Aug 2011 17:46:36 +0100 From: Andy Pepperell User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 MIME-Version: 1.0 To: clue@ietf.org References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Aug 2011 16:44:51 -0000 Thanks Charles! To follow up: >>[Roni] >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to understand what will be conveyed in the three messages and how will we use the information. >> The first question I had was how this relates to SIP. At which stage of the SIP call will the consumer advertize its capabilities? >[Charles] >I think it is best to focus on the framework a bit more before mapping to SIP. Yes, that's been our approach so far... >>[Roni] >> The third question I had was if these three messages can be repeated at any time or do we see a different message to request a mode change. >[Charles] >My understanding, based on the presentation in the virtual meeting, is that these messages, though shown as an ordered exchange, could theoretically come in any order at any time. While for the purposes of of producing robust implementations messages would need to be handled "in any order at any time", within the model as proposed there are some constraints - specifically, a media stream provider must not send a media capture advertisement until it's seen at least one consumer capability advertisement, and a consumer would not be able to send a stream configuration message until it's seen at least one media capture advertisement from the provider. Andy On 24/08/2011 21:09, Charles Eckel (eckelcu) wrote: > Hi Roni, > > Please see inline. > >> -----Original Message----- >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf > Of Roni Even >> Sent: Wednesday, August 24, 2011 6:30 AM >> To: clue@ietf.org >> Subject: [clue] Questions on basic message flow in the framework >> >> Hi, >> >> In the interim meeting I mentioned that I that I support the model but > think that there are parameters >> that I would like to add. At the meeting it was clear to me that there > will be a new revision soon >> that will support parameters at the capture scene level. Trying to see > which parameters I would like >> to see supported I looked at the message flow and I have some > questions. >> Andy presented the basic message flow with three messages: >> >> >> >> 1. Consumer capability advertisement >> >> 2. Provider - media capture advertisement >> >> 3. Consumer configuration of provider streams. >> >> >> >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to > understand what will be conveyed in >> the three messages and how will we use the information. >> >> The first question I had was how this relates to SIP. At which stage > of the SIP call will the consumer >> advertize its capabilities? > I think it is best to focus on the framework a bit more before mapping > to SIP. > >> In the second part, I was then looking at a telepresence system that > has 3 65" screens where the >> distance between the screens including the frames is 6". The system > has three cameras, each mounted on >> the center of a screen. The system is facing a room with three rows > each row sits 6 people and each >> camera is capable of capturing a third of the room but the default > views of each camera does not >> overlap with the others. The cameras support zoom and pan (local from > the application). >> The system can decode up to four video streams where one is > presentation (H.239 like). The system can >> support an internal 4-way multipoint call, means that it can receive > the three main video streams from >> one, two or three endpoints. >> >> I think that this is a very standard system, nothing special. >> >> The telepresence application is willing to provide all this > information as part of the consumer >> capability advertisement and according to Andy's slides the message > include physical factors , user >> preferences and software limitations. >> >> I am now trying to understand what the purpose of the consumer > capability advertisement is in order to >> see what information is important to convey. >> >> Is the reason for the consumer capability advertisement to allow the > provider to propose a better >> media capability advertisement, or is it to allow the provider to > optimize the content of the media >> streams he is sending based on the information provided. This will > help with looking at which >> parameters can be used. The slides show that the information is used > for the capability >> advertisements. > I viewed it as being primarily for the former, but using it for the > latter may make sense as well and should not be excluded. The extent to > which the provider actually uses the information is implementation > dependent. > >> >> >> The third question I had was if these three messages can be repeated > at any time or do we see a >> different message to request a mode change. >> > My understanding, based on the presentation in the virtual meeting, is > that these messages, though shown as an ordered exchange, could > theoretically come in any order at any time. > > Cheers, > Charles > >> Thanks >> >> Roni Even >> >> > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue From eckelcu@cisco.com Thu Aug 25 09:56:56 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 63C1721F8B02 for ; Thu, 25 Aug 2011 09:56:56 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.769 X-Spam-Level: X-Spam-Status: No, score=-2.769 tagged_above=-999 required=5 tests=[AWL=-0.170, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yCO0FKDVBa-u for ; Thu, 25 Aug 2011 09:56:55 -0700 (PDT) Received: from rcdn-iport-5.cisco.com (rcdn-iport-5.cisco.com [173.37.86.76]) by ietfa.amsl.com (Postfix) with ESMTP id 5BFDE21F8B05 for ; Thu, 25 Aug 2011 09:56:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=eckelcu@cisco.com; l=6557; q=dns/txt; s=iport; t=1314291489; x=1315501089; h=mime-version:content-transfer-encoding:subject:date: message-id:in-reply-to:references:from:to; bh=W8Q4JZs0r4JqPyQATJZzdt7qgSLuvgW0TvfQZgPWIMk=; b=mV7ljizDNUnBbkiMCmF5deK/GnOGxmtv49bmKGnXa0p9o47tAH/fMPGt qDMTrfYLdWXH3TQK+LY29aJiVdZdRtxBE25doCmgGu86C9h62hbudfuP5 5gn7DrMMeSmuFCQVeBtlwgXgwVqwbZOadwNWA9dEQyp0VZGcWoopLGOP9 g=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ar4AAJV+Vk6rRDoH/2dsb2JhbABCmCuPWHeBQAEBAQECAQEBAQ8BFAkKNAYRBAIBCBEEAQEBCgYXAQYBJh8JCAEBBAESCBqHUASbbQGfCIVsYASHYZBOjAU X-IronPort-AV: E=Sophos;i="4.68,281,1312156800"; d="scan'208";a="16495919" Received: from mtv-core-2.cisco.com ([171.68.58.7]) by rcdn-iport-5.cisco.com with ESMTP; 25 Aug 2011 16:58:08 +0000 Received: from xbh-sjc-211.amer.cisco.com (xbh-sjc-211.cisco.com [171.70.151.144]) by mtv-core-2.cisco.com (8.14.3/8.14.3) with ESMTP id p7PGw8Bc024822 for ; Thu, 25 Aug 2011 16:58:08 GMT Received: from xmb-sjc-234.amer.cisco.com ([128.107.191.111]) by xbh-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 25 Aug 2011 09:58:08 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Thu, 25 Aug 2011 09:58:07 -0700 Message-ID: In-Reply-To: <4E567C6C.9010504@cisco.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [clue] Questions on basic message flow in the framework Thread-Index: AcxjRn27Kwdh2/i0RR+36K6HJHrh6wAAMmMQ References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> <4E567C6C.9010504@cisco.com> From: "Charles Eckel (eckelcu)" To: "Andrew Pepperell (apeppere)" , X-OriginalArrivalTime: 25 Aug 2011 16:58:08.0279 (UTC) FILETIME=[2A256270:01CC6348] Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Aug 2011 16:56:56 -0000 Hi Andy, One more question inline to make sure I have it straight. > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Andrew Pepperell (apeppere) > Sent: Thursday, August 25, 2011 9:47 AM > To: clue@ietf.org > Subject: Re: [clue] Questions on basic message flow in the framework >=20 > Thanks Charles! To follow up: >=20 > >>[Roni] > >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to > understand what will be conveyed in the three messages and how will we > use the information. > >> The first question I had was how this relates to SIP. At which stage > of the SIP call will the consumer advertize its capabilities? > >[Charles] > >I think it is best to focus on the framework a bit more before mapping > to SIP. >=20 > Yes, that's been our approach so far... >=20 > >>[Roni] > >> The third question I had was if these three messages can be repeated > at any time or do we see a different message to request a mode change. > >[Charles] > >My understanding, based on the presentation in the virtual meeting, is > that these messages, though shown as an ordered exchange, could > theoretically come in any order at any time. >=20 > While for the purposes of of producing robust implementations messages > would need to be handled "in any order at any time", within the model as > proposed there are some constraints - specifically, a media stream > provider must not send a media capture advertisement until it's seen at > least one consumer capability advertisement, and a consumer would not be > able to send a stream configuration message until it's seen at least one > media capture advertisement from the provider. At call establishment, the message flow must be: Consumer Provider -------------------------------------- (1) Consumer capability advertisement --> <-- (2) Provider media capture advertisement (3) Consumer configuration of provider streams --> But after that, an updated version of (1) or (2) or (3) could be sent/rcvd at any time, and it is okay for the provider to send an updated (2) without first receiving an updated (1), etc. Is that correct? Thanks, Charles =20 >=20 > Andy >=20 >=20 > On 24/08/2011 21:09, Charles Eckel (eckelcu) wrote: > > Hi Roni, > > > > Please see inline. > > > >> -----Original Message----- > >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf > > Of Roni Even > >> Sent: Wednesday, August 24, 2011 6:30 AM > >> To: clue@ietf.org > >> Subject: [clue] Questions on basic message flow in the framework > >> > >> Hi, > >> > >> In the interim meeting I mentioned that I that I support the model but > > think that there are parameters > >> that I would like to add. At the meeting it was clear to me that there > > will be a new revision soon > >> that will support parameters at the capture scene level. Trying to see > > which parameters I would like > >> to see supported I looked at the message flow and I have some > > questions. > >> Andy presented the basic message flow with three messages: > >> > >> > >> > >> 1. Consumer capability advertisement > >> > >> 2. Provider - media capture advertisement > >> > >> 3. Consumer configuration of provider streams. > >> > >> > >> > >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to > > understand what will be conveyed in > >> the three messages and how will we use the information. > >> > >> The first question I had was how this relates to SIP. At which stage > > of the SIP call will the consumer > >> advertize its capabilities? > > I think it is best to focus on the framework a bit more before mapping > > to SIP. > > > >> In the second part, I was then looking at a telepresence system that > > has 3 65" screens where the > >> distance between the screens including the frames is 6". The system > > has three cameras, each mounted on > >> the center of a screen. The system is facing a room with three rows > > each row sits 6 people and each > >> camera is capable of capturing a third of the room but the default > > views of each camera does not > >> overlap with the others. The cameras support zoom and pan (local from > > the application). > >> The system can decode up to four video streams where one is > > presentation (H.239 like). The system can > >> support an internal 4-way multipoint call, means that it can receive > > the three main video streams from > >> one, two or three endpoints. > >> > >> I think that this is a very standard system, nothing special. > >> > >> The telepresence application is willing to provide all this > > information as part of the consumer > >> capability advertisement and according to Andy's slides the message > > include physical factors , user > >> preferences and software limitations. > >> > >> I am now trying to understand what the purpose of the consumer > > capability advertisement is in order to > >> see what information is important to convey. > >> > >> Is the reason for the consumer capability advertisement to allow the > > provider to propose a better > >> media capability advertisement, or is it to allow the provider to > > optimize the content of the media > >> streams he is sending based on the information provided. This will > > help with looking at which > >> parameters can be used. The slides show that the information is used > > for the capability > >> advertisements. > > I viewed it as being primarily for the former, but using it for the > > latter may make sense as well and should not be excluded. The extent to > > which the provider actually uses the information is implementation > > dependent. > > > >> > >> > >> The third question I had was if these three messages can be repeated > > at any time or do we see a > >> different message to request a mode change. > >> > > My understanding, based on the presentation in the virtual meeting, is > > that these messages, though shown as an ordered exchange, could > > theoretically come in any order at any time. > > > > Cheers, > > Charles > > > >> Thanks > >> > >> Roni Even > >> > >> > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue >=20 > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue From pkyzivat@alum.mit.edu Thu Aug 25 10:05:24 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 97F3521F8B28 for ; Thu, 25 Aug 2011 10:05:24 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.518 X-Spam-Level: X-Spam-Status: No, score=-2.518 tagged_above=-999 required=5 tests=[AWL=0.081, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id c134oIEbpD78 for ; Thu, 25 Aug 2011 10:05:24 -0700 (PDT) Received: from qmta07.westchester.pa.mail.comcast.net (qmta07.westchester.pa.mail.comcast.net [76.96.62.64]) by ietfa.amsl.com (Postfix) with ESMTP id F2DA821F8B22 for ; Thu, 25 Aug 2011 10:05:23 -0700 (PDT) Received: from omta07.westchester.pa.mail.comcast.net ([76.96.62.59]) by qmta07.westchester.pa.mail.comcast.net with comcast id Qh4V1h0061GhbT857h6esP; Thu, 25 Aug 2011 17:06:38 +0000 Received: from Paul-Kyzivats-MacBook-Pro.local ([24.62.109.41]) by omta07.westchester.pa.mail.comcast.net with comcast id Qh6Y1h01F0tdiYw3Th6bdD; Thu, 25 Aug 2011 17:06:37 +0000 Message-ID: <4E568116.6070208@alum.mit.edu> Date: Thu, 25 Aug 2011 13:06:30 -0400 From: Paul Kyzivat User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: clue@ietf.org References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> <4E567C6C.9010504@cisco.com> In-Reply-To: <4E567C6C.9010504@cisco.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Aug 2011 17:05:24 -0000 On 8/25/11 12:46 PM, Andy Pepperell wrote: > While for the purposes of of producing robust implementations messages > would need to be handled "in any order at any time", within the model as > proposed there are some constraints - specifically, a media stream > provider must not send a media capture advertisement until it's seen at > least one consumer capability advertisement, and a consumer would not be > able to send a stream configuration message until it's seen at least one > media capture advertisement from the provider. While I think a mechanism may end up imposing some limitations such as this, I see no reason to impose such restrictions a priori. Thanks, Paul From apeppere@cisco.com Thu Aug 25 10:14:49 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 850AD21F8B2F for ; Thu, 25 Aug 2011 10:14:49 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -10.599 X-Spam-Level: X-Spam-Status: No, score=-10.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8TIkqEOwCe1v for ; Thu, 25 Aug 2011 10:14:48 -0700 (PDT) Received: from ams-iport-1.cisco.com (ams-iport-1.cisco.com [144.254.224.140]) by ietfa.amsl.com (Postfix) with ESMTP id EBC1F21F8AD8 for ; Thu, 25 Aug 2011 10:14:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=apeppere@cisco.com; l=7551; q=dns/txt; s=iport; t=1314292562; x=1315502162; h=message-id:date:from:mime-version:to:cc:subject: references:in-reply-to:content-transfer-encoding; bh=xO0y3ELSc4OzMgg8Hqa/fqb3Dj+VXWq9IPaQG9U4ld8=; b=aVE+u2nPra7+p+ODzVClkfZ4TY7cOQbEZI1MTY5mCNufvbvNhOT3JY7P D0ZgIicyfari/TdumUpHJcRQD0ejGcY+4JhvjEyzSIH7wxyFKzG2O5wp9 qPV7P//h9o0P2JoeStS1XayRBXyt4jHiNfSyeFyzkfOxYoKbkIYm5aojH U=; X-IronPort-AV: E=Sophos;i="4.68,281,1312156800"; d="scan'208";a="112476097" Received: from ams-core-1.cisco.com ([144.254.72.81]) by ams-iport-1.cisco.com with ESMTP; 25 Aug 2011 17:16:01 +0000 Received: from [10.47.196.246] ([10.47.196.246]) by ams-core-1.cisco.com (8.14.3/8.14.3) with ESMTP id p7PHFbxr006007; Thu, 25 Aug 2011 19:15:37 +0200 Message-ID: <4E568372.1030000@cisco.com> Date: Thu, 25 Aug 2011 18:16:34 +0100 From: Andy Pepperell User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 MIME-Version: 1.0 To: "Charles Eckel (eckelcu)" References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> <4E567C6C.9010504@cisco.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: clue@ietf.org Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Aug 2011 17:14:49 -0000 Hi Charles, >At call establishment, the message flow must be: >Consumer Provider >-------------------------------------- >(1) Consumer capability advertisement --> ><-- (2) Provider media capture advertisement >(3) Consumer configuration of provider streams --> >But after that, an updated version of (1) or (2) or (3) could be >sent/rcvd at any time, and it is okay for the provider to send an >updated (2) without first receiving an updated (1), etc. >Is that correct? That's the scheme in the framework, yes. It may be that new information in a consumer capability advertisement (1) causes a new media capture advertisement (2) to need to be sent, or a different stimulus might cause a new (2) to be needed (e.g. connection / disconnection of a presentation source). Regards, Andy On 25/08/2011 17:58, Charles Eckel (eckelcu) wrote: > Hi Andy, > > One more question inline to make sure I have it straight. > >> -----Original Message----- >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf > Of Andrew Pepperell (apeppere) >> Sent: Thursday, August 25, 2011 9:47 AM >> To: clue@ietf.org >> Subject: Re: [clue] Questions on basic message flow in the framework >> >> Thanks Charles! To follow up: >> >> >>[Roni] >> >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to >> understand what will be conveyed in the three messages and how will we >> use the information. >> >> The first question I had was how this relates to SIP. At which > stage >> of the SIP call will the consumer advertize its capabilities? >> >[Charles] >> >I think it is best to focus on the framework a bit more before > mapping >> to SIP. >> >> Yes, that's been our approach so far... >> >> >>[Roni] >> >> The third question I had was if these three messages can be > repeated >> at any time or do we see a different message to request a mode change. >> >[Charles] >> >My understanding, based on the presentation in the virtual meeting, > is >> that these messages, though shown as an ordered exchange, could >> theoretically come in any order at any time. >> >> While for the purposes of of producing robust implementations messages >> would need to be handled "in any order at any time", within the model > as >> proposed there are some constraints - specifically, a media stream >> provider must not send a media capture advertisement until it's seen > at >> least one consumer capability advertisement, and a consumer would not > be >> able to send a stream configuration message until it's seen at least > one >> media capture advertisement from the provider. > At call establishment, the message flow must be: > > Consumer Provider > -------------------------------------- > (1) Consumer capability advertisement --> > <-- (2) Provider media capture advertisement > (3) Consumer configuration of provider streams --> > > But after that, an updated version of (1) or (2) or (3) could be > sent/rcvd at any time, and it is okay for the provider to send an > updated (2) without first receiving an updated (1), etc. > Is that correct? > > Thanks, > Charles > >> Andy >> >> >> On 24/08/2011 21:09, Charles Eckel (eckelcu) wrote: >>> Hi Roni, >>> >>> Please see inline. >>> >>>> -----Original Message----- >>>> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On > Behalf >>> Of Roni Even >>>> Sent: Wednesday, August 24, 2011 6:30 AM >>>> To: clue@ietf.org >>>> Subject: [clue] Questions on basic message flow in the framework >>>> >>>> Hi, >>>> >>>> In the interim meeting I mentioned that I that I support the model > but >>> think that there are parameters >>>> that I would like to add. At the meeting it was clear to me that > there >>> will be a new revision soon >>>> that will support parameters at the capture scene level. Trying to > see >>> which parameters I would like >>>> to see supported I looked at the message flow and I have some >>> questions. >>>> Andy presented the basic message flow with three messages: >>>> >>>> >>>> >>>> 1. Consumer capability advertisement >>>> >>>> 2. Provider - media capture advertisement >>>> >>>> 3. Consumer configuration of provider streams. >>>> >>>> >>>> >>>> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to >>> understand what will be conveyed in >>>> the three messages and how will we use the information. >>>> >>>> The first question I had was how this relates to SIP. At which > stage >>> of the SIP call will the consumer >>>> advertize its capabilities? >>> I think it is best to focus on the framework a bit more before > mapping >>> to SIP. >>> >>>> In the second part, I was then looking at a telepresence system > that >>> has 3 65" screens where the >>>> distance between the screens including the frames is 6". The system >>> has three cameras, each mounted on >>>> the center of a screen. The system is facing a room with three rows >>> each row sits 6 people and each >>>> camera is capable of capturing a third of the room but the default >>> views of each camera does not >>>> overlap with the others. The cameras support zoom and pan (local > from >>> the application). >>>> The system can decode up to four video streams where one is >>> presentation (H.239 like). The system can >>>> support an internal 4-way multipoint call, means that it can > receive >>> the three main video streams from >>>> one, two or three endpoints. >>>> >>>> I think that this is a very standard system, nothing special. >>>> >>>> The telepresence application is willing to provide all this >>> information as part of the consumer >>>> capability advertisement and according to Andy's slides the message >>> include physical factors , user >>>> preferences and software limitations. >>>> >>>> I am now trying to understand what the purpose of the consumer >>> capability advertisement is in order to >>>> see what information is important to convey. >>>> >>>> Is the reason for the consumer capability advertisement to allow > the >>> provider to propose a better >>>> media capability advertisement, or is it to allow the provider to >>> optimize the content of the media >>>> streams he is sending based on the information provided. This will >>> help with looking at which >>>> parameters can be used. The slides show that the information is > used >>> for the capability >>>> advertisements. >>> I viewed it as being primarily for the former, but using it for the >>> latter may make sense as well and should not be excluded. The extent > to >>> which the provider actually uses the information is implementation >>> dependent. >>> >>>> >>>> The third question I had was if these three messages can be > repeated >>> at any time or do we see a >>>> different message to request a mode change. >>>> >>> My understanding, based on the presentation in the virtual meeting, > is >>> that these messages, though shown as an ordered exchange, could >>> theoretically come in any order at any time. >>> >>> Cheers, >>> Charles >>> >>>> Thanks >>>> >>>> Roni Even >>>> >>>> >>> _______________________________________________ >>> clue mailing list >>> clue@ietf.org >>> https://www.ietf.org/mailman/listinfo/clue >> _______________________________________________ >> clue mailing list >> clue@ietf.org >> https://www.ietf.org/mailman/listinfo/clue From apeppere@cisco.com Thu Aug 25 10:20:29 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 584F921F8BA4 for ; Thu, 25 Aug 2011 10:20:29 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -10.599 X-Spam-Level: X-Spam-Status: No, score=-10.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Z2TubeX8+sTB for ; Thu, 25 Aug 2011 10:20:28 -0700 (PDT) Received: from ams-iport-1.cisco.com (ams-iport-1.cisco.com [144.254.224.140]) by ietfa.amsl.com (Postfix) with ESMTP id 70A8B21F8BA0 for ; Thu, 25 Aug 2011 10:20:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=apeppere@cisco.com; l=1721; q=dns/txt; s=iport; t=1314292902; x=1315502502; h=message-id:date:from:mime-version:to:subject:references: in-reply-to:content-transfer-encoding; bh=Gwj9QowXEx7NuG07YJRpctGYwCK3h3YyS1hft0lnhjU=; b=E38O0zDrSbcFxKYFShKzZ31EHnqBUKl3VFTrLxpPf9b0LyyQRwuu6RSy WD0hhB5D305Hua9HhoG8K+gylGEQuXunD/GVJjIbt4XsFREwlaKMAja4T jZlKVzrk16hVUcx7E8DlW7v/AN9j9LcYJr+SHipHNUVSVP+02bqWSd3Ed k=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvkGAN+DVk6Q/khN/2dsb2JhbABCmG+PFHeBQAEBAQEDAQEBDwElNgoRCxgJFg8JAwIBAgEVMBMGAgEBHodUmzMBnwiGTASTGYUNi28 X-IronPort-AV: E=Sophos;i="4.68,281,1312156800"; d="scan'208";a="112476514" Received: from ams-core-4.cisco.com ([144.254.72.77]) by ams-iport-1.cisco.com with ESMTP; 25 Aug 2011 17:21:41 +0000 Received: from [10.47.196.246] ([10.47.196.246]) by ams-core-4.cisco.com (8.14.3/8.14.3) with ESMTP id p7PHLfp2015002 for ; Thu, 25 Aug 2011 17:21:41 GMT Message-ID: <4E5684C7.2070804@cisco.com> Date: Thu, 25 Aug 2011 18:22:15 +0100 From: Andy Pepperell User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 MIME-Version: 1.0 To: clue@ietf.org References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> <4E567C6C.9010504@cisco.com> <4E568116.6070208@alum.mit.edu> In-Reply-To: <4E568116.6070208@alum.mit.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Aug 2011 17:20:29 -0000 Hi Paul, >While I think a mechanism may end up imposing some limitations such as this, I see no reason to impose such restrictions a priori. I see this as more than a mere "restriction"; specifically, if a device is intending to use the consumer capability advertisement to determine its media capture advertisement (for instance, by acting on the list of attributes that are understood by the consumer) then it seems much better to require at least one of these to be sent rather than making it optional and forcing vendors to potentially introduce a timeout scheme where their devices wait around for a while to see if they do receive a message and, if not, give up after a certain time interval (which would no doubt be inconsistent between different manufacturers). Regards, Andy On 25/08/2011 18:06, Paul Kyzivat wrote: > On 8/25/11 12:46 PM, Andy Pepperell wrote: > >> While for the purposes of of producing robust implementations messages >> would need to be handled "in any order at any time", within the model as >> proposed there are some constraints - specifically, a media stream >> provider must not send a media capture advertisement until it's seen at >> least one consumer capability advertisement, and a consumer would not be >> able to send a stream configuration message until it's seen at least one >> media capture advertisement from the provider. > > While I think a mechanism may end up imposing some limitations such as > this, I see no reason to impose such restrictions a priori. > > Thanks, > Paul > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue From Even.roni@huawei.com Thu Aug 25 12:00:40 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 16ECC21F85EC for ; Thu, 25 Aug 2011 12:00:40 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.207 X-Spam-Level: X-Spam-Status: No, score=-106.207 tagged_above=-999 required=5 tests=[AWL=0.392, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CaMLd-VPoBY4 for ; Thu, 25 Aug 2011 12:00:39 -0700 (PDT) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [119.145.14.64]) by ietfa.amsl.com (Postfix) with ESMTP id 74F8E21F8C97 for ; Thu, 25 Aug 2011 12:00:29 -0700 (PDT) Received: from huawei.com (szxga05-in [172.24.2.49]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQH004JPZINRF@szxga05-in.huawei.com> for clue@ietf.org; Fri, 26 Aug 2011 03:01:42 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQH00C7FZIN42@szxga05-in.huawei.com> for clue@ietf.org; Fri, 26 Aug 2011 03:01:35 +0800 (CST) Received: from windows8d787f9 (bzq-109-64-200-234.red.bezeqint.net [109.64.200.234]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQH00LKZZIGYL@szxml12-in.huawei.com>; Fri, 26 Aug 2011 03:01:35 +0800 (CST) Date: Thu, 25 Aug 2011 22:00:32 +0300 From: Roni Even In-reply-to: <4E567C6C.9010504@cisco.com> To: 'Andy Pepperell' , clue@ietf.org Message-id: <00a901cc6359$484f0880$d8ed1980$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: text/plain; charset=us-ascii Content-language: en-us Content-transfer-encoding: 7BIT Thread-index: AcxjRuGObPGNH+pfQMyfff6YGMDwLgAEerBg References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> <4E567C6C.9010504@cisco.com> Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Aug 2011 19:00:40 -0000 Hi, During the initial discussion on this work one of the issues was if we are talking about one or two stage signaling. As far as I remember we talked about one stage signaling, is this still the case or do we still keep it open. This was way I asked about the mapping to SIP and why I think we need to consider it early to verify if the proposed message flow works with one stage signaling. Regards Roni > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of > Andy Pepperell > Sent: Thursday, August 25, 2011 7:47 PM > To: clue@ietf.org > Subject: Re: [clue] Questions on basic message flow in the framework > > Thanks Charles! To follow up: > > >>[Roni] > >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to > understand what will be conveyed in the three messages and how will we > use the information. > >> The first question I had was how this relates to SIP. At which > stage > of the SIP call will the consumer advertize its capabilities? > >[Charles] > >I think it is best to focus on the framework a bit more before > mapping > to SIP. > > Yes, that's been our approach so far... > > >>[Roni] > >> The third question I had was if these three messages can be > repeated > at any time or do we see a different message to request a mode change. > >[Charles] > >My understanding, based on the presentation in the virtual meeting, > is > that these messages, though shown as an ordered exchange, could > theoretically come in any order at any time. > > While for the purposes of of producing robust implementations messages > would need to be handled "in any order at any time", within the model > as > proposed there are some constraints - specifically, a media stream > provider must not send a media capture advertisement until it's seen at > least one consumer capability advertisement, and a consumer would not > be > able to send a stream configuration message until it's seen at least > one > media capture advertisement from the provider. > > Andy > > > On 24/08/2011 21:09, Charles Eckel (eckelcu) wrote: > > Hi Roni, > > > > Please see inline. > > > >> -----Original Message----- > >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf > > Of Roni Even > >> Sent: Wednesday, August 24, 2011 6:30 AM > >> To: clue@ietf.org > >> Subject: [clue] Questions on basic message flow in the framework > >> > >> Hi, > >> > >> In the interim meeting I mentioned that I that I support the model > but > > think that there are parameters > >> that I would like to add. At the meeting it was clear to me that > there > > will be a new revision soon > >> that will support parameters at the capture scene level. Trying to > see > > which parameters I would like > >> to see supported I looked at the message flow and I have some > > questions. > >> Andy presented the basic message flow with three messages: > >> > >> > >> > >> 1. Consumer capability advertisement > >> > >> 2. Provider - media capture advertisement > >> > >> 3. Consumer configuration of provider streams. > >> > >> > >> > >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to > > understand what will be conveyed in > >> the three messages and how will we use the information. > >> > >> The first question I had was how this relates to SIP. At which stage > > of the SIP call will the consumer > >> advertize its capabilities? > > I think it is best to focus on the framework a bit more before > mapping > > to SIP. > > > >> In the second part, I was then looking at a telepresence system > that > > has 3 65" screens where the > >> distance between the screens including the frames is 6". The system > > has three cameras, each mounted on > >> the center of a screen. The system is facing a room with three rows > > each row sits 6 people and each > >> camera is capable of capturing a third of the room but the default > > views of each camera does not > >> overlap with the others. The cameras support zoom and pan (local > from > > the application). > >> The system can decode up to four video streams where one is > > presentation (H.239 like). The system can > >> support an internal 4-way multipoint call, means that it can receive > > the three main video streams from > >> one, two or three endpoints. > >> > >> I think that this is a very standard system, nothing special. > >> > >> The telepresence application is willing to provide all this > > information as part of the consumer > >> capability advertisement and according to Andy's slides the message > > include physical factors , user > >> preferences and software limitations. > >> > >> I am now trying to understand what the purpose of the consumer > > capability advertisement is in order to > >> see what information is important to convey. > >> > >> Is the reason for the consumer capability advertisement to allow the > > provider to propose a better > >> media capability advertisement, or is it to allow the provider to > > optimize the content of the media > >> streams he is sending based on the information provided. This will > > help with looking at which > >> parameters can be used. The slides show that the information is used > > for the capability > >> advertisements. > > I viewed it as being primarily for the former, but using it for the > > latter may make sense as well and should not be excluded. The extent > to > > which the provider actually uses the information is implementation > > dependent. > > > >> > >> > >> The third question I had was if these three messages can be repeated > > at any time or do we see a > >> different message to request a mode change. > >> > > My understanding, based on the presentation in the virtual meeting, > is > > that these messages, though shown as an ordered exchange, could > > theoretically come in any order at any time. > > > > Cheers, > > Charles > > > >> Thanks > >> > >> Roni Even > >> > >> > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue > > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue From allyn@cisco.com Thu Aug 25 12:07:04 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EEF3821F8AFA for ; Thu, 25 Aug 2011 12:07:04 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.997 X-Spam-Level: X-Spam-Status: No, score=-2.997 tagged_above=-999 required=5 tests=[AWL=-0.398, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YEA6GTJthgEy for ; Thu, 25 Aug 2011 12:07:04 -0700 (PDT) Received: from rcdn-iport-8.cisco.com (rcdn-iport-8.cisco.com [173.37.86.79]) by ietfa.amsl.com (Postfix) with ESMTP id C286E21F8AEA for ; Thu, 25 Aug 2011 12:07:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=allyn@cisco.com; l=7351; q=dns/txt; s=iport; t=1314299298; x=1315508898; h=mime-version:content-transfer-encoding:subject:date: message-id:in-reply-to:references:from:to; bh=pFT/cYoion8Hw4uhCtW1JTMumkWSxf5w5ljOK3v9K84=; b=FthQSVWzd0qmGNP/cvvBLSDuGzLDOYJKZhJBqjwsDQ46VxLEu77ctWPA fg4Qh/VrA6LMPddF14yB4DpIC4WCt6ubunFehB2XLp9j2dFRTTt3gLgSa rOAnMEm8YDTdCTSHWrzVt5hBWfIcZBFWwPh+hrqHWtXU54vfOleH2BvDY 8=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ar4AABSdVk6rRDoH/2dsb2JhbABDmCuPWHeBQAEBAQEDAQEBDwEdCjQGEQQCAQgRBAEBAQoGFwEGASYfCQgBAQQBEggah1SbTwGefoVsYASHYZBOjAU X-IronPort-AV: E=Sophos;i="4.68,282,1312156800"; d="scan'208";a="16546319" Received: from mtv-core-2.cisco.com ([171.68.58.7]) by rcdn-iport-8.cisco.com with ESMTP; 25 Aug 2011 19:08:17 +0000 Received: from xbh-sjc-221.amer.cisco.com (xbh-sjc-221.cisco.com [128.107.191.63]) by mtv-core-2.cisco.com (8.14.3/8.14.3) with ESMTP id p7PJ8Hfb007810; Thu, 25 Aug 2011 19:08:17 GMT Received: from xmb-sjc-221.amer.cisco.com ([128.107.191.80]) by xbh-sjc-221.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 25 Aug 2011 12:08:16 -0700 x-mimeole: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable Date: Thu, 25 Aug 2011 12:08:14 -0700 Message-ID: <9AC2C4348FD86B4BB1F8FA9C5E3A5EDC05603802@xmb-sjc-221.amer.cisco.com> In-Reply-To: <00a901cc6359$484f0880$d8ed1980$%roni@huawei.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [clue] Questions on basic message flow in the framework Thread-Index: AcxjRuGObPGNH+pfQMyfff6YGMDwLgAEerBgAABZ4YA= References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com><4E567C6C.9010504@cisco.com> <00a901cc6359$484f0880$d8ed1980$%roni@huawei.com> From: "Allyn Romanow (allyn)" To: "Roni Even" , "Andrew Pepperell (apeppere)" , X-OriginalArrivalTime: 25 Aug 2011 19:08:16.0792 (UTC) FILETIME=[5861E980:01CC635A] Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Aug 2011 19:07:05 -0000 One option would be to establish CLUE through SIP, and then these messages are CLUE messages, not SIP messages. > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of > Roni Even > Sent: Thursday, August 25, 2011 12:01 PM > To: Andrew Pepperell (apeppere); clue@ietf.org > Subject: Re: [clue] Questions on basic message flow in the framework >=20 > Hi, > During the initial discussion on this work one of the issues was if we > are > talking about one or two stage signaling. As far as I remember we > talked > about one stage signaling, is this still the case or do we still keep > it > open. This was way I asked about the mapping to SIP and why I think we > need > to consider it early to verify if the proposed message flow works with > one > stage signaling. > Regards > Roni >=20 > > -----Original Message----- > > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf > Of > > Andy Pepperell > > Sent: Thursday, August 25, 2011 7:47 PM > > To: clue@ietf.org > > Subject: Re: [clue] Questions on basic message flow in the framework > > > > Thanks Charles! To follow up: > > > > >>[Roni] > > >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to > > understand what will be conveyed in the three messages and how will > we > > use the information. > > >> The first question I had was how this relates to SIP. At which > > stage > > of the SIP call will the consumer advertize its capabilities? > > >[Charles] > > >I think it is best to focus on the framework a bit more before > > mapping > > to SIP. > > > > Yes, that's been our approach so far... > > > > >>[Roni] > > >> The third question I had was if these three messages can be > > repeated > > at any time or do we see a different message to request a mode > change. > > >[Charles] > > >My understanding, based on the presentation in the virtual meeting, > > is > > that these messages, though shown as an ordered exchange, could > > theoretically come in any order at any time. > > > > While for the purposes of of producing robust implementations > messages > > would need to be handled "in any order at any time", within the model > > as > > proposed there are some constraints - specifically, a media stream > > provider must not send a media capture advertisement until it's seen > at > > least one consumer capability advertisement, and a consumer would not > > be > > able to send a stream configuration message until it's seen at least > > one > > media capture advertisement from the provider. > > > > Andy > > > > > > On 24/08/2011 21:09, Charles Eckel (eckelcu) wrote: > > > Hi Roni, > > > > > > Please see inline. > > > > > >> -----Original Message----- > > >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On > Behalf > > > Of Roni Even > > >> Sent: Wednesday, August 24, 2011 6:30 AM > > >> To: clue@ietf.org > > >> Subject: [clue] Questions on basic message flow in the framework > > >> > > >> Hi, > > >> > > >> In the interim meeting I mentioned that I that I support the model > > but > > > think that there are parameters > > >> that I would like to add. At the meeting it was clear to me that > > there > > > will be a new revision soon > > >> that will support parameters at the capture scene level. Trying to > > see > > > which parameters I would like > > >> to see supported I looked at the message flow and I have some > > > questions. > > >> Andy presented the basic message flow with three messages: > > >> > > >> > > >> > > >> 1. Consumer capability advertisement > > >> > > >> 2. Provider - media capture advertisement > > >> > > >> 3. Consumer configuration of provider streams. > > >> > > >> > > >> > > >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to > > > understand what will be conveyed in > > >> the three messages and how will we use the information. > > >> > > >> The first question I had was how this relates to SIP. At which > stage > > > of the SIP call will the consumer > > >> advertize its capabilities? > > > I think it is best to focus on the framework a bit more before > > mapping > > > to SIP. > > > > > >> In the second part, I was then looking at a telepresence system > > that > > > has 3 65" screens where the > > >> distance between the screens including the frames is 6". The > system > > > has three cameras, each mounted on > > >> the center of a screen. The system is facing a room with three > rows > > > each row sits 6 people and each > > >> camera is capable of capturing a third of the room but the default > > > views of each camera does not > > >> overlap with the others. The cameras support zoom and pan (local > > from > > > the application). > > >> The system can decode up to four video streams where one is > > > presentation (H.239 like). The system can > > >> support an internal 4-way multipoint call, means that it can > receive > > > the three main video streams from > > >> one, two or three endpoints. > > >> > > >> I think that this is a very standard system, nothing special. > > >> > > >> The telepresence application is willing to provide all this > > > information as part of the consumer > > >> capability advertisement and according to Andy's slides the > message > > > include physical factors , user > > >> preferences and software limitations. > > >> > > >> I am now trying to understand what the purpose of the consumer > > > capability advertisement is in order to > > >> see what information is important to convey. > > >> > > >> Is the reason for the consumer capability advertisement to allow > the > > > provider to propose a better > > >> media capability advertisement, or is it to allow the provider to > > > optimize the content of the media > > >> streams he is sending based on the information provided. This will > > > help with looking at which > > >> parameters can be used. The slides show that the information is > used > > > for the capability > > >> advertisements. > > > I viewed it as being primarily for the former, but using it for the > > > latter may make sense as well and should not be excluded. The > extent > > to > > > which the provider actually uses the information is implementation > > > dependent. > > > > > >> > > >> > > >> The third question I had was if these three messages can be > repeated > > > at any time or do we see a > > >> different message to request a mode change. > > >> > > > My understanding, based on the presentation in the virtual meeting, > > is > > > that these messages, though shown as an ordered exchange, could > > > theoretically come in any order at any time. > > > > > > Cheers, > > > Charles > > > > > >> Thanks > > >> > > >> Roni Even > > >> > > >> > > > _______________________________________________ > > > clue mailing list > > > clue@ietf.org > > > https://www.ietf.org/mailman/listinfo/clue > > > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue >=20 > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue From Even.roni@huawei.com Thu Aug 25 12:48:03 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8FA4421F8C08 for ; Thu, 25 Aug 2011 12:48:03 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.256 X-Spam-Level: X-Spam-Status: No, score=-106.256 tagged_above=-999 required=5 tests=[AWL=0.343, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AVBFrZbfChQE for ; Thu, 25 Aug 2011 12:48:02 -0700 (PDT) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [119.145.14.64]) by ietfa.amsl.com (Postfix) with ESMTP id 6024621F881C for ; Thu, 25 Aug 2011 12:48:02 -0700 (PDT) Received: from huawei.com (szxga05-in [172.24.2.49]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQI00DFE1Q3L1@szxga05-in.huawei.com> for clue@ietf.org; Fri, 26 Aug 2011 03:49:15 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQI008BQ1Q2A7@szxga05-in.huawei.com> for clue@ietf.org; Fri, 26 Aug 2011 03:49:15 +0800 (CST) Received: from windows8d787f9 (bzq-109-64-200-234.red.bezeqint.net [109.64.200.234]) by szxml11-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQI008LI1PW7C@szxml11-in.huawei.com>; Fri, 26 Aug 2011 03:49:14 +0800 (CST) Date: Thu, 25 Aug 2011 22:48:13 +0300 From: Roni Even In-reply-to: <9AC2C4348FD86B4BB1F8FA9C5E3A5EDC05603802@xmb-sjc-221.amer.cisco.com> To: "'Allyn Romanow (allyn)'" , "'Andrew Pepperell (apeppere)'" , clue@ietf.org Message-id: <00b001cc635f$f029b590$d07d20b0$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: text/plain; charset=us-ascii Content-language: en-us Content-transfer-encoding: 7BIT Thread-index: AcxjRuGObPGNH+pfQMyfff6YGMDwLgAEerBgAABZ4YAAATSCQA== References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> <4E567C6C.9010504@cisco.com> <00a901cc6359$484f0880$d8ed1980$%roni@huawei.com> <9AC2C4348FD86B4BB1F8FA9C5E3A5EDC05603802@xmb-sjc-221.amer.cisco.com> Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Aug 2011 19:48:03 -0000 Hi, This will mean two stage, the initial SIP exchange will require a valid SDP for backward interoperability that will open one video, one video and CLUE channels and only afterwards the full telepresence sessions will be added. We can say that systems that support CLUE will wait for the exchange in the CLUE channel to establish media which is a delay. If we use multi body in the SIP message, we will still need to discuss how to have the three message exchange in an offer answer dialog. The way we chose will also affect my third question about using the messages after the initial media channels are running. This is why I was asking about maybe we need mode change messages. Roni > -----Original Message----- > From: Allyn Romanow (allyn) [mailto:allyn@cisco.com] > Sent: Thursday, August 25, 2011 10:08 PM > To: Roni Even; Andrew Pepperell (apeppere); clue@ietf.org > Subject: RE: [clue] Questions on basic message flow in the framework > > One option would be to establish CLUE through SIP, and then these > messages are CLUE messages, not SIP messages. > > > -----Original Message----- > > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf > Of > > Roni Even > > Sent: Thursday, August 25, 2011 12:01 PM > > To: Andrew Pepperell (apeppere); clue@ietf.org > > Subject: Re: [clue] Questions on basic message flow in the framework > > > > Hi, > > During the initial discussion on this work one of the issues was if > we > > are > > talking about one or two stage signaling. As far as I remember we > > talked > > about one stage signaling, is this still the case or do we still keep > > it > > open. This was way I asked about the mapping to SIP and why I think > we > > need > > to consider it early to verify if the proposed message flow works > with > > one > > stage signaling. > > Regards > > Roni > > > > > -----Original Message----- > > > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On > Behalf > > Of > > > Andy Pepperell > > > Sent: Thursday, August 25, 2011 7:47 PM > > > To: clue@ietf.org > > > Subject: Re: [clue] Questions on basic message flow in the > framework > > > > > > Thanks Charles! To follow up: > > > > > > >>[Roni] > > > >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried > to > > > understand what will be conveyed in the three messages and how will > > we > > > use the information. > > > >> The first question I had was how this relates to SIP. At which > > > stage > > > of the SIP call will the consumer advertize its capabilities? > > > >[Charles] > > > >I think it is best to focus on the framework a bit more before > > > mapping > > > to SIP. > > > > > > Yes, that's been our approach so far... > > > > > > >>[Roni] > > > >> The third question I had was if these three messages can be > > > repeated > > > at any time or do we see a different message to request a mode > > change. > > > >[Charles] > > > >My understanding, based on the presentation in the virtual > meeting, > > > is > > > that these messages, though shown as an ordered exchange, could > > > theoretically come in any order at any time. > > > > > > While for the purposes of of producing robust implementations > > messages > > > would need to be handled "in any order at any time", within the > model > > > as > > > proposed there are some constraints - specifically, a media stream > > > provider must not send a media capture advertisement until it's > seen > > at > > > least one consumer capability advertisement, and a consumer would > not > > > be > > > able to send a stream configuration message until it's seen at > least > > > one > > > media capture advertisement from the provider. > > > > > > Andy > > > > > > > > > On 24/08/2011 21:09, Charles Eckel (eckelcu) wrote: > > > > Hi Roni, > > > > > > > > Please see inline. > > > > > > > >> -----Original Message----- > > > >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On > > Behalf > > > > Of Roni Even > > > >> Sent: Wednesday, August 24, 2011 6:30 AM > > > >> To: clue@ietf.org > > > >> Subject: [clue] Questions on basic message flow in the framework > > > >> > > > >> Hi, > > > >> > > > >> In the interim meeting I mentioned that I that I support the > model > > > but > > > > think that there are parameters > > > >> that I would like to add. At the meeting it was clear to me that > > > there > > > > will be a new revision soon > > > >> that will support parameters at the capture scene level. Trying > to > > > see > > > > which parameters I would like > > > >> to see supported I looked at the message flow and I have some > > > > questions. > > > >> Andy presented the basic message flow with three messages: > > > >> > > > >> > > > >> > > > >> 1. Consumer capability advertisement > > > >> > > > >> 2. Provider - media capture advertisement > > > >> > > > >> 3. Consumer configuration of provider streams. > > > >> > > > >> > > > >> > > > >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to > > > > understand what will be conveyed in > > > >> the three messages and how will we use the information. > > > >> > > > >> The first question I had was how this relates to SIP. At which > > stage > > > > of the SIP call will the consumer > > > >> advertize its capabilities? > > > > I think it is best to focus on the framework a bit more before > > > mapping > > > > to SIP. > > > > > > > >> In the second part, I was then looking at a telepresence system > > > that > > > > has 3 65" screens where the > > > >> distance between the screens including the frames is 6". The > > system > > > > has three cameras, each mounted on > > > >> the center of a screen. The system is facing a room with three > > rows > > > > each row sits 6 people and each > > > >> camera is capable of capturing a third of the room but the > default > > > > views of each camera does not > > > >> overlap with the others. The cameras support zoom and pan (local > > > from > > > > the application). > > > >> The system can decode up to four video streams where one is > > > > presentation (H.239 like). The system can > > > >> support an internal 4-way multipoint call, means that it can > > receive > > > > the three main video streams from > > > >> one, two or three endpoints. > > > >> > > > >> I think that this is a very standard system, nothing special. > > > >> > > > >> The telepresence application is willing to provide all this > > > > information as part of the consumer > > > >> capability advertisement and according to Andy's slides the > > message > > > > include physical factors , user > > > >> preferences and software limitations. > > > >> > > > >> I am now trying to understand what the purpose of the consumer > > > > capability advertisement is in order to > > > >> see what information is important to convey. > > > >> > > > >> Is the reason for the consumer capability advertisement to allow > > the > > > > provider to propose a better > > > >> media capability advertisement, or is it to allow the provider > to > > > > optimize the content of the media > > > >> streams he is sending based on the information provided. This > will > > > > help with looking at which > > > >> parameters can be used. The slides show that the information is > > used > > > > for the capability > > > >> advertisements. > > > > I viewed it as being primarily for the former, but using it for > the > > > > latter may make sense as well and should not be excluded. The > > extent > > > to > > > > which the provider actually uses the information is > implementation > > > > dependent. > > > > > > > >> > > > >> > > > >> The third question I had was if these three messages can be > > repeated > > > > at any time or do we see a > > > >> different message to request a mode change. > > > >> > > > > My understanding, based on the presentation in the virtual > meeting, > > > is > > > > that these messages, though shown as an ordered exchange, could > > > > theoretically come in any order at any time. > > > > > > > > Cheers, > > > > Charles > > > > > > > >> Thanks > > > >> > > > >> Roni Even > > > >> > > > >> > > > > _______________________________________________ > > > > clue mailing list > > > > clue@ietf.org > > > > https://www.ietf.org/mailman/listinfo/clue > > > > > > _______________________________________________ > > > clue mailing list > > > clue@ietf.org > > > https://www.ietf.org/mailman/listinfo/clue > > > > _______________________________________________ > > clue mailing list > > clue@ietf.org > > https://www.ietf.org/mailman/listinfo/clue From pkyzivat@alum.mit.edu Fri Aug 26 07:12:24 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 299A121F8B5A for ; Fri, 26 Aug 2011 07:12:24 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.519 X-Spam-Level: X-Spam-Status: No, score=-2.519 tagged_above=-999 required=5 tests=[AWL=0.080, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sAnLYeS-xb+8 for ; Fri, 26 Aug 2011 07:12:23 -0700 (PDT) Received: from qmta12.westchester.pa.mail.comcast.net (qmta12.westchester.pa.mail.comcast.net [76.96.59.227]) by ietfa.amsl.com (Postfix) with ESMTP id E21CB21F8B5E for ; Fri, 26 Aug 2011 07:12:22 -0700 (PDT) Received: from omta06.westchester.pa.mail.comcast.net ([76.96.62.51]) by qmta12.westchester.pa.mail.comcast.net with comcast id R2C81h00A16LCl05C2Ddk4; Fri, 26 Aug 2011 14:13:37 +0000 Received: from Paul-Kyzivats-MacBook-Pro.local ([24.62.109.41]) by omta06.westchester.pa.mail.comcast.net with comcast id R2Da1h0040tdiYw3S2Db5B; Fri, 26 Aug 2011 14:13:36 +0000 Message-ID: <4E57AA11.6090704@alum.mit.edu> Date: Fri, 26 Aug 2011 10:13:37 -0400 From: Paul Kyzivat User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: clue@ietf.org References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> <4E567C6C.9010504@cisco.com> <00a901cc6359$484f0880$d8ed1980$%roni@huawei.com> <9AC2C4348FD86B4BB1F8FA9C5E3A5EDC05603802@xmb-sjc-221.amer.cisco.com> <00b001cc635f$f029b590$d07d20b0$%roni@huawei.com> In-Reply-To: <00b001cc635f$f029b590$d07d20b0$%roni@huawei.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Aug 2011 14:12:24 -0000 On 8/25/11 3:48 PM, Roni Even wrote: > Hi, > This will mean two stage, the initial SIP exchange will require a valid SDP > for backward interoperability that will open one video, one video and CLUE > channels and only afterwards the full telepresence sessions will be added. > We can say that systems that support CLUE will wait for the exchange in the > CLUE channel to establish media which is a delay. > If we use multi body in the SIP message, we will still need to discuss how > to have the three message exchange in an offer answer dialog. > > The way we chose will also affect my third question about using the messages > after the initial media channels are running. This is why I was asking about > maybe we need mode change messages. There are many ways to accomplish this. Is your concern the call setup delay while more messages are exchanged? Or is it other aspects of user experience, such as establishing one video stream before the others? Delay due to extra message exchange can be hidden so that the user experience isn't diminished (much). E.g. the extra message(s) can be exchanged while "ringing" (on calling side) and before alerting commences (on called side). For instance, there could be multiple O/A exchanges, with preconditions used to delay the alerting. There could then also be exchanges over a "clue" stream between the first o/a and the last one resolving the preconditions. I don't think its necessary to decide on this mechanism yet. Thanks Paul > Roni > >> -----Original Message----- >> From: Allyn Romanow (allyn) [mailto:allyn@cisco.com] >> Sent: Thursday, August 25, 2011 10:08 PM >> To: Roni Even; Andrew Pepperell (apeppere); clue@ietf.org >> Subject: RE: [clue] Questions on basic message flow in the framework >> >> One option would be to establish CLUE through SIP, and then these >> messages are CLUE messages, not SIP messages. >> >>> -----Original Message----- >>> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf >> Of >>> Roni Even >>> Sent: Thursday, August 25, 2011 12:01 PM >>> To: Andrew Pepperell (apeppere); clue@ietf.org >>> Subject: Re: [clue] Questions on basic message flow in the framework >>> >>> Hi, >>> During the initial discussion on this work one of the issues was if >> we >>> are >>> talking about one or two stage signaling. As far as I remember we >>> talked >>> about one stage signaling, is this still the case or do we still keep >>> it >>> open. This was way I asked about the mapping to SIP and why I think >> we >>> need >>> to consider it early to verify if the proposed message flow works >> with >>> one >>> stage signaling. >>> Regards >>> Roni >>> >>>> -----Original Message----- >>>> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On >> Behalf >>> Of >>>> Andy Pepperell >>>> Sent: Thursday, August 25, 2011 7:47 PM >>>> To: clue@ietf.org >>>> Subject: Re: [clue] Questions on basic message flow in the >> framework >>>> >>>> Thanks Charles! To follow up: >>>> >>>> >>[Roni] >>>> >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried >> to >>>> understand what will be conveyed in the three messages and how will >>> we >>>> use the information. >>>> >> The first question I had was how this relates to SIP. At which >>>> stage >>>> of the SIP call will the consumer advertize its capabilities? >>>> >[Charles] >>>> >I think it is best to focus on the framework a bit more before >>>> mapping >>>> to SIP. >>>> >>>> Yes, that's been our approach so far... >>>> >>>> >>[Roni] >>>> >> The third question I had was if these three messages can be >>>> repeated >>>> at any time or do we see a different message to request a mode >>> change. >>>> >[Charles] >>>> >My understanding, based on the presentation in the virtual >> meeting, >>>> is >>>> that these messages, though shown as an ordered exchange, could >>>> theoretically come in any order at any time. >>>> >>>> While for the purposes of of producing robust implementations >>> messages >>>> would need to be handled "in any order at any time", within the >> model >>>> as >>>> proposed there are some constraints - specifically, a media stream >>>> provider must not send a media capture advertisement until it's >> seen >>> at >>>> least one consumer capability advertisement, and a consumer would >> not >>>> be >>>> able to send a stream configuration message until it's seen at >> least >>>> one >>>> media capture advertisement from the provider. >>>> >>>> Andy >>>> >>>> >>>> On 24/08/2011 21:09, Charles Eckel (eckelcu) wrote: >>>>> Hi Roni, >>>>> >>>>> Please see inline. >>>>> >>>>>> -----Original Message----- >>>>>> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On >>> Behalf >>>>> Of Roni Even >>>>>> Sent: Wednesday, August 24, 2011 6:30 AM >>>>>> To: clue@ietf.org >>>>>> Subject: [clue] Questions on basic message flow in the framework >>>>>> >>>>>> Hi, >>>>>> >>>>>> In the interim meeting I mentioned that I that I support the >> model >>>> but >>>>> think that there are parameters >>>>>> that I would like to add. At the meeting it was clear to me that >>>> there >>>>> will be a new revision soon >>>>>> that will support parameters at the capture scene level. Trying >> to >>>> see >>>>> which parameters I would like >>>>>> to see supported I looked at the message flow and I have some >>>>> questions. >>>>>> Andy presented the basic message flow with three messages: >>>>>> >>>>>> >>>>>> >>>>>> 1. Consumer capability advertisement >>>>>> >>>>>> 2. Provider - media capture advertisement >>>>>> >>>>>> 3. Consumer configuration of provider streams. >>>>>> >>>>>> >>>>>> >>>>>> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to >>>>> understand what will be conveyed in >>>>>> the three messages and how will we use the information. >>>>>> >>>>>> The first question I had was how this relates to SIP. At which >>> stage >>>>> of the SIP call will the consumer >>>>>> advertize its capabilities? >>>>> I think it is best to focus on the framework a bit more before >>>> mapping >>>>> to SIP. >>>>> >>>>>> In the second part, I was then looking at a telepresence system >>>> that >>>>> has 3 65" screens where the >>>>>> distance between the screens including the frames is 6". The >>> system >>>>> has three cameras, each mounted on >>>>>> the center of a screen. The system is facing a room with three >>> rows >>>>> each row sits 6 people and each >>>>>> camera is capable of capturing a third of the room but the >> default >>>>> views of each camera does not >>>>>> overlap with the others. The cameras support zoom and pan (local >>>> from >>>>> the application). >>>>>> The system can decode up to four video streams where one is >>>>> presentation (H.239 like). The system can >>>>>> support an internal 4-way multipoint call, means that it can >>> receive >>>>> the three main video streams from >>>>>> one, two or three endpoints. >>>>>> >>>>>> I think that this is a very standard system, nothing special. >>>>>> >>>>>> The telepresence application is willing to provide all this >>>>> information as part of the consumer >>>>>> capability advertisement and according to Andy's slides the >>> message >>>>> include physical factors , user >>>>>> preferences and software limitations. >>>>>> >>>>>> I am now trying to understand what the purpose of the consumer >>>>> capability advertisement is in order to >>>>>> see what information is important to convey. >>>>>> >>>>>> Is the reason for the consumer capability advertisement to allow >>> the >>>>> provider to propose a better >>>>>> media capability advertisement, or is it to allow the provider >> to >>>>> optimize the content of the media >>>>>> streams he is sending based on the information provided. This >> will >>>>> help with looking at which >>>>>> parameters can be used. The slides show that the information is >>> used >>>>> for the capability >>>>>> advertisements. >>>>> I viewed it as being primarily for the former, but using it for >> the >>>>> latter may make sense as well and should not be excluded. The >>> extent >>>> to >>>>> which the provider actually uses the information is >> implementation >>>>> dependent. >>>>> >>>>>> >>>>>> >>>>>> The third question I had was if these three messages can be >>> repeated >>>>> at any time or do we see a >>>>>> different message to request a mode change. >>>>>> >>>>> My understanding, based on the presentation in the virtual >> meeting, >>>> is >>>>> that these messages, though shown as an ordered exchange, could >>>>> theoretically come in any order at any time. >>>>> >>>>> Cheers, >>>>> Charles >>>>> >>>>>> Thanks >>>>>> >>>>>> Roni Even >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> clue mailing list >>>>> clue@ietf.org >>>>> https://www.ietf.org/mailman/listinfo/clue >>>> >>>> _______________________________________________ >>>> clue mailing list >>>> clue@ietf.org >>>> https://www.ietf.org/mailman/listinfo/clue >>> >>> _______________________________________________ >>> clue mailing list >>> clue@ietf.org >>> https://www.ietf.org/mailman/listinfo/clue > > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue > From marshall.eubanks@gmail.com Fri Aug 26 08:42:01 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5899021F8B7B for ; Fri, 26 Aug 2011 08:42:01 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -102.329 X-Spam-Level: X-Spam-Status: No, score=-102.329 tagged_above=-999 required=5 tests=[AWL=-0.397, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, SARE_HTML_USL_OBFU=1.666, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mOr0r228DvH9 for ; Fri, 26 Aug 2011 08:41:59 -0700 (PDT) Received: from mail-gx0-f172.google.com (mail-gx0-f172.google.com [209.85.161.172]) by ietfa.amsl.com (Postfix) with ESMTP id 5B28821F8B5C for ; Fri, 26 Aug 2011 08:41:59 -0700 (PDT) Received: by gxk19 with SMTP id 19so3435600gxk.31 for ; Fri, 26 Aug 2011 08:43:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=YKKuACI0IX7/eneq/BjowkMqma/77PZufgb8a/orbvo=; b=Wju1q6A6qfLlT1uolxIGsXhJPT89oeP7l2kcXlNznAcXU01T79SAHy/vo2wlrMv0gZ 4eBUksjB3slABvBeZccgozXbZgzFzi/SvJU26pEoYU7hm+5jkPl8EN14hckmfxarazH4 zDHmXYclAtilmQvnyF3Q7yQ+aZMH4L3iKaxFs= MIME-Version: 1.0 Received: by 10.150.164.1 with SMTP id m1mr2532481ybe.297.1314373393179; Fri, 26 Aug 2011 08:43:13 -0700 (PDT) Received: by 10.150.202.16 with HTTP; Fri, 26 Aug 2011 08:43:13 -0700 (PDT) In-Reply-To: <4E57AA11.6090704@alum.mit.edu> References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> <4E567C6C.9010504@cisco.com> <00a901cc6359$484f0880$d8ed1980$%roni@huawei.com> <9AC2C4348FD86B4BB1F8FA9C5E3A5EDC05603802@xmb-sjc-221.amer.cisco.com> <00b001cc635f$f029b590$d07d20b0$%roni@huawei.com> <4E57AA11.6090704@alum.mit.edu> Date: Fri, 26 Aug 2011 11:43:13 -0400 Message-ID: From: Marshall Eubanks To: Paul Kyzivat Content-Type: multipart/alternative; boundary=000e0cd58ca033e52404ab6a67d4 Cc: clue@ietf.org Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Aug 2011 15:42:01 -0000 --000e0cd58ca033e52404ab6a67d4 Content-Type: text/plain; charset=ISO-8859-1 On Fri, Aug 26, 2011 at 10:13 AM, Paul Kyzivat wrote: > On 8/25/11 3:48 PM, Roni Even wrote: > >> Hi, >> This will mean two stage, the initial SIP exchange will require a valid >> SDP >> for backward interoperability that will open one video, one video and CLUE >> channels and only afterwards the full telepresence sessions will be added. >> We can say that systems that support CLUE will wait for the exchange in >> the >> CLUE channel to establish media which is a delay. >> If we use multi body in the SIP message, we will still need to discuss how >> to have the three message exchange in an offer answer dialog. >> >> The way we chose will also affect my third question about using the >> messages >> after the initial media channels are running. This is why I was asking >> about >> maybe we need mode change messages. >> > > There are many ways to accomplish this. > > Is your concern the call setup delay while more messages are exchanged? Or > is it other aspects of user experience, such as establishing one video > stream before the others? > > Delay due to extra message exchange can be hidden so that the user > experience isn't diminished (much). E.g. the extra message(s) can be > exchanged while "ringing" (on calling side) and before alerting commences > (on called side). > > This needs to be looked at but may be acceptable at call setup, which always seems to take a few seconds. I am also worried about changes _during the call_, where a few seconds delay could be bad. Here is an example (I am going to try and capture the worries I expressed in QC, and at the interim). A session starts, with two endpoints separated by (say) 150 msec one way . There is Consumer capability advertisement |-----------------------------> Media Capture Advertisement <-----------------------------| Consumer config of provider |-----------------------------> streams That takes roughly 450 msec + a little, which seems OK. 1.) Won't in practice the provider send a SDP file to the consumer, which in practice the consumer should receive and parse (if only as an error check), so won't that add ANOTHER round trip ? So, won't that take 600 msec plus a little, which is less OK ? And, won't that also mean that, if there is a 1% packet loss, there will be a (1 - (0.99)^4 =) 4% chance of a problem with these handshakes ? And won't a drop on any of these 4 messages mean a considerably longer setup delay ? And, then 2.) Suppose, at some point in the session, there is a network blip and inbound becomes congested. The provider needs to throttle back. The consumer detects this, says "I need to advertise less bandwidth," sends a corresponding CCA via UDP. The provider receives this, and sends a new MCA. This gets lost in the congestion. Even if three in a row are sent, they might all be lost, as the link is congested. So, then there is a timer set. Tick, tick, tick. The consumer will send more CCAs (presumably) once the timer times out, but nothing comes back. The consumer can talk to the provider, but it doesn't know it, and it isn't allowed to except through this three way, so the situation never gets resolved. Meanwhile, up at Layer 8, the company CEO is getting pissed off. (And, in some parallel universe, IESG ADs are asking questions about congestion control.) Even a 1 second timer would mean that the time to recover from _one_ control packet loss could be 1.5 seconds, which is not good. This says to me that either - providers SHOULD include information about adapting to congestion in the first handshake, so that the consumer can send an appropriate config message as needed or - consumers should be able to send a "squelch" message to the provider, saying "reduce bandwidth to me now," and let the provider puzzle it out. This could take the form of (say) a not to exceed bandwidth in the config message; a subsequent config could be the same except for a lowered NTE bandwidth. Obviously, in a really severe problem, you might want to set that to zero. or both. All of this seems fairly fundamental to me, the sort of thing that needs to be addressed if we are going to use this 3-way handshake. Regards Marshall > For instance, there could be multiple O/A exchanges, with preconditions > used to delay the alerting. There could then also be exchanges over a "clue" > stream between the first o/a and the last one resolving the preconditions. > > I don't think its necessary to decide on this mechanism yet. > > Thanks > Paul > > > Roni >> >> -----Original Message----- >>> From: Allyn Romanow (allyn) [mailto:allyn@cisco.com] >>> Sent: Thursday, August 25, 2011 10:08 PM >>> To: Roni Even; Andrew Pepperell (apeppere); clue@ietf.org >>> Subject: RE: [clue] Questions on basic message flow in the framework >>> >>> One option would be to establish CLUE through SIP, and then these >>> messages are CLUE messages, not SIP messages. >>> >>> -----Original Message----- >>>> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf >>>> >>> Of >>> >>>> Roni Even >>>> Sent: Thursday, August 25, 2011 12:01 PM >>>> To: Andrew Pepperell (apeppere); clue@ietf.org >>>> Subject: Re: [clue] Questions on basic message flow in the framework >>>> >>>> Hi, >>>> During the initial discussion on this work one of the issues was if >>>> >>> we >>> >>>> are >>>> talking about one or two stage signaling. As far as I remember we >>>> talked >>>> about one stage signaling, is this still the case or do we still keep >>>> it >>>> open. This was way I asked about the mapping to SIP and why I think >>>> >>> we >>> >>>> need >>>> to consider it early to verify if the proposed message flow works >>>> >>> with >>> >>>> one >>>> stage signaling. >>>> Regards >>>> Roni >>>> >>>> -----Original Message----- >>>>> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On >>>>> >>>> Behalf >>> >>>> Of >>>> >>>>> Andy Pepperell >>>>> Sent: Thursday, August 25, 2011 7:47 PM >>>>> To: clue@ietf.org >>>>> Subject: Re: [clue] Questions on basic message flow in the >>>>> >>>> framework >>> >>>> >>>>> Thanks Charles! To follow up: >>>>> >>>>> >>[Roni] >>>>> >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried >>>>> >>>> to >>> >>>> understand what will be conveyed in the three messages and how will >>>>> >>>> we >>>> >>>>> use the information. >>>>> >> The first question I had was how this relates to SIP. At which >>>>> stage >>>>> of the SIP call will the consumer advertize its capabilities? >>>>> >[Charles] >>>>> >I think it is best to focus on the framework a bit more before >>>>> mapping >>>>> to SIP. >>>>> >>>>> Yes, that's been our approach so far... >>>>> >>>>> >>[Roni] >>>>> >> The third question I had was if these three messages can be >>>>> repeated >>>>> at any time or do we see a different message to request a mode >>>>> >>>> change. >>>> >>>>> >[Charles] >>>>> >My understanding, based on the presentation in the virtual >>>>> >>>> meeting, >>> >>>> is >>>>> that these messages, though shown as an ordered exchange, could >>>>> theoretically come in any order at any time. >>>>> >>>>> While for the purposes of of producing robust implementations >>>>> >>>> messages >>>> >>>>> would need to be handled "in any order at any time", within the >>>>> >>>> model >>> >>>> as >>>>> proposed there are some constraints - specifically, a media stream >>>>> provider must not send a media capture advertisement until it's >>>>> >>>> seen >>> >>>> at >>>> >>>>> least one consumer capability advertisement, and a consumer would >>>>> >>>> not >>> >>>> be >>>>> able to send a stream configuration message until it's seen at >>>>> >>>> least >>> >>>> one >>>>> media capture advertisement from the provider. >>>>> >>>>> Andy >>>>> >>>>> >>>>> On 24/08/2011 21:09, Charles Eckel (eckelcu) wrote: >>>>> >>>>>> Hi Roni, >>>>>> >>>>>> Please see inline. >>>>>> >>>>>> -----Original Message----- >>>>>>> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On >>>>>>> >>>>>> Behalf >>>> >>>>> Of Roni Even >>>>>> >>>>>>> Sent: Wednesday, August 24, 2011 6:30 AM >>>>>>> To: clue@ietf.org >>>>>>> Subject: [clue] Questions on basic message flow in the framework >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> In the interim meeting I mentioned that I that I support the >>>>>>> >>>>>> model >>> >>>> but >>>>> >>>>>> think that there are parameters >>>>>> >>>>>>> that I would like to add. At the meeting it was clear to me that >>>>>>> >>>>>> there >>>>> >>>>>> will be a new revision soon >>>>>> >>>>>>> that will support parameters at the capture scene level. Trying >>>>>>> >>>>>> to >>> >>>> see >>>>> >>>>>> which parameters I would like >>>>>> >>>>>>> to see supported I looked at the message flow and I have some >>>>>>> >>>>>> questions. >>>>>> >>>>>>> Andy presented the basic message flow with three messages: >>>>>>> >>>>>>> >>>>>>> >>>>>>> 1. Consumer capability advertisement >>>>>>> >>>>>>> 2. Provider - media capture advertisement >>>>>>> >>>>>>> 3. Consumer configuration of provider streams. >>>>>>> >>>>>>> >>>>>>> >>>>>>> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to >>>>>>> >>>>>> understand what will be conveyed in >>>>>> >>>>>>> the three messages and how will we use the information. >>>>>>> >>>>>>> The first question I had was how this relates to SIP. At which >>>>>>> >>>>>> stage >>>> >>>>> of the SIP call will the consumer >>>>>> >>>>>>> advertize its capabilities? >>>>>>> >>>>>> I think it is best to focus on the framework a bit more before >>>>>> >>>>> mapping >>>>> >>>>>> to SIP. >>>>>> >>>>>> In the second part, I was then looking at a telepresence system >>>>>>> >>>>>> that >>>>> >>>>>> has 3 65" screens where the >>>>>> >>>>>>> distance between the screens including the frames is 6". The >>>>>>> >>>>>> system >>>> >>>>> has three cameras, each mounted on >>>>>> >>>>>>> the center of a screen. The system is facing a room with three >>>>>>> >>>>>> rows >>>> >>>>> each row sits 6 people and each >>>>>> >>>>>>> camera is capable of capturing a third of the room but the >>>>>>> >>>>>> default >>> >>>> views of each camera does not >>>>>> >>>>>>> overlap with the others. The cameras support zoom and pan (local >>>>>>> >>>>>> from >>>>> >>>>>> the application). >>>>>> >>>>>>> The system can decode up to four video streams where one is >>>>>>> >>>>>> presentation (H.239 like). The system can >>>>>> >>>>>>> support an internal 4-way multipoint call, means that it can >>>>>>> >>>>>> receive >>>> >>>>> the three main video streams from >>>>>> >>>>>>> one, two or three endpoints. >>>>>>> >>>>>>> I think that this is a very standard system, nothing special. >>>>>>> >>>>>>> The telepresence application is willing to provide all this >>>>>>> >>>>>> information as part of the consumer >>>>>> >>>>>>> capability advertisement and according to Andy's slides the >>>>>>> >>>>>> message >>>> >>>>> include physical factors , user >>>>>> >>>>>>> preferences and software limitations. >>>>>>> >>>>>>> I am now trying to understand what the purpose of the consumer >>>>>>> >>>>>> capability advertisement is in order to >>>>>> >>>>>>> see what information is important to convey. >>>>>>> >>>>>>> Is the reason for the consumer capability advertisement to allow >>>>>>> >>>>>> the >>>> >>>>> provider to propose a better >>>>>> >>>>>>> media capability advertisement, or is it to allow the provider >>>>>>> >>>>>> to >>> >>>> optimize the content of the media >>>>>> >>>>>>> streams he is sending based on the information provided. This >>>>>>> >>>>>> will >>> >>>> help with looking at which >>>>>> >>>>>>> parameters can be used. The slides show that the information is >>>>>>> >>>>>> used >>>> >>>>> for the capability >>>>>> >>>>>>> advertisements. >>>>>>> >>>>>> I viewed it as being primarily for the former, but using it for >>>>>> >>>>> the >>> >>>> latter may make sense as well and should not be excluded. The >>>>>> >>>>> extent >>>> >>>>> to >>>>> >>>>>> which the provider actually uses the information is >>>>>> >>>>> implementation >>> >>>> dependent. >>>>>> >>>>>> >>>>>>> >>>>>>> The third question I had was if these three messages can be >>>>>>> >>>>>> repeated >>>> >>>>> at any time or do we see a >>>>>> >>>>>>> different message to request a mode change. >>>>>>> >>>>>>> My understanding, based on the presentation in the virtual >>>>>> >>>>> meeting, >>> >>>> is >>>>> >>>>>> that these messages, though shown as an ordered exchange, could >>>>>> theoretically come in any order at any time. >>>>>> >>>>>> Cheers, >>>>>> Charles >>>>>> >>>>>> Thanks >>>>>>> >>>>>>> Roni Even >>>>>>> >>>>>>> >>>>>>> ______________________________**_________________ >>>>>> clue mailing list >>>>>> clue@ietf.org >>>>>> https://www.ietf.org/mailman/**listinfo/clue >>>>>> >>>>> >>>>> ______________________________**_________________ >>>>> clue mailing list >>>>> clue@ietf.org >>>>> https://www.ietf.org/mailman/**listinfo/clue >>>>> >>>> >>>> ______________________________**_________________ >>>> clue mailing list >>>> clue@ietf.org >>>> https://www.ietf.org/mailman/**listinfo/clue >>>> >>> >> ______________________________**_________________ >> clue mailing list >> clue@ietf.org >> https://www.ietf.org/mailman/**listinfo/clue >> >> > ______________________________**_________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/**listinfo/clue > --000e0cd58ca033e52404ab6a67d4 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

On Fri, Aug 26, 2011 at 10:13 AM, Paul K= yzivat <pkyzi= vat@alum.mit.edu> wrote:
On 8/25/11 3:48 PM, Roni Even wrote:
Hi,
This will mean two stage, the initial SIP exchange will require a valid SDP=
for backward interoperability that will open one video, one video and CLUE<= br> channels and only afterwards the full telepresence sessions will be added.<= br> We can say that systems that support CLUE will wait for the exchange in the=
CLUE channel to establish media which is a delay.
If we use multi body in the SIP message, we will still need to discuss how<= br> to have the three message exchange in an offer answer dialog.

The way we chose will also affect my third question about using the message= s
after the initial media channels are running. This is why I was asking abou= t
maybe we need mode change messages.

There are many ways to accomplish this.

Is your concern the call setup delay while more messages are exchanged? Or = is it other aspects of user experience, such as establishing one video stre= am before the others?

Delay due to extra message exchange can be hidden so that the user experien= ce isn't diminished (much). E.g. the extra message(s) can be exchanged = while "ringing" (on calling side) and before alerting commences (= on called side).


This needs to be looked at but may be = acceptable at call setup, which always seems to take a few seconds.

I am also worried about changes _during the call_, where = a few seconds delay could be bad.

Here is an example (I am going to try and capture the w= orries I expressed in QC, and at the interim).

A s= ession starts, with two endpoints separated by (say) 150 msec one way . The= re is=A0

Consumer capability advertisement |---------------= -------------->

Media Capture Advertisement =A0= =A0 =A0 <-----------------------------|

Consum= er config of provider =A0 =A0 =A0 |----------------------------->
streams=A0

That takes roughly 450 msec = + a little, which seems OK.

1.) Won't in pract= ice the provider send a SDP file to the consumer, which in practice the con= sumer should receive and parse (if only as an error check), so won't th= at add ANOTHER round trip ? So, won't that take 600 msec plus a little,= which is less OK ? And, won't that also mean that,
if there is a 1% packet loss, there will be a (1 - (0.99)^4 =3D) =A04%= chance of a problem with these handshakes ? And won't a drop on any of= these 4 messages mean a considerably longer setup delay ?=A0
And, then=A0

2.) Suppose, at some point in th= e session, there is a network blip and inbound becomes congested. The provi= der needs to throttle back. The consumer detects this, says "I need to= advertise less bandwidth," sends=A0a corresponding CCA via UDP. The p= rovider receives this, and sends a new MCA.

This gets lost in the congestion. Even if three in a ro= w are sent, they might all be lost, as the link is congested.
So, then there is a timer set. Tick, tick, tick. The consumer w= ill send more CCAs (presumably) once the timer times out, but nothing comes= back. The consumer can talk to the provider, but it doesn't know it, a= nd it isn't allowed to except through
this three way, so the situation never gets resolved. Meanwhile, up at= Layer 8, the company CEO is getting pissed off. (And, in some parallel uni= verse, IESG ADs are asking questions about congestion control.) Even a 1 se= cond timer would mean that the time to recover from _one_ control packet lo= ss could be 1.5 seconds, which is not good.=A0

This says to me that either

- = providers SHOULD include information about adapting to congestion in the fi= rst handshake, so that the consumer can send an appropriate
confi= g message as needed

or=A0

- consumers =A0should be= able to send a "squelch" message to the provider, saying "r= educe bandwidth to me now," and let the provider puzzle it out. This c= ould take the form of (say) a not to exceed bandwidth in the config message= ; a subsequent config could be the same except for a lowered =A0NTE bandwid= th. Obviously, in a really severe problem, you might want to set that to ze= ro.

or both.=A0

All of this seems = fairly fundamental to me, the sort of thing that needs to be addressed if w= e are going to use this 3-way handshake.=A0

Regard= s
Marshall

=A0
For instance, there could be multiple O/A exchanges, with preconditions use= d to delay the alerting. There could then also be exchanges over a "cl= ue" stream between the first o/a and the last one resolving the precon= ditions.

I don't think its necessary to decide on this mechanism yet.

=A0 =A0 =A0 =A0Thanks
=A0 =A0 =A0 =A0Paul


Roni

-----Original Message-----
From: Allyn Romanow (allyn) [mailto:allyn@cisco.com]
Sent: Thursday, August 25, 2011 10:08 PM
To: Roni Even; Andrew Pepperell (apeppere); clue@ietf.org
Subject: RE: [clue] Questions on basic message flow in the framework

One option would be to establish CLUE through SIP, and then these
messages are CLUE messages, not SIP messages.

-----Original Message-----
From: clue-bounc= es@ietf.org [mailto:clue-bounces@ietf.org] On Behalf
Of
Roni Even
Sent: Thursday, August 25, 2011 12:01 PM
To: Andrew Pepperell (apeppere); clue@ietf.org
Subject: Re: [clue] Questions on basic message flow in the framework

Hi,
During the initial discussion on this work one of the issues was if
we
are
talking about one or two stage signaling. As far as I remember we
talked
about one stage signaling, is this still the case or do we still keep
it
open. This was way I asked about the mapping to SIP and why I think
we
need
to consider it early to verify if the proposed message flow works
with
one
stage signaling.
Regards
Roni

-----Original Message-----
From: clue-bounc= es@ietf.org [mailto:clue-bounces@ietf.org] On
Behalf
Of
Andy Pepperell
Sent: Thursday, August 25, 2011 7:47 PM
To: clue@ietf.org Subject: Re: [clue] Questions on basic message flow in the
framework

Thanks Charles! To follow up:

=A0>>[Roni]
=A0>> =A0I was looking at the use cases of 3 to 3 and 3 to 1 and tri= ed
to
understand what will be conveyed in the three messages and how will
we
use the information.
=A0>> =A0The first question I had was how this relates to SIP. At wh= ich
stage
of the SIP call will the consumer advertize its capabilities?
=A0>[Charles]
=A0>I think it is best to focus on the framework a bit more before
mapping
to SIP.

Yes, that's been our approach so far...

=A0>>[Roni]
=A0>> =A0The third question I had was if these three messages can be=
repeated
at any time or do we see a different message to request a mode
change.
=A0>[Charles]
=A0>My understanding, based on the presentation in the virtual
meeting,
is
that these messages, though shown as an ordered exchange, could
theoretically come in any order at any time.

While for the purposes of of producing robust implementations
messages
would need to be handled "in any order at any time", within the
model
as
proposed there are some constraints - specifically, a media stream
provider must not send a media capture advertisement until it's
seen
at
least one consumer capability advertisement, and a consumer would
not
be
able to send a stream configuration message until it's seen at
least
one
media capture advertisement from the provider.

Andy


On 24/08/2011 21:09, Charles Eckel (eckelcu) wrote:
Hi Roni,

Please see inline.

-----Original Message-----
From: clue-bounc= es@ietf.org [mailto:clue-bounces@ietf.org] On
Behalf
Of Roni Even
Sent: Wednesday, August 24, 2011 6:30 AM
To: clue@ietf.org Subject: [clue] Questions on basic message flow in the framework

Hi,

In the interim meeting I mentioned that I that I support the
model
but
think that there are parameters
that I would like to add. At the meeting it was clear to me that
there
will be a new revision soon
that will support parameters at the capture scene level. Trying
to
see
which parameters I would like
to see supported I looked at the message flow and I have some
questions.
Andy presented the basic message flow with three messages:



1. =A0 =A0 =A0 Consumer capability advertisement

2. =A0 =A0 =A0 Provider - media capture advertisement

3. =A0 =A0 =A0 Consumer configuration of provider streams.



I was looking at the use cases of 3 to 3 and 3 to 1 and tried to
understand what will be conveyed in
the three messages and how will we use the information.

The first question I had was how this relates to SIP. At which
stage
of the SIP call will the consumer
advertize its capabilities?
I think it is best to focus on the framework a bit more before
mapping
to SIP.

In the second part, I was then looking at a =A0telepresence system
that
has 3 65" screens where the
distance between the screens including the frames is 6". The
system
has three cameras, each mounted on
the center of a screen. The system is facing a room with three
rows
each row sits 6 people and each
camera is capable of capturing a third of the room but the
default
views of each camera does not
overlap with the others. The cameras support zoom and pan (local
from
the application).
The system can decode up to four video streams where one is
presentation (H.239 like). The system can
support an internal 4-way multipoint call, means that it can
receive
the three main video streams from
one, two or three endpoints.

I think that this is a very standard system, nothing special.

The telepresence application is willing to provide all this
information as part of the consumer
capability advertisement and according to Andy's slides the
message
include physical factors , user
preferences and software limitations.

I am now trying to understand what the purpose of the consumer
capability advertisement is in order to
see what information is important to convey.

Is the reason for the consumer capability advertisement to allow
the
provider to propose a better
media capability advertisement, or is it to allow the provider
to
optimize the content of the media
streams he is sending based on the information provided. This
will
help with looking at which
parameters can be used. The slides show that the information is
used
for the capability
advertisements.
I viewed it as being primarily for the former, but using it for
the
latter may make sense as well and should not be excluded. The
extent
to
which the provider actually uses the information is
implementation
dependent.



The third question I had was if these three messages can be
repeated
at any time or do we see a
different message to request a mode change.

My understanding, based on the presentation in the virtual
meeting,
is
that these messages, though shown as an ordered exchange, could
theoretically come in any order at any time.

Cheers,
Charles

Thanks

Roni Even


_______________________________________________
clue mailing list
clue@ietf.org
ht= tps://www.ietf.org/mailman/listinfo/clue

_______________________________________________
clue mailing list
clue@ietf.org
ht= tps://www.ietf.org/mailman/listinfo/clue

_______________________________________________
clue mailing list
clue@ietf.org
ht= tps://www.ietf.org/mailman/listinfo/clue

_______________________________________________
clue mailing list
clue@ietf.org
ht= tps://www.ietf.org/mailman/listinfo/clue


_______________________________________________
clue mailing list
clue@ietf.org
ht= tps://www.ietf.org/mailman/listinfo/clue

--000e0cd58ca033e52404ab6a67d4-- From Even.roni@huawei.com Fri Aug 26 09:37:47 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3D80221F8CA8 for ; Fri, 26 Aug 2011 09:37:47 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.293 X-Spam-Level: X-Spam-Status: No, score=-106.293 tagged_above=-999 required=5 tests=[AWL=0.305, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WzVKyiGUOTV9 for ; Fri, 26 Aug 2011 09:37:45 -0700 (PDT) Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [119.145.14.66]) by ietfa.amsl.com (Postfix) with ESMTP id CBC5021F8CAA for ; Fri, 26 Aug 2011 09:37:30 -0700 (PDT) Received: from huawei.com (szxga03-in [172.24.2.9]) by szxga03-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQJ0009ZNKL3Z@szxga03-in.huawei.com> for clue@ietf.org; Sat, 27 Aug 2011 00:38:45 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga03-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQJ008QXNKLV3@szxga03-in.huawei.com> for clue@ietf.org; Sat, 27 Aug 2011 00:38:45 +0800 (CST) Received: from windows8d787f9 ([109.64.200.234]) by szxml11-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQJ00IRTNJQBS@szxml11-in.huawei.com>; Sat, 27 Aug 2011 00:38:45 +0800 (CST) Date: Fri, 26 Aug 2011 19:37:16 +0300 From: Roni Even In-reply-to: To: 'Marshall Eubanks' , 'Paul Kyzivat' Message-id: <015d01cc640e$6f4ae200$4de0a600$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_pXg3QeptBs3txPnXk5ahZA)" Content-language: en-us Thread-index: AcxkBvdjoPFGDASBSJWUzKE9wRkV+wABkemA References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> <4E567C6C.9010504@cisco.com> <00a901cc6359$484f0880$d8ed1980$%roni@huawei.com> <9AC2C4348FD86B4BB1F8FA9C5E3A5EDC05603802@xmb-sjc-221.amer.cisco.com> <00b001cc635f$f029b590$d07d20b0$%roni@huawei.com> <4E57AA11.6090704@alum.mit.edu> Cc: clue@ietf.org Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Aug 2011 16:37:47 -0000 This is a multi-part message in MIME format. --Boundary_(ID_pXg3QeptBs3txPnXk5ahZA) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Hi, It is not a three message flow but four as far as I understand in order to achieve a two way communication. The caller will send his Consumer capability advertisement (as a consumer) and an SDP for non CLUE interoperability. The called party will send Consumer capability advertisement and his Media Capture Advertisement The calling party will send Media Capture Advertisement and can send Consumer config of provider streams ( what it wants to receive) The called party will send Consumer config of provider streams. I think we need all these messages but should be aware of what it means. In the past we thought about having initial communication with partial streams. And get all streams after the full exchange. Maybe we can have it as an option that can be part of the offer/answer. Roni From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Marshall Eubanks Sent: Friday, August 26, 2011 6:43 PM To: Paul Kyzivat Cc: clue@ietf.org Subject: Re: [clue] Questions on basic message flow in the framework On Fri, Aug 26, 2011 at 10:13 AM, Paul Kyzivat wrote: On 8/25/11 3:48 PM, Roni Even wrote: Hi, This will mean two stage, the initial SIP exchange will require a valid SDP for backward interoperability that will open one video, one video and CLUE channels and only afterwards the full telepresence sessions will be added. We can say that systems that support CLUE will wait for the exchange in the CLUE channel to establish media which is a delay. If we use multi body in the SIP message, we will still need to discuss how to have the three message exchange in an offer answer dialog. The way we chose will also affect my third question about using the messages after the initial media channels are running. This is why I was asking about maybe we need mode change messages. There are many ways to accomplish this. Is your concern the call setup delay while more messages are exchanged? Or is it other aspects of user experience, such as establishing one video stream before the others? Delay due to extra message exchange can be hidden so that the user experience isn't diminished (much). E.g. the extra message(s) can be exchanged while "ringing" (on calling side) and before alerting commences (on called side). This needs to be looked at but may be acceptable at call setup, which always seems to take a few seconds. I am also worried about changes _during the call_, where a few seconds delay could be bad. Here is an example (I am going to try and capture the worries I expressed in QC, and at the interim). A session starts, with two endpoints separated by (say) 150 msec one way . There is Consumer capability advertisement |-----------------------------> Media Capture Advertisement <-----------------------------| Consumer config of provider |-----------------------------> streams That takes roughly 450 msec + a little, which seems OK. 1.) Won't in practice the provider send a SDP file to the consumer, which in practice the consumer should receive and parse (if only as an error check), so won't that add ANOTHER round trip ? So, won't that take 600 msec plus a little, which is less OK ? And, won't that also mean that, if there is a 1% packet loss, there will be a (1 - (0.99)^4 =) 4% chance of a problem with these handshakes ? And won't a drop on any of these 4 messages mean a considerably longer setup delay ? And, then 2.) Suppose, at some point in the session, there is a network blip and inbound becomes congested. The provider needs to throttle back. The consumer detects this, says "I need to advertise less bandwidth," sends a corresponding CCA via UDP. The provider receives this, and sends a new MCA. This gets lost in the congestion. Even if three in a row are sent, they might all be lost, as the link is congested. So, then there is a timer set. Tick, tick, tick. The consumer will send more CCAs (presumably) once the timer times out, but nothing comes back. The consumer can talk to the provider, but it doesn't know it, and it isn't allowed to except through this three way, so the situation never gets resolved. Meanwhile, up at Layer 8, the company CEO is getting pissed off. (And, in some parallel universe, IESG ADs are asking questions about congestion control.) Even a 1 second timer would mean that the time to recover from _one_ control packet loss could be 1.5 seconds, which is not good. This says to me that either - providers SHOULD include information about adapting to congestion in the first handshake, so that the consumer can send an appropriate config message as needed or - consumers should be able to send a "squelch" message to the provider, saying "reduce bandwidth to me now," and let the provider puzzle it out. This could take the form of (say) a not to exceed bandwidth in the config message; a subsequent config could be the same except for a lowered NTE bandwidth. Obviously, in a really severe problem, you might want to set that to zero. or both. All of this seems fairly fundamental to me, the sort of thing that needs to be addressed if we are going to use this 3-way handshake. Regards Marshall For instance, there could be multiple O/A exchanges, with preconditions used to delay the alerting. There could then also be exchanges over a "clue" stream between the first o/a and the last one resolving the preconditions. I don't think its necessary to decide on this mechanism yet. Thanks Paul Roni -----Original Message----- From: Allyn Romanow (allyn) [mailto:allyn@cisco.com] Sent: Thursday, August 25, 2011 10:08 PM To: Roni Even; Andrew Pepperell (apeppere); clue@ietf.org Subject: RE: [clue] Questions on basic message flow in the framework One option would be to establish CLUE through SIP, and then these messages are CLUE messages, not SIP messages. -----Original Message----- From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Roni Even Sent: Thursday, August 25, 2011 12:01 PM To: Andrew Pepperell (apeppere); clue@ietf.org Subject: Re: [clue] Questions on basic message flow in the framework Hi, During the initial discussion on this work one of the issues was if we are talking about one or two stage signaling. As far as I remember we talked about one stage signaling, is this still the case or do we still keep it open. This was way I asked about the mapping to SIP and why I think we need to consider it early to verify if the proposed message flow works with one stage signaling. Regards Roni -----Original Message----- From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Andy Pepperell Sent: Thursday, August 25, 2011 7:47 PM To: clue@ietf.org Subject: Re: [clue] Questions on basic message flow in the framework Thanks Charles! To follow up: >>[Roni] >> I was looking at the use cases of 3 to 3 and 3 to 1 and tried to understand what will be conveyed in the three messages and how will we use the information. >> The first question I had was how this relates to SIP. At which stage of the SIP call will the consumer advertize its capabilities? >[Charles] >I think it is best to focus on the framework a bit more before mapping to SIP. Yes, that's been our approach so far... >>[Roni] >> The third question I had was if these three messages can be repeated at any time or do we see a different message to request a mode change. >[Charles] >My understanding, based on the presentation in the virtual meeting, is that these messages, though shown as an ordered exchange, could theoretically come in any order at any time. While for the purposes of of producing robust implementations messages would need to be handled "in any order at any time", within the model as proposed there are some constraints - specifically, a media stream provider must not send a media capture advertisement until it's seen at least one consumer capability advertisement, and a consumer would not be able to send a stream configuration message until it's seen at least one media capture advertisement from the provider. Andy On 24/08/2011 21:09, Charles Eckel (eckelcu) wrote: Hi Roni, Please see inline. -----Original Message----- From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Roni Even Sent: Wednesday, August 24, 2011 6:30 AM To: clue@ietf.org Subject: [clue] Questions on basic message flow in the framework Hi, In the interim meeting I mentioned that I that I support the model but think that there are parameters that I would like to add. At the meeting it was clear to me that there will be a new revision soon that will support parameters at the capture scene level. Trying to see which parameters I would like to see supported I looked at the message flow and I have some questions. Andy presented the basic message flow with three messages: 1. Consumer capability advertisement 2. Provider - media capture advertisement 3. Consumer configuration of provider streams. I was looking at the use cases of 3 to 3 and 3 to 1 and tried to understand what will be conveyed in the three messages and how will we use the information. The first question I had was how this relates to SIP. At which stage of the SIP call will the consumer advertize its capabilities? I think it is best to focus on the framework a bit more before mapping to SIP. In the second part, I was then looking at a telepresence system that has 3 65" screens where the distance between the screens including the frames is 6". The system has three cameras, each mounted on the center of a screen. The system is facing a room with three rows each row sits 6 people and each camera is capable of capturing a third of the room but the default views of each camera does not overlap with the others. The cameras support zoom and pan (local from the application). The system can decode up to four video streams where one is presentation (H.239 like). The system can support an internal 4-way multipoint call, means that it can receive the three main video streams from one, two or three endpoints. I think that this is a very standard system, nothing special. The telepresence application is willing to provide all this information as part of the consumer capability advertisement and according to Andy's slides the message include physical factors , user preferences and software limitations. I am now trying to understand what the purpose of the consumer capability advertisement is in order to see what information is important to convey. Is the reason for the consumer capability advertisement to allow the provider to propose a better media capability advertisement, or is it to allow the provider to optimize the content of the media streams he is sending based on the information provided. This will help with looking at which parameters can be used. The slides show that the information is used for the capability advertisements. I viewed it as being primarily for the former, but using it for the latter may make sense as well and should not be excluded. The extent to which the provider actually uses the information is implementation dependent. The third question I had was if these three messages can be repeated at any time or do we see a different message to request a mode change. My understanding, based on the presentation in the virtual meeting, is that these messages, though shown as an ordered exchange, could theoretically come in any order at any time. Cheers, Charles Thanks Roni Even _______________________________________________ clue mailing list clue@ietf.org https://www.ietf.org/mailman/listinfo/clue _______________________________________________ clue mailing list clue@ietf.org https://www.ietf.org/mailman/listinfo/clue _______________________________________________ clue mailing list clue@ietf.org https://www.ietf.org/mailman/listinfo/clue _______________________________________________ clue mailing list clue@ietf.org https://www.ietf.org/mailman/listinfo/clue _______________________________________________ clue mailing list clue@ietf.org https://www.ietf.org/mailman/listinfo/clue --Boundary_(ID_pXg3QeptBs3txPnXk5ahZA) Content-type: text/html; charset=us-ascii Content-transfer-encoding: quoted-printable

Hi,

It is not a three message flow but four as far as I understand in = order to achieve a two way communication.

 

The caller will send his Consumer capability advertisement (as = a consumer) and an SDP for non CLUE interoperability.

The called party will send Consumer capability = advertisement and his Media Capture Advertisement

The calling party will send Media Capture = Advertisement  and can send Consumer config of provider =  streams ( what it wants to receive)

The called party will send Consumer config of provider =  streams.

 

I think we = need all these messages but should be aware of what it = means.

In the past we thought about = having initial communication with partial streams. And get all streams = after the full exchange. Maybe we can have it as an option that can be = part of the offer/answer.

 

Roni

 

From:= = clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of = Marshall Eubanks
Sent: Friday, August 26, 2011 6:43 = PM
To: Paul Kyzivat
Cc: = clue@ietf.org
Subject: Re: [clue] Questions on basic message = flow in the framework

 

 

On Fri, Aug 26, 2011 at 10:13 AM, Paul Kyzivat <pkyzivat@alum.mit.edu> = wrote:

On 8/25/11 3:48 PM, Roni = Even wrote:

Hi,
This will mean two = stage, the initial SIP exchange will require a valid SDP
for backward = interoperability that will open one video, one video and = CLUE
channels and only afterwards the full telepresence sessions will = be added.
We can say that systems that support CLUE will wait for the = exchange in the
CLUE channel to establish media which is a = delay.
If we use multi body in the SIP message, we will still need to = discuss how
to have the three message exchange in an offer answer = dialog.

The way we chose will also affect my third question about = using the messages
after the initial media channels are running. This = is why I was asking about
maybe we need mode change = messages.

 

There are many ways to accomplish = this.

Is your concern the call setup delay while more messages = are exchanged? Or is it other aspects of user experience, such as = establishing one video stream before the others?

Delay due to = extra message exchange can be hidden so that the user experience isn't = diminished (much). E.g. the extra message(s) can be exchanged while = "ringing" (on calling side) and before alerting commences (on = called side).

 

This needs to be looked at but may be acceptable at = call setup, which always seems to take a few = seconds.

 

I = am also worried about changes _during the call_, where a few seconds = delay could be bad.

 

Here is an example (I am going to try and capture the = worries I expressed in QC, and at the = interim).

 

A = session starts, with two endpoints separated by (say) 150 msec one way . = There is 

 

Consumer capability advertisement = |----------------------------->

 

Media Capture Advertisement       = <-----------------------------|

 

Consumer config of provider       = |----------------------------->

streams 

 

That takes roughly 450 msec + a little, which seems = OK.

 

1.) Won't in practice the provider send a SDP file to = the consumer, which in practice the consumer should receive and parse = (if only as an error check), so won't that add ANOTHER round trip ? So, = won't that take 600 msec plus a little, which is less OK ? And, won't = that also mean that,

if = there is a 1% packet loss, there will be a (1 - (0.99)^4 =3D)  4% = chance of a problem with these handshakes ? And won't a drop on any of = these 4 messages mean a considerably longer setup delay = ? 

 

And, then 

 

2.) Suppose, at some point in the session, there is a = network blip and inbound becomes congested. The provider needs to = throttle back. The consumer detects this, says "I need to advertise = less bandwidth," sends a corresponding CCA via UDP. The = provider receives this, and sends a new MCA.

 

This gets lost in the congestion. Even if three in a = row are sent, they might all be lost, as the link is = congested.

 

So, then there is a timer set. Tick, tick, tick. The = consumer will send more CCAs (presumably) once the timer times out, but = nothing comes back. The consumer can talk to the provider, but it = doesn't know it, and it isn't allowed to except = through

this three way, so = the situation never gets resolved. Meanwhile, up at Layer 8, the company = CEO is getting pissed off. (And, in some parallel universe, IESG ADs are = asking questions about congestion control.) Even a 1 second timer would = mean that the time to recover from _one_ control packet loss could be = 1.5 seconds, which is not good. 

 

This says to me that = either

 

- = providers SHOULD include information about adapting to congestion in the = first handshake, so that the consumer can send an = appropriate

config message = as needed

 

or 

 

- = consumers  should be able to send a "squelch" message to = the provider, saying "reduce bandwidth to me now," and let the = provider puzzle it out. This could take the form of (say) a not to = exceed bandwidth in the config message; a subsequent config could be the = same except for a lowered  NTE bandwidth. Obviously, in a really = severe problem, you might want to set that to = zero.

 

or both. 

 

All of this seems fairly fundamental to me, the sort = of thing that needs to be addressed if we are going to use this 3-way = handshake. 

 

Regards

Marshall

 

 

For = instance, there could be multiple O/A exchanges, with preconditions used = to delay the alerting. There could then also be exchanges over a = "clue" stream between the first o/a and the last one resolving = the preconditions.

I don't think its necessary to decide on this = mechanism yet.

       Thanks
      =  Paul

 

Roni

-----Original Message-----
From: Allyn = Romanow (allyn) [mailto:allyn@cisco.com]
Sent: Thursday, August 25, = 2011 10:08 PM
To: Roni Even; Andrew Pepperell (apeppere); clue@ietf.org
Subject: RE: [clue] Questions on = basic message flow in the framework

One option would be to = establish CLUE through SIP, and then these
messages are CLUE = messages, not SIP messages.

-----Original Message-----
From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf

Of

Roni = Even
Sent: Thursday, August 25, 2011 12:01 PM
To: Andrew Pepperell = (apeppere); clue@ietf.org
Subject: Re: [clue] Questions on = basic message flow in the framework

Hi,
During the initial = discussion on this work one of the issues was if

we

are
talking = about one or two stage signaling. As far as I remember = we
talked
about one stage signaling, is this still the case or do = we still keep
it
open. This was way I asked about the mapping to = SIP and why I think

we

need
to = consider it early to verify if the proposed message flow = works

with

one
stage = signaling.
Regards
Roni

-----Original Message-----
From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On

Behalf

Of

Andy = Pepperell
Sent: Thursday, August 25, 2011 7:47 PM
To: clue@ietf.org
Subject: Re: [clue] Questions on = basic message flow in the

framework


Thanks Charles! To follow = up:

 >>[Roni]
 >>  I was looking at = the use cases of 3 to 3 and 3 to 1 and = tried

to

understand what will be conveyed in the three messages = and how will

we

use the = information.
 >>  The first question I had was how = this relates to SIP. At which
stage
of the SIP call will the = consumer advertize its = capabilities?
 >[Charles]
 >I think it is best to = focus on the framework a bit more before
mapping
to = SIP.

Yes, that's been our approach so = far...

 >>[Roni]
 >>  The third = question I had was if these three messages can be
repeated
at any = time or do we see a different message to request a mode

change.

 >[Charles]
 >My understanding, = based on the presentation in the virtual

meeting,

is
that these messages, though shown as an ordered = exchange, could
theoretically come in any order at any = time.

While for the purposes of of producing robust = implementations

messages

would need = to be handled "in any order at any time", within = the

model

as
proposed there are some constraints - = specifically, a media stream
provider must not send a media capture = advertisement until it's

seen

at

least one = consumer capability advertisement, and a consumer would

not

be
able to send a stream configuration message = until it's seen at

least

one
media capture advertisement from the = provider.

Andy


On 24/08/2011 21:09, Charles Eckel = (eckelcu) wrote:

Hi Roni,

Please see = inline.

-----Original = Message-----
From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] = On

Behalf

Of Roni = Even

Sent: Wednesday, August 24, 2011 = 6:30 AM
To: clue@ietf.org
Subject: [clue] Questions on = basic message flow in the framework

Hi,

In the interim = meeting I mentioned that I that I support = the

model

but

think that = there are parameters

that I would = like to add. At the meeting it was clear to me that

there

will be a new = revision soon

that will support = parameters at the capture scene level. = Trying

to

see

which = parameters I would like

to see = supported I looked at the message flow and I have some

questions.

Andy = presented the basic message flow with three messages:



1. =       Consumer capability advertisement

2.   =     Provider - media capture advertisement

3.   =     Consumer configuration of provider = streams.



I was looking at the use cases of 3 to 3 and 3 = to 1 and tried to

understand what = will be conveyed in

the three = messages and how will we use the information.

The first question = I had was how this relates to SIP. At = which

stage

of the = SIP call will the consumer

advertize = its capabilities?

I think it is best = to focus on the framework a bit more = before

mapping

to SIP.

In the second part, I was then looking at a =  telepresence system

that

has 3 65" = screens where the

distance between = the screens including the frames is 6". The

system

has three = cameras, each mounted on

the center = of a screen. The system is facing a room with = three

rows

each row = sits 6 people and each

camera is = capable of capturing a third of the room but = the

default

views of = each camera does not

overlap with the = others. The cameras support zoom and pan = (local

from

the = application).

The system can decode = up to four video streams where one is

presentation (H.239 like). The system = can

support an internal 4-way = multipoint call, means that it can

receive

the three = main video streams from

one, two or = three endpoints.

I think that this is a very standard system, = nothing special.

The telepresence application is willing to = provide all this

information as part = of the consumer

capability = advertisement and according to Andy's slides = the

message

include = physical factors , user

preferences = and software limitations.

I am now trying to understand what the = purpose of the consumer

capability = advertisement is in order to

see what = information is important to convey.

Is the reason for the = consumer capability advertisement to allow

the

provider = to propose a better

media capability = advertisement, or is it to allow the = provider

to

optimize = the content of the media

streams he = is sending based on the information provided. = This

will

help with = looking at which

parameters can be = used. The slides show that the information = is

used

for the = capability

advertisements.

I = viewed it as being primarily for the former, but using it = for

the

latter = may make sense as well and should not be excluded. = The

extent

to

which the = provider actually uses the information is

implementation

dependent.



The third question I had was if these three = messages can be

repeated

at any = time or do we see a

different message to request a mode = change.

My understanding, based on = the presentation in the virtual

meeting,

is

that these messages, though shown as an = ordered exchange, could
theoretically come in any order at any = time.

Cheers,
Charles

Thanks

Roni = Even

_______________________________________________
clue= mailing list
clue@ietf.org
https://www.ietf.org/mailman/listinfo/clue


_______________________________________________
= clue mailing list
clue@ietf.org
https://www.ietf.org/mailman/listinfo/clue


_______________________________________________
= clue mailing list
clue@ietf.org
https://www.ietf.org/mailman/listinfo/clue


______________________________________= _________
clue mailing list
clue@ietf.org
https://www.ietf.org/mailman/listinfo/clue


_______________________________________________
= clue mailing list
clue@ietf.org
https://www.ietf.org/mailman/listinfo/clue

 

= --Boundary_(ID_pXg3QeptBs3txPnXk5ahZA)-- From pkyzivat@alum.mit.edu Fri Aug 26 10:24:48 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3782321F8C80 for ; Fri, 26 Aug 2011 10:24:48 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.52 X-Spam-Level: X-Spam-Status: No, score=-2.52 tagged_above=-999 required=5 tests=[AWL=0.079, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1fb03yktl41Q for ; Fri, 26 Aug 2011 10:24:47 -0700 (PDT) Received: from qmta04.westchester.pa.mail.comcast.net (qmta04.westchester.pa.mail.comcast.net [76.96.62.40]) by ietfa.amsl.com (Postfix) with ESMTP id A4C2521F86F6 for ; Fri, 26 Aug 2011 10:24:46 -0700 (PDT) Received: from omta10.westchester.pa.mail.comcast.net ([76.96.62.28]) by qmta04.westchester.pa.mail.comcast.net with comcast id R5QD1h0030cZkys545S4TT; Fri, 26 Aug 2011 17:26:04 +0000 Received: from Paul-Kyzivats-MacBook-Pro.local ([24.62.109.41]) by omta10.westchester.pa.mail.comcast.net with comcast id R5S01h00i0tdiYw3W5S2ij; Fri, 26 Aug 2011 17:26:03 +0000 Message-ID: <4E57D727.5020409@alum.mit.edu> Date: Fri, 26 Aug 2011 13:25:59 -0400 From: Paul Kyzivat User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: Marshall Eubanks References: <033601cc6261$f8487670$e8d96350$%roni@huawei.com> <4E567C6C.9010504@cisco.com> <00a901cc6359$484f0880$d8ed1980$%roni@huawei.com> <9AC2C4348FD86B4BB1F8FA9C5E3A5EDC05603802@xmb-sjc-221.amer.cisco.com> <00b001cc635f$f029b590$d07d20b0$%roni@huawei.com> <4E57AA11.6090704@alum.mit.edu> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: clue@ietf.org Subject: Re: [clue] Questions on basic message flow in the framework X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Aug 2011 17:24:48 -0000 On 8/26/11 11:43 AM, Marshall Eubanks wrote: > > > On Fri, Aug 26, 2011 at 10:13 AM, Paul Kyzivat > wrote: > > On 8/25/11 3:48 PM, Roni Even wrote: > > Hi, > This will mean two stage, the initial SIP exchange will require > a valid SDP > for backward interoperability that will open one video, one > video and CLUE > channels and only afterwards the full telepresence sessions will > be added. > We can say that systems that support CLUE will wait for the > exchange in the > CLUE channel to establish media which is a delay. > If we use multi body in the SIP message, we will still need to > discuss how > to have the three message exchange in an offer answer dialog. > > The way we chose will also affect my third question about using > the messages > after the initial media channels are running. This is why I was > asking about > maybe we need mode change messages. > > > There are many ways to accomplish this. > > Is your concern the call setup delay while more messages are > exchanged? Or is it other aspects of user experience, such as > establishing one video stream before the others? > > Delay due to extra message exchange can be hidden so that the user > experience isn't diminished (much). E.g. the extra message(s) can be > exchanged while "ringing" (on calling side) and before alerting > commences (on called side). > > > This needs to be looked at but may be acceptable at call setup, which > always seems to take a few seconds. > > I am also worried about changes _during the call_, where a few seconds > delay could be bad. > > Here is an example (I am going to try and capture the worries I > expressed in QC, and at the interim). > > A session starts, with two endpoints separated by (say) 150 msec one way > . There is > > Consumer capability advertisement |-----------------------------> > > Media Capture Advertisement <-----------------------------| > > Consumer config of provider |-----------------------------> > streams > > That takes roughly 450 msec + a little, which seems OK. > > 1.) Won't in practice the provider send a SDP file to the consumer, > which in practice the consumer should receive and parse (if only as an > error check), so won't that add ANOTHER round trip ? So, won't that take > 600 msec plus a little, which is less OK ? And, won't that also mean that, > if there is a 1% packet loss, there will be a (1 - (0.99)^4 =) 4% > chance of a problem with these handshakes ? And won't a drop on any of > these 4 messages mean a considerably longer setup delay ? I think it is still unclear if the above messages precede the O/A, are carried in some of the same messages as the O/A, or are actually embedded in the O/A SDP. So its as yet unclear whether there would be 3, 4, 5, or more messages. The impact of message drop will also depend on the specific mechanisms, but certainly there will be *some* impact. (But note that we may well be doing ICE as well, which can require many more messages.) > And, then > > 2.) Suppose, at some point in the session, there is a network blip and > inbound becomes congested. The provider needs to throttle back. The > consumer detects this, says "I need to advertise less bandwidth," > sends a corresponding CCA via UDP. The provider receives this, and sends > a new MCA. Do you think CLUE-specific signaling is the proper way to deal with that? Its not a problem that is unique to CLUE. Perhaps it would be better to use adaptive rate codecs for this. They could presumably respond more quickly. Thanks, Paul > This gets lost in the congestion. Even if three in a row are sent, they > might all be lost, as the link is congested. > > So, then there is a timer set. Tick, tick, tick. The consumer will send > more CCAs (presumably) once the timer times out, but nothing comes back. > The consumer can talk to the provider, but it doesn't know it, and it > isn't allowed to except through > this three way, so the situation never gets resolved. Meanwhile, up at > Layer 8, the company CEO is getting pissed off. (And, in some parallel > universe, IESG ADs are asking questions about congestion control.) Even > a 1 second timer would mean that the time to recover from _one_ control > packet loss could be 1.5 seconds, which is not good. > > This says to me that either > > - providers SHOULD include information about adapting to congestion in > the first handshake, so that the consumer can send an appropriate > config message as needed > > or > > - consumers should be able to send a "squelch" message to the provider, > saying "reduce bandwidth to me now," and let the provider puzzle it out. > This could take the form of (say) a not to exceed bandwidth in the > config message; a subsequent config could be the same except for a > lowered NTE bandwidth. Obviously, in a really severe problem, you might > want to set that to zero. > > or both. > > All of this seems fairly fundamental to me, the sort of thing that needs > to be addressed if we are going to use this 3-way handshake. > > Regards > Marshall > > For instance, there could be multiple O/A exchanges, with > preconditions used to delay the alerting. There could then also be > exchanges over a "clue" stream between the first o/a and the last > one resolving the preconditions. > > I don't think its necessary to decide on this mechanism yet. > > Thanks > Paul > > > Roni > > -----Original Message----- > From: Allyn Romanow (allyn) [mailto:allyn@cisco.com > ] > Sent: Thursday, August 25, 2011 10:08 PM > To: Roni Even; Andrew Pepperell (apeppere); clue@ietf.org > > Subject: RE: [clue] Questions on basic message flow in the > framework > > One option would be to establish CLUE through SIP, and then > these > messages are CLUE messages, not SIP messages. > > -----Original Message----- > From: clue-bounces@ietf.org > > [mailto:clue-bounces@ietf.org > ] On Behalf > > Of > > Roni Even > Sent: Thursday, August 25, 2011 12:01 PM > To: Andrew Pepperell (apeppere); clue@ietf.org > > Subject: Re: [clue] Questions on basic message flow in > the framework > > Hi, > During the initial discussion on this work one of the > issues was if > > we > > are > talking about one or two stage signaling. As far as I > remember we > talked > about one stage signaling, is this still the case or do > we still keep > it > open. This was way I asked about the mapping to SIP and > why I think > > we > > need > to consider it early to verify if the proposed message > flow works > > with > > one > stage signaling. > Regards > Roni > > -----Original Message----- > From: clue-bounces@ietf.org > > [mailto:clue-bounces@ietf.org > ] On > > Behalf > > Of > > Andy Pepperell > Sent: Thursday, August 25, 2011 7:47 PM > To: clue@ietf.org > Subject: Re: [clue] Questions on basic message flow > in the > > framework > > > Thanks Charles! To follow up: > > >>[Roni] > >> I was looking at the use cases of 3 to 3 and 3 > to 1 and tried > > to > > understand what will be conveyed in the three > messages and how will > > we > > use the information. > >> The first question I had was how this relates > to SIP. At which > stage > of the SIP call will the consumer advertize its > capabilities? > >[Charles] > >I think it is best to focus on the framework a bit > more before > mapping > to SIP. > > Yes, that's been our approach so far... > > >>[Roni] > >> The third question I had was if these three > messages can be > repeated > at any time or do we see a different message to > request a mode > > change. > > >[Charles] > >My understanding, based on the presentation in the > virtual > > meeting, > > is > that these messages, though shown as an ordered > exchange, could > theoretically come in any order at any time. > > While for the purposes of of producing robust > implementations > > messages > > would need to be handled "in any order at any time", > within the > > model > > as > proposed there are some constraints - specifically, > a media stream > provider must not send a media capture advertisement > until it's > > seen > > at > > least one consumer capability advertisement, and a > consumer would > > not > > be > able to send a stream configuration message until > it's seen at > > least > > one > media capture advertisement from the provider. > > Andy > > > On 24/08/2011 21:09, Charles Eckel (eckelcu) wrote: > > Hi Roni, > > Please see inline. > > -----Original Message----- > From: clue-bounces@ietf.org > > [mailto:clue-bounces@ietf.org > ] On > > Behalf > > Of Roni Even > > Sent: Wednesday, August 24, 2011 6:30 AM > To: clue@ietf.org > Subject: [clue] Questions on basic message > flow in the framework > > Hi, > > In the interim meeting I mentioned that I > that I support the > > model > > but > > think that there are parameters > > that I would like to add. At the meeting it > was clear to me that > > there > > will be a new revision soon > > that will support parameters at the capture > scene level. Trying > > to > > see > > which parameters I would like > > to see supported I looked at the message > flow and I have some > > questions. > > Andy presented the basic message flow with > three messages: > > > > 1. Consumer capability advertisement > > 2. Provider - media capture advertisement > > 3. Consumer configuration of provider > streams. > > > > I was looking at the use cases of 3 to 3 and > 3 to 1 and tried to > > understand what will be conveyed in > > the three messages and how will we use the > information. > > The first question I had was how this > relates to SIP. At which > > stage > > of the SIP call will the consumer > > advertize its capabilities? > > I think it is best to focus on the framework a > bit more before > > mapping > > to SIP. > > In the second part, I was then looking at a > telepresence system > > that > > has 3 65" screens where the > > distance between the screens including the > frames is 6". The > > system > > has three cameras, each mounted on > > the center of a screen. The system is facing > a room with three > > rows > > each row sits 6 people and each > > camera is capable of capturing a third of > the room but the > > default > > views of each camera does not > > overlap with the others. The cameras support > zoom and pan (local > > from > > the application). > > The system can decode up to four video > streams where one is > > presentation (H.239 like). The system can > > support an internal 4-way multipoint call, > means that it can > > receive > > the three main video streams from > > one, two or three endpoints. > > I think that this is a very standard system, > nothing special. > > The telepresence application is willing to > provide all this > > information as part of the consumer > > capability advertisement and according to > Andy's slides the > > message > > include physical factors , user > > preferences and software limitations. > > I am now trying to understand what the > purpose of the consumer > > capability advertisement is in order to > > see what information is important to convey. > > Is the reason for the consumer capability > advertisement to allow > > the > > provider to propose a better > > media capability advertisement, or is it to > allow the provider > > to > > optimize the content of the media > > streams he is sending based on the > information provided. This > > will > > help with looking at which > > parameters can be used. The slides show that > the information is > > used > > for the capability > > advertisements. > > I viewed it as being primarily for the former, > but using it for > > the > > latter may make sense as well and should not be > excluded. The > > extent > > to > > which the provider actually uses the information is > > implementation > > dependent. > > > > The third question I had was if these three > messages can be > > repeated > > at any time or do we see a > > different message to request a mode change. > > My understanding, based on the presentation in > the virtual > > meeting, > > is > > that these messages, though shown as an ordered > exchange, could > theoretically come in any order at any time. > > Cheers, > Charles > > Thanks > > Roni Even > > > _________________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/__listinfo/clue > > > > _________________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/__listinfo/clue > > > > _________________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/__listinfo/clue > > > > _________________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/__listinfo/clue > > > > _________________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/__listinfo/clue > > > From mary.ietf.barnes@gmail.com Mon Aug 29 14:44:21 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8FDC921F8B00 for ; Mon, 29 Aug 2011 14:44:21 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -103.454 X-Spam-Level: X-Spam-Status: No, score=-103.454 tagged_above=-999 required=5 tests=[AWL=0.144, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FOBRVq7Lyxx5 for ; Mon, 29 Aug 2011 14:44:20 -0700 (PDT) Received: from mail-vx0-f172.google.com (mail-vx0-f172.google.com [209.85.220.172]) by ietfa.amsl.com (Postfix) with ESMTP id 50E3121F8761 for ; Mon, 29 Aug 2011 14:44:20 -0700 (PDT) Received: by vxi29 with SMTP id 29so5822529vxi.31 for ; Mon, 29 Aug 2011 14:45:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=9SYVdpLr0hYfT+ShC+ab2uUXQm4GO2MalJt8yKwq0jE=; b=L+wyXZC4DOpj4FXsZ68Q2eGijwUDhZgztMPgDb16zJwbGhZOapKIBnn/qCnshsx1zF eQPyIcK+UGMkcHmMhnzQ3d1dk+yjwnYu/OlWD4gQz6rhE6+792tJ1V9HwF+l1bwnebeJ /crWeQFXdh4Uzgo+dpo/BEAyoRMJSWMvK8lhQ= MIME-Version: 1.0 Received: by 10.52.20.133 with SMTP id n5mr648200vde.302.1314654345600; Mon, 29 Aug 2011 14:45:45 -0700 (PDT) Received: by 10.52.183.201 with HTTP; Mon, 29 Aug 2011 14:45:45 -0700 (PDT) In-Reply-To: <1241030509.3022037.1314646459806.JavaMail.doodle@worker2> References: <1241030509.3022037.1314646459806.JavaMail.doodle@worker2> Date: Mon, 29 Aug 2011 16:45:45 -0500 Message-ID: From: Mary Barnes To: CLUE Content-Type: multipart/alternative; boundary=20cf307c9ff4459ddf04ababd1b4 Subject: [clue] Doodle: Link for poll "CLUE WG F2F Interim" X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 29 Aug 2011 21:44:21 -0000 --20cf307c9ff4459ddf04ababd1b4 Content-Type: text/plain; charset=ISO-8859-1 Hi folks, We are considering holding a face to face for CLUE in order to progress the framework. In speaking to some of the primary authors, October 11th and 12th (1.5 days) looks like it might work. The plan is to host the meeting in Boston (ideally at the Polycom Andover site, but we'll need to work out the logistics). However, we first need an idea of how many people could attend a f2f. http://doodle.com/h3ahqn9ht96m839k We would also have a Webex session. If you are not able to attend a f2f but would participate via Webex, please include a comment indicating such. In order to plan, we would like responses no later than Monday, Sept. 5th at 5pm Pacific. We will do a separate doodle poll for a virtual interim if we don't get enough folks able to attend a f2f. Thanks, Mary CLUE WG co-chair --20cf307c9ff4459ddf04ababd1b4 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi folks,

We are considering holding a face to face fo= r CLUE in order to progress the framework. In speaking to some of the prima= ry authors, October 11th and 12th (1.5 days) looks like it might work. =A0T= he plan is to host the meeting in Boston (ideally at the Polycom Andover si= te, but we'll need to work out the logistics). =A0However, we first nee= d an idea of how many people could attend a f2f. =A0
http://doodle.com/h3ahqn9ht96m839k

We would also have a Webex session. =A0If you are not able to attend a = f2f but would participate via Webex, please include a comment indicating su= ch.=A0

In order to plan, we would like responses no later than Monday, Sept. 5th = at 5pm Pacific. =A0

We will do = a separate doodle poll for a virtual interim if we don't get enough fol= ks able to attend a f2f.

Thanks,
Mary
CLUE WG co-chair


--20cf307c9ff4459ddf04ababd1b4-- From Even.roni@huawei.com Tue Aug 30 14:11:07 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2B7A421F8D2F for ; Tue, 30 Aug 2011 14:11:07 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.338 X-Spam-Level: X-Spam-Status: No, score=-106.338 tagged_above=-999 required=5 tests=[AWL=0.260, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Wy2+sZTL+oym for ; Tue, 30 Aug 2011 14:11:05 -0700 (PDT) Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [119.145.14.67]) by ietfa.amsl.com (Postfix) with ESMTP id 41C9E21F8D2B for ; Tue, 30 Aug 2011 14:11:04 -0700 (PDT) Received: from huawei.com (szxga04-in [172.24.2.12]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQR00BDEEWWCF@szxga04-in.huawei.com> for clue@ietf.org; Wed, 31 Aug 2011 05:12:32 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQR0048CEWVFU@szxga04-in.huawei.com> for clue@ietf.org; Wed, 31 Aug 2011 05:12:32 +0800 (CST) Received: from windows8d787f9 ([109.64.200.234]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQR00JTBEWQBI@szxml12-in.huawei.com>; Wed, 31 Aug 2011 05:12:31 +0800 (CST) Date: Wed, 31 Aug 2011 00:10:56 +0300 From: Roni Even To: clue@ietf.org Message-id: <049c01cc6759$534f7bd0$f9ee7370$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_kdeEj6yIwPuesVPIIVn2vA)" Content-language: en-us Thread-index: AcxnWUwOxhYXHs9DTi+pQrFJrptz2g== Subject: [clue] preworking group last call - X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Aug 2011 21:11:07 -0000 This is a multi-part message in MIME format. --Boundary_(ID_kdeEj6yIwPuesVPIIVn2vA) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Hi, Christian did a review and sent a markup work document with comments and sent it to me as the editor of the document. I will try to send the list the major comments. 1. The use case does not provide any text about dynamic changes of the media description during the call. Need to add to the use cases. 2. Section 1 "A standard way of describing the multiple streams constituting the media flows and the fundamental aspects of their behavior, would allow telepresence systems to interwork." The question is what is meant here. Is it that by allowing a common way to describe streams that systems that have different capabilities are able to negotiate a common set of capabilities. My view is that what it means that even though there are some standards in place there is no standard way to describe the streams behavior or semantics preventing interoperability even between systems with same capabilities from different manufacturers. 3. We have the following paragraph in the introduction and I think it can be deleted " Many different scenarios need to be supported. Our strategy in this document is to describe in detail the most common and basic use cases. These will cover most of the requirements. Additional scenarios that bring new features and requirements will be added." 4. Section 2 starts with a sentence that says " This section describes the general characteristics of the use cases and what the scenarios are intended to show." But the bullets in the section talk more about the characteristics of a general TP systems. I do not think it is a problem statement so maybe we should revised the first paragraph to talk about TP system characteristics. 5. In section 3.6 it is not clear what "panoramic views at all the site" means. Thanks Christian for the review. There were also editorial comments which I will update in the next revision Thanks Roni Even _____ --Boundary_(ID_kdeEj6yIwPuesVPIIVn2vA) Content-type: text/html; charset=us-ascii Content-transfer-encoding: 7BIT

Hi,

Christian did a review and sent a markup work document with comments and sent it to me as the editor of the document. I will try to send the list the major comments.

 

1.       The use case does not provide any text about  dynamic changes of the media description during the call. Need to add to the use cases.

2.    &nb sp;  >Section 1 "A standard way of describing the multiple  streams constituting the media flows and the fundamental aspects of their behavior, would allow telepresence systems to interwork." The question is what is meant here. Is it that by allowing a common way to describe streams that systems that have different capabilities are able to negotiate a common set of capabilities.  My view is that what it means that even though there are some standards in place there is no standard way to describe the streams behavior or semantics preventing interoperability even between systems with same capabilities from different manufacturers.

3.       We have the following paragraph in the introduct ion and " Many different scenarios need to be supported.  Our strategy in this  document is to describe in detail the most common and basic use cases.  These will cover most of the requirements.  Additional scenarios that bring new features and requirements will be added."

4.       Section 2 starts with a sentence that says " This section describes the general characteristics of the use cases  and what the scenarios are intended to show." But the bullets in the section talk more about the characteristics of a general TP systems.  I do not think it is a problem statement so maybe we should revised the first paragraph to talk about TP system characteristics.

5.      In section 3.6 it is not clear what "panoramic views at all the site" means.

 

 

Thanks Christian for the review. There were also editorial comments which I will update in the next revision

 

Thanks

Roni Even

 

lang=EN-AU> 

 

 

 


--Boundary_(ID_kdeEj6yIwPuesVPIIVn2vA)-- From Even.roni@huawei.com Tue Aug 30 14:42:37 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0BE6621F8F00 for ; Tue, 30 Aug 2011 14:42:37 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.355 X-Spam-Level: X-Spam-Status: No, score=-106.355 tagged_above=-999 required=5 tests=[AWL=0.243, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9nzdHvdmVJcu for ; Tue, 30 Aug 2011 14:42:36 -0700 (PDT) Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [119.145.14.67]) by ietfa.amsl.com (Postfix) with ESMTP id 2C37021F8E82 for ; Tue, 30 Aug 2011 14:42:29 -0700 (PDT) Received: from huawei.com (szxga04-in [172.24.2.12]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQR00M3QGD7G8@szxga04-in.huawei.com> for clue@ietf.org; Wed, 31 Aug 2011 05:43:55 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQR00AFSGD7ZZ@szxga04-in.huawei.com> for clue@ietf.org; Wed, 31 Aug 2011 05:43:55 +0800 (CST) Received: from windows8d787f9 ([109.64.200.234]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQR001KMGCYZ5@szxml12-in.huawei.com> for clue@ietf.org; Wed, 31 Aug 2011 05:43:55 +0800 (CST) Date: Wed, 31 Aug 2011 00:42:17 +0300 From: Roni Even To: clue@ietf.org Message-id: <04ad01cc675d$b5b66230$21232690$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_+7zmEjVGtUU0c4/sUklZ0w)" Content-language: en-us Thread-index: AcxnXbBKsIvIpTFGQ5SzdW7+mCtRRA== Subject: [clue] pre wglc review of usa cases X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Aug 2011 21:42:37 -0000 This is a multi-part message in MIME format. --Boundary_(ID_+7zmEjVGtUU0c4/sUklZ0w) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Hi, Christian did a review and sent a markup work document with comments and sent it to me as the editor of the document. I will try to send the list the major comments. 1. The use case does not provide any text about dynamic changes of the media description during the call. Need to add to the use cases. 2. Section 1 "A standard way of describing the multiple streams constituting the media flows and the fundamental aspects of their behavior, would allow telepresence systems to interwork." The question is what is meant here. Is it that by allowing a common way to describe streams that systems that have different capabilities are able to negotiate a common set of capabilities. My view is that what it means that even though there are some standards in place there is no standard way to describe the streams behavior or semantics preventing interoperability even between systems with same capabilities from different manufacturers. 3. We have the following paragraph in the introduction and I think it can be deleted " Many different scenarios need to be supported. Our strategy in this document is to describe in detail the most common and basic use cases. These will cover most of the requirements. Additional scenarios that bring new features and requirements will be added." 4. Section 2 starts with a sentence that says " This section describes the general characteristics of the use cases and what the scenarios are intended to show." But the bullets in the section talk more about the characteristics of a general TP systems. I do not think it is a problem statement so maybe we should revised the first paragraph to talk about TP system characteristics. 5. In section 3.6 it is not clear what "panoramic views at all the site" means. Thanks Christian for the review. There were also editorial comments which I will update in the next revision Thanks Roni Even --Boundary_(ID_+7zmEjVGtUU0c4/sUklZ0w) Content-type: text/html; charset=us-ascii Content-transfer-encoding: 7BIT

Hi,

Christian did a review and sent a markup work document with comments and sent it to me as the editor of the document. I will try to send the list the major comments.

 

1.      The use case does not provide any text about  dy namic ch ption during the call. Need to add to the use cases.

2.      Section 1 "A standard way of describing the multiple  streams constituting the media flows and the fundamental aspects of their behavior, would allow telepresence systems to interwork." The question is what is meant here. Is it that by allowing a common way to describe streams that systems that have different capabilities are able to negotiate a common set of capabilities.  My view is that what it means that even though there are some standards in place there is no standard way to describe the streams behavior or semanti ility even between systems with same capabilities from different manufacturers.

3.      We have the following paragraph in the introduction and I think it can be deleted " Many different scenarios need to be supported.  Our strategy in this  document is to describe in detail the most common and basic use cases.  These will cover most of the requirements.  Additional scenarios that bring new features and requirements will be added."

4.      Section 2 starts with a sentence that says " This section describes the general characteristics of the use cases  and what the scenarios are intended to show." But the bullets in the section talk more about the characteristics of a general TP systems.  I do not think it is a problem statement so maybe we should revised the first paragraph to talk about TP system characteristics.

5.      In section 3.6 it is not clear what "panoramic views at all the site" means.

 

 

Thanks Christian for the review. There were also editorial comments which I will update in the next revision

 

Thanks

Roni Even

 

--Boundary_(ID_+7zmEjVGtUU0c4/sUklZ0w)-- From Christian.Groves@nteczone.com Tue Aug 30 18:02:14 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 490E721F8E42 for ; Tue, 30 Aug 2011 18:02:14 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.599 X-Spam-Level: X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R3emDZhNSqd9 for ; Tue, 30 Aug 2011 18:02:13 -0700 (PDT) Received: from ipmail06.adl2.internode.on.net (ipmail06.adl2.internode.on.net [150.101.137.129]) by ietfa.amsl.com (Postfix) with ESMTP id A621F21F8E10 for ; Tue, 30 Aug 2011 18:02:12 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApIBAFuFXU520YVT/2dsb2JhbAAMNqpzQAEFGxUMBD0WGAMCAQIBWAgBARvAIIZUBKQ8 Received: from ppp118-209-133-83.lns20.mel6.internode.on.net (HELO [127.0.0.1]) ([118.209.133.83]) by ipmail06.adl2.internode.on.net with ESMTP; 31 Aug 2011 10:33:39 +0930 Message-ID: <4E5D884C.2060206@nteczone.com> Date: Wed, 31 Aug 2011 11:03:08 +1000 From: Christian Groves User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20110812 Thunderbird/6.0 MIME-Version: 1.0 To: clue@ietf.org Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Subject: [clue] Comments on draft-ietf-clue-telepresence-use-cases-01 X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Aug 2011 01:02:14 -0000 Hello all, As Roni mentioned I provided some comments to the authors of draft-ietf-clue-telepresence-use-cases-01. Here they are: Comments: --------- 1) Section 1 2nd paragraph: [CNG] Having participated in the work I know what is meant here, but if someone new reads this draft there doesn’t seem to be any linking text between the fact that there are multiple standards in use and that by describing multiple streams, interoperability is somehow now guaranteed? I assume it the fact that by allowing a common way to describe streams that systems that have different capabilities are able to negotiate a common set of capabilities??? 2) Section 1 4th paragraph: [CNG] Will this be reworded when the document goes for WGLC? I take additional scenarios won’t be added after that time. 3) Section 2: [CNG] This section seems to convey a “mixed” message. The initial sentence says “ This section describes the general characteristics of the use cases and what the scenarios are intended to show. “ However a good part of the section (bullet list down) doesn’t talk about the use cases it’s a problem statement. Perhaps it should go into the introduction or be a section titled “problem statement” or similar? 4) Section 2 List item 4: "The number of audio channels" [CNG] Should this be number and type of microphones? Or is there some significance of the "channel" usage? 5) Section 3: [CNG] Will there be any attempt to link the requirements to certain uses cases?? For example: The RTCWEB requirements document indicates which requirements are derived from which use cases. 6) Section 3.x: [CNG] The framework draft sect 6.1 indicates that media descriptions are dynamic. I.e. capabilities may change. There’s nothing explicit in the requirements to support this. Likewise there doesn’t appear to be a use case to support that. Do we need to add one for that? 7) Section 3.2 Para 2: "...and 1 screen and 1 camera at the other site, connected by a point to point call." [CNG] What’s the relevance of this text? It’s not included in the symmetric case? And this aspect doesn’t seem to be further discussed in the use case. 8) Section 3.4 Para 4: "..In a multipoint meeting, the presentation streams for the currently active presentation are always distributed to all sites in the meeting, so that the presentations are viewed by all." [CNG] How does this relate to the framework? i.e. even though currently presentations are distributed to all participants the framework allows different video captures to be sent to different remote systems. 9) Section 3.5 Para 3: "In most cases a transcoding intermediate device will be relied upon to produce a single stream, perhaps with some kind of continuous presence." [CNG] I’m not sure what “continuous presence” has to do with a transcoding device??? Maybe some clarifying text could be added? 10) Section 3.5 Para 6: "For the software conferencing participant,..." [CNG] PC-based? Software conferencing participant isn’t in the introduction paragraph in sect. 3.5. 11) Section 3.6 Para 1: "The importance of this example is that the multiple video streams are not used to create an immersive conferencing experience with panoramic views at all the site." [CNG] I’m not getting the meaning here. Is there a word missing? i.e. “panoramic views at all the sites”, “panoramic view of the sites” etc? Editorial --------- 1) List of authors mentions T.Eubanks [CNG] should it be M.Eubanks? 2) Abstract: Telepresence conferencing systems seek to create the sense of <> really being present. [CNG]I guess it’s the participants really being present rather than the telepresence system. 3) General: Various terms are used to indicate the apparent size of participants: full-size, actual-sized etc. I think we should consistently the same term (I think "actual sized" is being favoured). Otherwise we should documents the difference between these terms if some different is meant. 4) Sect.1 Introduction, Para 2. Addition: ...assistance and expensive additional equipment which translates from one vendor<<’s protocols>> to another... 5) Sect.1 Para 5 suggested rewording: "Point-to-point and Multipoint telepresence conferences are considered. In some use cases, the number of displays the same at all sites, in others, the number of displays differs at different sites. Both use cases are considered. Also included is a use case describing display of presentation material or content." [CNG] The term "use case" should be used consistently i.e. instead of "case" or "scenario" [CNG] “similar” means not necessary identical to i.e. different. So "same" should be used. 6) Sect.1 Para 6: "section" in lower case. 7) Sect.2 Para 2: (around 60") [CNG] What does it relate to, width, diagonal, what aspect ratio? We should probably be more specific. 8) Sect.2 Para 2 change: "The cameras used to present.." to "The cameras used to capture..." AND "There may also be other cameras, such as for document display." to ""There may also be other cameras, such as for document capture." 9) Sect.2 Para 2 AND General: [CNG] The terms "screen", "monitor" and "display" are all used interchangeably in the draft. It would be good to use a single term. 10) Section 2 List item 8: "Similar or dissimilar number of primary screens at all sites" [CNG] As per above is this really “the same or a different” number of? Or is there a reason that there is some ambiguity in this definition? Also “Type” is described against “presentation displays” is there a reason that the primary screens may be a different type? 11) Section 2 1st para under list [delete]: "This state of affairs is not acceptable for the continued growth of telepresence >>>- we believe<<< telepresence systems should have the same ease of interoperability as do telephones." 12) Section 2 2nd last paragraph [delete]: "...and simultaneously providing a spatial audio sound stage that is consistent with the video >>>presentation<<<. [CNG] Not to confuse a presentation video stream as opposed to a participant video stream. 13) Section 3.2 para 3: "maneuvers" to "manoeuvres" 14) Section 3.4 bullet list, split point 2 to create point 3: 3. An educator who is presenting a multi-screen slide show. This show requires that the placement of the images on the multiple displays at each site be consistent. 15) Section 3.5 para 3 [change]: "In most cases a transcoding intermediate device will..." to " In most cases an intermediate transcoding device will..." 16) Section 3.5 para 6 [add]: "Or, it could be multiple streams, similar to an immersive <> but with a smaller screen. 17) Section 3.5 last para [add]: "...the same way that immersive system signals actions." 18) Section 3.7 para 1: [CNG] The reference to Allardyce and Randal doesn't match the reference section "Allardyre" and "Randall"??? 19) Section 7. [CNG] ITU docs are not referenced? E.g. H.239, H.264, H.323, 20) Author's addresses: Is Marshall address correct? I get a bounce when sending to him. + Other small nits, spaces missing etc. Communicated to the editor. Regards, Christian From Even.roni@huawei.com Wed Aug 31 00:25:54 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AA9EE21F8B86 for ; Wed, 31 Aug 2011 00:25:54 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.371 X-Spam-Level: X-Spam-Status: No, score=-106.371 tagged_above=-999 required=5 tests=[AWL=0.228, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id M0l16Q0K+QrG for ; Wed, 31 Aug 2011 00:25:53 -0700 (PDT) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [119.145.14.64]) by ietfa.amsl.com (Postfix) with ESMTP id F23D921F8B7F for ; Wed, 31 Aug 2011 00:25:51 -0700 (PDT) Received: from huawei.com (szxga05-in [172.24.2.49]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQS00FUK7BY3C@szxga05-in.huawei.com> for clue@ietf.org; Wed, 31 Aug 2011 15:26:22 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga05-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQS002SF7BXKE@szxga05-in.huawei.com> for clue@ietf.org; Wed, 31 Aug 2011 15:26:22 +0800 (CST) Received: from windows8d787f9 ([109.64.200.234]) by szxml12-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQS00E9K7BSCL@szxml12-in.huawei.com>; Wed, 31 Aug 2011 15:26:21 +0800 (CST) Date: Wed, 31 Aug 2011 10:24:47 +0300 From: Roni Even In-reply-to: <4E5D884C.2060206@nteczone.com> To: 'Christian Groves' , clue@ietf.org Message-id: <04e101cc67af$12ce9d10$386bd730$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: text/plain; charset=us-ascii Content-language: en-us Content-transfer-encoding: 7BIT Thread-index: Acxnee19H/BggddTQ8CN+VaYJQ8fqwAMoWSQ References: <4E5D884C.2060206@nteczone.com> Subject: Re: [clue] Comments on draft-ietf-clue-telepresence-use-cases-01 X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Aug 2011 07:25:54 -0000 Hi Christian, Thanks for the review see inline Roni > -----Original Message----- > From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of > Christian Groves > Sent: Wednesday, August 31, 2011 4:03 AM > To: clue@ietf.org > Subject: [clue] Comments on draft-ietf-clue-telepresence-use-cases-01 > > Hello all, > > As Roni mentioned I provided some comments to the authors of > draft-ietf-clue-telepresence-use-cases-01. > > Here they are: > > Comments: > --------- > 1) Section 1 2nd paragraph: [CNG] Having participated in the work I > know > what is meant here, but if someone new reads this draft there doesn't > seem to be any linking text between the fact that there are multiple > standards in use and that by describing multiple streams, > interoperability is somehow now guaranteed? I assume it the fact that > by > allowing a common way to describe streams that systems that have > different capabilities are able to negotiate a common set of > capabilities??? My understanding is that even though there are some standards in place there is no standard way to describe the streams behavior or semantics preventing interoperability even between systems with same capabilities from different manufacturers. > > 2) Section 1 4th paragraph: [CNG] Will this be reworded when the > document goes for WGLC? I take additional scenarios won't be added > after > that time. Propose to delete this paragraph. > > 3) Section 2: [CNG] This section seems to convey a "mixed" message. The > initial sentence says " This section describes the general > characteristics of the use cases and what the scenarios are intended to > show. " However a good part of the section (bullet list down) doesn't > talk about the use cases it's a problem statement. Perhaps it should go > into the introduction or be a section titled "problem statement" or > similar? The bullets in the section talk more about the characteristics of a general TP systems. I do not think it is a problem statement so maybe we should revised the first paragraph to talk about TP system characteristics. > > 4) Section 2 List item 4: "The number of audio channels" > [CNG] Should this be number and type of microphones? Or is there some > significance of the "channel" usage? > > 5) Section 3: [CNG] Will there be any attempt to link the requirements > to certain uses cases?? For example: The RTCWEB requirements document > indicates which requirements are derived from which use cases. > > 6) Section 3.x: [CNG] The framework draft sect 6.1 indicates that media > descriptions are dynamic. I.e. capabilities may change. There's nothing > explicit in the requirements to support this. Likewise there doesn't > appear to be a use case to support that. Do we need to add one for > that? > > 7) Section 3.2 Para 2: "...and 1 screen and 1 camera at the other site, > connected by a point to > point call." > [CNG] What's the relevance of this text? It's not included in the > symmetric case? And this aspect doesn't seem to be further discussed in > the use case. I suggest to remove the sentence on point to point call. > > 8) Section 3.4 Para 4: "..In a multipoint meeting, the presentation > streams for the currently active presentation are always distributed to > all sites in the meeting, so that the presentations are viewed by all." > [CNG] How does this relate to the framework? i.e. even though currently > presentations are distributed to all participants the framework allows > different video captures to be sent to different remote systems. This draft is about use cases. It describes how systems from different manufacturers work so it is good to describe the presentation streams usage. > > 9) Section 3.5 Para 3: "In most cases a transcoding intermediate device > will be relied upon to produce a single stream, perhaps with some kind > of continuous presence." > [CNG] I'm not sure what "continuous presence" has to do with a > transcoding device??? Maybe some clarifying text could be added? The transcoding device may have the capability to also create a composed image (continuous presence) > > 10) Section 3.5 Para 6: "For the software conferencing participant,..." > [CNG] PC-based? Software conferencing participant isn't in the > introduction paragraph in sect. 3.5. For consistency should be PC based in section 3.5. > > 11) Section 3.6 Para 1: "The importance of this example is that the > multiple video streams are > not used to create an immersive conferencing experience with panoramic > views at all the site." > [CNG] I'm not getting the meaning here. Is there a word missing? i.e. > "panoramic views at all the sites", "panoramic view of the sites" etc? > > > Editorial > --------- > 1) List of authors mentions T.Eubanks > [CNG] should it be M.Eubanks? > > 2) Abstract: Telepresence conferencing systems seek to create the sense > of <> really being present. > [CNG]I guess it's the participants really being present rather than the > telepresence system. > > 3) General: Various terms are used to indicate the apparent size of > participants: full-size, actual-sized etc. I think we should > consistently the same term (I think "actual sized" is being favoured). > Otherwise we should documents the difference between these terms if > some > different is meant. > > 4) Sect.1 Introduction, Para 2. Addition: ...assistance and expensive > additional equipment which translates from one vendor<<'s protocols>> > to > another... > > 5) Sect.1 Para 5 suggested rewording: "Point-to-point and Multipoint > telepresence conferences are considered. In some use cases, the number > of displays the same at all sites, in others, the number of displays > differs at different sites. Both use cases are considered. Also > included > is a use case describing display of presentation material or content." > [CNG] The term "use case" should be used consistently i.e. instead of > "case" or "scenario" > [CNG] "similar" means not necessary identical to i.e. different. So > "same" should be used. > > 6) Sect.1 Para 6: "section" in lower case. > > 7) Sect.2 Para 2: (around 60") [CNG] What does it relate to, width, > diagonal, what aspect ratio? We should probably be more specific. > > 8) Sect.2 Para 2 change: "The cameras used to present.." to "The > cameras > used to capture..." AND "There may also be other cameras, such as for > document display." to ""There may also be other cameras, such as for > document capture." > > 9) Sect.2 Para 2 AND General: [CNG] The terms "screen", "monitor" and > "display" are all used interchangeably in the draft. It would be good > to > use a single term. > > 10) Section 2 List item 8: "Similar or dissimilar number of primary > screens at all sites" > [CNG] As per above is this really "the same or a different" number of? > Or is there a reason that there is some ambiguity in this definition? > Also "Type" is described against "presentation displays" is there a > reason that the primary screens may be a different type? > > 11) Section 2 1st para under list [delete]: "This state of affairs is > not acceptable for the continued growth of telepresence >>>- we > believe<<< telepresence systems should have the same ease of > interoperability as do telephones." > > 12) Section 2 2nd last paragraph [delete]: "...and simultaneously > providing a spatial audio sound stage that is consistent with the video > >>>presentation<<<. > [CNG] Not to confuse a presentation video stream as opposed to a > participant video stream. > > 13) Section 3.2 para 3: "maneuvers" to "manoeuvres" > > 14) Section 3.4 bullet list, split point 2 to create point 3: > 3. An educator who is presenting a multi-screen slide show. This show > requires that the placement of the images on the multiple displays at > each site be consistent. > > 15) Section 3.5 para 3 [change]: "In most cases a transcoding > intermediate device will..." to " In most cases an intermediate > transcoding device will..." > > 16) Section 3.5 para 6 [add]: "Or, it could be multiple streams, > similar > to an immersive <> but with a smaller screen. > > 17) Section 3.5 last para [add]: "...the same way that immersive system > signals actions." > > 18) Section 3.7 para 1: [CNG] The reference to Allardyce and Randal > doesn't match the reference section "Allardyre" and "Randall"??? Allardyce and Randall http://adsabs.harvard.edu/abs/1983ddi..rept.....A > > 19) Section 7. [CNG] ITU docs are not referenced? E.g. H.239, H.264, > H.323, > > 20) Author's addresses: Is Marshall address correct? I get a bounce > when > sending to him. > > + Other small nits, spaces missing etc. Communicated to the editor. > > Regards, Christian > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue From Even.roni@huawei.com Wed Aug 31 00:37:51 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B1E4221F8B83 for ; Wed, 31 Aug 2011 00:37:51 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -106.384 X-Spam-Level: X-Spam-Status: No, score=-106.384 tagged_above=-999 required=5 tests=[AWL=0.214, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wxUOgJnVns9h for ; Wed, 31 Aug 2011 00:37:50 -0700 (PDT) Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [119.145.14.67]) by ietfa.amsl.com (Postfix) with ESMTP id 92B1221F8B0E for ; Wed, 31 Aug 2011 00:37:49 -0700 (PDT) Received: from huawei.com (szxga04-in [172.24.2.12]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQS002MZ7XI0I@szxga04-in.huawei.com> for clue@ietf.org; Wed, 31 Aug 2011 15:39:18 +0800 (CST) Received: from huawei.com ([172.24.2.119]) by szxga04-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LQS009X17XIZY@szxga04-in.huawei.com> for clue@ietf.org; Wed, 31 Aug 2011 15:39:18 +0800 (CST) Received: from windows8d787f9 (bzq-109-64-200-234.red.bezeqint.net [109.64.200.234]) by szxml11-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPA id <0LQS00BQ07XDBC@szxml11-in.huawei.com>; Wed, 31 Aug 2011 15:39:18 +0800 (CST) Date: Wed, 31 Aug 2011 10:37:42 +0300 From: Roni Even In-reply-to: To: 'Mary Barnes' , 'CLUE' Message-id: <04ef01cc67b0$e18d9880$a4a8c980$%roni@huawei.com> MIME-version: 1.0 X-Mailer: Microsoft Office Outlook 12.0 Content-type: multipart/alternative; boundary="Boundary_(ID_R/bD+0cU6RBNSZ3XyGPgkA)" Content-language: en-us Thread-index: AcxmlRN80awHI7ynQ6qldhGV0NHveQBGwrOA References: <1241030509.3022037.1314646459806.JavaMail.doodle@worker2> Subject: Re: [clue] Doodle: Link for poll "CLUE WG F2F Interim" X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Aug 2011 07:37:51 -0000 This is a multi-part message in MIME format. --Boundary_(ID_R/bD+0cU6RBNSZ3XyGPgkA) Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Mary, What is the plan for the framework document. Are we going to discuss the framework in the f2f based on a WG document or an individual draft. If it will be based on the individual draft how are we going to decide on changes to the current text? Roni From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of Mary Barnes Sent: Tuesday, August 30, 2011 12:46 AM To: CLUE Subject: [clue] Doodle: Link for poll "CLUE WG F2F Interim" Hi folks, We are considering holding a face to face for CLUE in order to progress the framework. In speaking to some of the primary authors, October 11th and 12th (1.5 days) looks like it might work. The plan is to host the meeting in Boston (ideally at the Polycom Andover site, but we'll need to work out the logistics). However, we first need an idea of how many people could attend a f2f. http://doodle.com/h3ahqn9ht96m839k We would also have a Webex session. If you are not able to attend a f2f but would participate via Webex, please include a comment indicating such. In order to plan, we would like responses no later than Monday, Sept. 5th at 5pm Pacific. We will do a separate doodle poll for a virtual interim if we don't get enough folks able to attend a f2f. Thanks, Mary CLUE WG co-chair --Boundary_(ID_R/bD+0cU6RBNSZ3XyGPgkA) Content-type: text/html; charset=us-ascii Content-transfer-encoding: quoted-printable

Mary,

What is the plan for the framework document. Are we going to discuss = the framework  in the f2f based on a WG document or an individual = draft.  

If it will be based on the individual draft how are we going to = decide on changes to the current text?

 

Roni

 

From:= = clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of = Mary Barnes
Sent: Tuesday, August 30, 2011 12:46 = AM
To: CLUE
Subject: [clue] Doodle: Link for poll = "CLUE WG F2F Interim"

 

Hi = folks,

 

We are considering holding a face to face for CLUE in = order to progress the framework. In speaking to some of the primary = authors, October 11th and 12th (1.5 days) looks like it might work. =  The plan is to host the meeting in Boston (ideally at the Polycom = Andover site, but we'll need to work out the logistics).  However, = we first need an idea of how many people could attend a f2f. =  

http://doodle.com/h3ahqn9ht96m839k

We would = also have a Webex session.  If you are not able to attend a f2f but = would participate via Webex, please include a comment indicating = such. 

 

In order to plan, we would like responses no later = than Monday, Sept. 5th at 5pm Pacific. =  

 

We will do a separate doodle poll for a virtual = interim if we don't get enough folks able to attend a = f2f.

 

Thanks,

Mary

CLUE WG co-chair

 

= --Boundary_(ID_R/bD+0cU6RBNSZ3XyGPgkA)-- From Christian.Groves@nteczone.com Wed Aug 31 03:04:15 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3324221F8B2A for ; Wed, 31 Aug 2011 03:04:15 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.599 X-Spam-Level: X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2QyEq03fpd7W for ; Wed, 31 Aug 2011 03:04:14 -0700 (PDT) Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [150.101.137.145]) by ietfa.amsl.com (Postfix) with ESMTP id 2FB7F21F8AD6 for ; Wed, 31 Aug 2011 03:04:12 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjoAABUFXk520YVT/2dsb2JhbAAMNphekhoHAQEBAQMBAQEkERsVBgYEAQwECxEEAQEBCRYIBwkDAgECARUfCQgGDQEFAgEBh3K3QoZVBKQ+ Received: from ppp118-209-133-83.lns20.mel6.internode.on.net (HELO [127.0.0.1]) ([118.209.133.83]) by ipmail06.adl6.internode.on.net with ESMTP; 31 Aug 2011 19:35:39 +0930 Message-ID: <4E5E0754.3040905@nteczone.com> Date: Wed, 31 Aug 2011 20:05:08 +1000 From: Christian Groves User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20110812 Thunderbird/6.0 MIME-Version: 1.0 To: Roni Even References: <4E5D884C.2060206@nteczone.com> <04e101cc67af$12ce9d10$386bd730$%roni@huawei.com> In-Reply-To: <04e101cc67af$12ce9d10$386bd730$%roni@huawei.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: clue@ietf.org Subject: Re: [clue] Comments on draft-ietf-clue-telepresence-use-cases-01 X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Aug 2011 10:04:15 -0000 Hello Roni, Thanks for the responses. Please see my replies below. Regards, Christian On 31/08/2011 5:24 PM, Roni Even wrote: > Hi Christian, > Thanks for the review see inline > Roni > > >> -----Original Message----- >> From: clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] On Behalf Of >> Christian Groves >> Sent: Wednesday, August 31, 2011 4:03 AM >> To: clue@ietf.org >> Subject: [clue] Comments on draft-ietf-clue-telepresence-use-cases-01 >> >> Hello all, >> >> As Roni mentioned I provided some comments to the authors of >> draft-ietf-clue-telepresence-use-cases-01. >> >> Here they are: >> >> Comments: >> --------- >> 1) Section 1 2nd paragraph: [CNG] Having participated in the work I >> know >> what is meant here, but if someone new reads this draft there doesn't >> seem to be any linking text between the fact that there are multiple >> standards in use and that by describing multiple streams, >> interoperability is somehow now guaranteed? I assume it the fact that >> by >> allowing a common way to describe streams that systems that have >> different capabilities are able to negotiate a common set of >> capabilities??? > My understanding is that even though there are some standards in place there > is no standard way to describe the streams behavior or semantics preventing > interoperability even between systems with same capabilities from different > manufacturers. [CNG] I agree. My point was that the current text in the use case document I think is overly simplistic. Being able to describe something suddenly doesn't solve interoperability. e.g. I can use SDP to describe plenty of things it doesn't mean that it will make it interoperable with another party. There's more built around it i.e. offer / answer to be able to negotiate a common set. Its the text that describes the protocol aspects of CLUE such as the hint->, <-capabilities, ->configure. > > >> 2) Section 1 4th paragraph: [CNG] Will this be reworded when the >> document goes for WGLC? I take additional scenarios won't be added >> after >> that time. > Propose to delete this paragraph. [CNG] OK > >> 3) Section 2: [CNG] This section seems to convey a "mixed" message. The >> initial sentence says " This section describes the general >> characteristics of the use cases and what the scenarios are intended to >> show. " However a good part of the section (bullet list down) doesn't >> talk about the use cases it's a problem statement. Perhaps it should go >> into the introduction or be a section titled "problem statement" or >> similar? > The bullets in the section talk more about the characteristics of a general > TP systems. I do not think it is a problem statement so maybe we should > revised the first paragraph to talk about TP system characteristics. [CNG] I don't think all of the text is a problem statement just certain aspects of section 3. I think the section is a bit of mixed bag in what it is trying to achieve. The heading says "Telepresence Scenarios Overview" - there's elements of this e.g. saying what in and out of scope. The first sentence talks about "Characteristics" - there's element of this also in the paragraphs and list. Then there's text like the 3rd paragraph which to me reads like a problem statement. > >> 4) Section 2 List item 4: "The number of audio channels" >> [CNG] Should this be number and type of microphones? Or is there some >> significance of the "channel" usage? >> >> 5) Section 3: [CNG] Will there be any attempt to link the requirements >> to certain uses cases?? For example: The RTCWEB requirements document >> indicates which requirements are derived from which use cases. >> >> 6) Section 3.x: [CNG] The framework draft sect 6.1 indicates that media >> descriptions are dynamic. I.e. capabilities may change. There's nothing >> explicit in the requirements to support this. Likewise there doesn't >> appear to be a use case to support that. Do we need to add one for >> that? >> >> 7) Section 3.2 Para 2: "...and 1 screen and 1 camera at the other site, >> connected by a point to >> point call." >> [CNG] What's the relevance of this text? It's not included in the >> symmetric case? And this aspect doesn't seem to be further discussed in >> the use case. > I suggest to remove the sentence on point to point call. [CNG] OK > >> 8) Section 3.4 Para 4: "..In a multipoint meeting, the presentation >> streams for the currently active presentation are always distributed to >> all sites in the meeting, so that the presentations are viewed by all." >> [CNG] How does this relate to the framework? i.e. even though currently >> presentations are distributed to all participants the framework allows >> different video captures to be sent to different remote systems. > This draft is about use cases. It describes how systems from different > manufacturers work so it is good to describe the presentation streams usage. [CNG] I'm not disagreeing with leaving it in. It should be mentioned. I was just thinking ahead to see what requirement was derived from this statement. One could derive the requirement "A presentation capture SHALL be distributed to all participants" or "A presentation capture MAY be distributed to all participants or a subset". This requirement would then need to be supported in the CLUE framework. So I was really questioning what the intention was for this text? > > >> 9) Section 3.5 Para 3: "In most cases a transcoding intermediate device >> will be relied upon to produce a single stream, perhaps with some kind >> of continuous presence." >> [CNG] I'm not sure what "continuous presence" has to do with a >> transcoding device??? Maybe some clarifying text could be added? > The transcoding device may have the capability to also create a composed > image (continuous presence) [CNG] OK, I understand the intention. Could you include some clarifying text in an updated draft? > >> 10) Section 3.5 Para 6: "For the software conferencing participant,..." >> [CNG] PC-based? Software conferencing participant isn't in the >> introduction paragraph in sect. 3.5. > For consistency should be PC based in section 3.5. > >> 11) Section 3.6 Para 1: "The importance of this example is that the >> multiple video streams are >> not used to create an immersive conferencing experience with panoramic >> views at all the site." >> [CNG] I'm not getting the meaning here. Is there a word missing? i.e. >> "panoramic views at all the sites", "panoramic view of the sites" etc? >> >> >> Editorial >> --------- >> 1) List of authors mentions T.Eubanks >> [CNG] should it be M.Eubanks? >> >> 2) Abstract: Telepresence conferencing systems seek to create the sense >> of<> really being present. >> [CNG]I guess it's the participants really being present rather than the >> telepresence system. >> >> 3) General: Various terms are used to indicate the apparent size of >> participants: full-size, actual-sized etc. I think we should >> consistently the same term (I think "actual sized" is being favoured). >> Otherwise we should documents the difference between these terms if >> some >> different is meant. >> >> 4) Sect.1 Introduction, Para 2. Addition: ...assistance and expensive >> additional equipment which translates from one vendor<<'s protocols>> >> to >> another... >> >> 5) Sect.1 Para 5 suggested rewording: "Point-to-point and Multipoint >> telepresence conferences are considered. In some use cases, the number >> of displays the same at all sites, in others, the number of displays >> differs at different sites. Both use cases are considered. Also >> included >> is a use case describing display of presentation material or content." >> [CNG] The term "use case" should be used consistently i.e. instead of >> "case" or "scenario" >> [CNG] "similar" means not necessary identical to i.e. different. So >> "same" should be used. >> >> 6) Sect.1 Para 6: "section" in lower case. >> >> 7) Sect.2 Para 2: (around 60") [CNG] What does it relate to, width, >> diagonal, what aspect ratio? We should probably be more specific. >> >> 8) Sect.2 Para 2 change: "The cameras used to present.." to "The >> cameras >> used to capture..." AND "There may also be other cameras, such as for >> document display." to ""There may also be other cameras, such as for >> document capture." >> >> 9) Sect.2 Para 2 AND General: [CNG] The terms "screen", "monitor" and >> "display" are all used interchangeably in the draft. It would be good >> to >> use a single term. >> >> 10) Section 2 List item 8: "Similar or dissimilar number of primary >> screens at all sites" >> [CNG] As per above is this really "the same or a different" number of? >> Or is there a reason that there is some ambiguity in this definition? >> Also "Type" is described against "presentation displays" is there a >> reason that the primary screens may be a different type? >> >> 11) Section 2 1st para under list [delete]: "This state of affairs is >> not acceptable for the continued growth of telepresence>>>- we >> believe<<< telepresence systems should have the same ease of >> interoperability as do telephones." >> >> 12) Section 2 2nd last paragraph [delete]: "...and simultaneously >> providing a spatial audio sound stage that is consistent with the video >> >>>presentation<<<. >> [CNG] Not to confuse a presentation video stream as opposed to a >> participant video stream. >> >> 13) Section 3.2 para 3: "maneuvers" to "manoeuvres" >> >> 14) Section 3.4 bullet list, split point 2 to create point 3: >> 3. An educator who is presenting a multi-screen slide show. This show >> requires that the placement of the images on the multiple displays at >> each site be consistent. >> >> 15) Section 3.5 para 3 [change]: "In most cases a transcoding >> intermediate device will..." to " In most cases an intermediate >> transcoding device will..." >> >> 16) Section 3.5 para 6 [add]: "Or, it could be multiple streams, >> similar >> to an immersive<> but with a smaller screen. >> >> 17) Section 3.5 last para [add]: "...the same way that immersive system >> signals actions." >> >> 18) Section 3.7 para 1: [CNG] The reference to Allardyce and Randal >> doesn't match the reference section "Allardyre" and "Randall"??? > Allardyce and Randall http://adsabs.harvard.edu/abs/1983ddi..rept.....A [CNG] OK > > >> 19) Section 7. [CNG] ITU docs are not referenced? E.g. H.239, H.264, >> H.323, >> >> 20) Author's addresses: Is Marshall address correct? I get a bounce >> when >> sending to him. >> >> + Other small nits, spaces missing etc. Communicated to the editor. >> >> Regards, Christian >> _______________________________________________ >> clue mailing list >> clue@ietf.org >> https://www.ietf.org/mailman/listinfo/clue > From Christian.Groves@nteczone.com Wed Aug 31 03:24:09 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C14B821F8B33 for ; Wed, 31 Aug 2011 03:24:09 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.599 X-Spam-Level: X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id k0GW9Il5lPmi for ; Wed, 31 Aug 2011 03:24:09 -0700 (PDT) Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [150.101.137.145]) by ietfa.amsl.com (Postfix) with ESMTP id BAAA421F8B32 for ; Wed, 31 Aug 2011 03:24:08 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ArUBAJUIXk520YVT/2dsb2JhbAAMNpkikhwbJT0WGAMCAQIBWAgBAb9QhlUEpD4 Received: from ppp118-209-133-83.lns20.mel6.internode.on.net (HELO [127.0.0.1]) ([118.209.133.83]) by ipmail06.adl6.internode.on.net with ESMTP; 31 Aug 2011 19:55:36 +0930 Message-ID: <4E5E0C00.6050208@nteczone.com> Date: Wed, 31 Aug 2011 20:25:04 +1000 From: Christian Groves User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20110812 Thunderbird/6.0 MIME-Version: 1.0 To: clue@ietf.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [clue] Use Case and Framework question X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Aug 2011 10:24:09 -0000 Hello, Whilst reviewing the use case document use case 4.1 point to point meeting talks about the possibility for separate monophonic audio streams. Now this presumably this allows for a microphone to be placed in front of each participant. Now consider a three position (left to right) two row telepresence system. Each of these has a microphone i.e. 6 audio captures, Front row AC0, AC1, AC2, Back row AC3, AC4, AC5. The current framework document considers video and audio captures to be left to right. So its easy to describe the first row AC0, AC1, AC2. How would I describe using the current framework the audio captures for the second row microphones? Even if we disregard the row (depth) element, how would I then say AC0 & AC3, AC1 & AC4 and AC3 & AC5 relate to the same left / centre / right position? Regards, Christian From stephen.botzko@gmail.com Wed Aug 31 04:46:21 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8DF9121F8997 for ; Wed, 31 Aug 2011 04:46:21 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.576 X-Spam-Level: X-Spam-Status: No, score=-2.576 tagged_above=-999 required=5 tests=[AWL=-0.644, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, SARE_HTML_USL_OBFU=1.666] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bCImZKJxqJkt for ; Wed, 31 Aug 2011 04:46:20 -0700 (PDT) Received: from mail-vw0-f44.google.com (mail-vw0-f44.google.com [209.85.212.44]) by ietfa.amsl.com (Postfix) with ESMTP id 66F7521F8829 for ; Wed, 31 Aug 2011 04:46:20 -0700 (PDT) Received: by vws12 with SMTP id 12so601873vws.31 for ; Wed, 31 Aug 2011 04:47:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=981ceA7B8BLJ6neucCopjboJdPcYGh5IKczrTDJx6Ck=; b=GbPqMxb67pHcuAWCH36kl2nxw13uy9TKO58Yo8zZim8nGbANFXY8jOUmM5I98+K+jy vig2M/mfk+1hwdnFGBZNTNaxGAcGQqm6ggzyXhXXjDY7COYsS96nJ3fhtdQK6mX5XL9J dsw8xzIA1TVh3ExtGeFWgez7lzo2XkQb3HoyE= MIME-Version: 1.0 Received: by 10.52.76.227 with SMTP id n3mr273099vdw.108.1314791270233; Wed, 31 Aug 2011 04:47:50 -0700 (PDT) Received: by 10.52.183.100 with HTTP; Wed, 31 Aug 2011 04:47:50 -0700 (PDT) In-Reply-To: <4E5E0C00.6050208@nteczone.com> References: <4E5E0C00.6050208@nteczone.com> Date: Wed, 31 Aug 2011 07:47:50 -0400 Message-ID: From: Stephen Botzko To: Christian Groves Content-Type: multipart/alternative; boundary=bcaec5015d8d9db5f304abcbb23f Cc: clue@ietf.org Subject: Re: [clue] Use Case and Framework question X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Aug 2011 11:46:21 -0000 --bcaec5015d8d9db5f304abcbb23f Content-Type: text/plain; charset=ISO-8859-1 The current framework draft allows multiple captures to signal the same linear index. So it is fine for AC0,AC3 and the the other pairs to all signal the same index. This covers the case where the two rows are captured in the same video captures (that is, the left camera VC0 captures the left side of both rows). If the video capture set is VC0, VC1, VC2, it is possible to associate both AC3 and AC0 with VC0. There is a related case, which is when there are also independent cameras (AC3 has a corresponding VC3). The framework also allows multiple capture sets, so you can signal VC3, VC4, VC5, AC3, AC4, AC5 as its own independent capture set. There's been quite a bit of discussion on how two "separate monophonic streams" relate to a "stereo audio stream" amongst the authors of the framework. At this point I believe the conclusion is that the key distinction between them is RTP transport. A second possible distinction is that some audio codecs allow a stereo audio stream to be jointly encoded, providing somewhat better compression. BTW, though it is convenient to think about audio captures as independent microphones, it is important to keep in mind that there are other ways to generate them. For instance, a microphone array can use beam-forming techniques to construct multiple captures from the microphone array. This can in principle be done with video, particularly if you also have depth information. The framework does not describe how many sensors are used to create a capture. Stephen Botzko On Wed, Aug 31, 2011 at 6:25 AM, Christian Groves < Christian.Groves@nteczone.com> wrote: > Hello, > > Whilst reviewing the use case document use case 4.1 point to point meeting > talks about the possibility for separate monophonic audio streams. Now this > presumably this allows for a microphone to be placed in front of each > participant. > > Now consider a three position (left to right) two row telepresence system. > Each of these has a microphone i.e. 6 audio captures, Front row AC0, AC1, > AC2, Back row AC3, AC4, AC5. > > The current framework document considers video and audio captures to be > left to right. > So its easy to describe the first row AC0, AC1, AC2. How would I describe > using the current framework the audio captures for the second row > microphones? > > Even if we disregard the row (depth) element, how would I then say AC0 & > AC3, AC1 & AC4 and AC3 & AC5 relate to the same left / centre / right > position? > > Regards, Christian > > ______________________________**_________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/**listinfo/clue > --bcaec5015d8d9db5f304abcbb23f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable The current framework draft allows multiple captures to signal the same lin= ear index.=A0 So it is fine for AC0,AC3 and the the other pairs to all sign= al the same index.=A0 This covers the case where the two rows are captured = in the same video captures (that is, the left camera VC0 captures the left = side of both rows). If the video capture set is VC0, VC1, VC2,=A0 it is pos= sible to associate both AC3 and AC0 with VC0.

There is a related case, which is when there are also independent camer= as (AC3 has a corresponding VC3).=A0 The framework also allows multiple cap= ture sets, so you can signal VC3, VC4, VC5, AC3, AC4, AC5 as its own indepe= ndent capture set.

There's been quite a bit of discussion on how two "separate mo= nophonic streams" relate to a "stereo audio stream" amongst = the authors of the framework.=A0 At this point I believe the conclusion is = that the key distinction between them is RTP transport.=A0 A second possibl= e distinction is that some audio codecs allow a stereo audio stream to be j= ointly encoded, providing somewhat better compression.

BTW, though it is convenient to think about audio captures as independe= nt microphones, it is important to keep in mind that there are other ways t= o generate them.=A0 For instance, a microphone array can use beam-forming t= echniques to construct multiple captures from the microphone array.=A0 This= can in principle be done with video, particularly if you also have depth i= nformation. The framework does not describe how many sensors are used to cr= eate a capture.

Stephen Botzko



On Wed, Aug 31= , 2011 at 6:25 AM, Christian Groves <Christian.Groves@nteczone= .com> wrote:
Hello,

Whilst reviewing the use case document use case 4.1 point to point meeting = talks about the possibility for separate monophonic audio streams. Now this= presumably this allows for a microphone to be placed in front of each part= icipant.

Now consider a three position (left to right) two row telepresence system. = Each of these has a microphone i.e. 6 audio captures, Front row AC0, AC1, A= C2, Back row AC3, AC4, AC5.

The current framework document considers video and audio captures to be lef= t to right.
So its easy to describe the first row AC0, AC1, AC2. How would I describe u= sing the current framework the audio captures for the second row microphone= s?

Even if we disregard the row (depth) element, how would I then say AC0 &= ; AC3, AC1 & AC4 and AC3 & AC5 relate to the same left / centre / r= ight position?

Regards, Christian

_______________________________________________
clue mailing list
clue@ietf.org
ht= tps://www.ietf.org/mailman/listinfo/clue

--bcaec5015d8d9db5f304abcbb23f-- From marshall.eubanks@gmail.com Wed Aug 31 05:05:34 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0D75E21F8B2F for ; Wed, 31 Aug 2011 05:05:34 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -100.802 X-Spam-Level: X-Spam-Status: No, score=-100.802 tagged_above=-999 required=5 tests=[AWL=1.130, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, SARE_HTML_USL_OBFU=1.666, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zo7QC1-aFl5P for ; Wed, 31 Aug 2011 05:05:33 -0700 (PDT) Received: from mail-gw0-f44.google.com (mail-gw0-f44.google.com [74.125.83.44]) by ietfa.amsl.com (Postfix) with ESMTP id 201EB21F8B2D for ; Wed, 31 Aug 2011 05:05:32 -0700 (PDT) Received: by gwb20 with SMTP id 20so582616gwb.31 for ; Wed, 31 Aug 2011 05:07:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=D9GplWda+J9H9eSGifKazHilzBeC7Ew9VybvCi6vM30=; b=sQiXfj6T/h9mGzbP1oMkWflD7EONfuUGZ4w3CNW1hOmgzZ2H1fhAVdoP6iCCR5v+YQ e2sj1iORdZzc5uVAFyLuO+ES2ZRFfwu2GVilSCgWW8iAU2jMoWPp6a92ZaP2/YX21H97 KvA0k/sQGOxSAsRunh42KId3woX27BielJSkI= MIME-Version: 1.0 Received: by 10.150.177.16 with SMTP id z16mr256623ybe.27.1314792421370; Wed, 31 Aug 2011 05:07:01 -0700 (PDT) Received: by 10.150.185.9 with HTTP; Wed, 31 Aug 2011 05:07:01 -0700 (PDT) In-Reply-To: <4E5E0C00.6050208@nteczone.com> References: <4E5E0C00.6050208@nteczone.com> Date: Wed, 31 Aug 2011 08:07:01 -0400 Message-ID: From: Marshall Eubanks To: Christian Groves Content-Type: multipart/alternative; boundary=000e0cd6a95a3aabd904abcbf7d6 Cc: clue@ietf.org Subject: Re: [clue] Use Case and Framework question X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Aug 2011 12:05:34 -0000 --000e0cd6a95a3aabd904abcbf7d6 Content-Type: text/plain; charset=ISO-8859-1 On Wed, Aug 31, 2011 at 6:25 AM, Christian Groves < Christian.Groves@nteczone.com> wrote: > Hello, > > Whilst reviewing the use case document use case 4.1 point to point meeting > talks about the possibility for separate monophonic audio streams. Now this > presumably this allows for a microphone to be placed in front of each > participant. > > Now consider a three position (left to right) two row telepresence system. > Each of these has a microphone i.e. 6 audio captures, Front row AC0, AC1, > AC2, Back row AC3, AC4, AC5. > > The current framework document considers video and audio captures to be > left to right. > So its easy to describe the first row AC0, AC1, AC2. How would I describe > using the current framework the audio captures for the second row > microphones? > > Even if we disregard the row (depth) element, how would I then say AC0 & > AC3, AC1 & AC4 and AC3 & AC5 relate to the same left / centre / right > position? > > My own personal opinion is that - it is dangerous and limiting to attempt to capture equipment spatial placement in a simple ordered list, especially when that equipment (or its capture zone) can move and change with time and that - we are likely to be sending kilobytes of information about codec formats and the like , so I don't see why we can't afford a few bytes to describe a mapping between equipment ids (numbering) and what and where they are capturing. If you want a specific proposal to deal with this, here is one. There are three possible rankings of spatial equipment capture, transverse, depth and vertical. Call them T, D and V. Each can be declared to be relevant, with the assumption that at least one will be. Ones not declared relevant are ignored. Each can be declared to be fixed or variable width. In the case of fixed width, the width is also declared. Units are degrees (T AND V) and meters (D). Everything is referenced to the axis of symmetry of the unit, oriented from the point of view of an observer standing on the axis at the capture equipment facing the participants (i.e., the point of view of an outside observer). Order is right to left (T), front to back (D) and bottom to top (V) . A "starting" value is given (so that we don't have to deal with negative numbers in the list) and a "unit" value (the smallest addressable chunk of degrees or meters). All spatially equipment is given an ID (could just be numbered, it doesn't really matter). So, assuming the above AC0-AC5, this situation could be dealt with as something like Declare T UNIT 30 STARTING -60 Declare D UNIT 2 STARTING 2 Order (ACO, 1,2), (AC1, 2, 2), (AC2, 3,2), (AC3, 1,3), (AC4, 2,3), (AC5, 3,3), where (ACO, 1,1) means unit ACO covers T zone 1 (-60 to -30 degrees) and D zone 2 (2 to 4 meters into the depth of field), etc. If there was a camera that covered all 120 degrees, and 3 others that covered 30 degrees each, that could be Declare T VARIABLE UNIT 30 STARTING -60 Order (VC0, 1, 4), (VC1, 1, 1), (VC1, 1, 1), (VC2, 2, 1), (VC3, 3, 1), where (VC0, 1, 4) means unit VC0 covers from -60 to +60 degrees (4 30 degree slices), unit VC1 covers -60 to -30 degrees (1 slice), etc. I don't see this as really much more complicated or byte consuming and it really would avoid a bunch of problems I see as coming from overlaying a potentially variable 3-D spatial order onto a simple list. Regards Marshall > Regards, Christian > > ______________________________**_________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/**listinfo/clue > --000e0cd6a95a3aabd904abcbf7d6 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

On Wed, Aug 31, 2011 at 6:25 AM, Christi= an Groves <Christian.Groves@nteczone.com> wrote:
Hello,

Whilst reviewing the use case document use case 4.1 point to point meeting = talks about the possibility for separate monophonic audio streams. Now this= presumably this allows for a microphone to be placed in front of each part= icipant.

Now consider a three position (left to right) two row telepresence system. = Each of these has a microphone i.e. 6 audio captures, Front row AC0, AC1, A= C2, Back row AC3, AC4, AC5.

The current framework document considers video and audio captures to be lef= t to right.
So its easy to describe the first row AC0, AC1, AC2. How would I describe u= sing the current framework the audio captures for the second row microphone= s?

Even if we disregard the row (depth) element, how would I then say AC0 &= ; AC3, AC1 & AC4 and AC3 & AC5 relate to the same left / centre / r= ight position?


My own personal opinion is that
<= div>
- it is dangerous and limiting to attempt to capture equ= ipment spatial placement in a simple ordered list, especially when that equ= ipment (or its capture zone) can move and change with time and that

- we are likely to be sending kilobytes of information = about codec formats and the like , so I don't see why we can't affo= rd a few bytes to
describe a mapping between equipment ids (numbe= ring) and what and where they are capturing.=A0

If you want a specific proposal to deal with this, here= is one.

There are three possible rankings of spat= ial equipment capture, transverse, depth and vertical. Call them T, D and V= .

Each can be declared to be relevant, with the assumptio= n that at least one will be. Ones not declared relevant are ignored.=A0

Each can be declared to be fixed or variable width. I= n the case of fixed width, the width is also declared.

Units are degrees (T AND V) and meters (D).=A0

Everything is referenced to the axis of symmetry of the un= it, oriented from the point of view of an observer standing on the axis at = the capture equipment facing the participants (i.e., the point of view of a= n outside observer). Order is right to left (T), front to back (D) and bott= om to top (V) . A "starting" value is given (so that we don't= have to deal with negative numbers in the list) and a "unit" val= ue (the smallest addressable chunk of degrees or meters).=A0

All spatially equipment is given an ID (could just be n= umbered, it doesn't really matter).=A0

So, ass= uming the above AC0-AC5, this situation could be dealt with as something li= ke

Declare T UNIT 30 STARTING -60
Declare D UNIT= 2 STARTING 2

Order (ACO, 1,2), (AC1, 2, 2),=A0(AC= 2, 3,2),=A0(AC3, 1,3),=A0(AC4, 2,3),=A0(AC5, 3,3),

where=A0(ACO, 1,1) means unit ACO covers T zone 1 (-60 to -30 degrees) and= D zone 2 (2 to 4 meters into the depth of field), etc.

If there was a camera that covered all 120 degrees, and= 3 others that covered 30 degrees each, that could be

<= div>Declare T VARIABLE UNIT 30 STARTING -60

Order = (VC0, 1, 4),=A0(VC1, 1, 1),=A0(VC1, 1, 1),=A0(VC2, 2, 1),=A0(VC3, 3, 1),

where=A0(VC0, 1, 4) means unit VC0 covers from -60 to += 60 degrees (4 30 degree slices), unit VC1 covers -60 to -30 degrees (1 slic= e), etc.=A0

I don't see this as really much mo= re complicated or byte consuming and it really would avoid a bunch of probl= ems I see as coming from overlaying a potentially variable 3-D spatial orde= r onto a simple list.=A0

Regards
Marshall

=A0=
Regards, Christian

_______________________________________________
clue mailing list
clue@ietf.org
ht= tps://www.ietf.org/mailman/listinfo/clue

--000e0cd6a95a3aabd904abcbf7d6-- From mary.ietf.barnes@gmail.com Wed Aug 31 08:05:47 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B91BE21F8C20 for ; Wed, 31 Aug 2011 08:05:47 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -103.476 X-Spam-Level: X-Spam-Status: No, score=-103.476 tagged_above=-999 required=5 tests=[AWL=0.122, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5GyHOLWCybC1 for ; Wed, 31 Aug 2011 08:05:46 -0700 (PDT) Received: from mail-vw0-f44.google.com (mail-vw0-f44.google.com [209.85.212.44]) by ietfa.amsl.com (Postfix) with ESMTP id 4A05C21F8BEC for ; Wed, 31 Aug 2011 08:05:46 -0700 (PDT) Received: by vws12 with SMTP id 12so791302vws.31 for ; Wed, 31 Aug 2011 08:07:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=mH8J2nkXkjv2WOtfb3nC/3rR1un5NiIk0+zW4Ple+SU=; b=KAIu9ZZwvw1oZuHg+/A3/4GCIpWYhVSeCABPoctk+XU0WchHqpz9klgdGGHSpynpws jfd9/No0KQJ/Z+HlHyN1oc/NcC3pRdxIvYZj41hdlNp0xOOWJHG0ihu6O8pKI2n8lWtZ T/ug1iL1QUbT6iMvewMHv++9SsMvDHSzPsdlw= MIME-Version: 1.0 Received: by 10.52.66.163 with SMTP id g3mr461084vdt.90.1314803236493; Wed, 31 Aug 2011 08:07:16 -0700 (PDT) Received: by 10.52.35.2 with HTTP; Wed, 31 Aug 2011 08:07:16 -0700 (PDT) In-Reply-To: <04ef01cc67b0$e18d9880$a4a8c980$%roni@huawei.com> References: <1241030509.3022037.1314646459806.JavaMail.doodle@worker2> <04ef01cc67b0$e18d9880$a4a8c980$%roni@huawei.com> Date: Wed, 31 Aug 2011 10:07:16 -0500 Message-ID: From: Mary Barnes To: Roni Even Content-Type: multipart/alternative; boundary=20cf307f3ba6dc5a3104abce7b5c Cc: CLUE Subject: Re: [clue] Doodle: Link for poll "CLUE WG F2F Interim" X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Aug 2011 15:05:47 -0000 --20cf307f3ba6dc5a3104abce7b5c Content-Type: text/plain; charset=ISO-8859-1 Roni, At this time, as you note, the framework document is an individual document. We could take a hum vote on the list as to whether to accept this as a WG document after the next version is submitted since it will have material that was discussed at IETF-81 that wasn't in the original document. Certainly, with only one document on the table at this time for the framework WG milestone, it would not be unreasonable to do so. If the document is still an individual document at the time of the interim meeting, then the authors do have much more liberty in terms of the changes made to the document. However, I do not believe it behooves to authors to not make changes based on WG feedback. One thing I should clarify is that the focus of the interim is on the working group deliverable for a framework. While there is only one framework document at the table at this time, there is nothing that precludes other individuals from submitting documents for discussion and requesting agenda time. The agenda will be finalized based on the current status of discussions on the mailing list. There has been some really good discussion on the current individual framework document, so hopefully that will allow the authors to understand where their are gaps and potential issues in the current proposed framework. Regards, Mary. On Wed, Aug 31, 2011 at 2:37 AM, Roni Even wrote: > Mary,**** > > What is the plan for the framework document. Are we going to discuss the > framework in the f2f based on a WG document or an individual draft. **** > > If it will be based on the individual draft how are we going to decide on > changes to the current text?**** > > ** ** > > Roni**** > > ** ** > > *From:* clue-bounces@ietf.org [mailto:clue-bounces@ietf.org] *On Behalf Of > *Mary Barnes > *Sent:* Tuesday, August 30, 2011 12:46 AM > *To:* CLUE > *Subject:* [clue] Doodle: Link for poll "CLUE WG F2F Interim"**** > > ** ** > > Hi folks,**** > > ** ** > > We are considering holding a face to face for CLUE in order to progress the > framework. In speaking to some of the primary authors, October 11th and 12th > (1.5 days) looks like it might work. The plan is to host the meeting in > Boston (ideally at the Polycom Andover site, but we'll need to work out the > logistics). However, we first need an idea of how many people could attend > a f2f. **** > > http://doodle.com/h3ahqn9ht96m839k > > We would also have a Webex session. If you are not able to attend a f2f > but would participate via Webex, please include a comment indicating such. > **** > > ** ** > > In order to plan, we would like responses no later than Monday, Sept. 5th > at 5pm Pacific. **** > > ** ** > > We will do a separate doodle poll for a virtual interim if we don't get > enough folks able to attend a f2f.**** > > ** ** > > Thanks,**** > > Mary**** > > CLUE WG co-chair**** > > ** ** > --20cf307f3ba6dc5a3104abce7b5c Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Roni,

At this time, as you note, the framework document = is an individual document. =A0We could take a hum vote on the list as to wh= ether to accept this as a WG document after the next version is submitted s= ince it will have material that was discussed at IETF-81 that wasn't in= the original document. =A0 Certainly, with only one document on the table = at this time for the framework WG milestone, it would not be unreasonable t= o do so.=A0

If the document is still an individual document at the = time of the interim meeting, then the authors do have much more liberty in = terms of the changes made to the document. =A0However, I do not believe it = behooves to authors to not make changes based on WG feedback. =A0=A0

One thing I should clarify is that the focus of the int= erim is on the working group deliverable for a framework. While there is on= ly one framework document at the table at this time, there is nothing that = precludes other individuals from submitting documents for discussion and re= questing agenda time. =A0 =A0The agenda will be finalized based on the curr= ent status of discussions on the mailing list. =A0There has been some reall= y good discussion on the current individual framework document, so hopefull= y that will allow the authors to understand where their are gaps and potent= ial issues in the current proposed framework.=A0

Regards,
Mary.=A0


On Wed, Aug 31, 2011 at 2:37 AM, Roni Even <Even.roni@huawei.com>= ; wrote:

Mary,

What = is the plan for the framework document. Are we going to discuss the framewo= rk =A0in the f2f based on a WG document or an individual draft. =A0<= u>

If it= will be based on the individual draft how are we going to decide on change= s to the current text?

=A0

Roni<= u>

=A0

=A0

http://doodle.com/h3ahqn9ht96m839k

We wo= uld also have a Webex session. =A0If you are not able to attend a f2f but w= ould participate via Webex, please include a comment indicating such.=A0=

=A0

In order to plan, we would like responses no later than Mond= ay, Sept. 5th at 5pm Pacific. =A0

=A0

We will do a separat= e doodle poll for a virtual interim if we don't get enough folks able t= o attend a f2f.

= =A0

Thanks,

Mary

CLUE WG co-chair

=A0


<= /div> --20cf307f3ba6dc5a3104abce7b5c-- From stewe@stewe.org Wed Aug 31 10:30:24 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 474BB21F8DD2 for ; Wed, 31 Aug 2011 10:30:24 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -0.508 X-Spam-Level: X-Spam-Status: No, score=-0.508 tagged_above=-999 required=5 tests=[AWL=-0.972, BAYES_00=-2.599, HTML_MESSAGE=0.001, MIME_QP_LONG_LINE=1.396, SARE_HTML_USL_OBFU=1.666] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id g323P+o336uj for ; Wed, 31 Aug 2011 10:30:23 -0700 (PDT) Received: from stewe.org (stewe.org [85.214.122.234]) by ietfa.amsl.com (Postfix) with ESMTP id 9595C21F8DA3 for ; Wed, 31 Aug 2011 10:30:21 -0700 (PDT) Received: from [192.168.1.104] (unverified [24.5.184.151]) by stewe.org (SurgeMail 3.9e) with ESMTP id 31506-1743317 for multiple; Wed, 31 Aug 2011 19:31:47 +0200 User-Agent: Microsoft-MacOutlook/14.12.0.110505 Date: Wed, 31 Aug 2011 10:31:38 -0700 From: Stephan Wenger To: Message-ID: Thread-Topic: [clue] Use Case and Framework question In-Reply-To: Mime-version: 1.0 Content-type: multipart/alternative; boundary="B_3397631508_1351640" X-Originating-IP: 24.5.184.151 X-Authenticated-User: stewe@stewe.org X-ORBS-Stamp: Your IP (24.5.184.151) was found in the spamhaus database. http://www.spamhaus.net Subject: Re: [clue] Use Case and Framework question X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Aug 2011 17:30:24 -0000 > This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. --B_3397631508_1351640 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit Thanks, Marshall. I strongly support such an approach. In fact, we should probably add a similar mechanism to describe the rendering side. Doing so would provide MCUs that can encode multiple video streams to be displayed on a multitude of monitors so much more leverage, accommodates for monitors of different sizes, and so forth. Stephan From: Marshall Eubanks Date: Wed, 31 Aug 2011 08:07:01 -0400 To: Christian Groves Cc: Subject: Re: [clue] Use Case and Framework question On Wed, Aug 31, 2011 at 6:25 AM, Christian Groves wrote: > Hello, > > Whilst reviewing the use case document use case 4.1 point to point meeting > talks about the possibility for separate monophonic audio streams. Now this > presumably this allows for a microphone to be placed in front of each > participant. > > Now consider a three position (left to right) two row telepresence system. > Each of these has a microphone i.e. 6 audio captures, Front row AC0, AC1, AC2, > Back row AC3, AC4, AC5. > > The current framework document considers video and audio captures to be left > to right. > So its easy to describe the first row AC0, AC1, AC2. How would I describe > using the current framework the audio captures for the second row microphones? > > Even if we disregard the row (depth) element, how would I then say AC0 & AC3, > AC1 & AC4 and AC3 & AC5 relate to the same left / centre / right position? > My own personal opinion is that - it is dangerous and limiting to attempt to capture equipment spatial placement in a simple ordered list, especially when that equipment (or its capture zone) can move and change with time and that - we are likely to be sending kilobytes of information about codec formats and the like , so I don't see why we can't afford a few bytes to describe a mapping between equipment ids (numbering) and what and where they are capturing. If you want a specific proposal to deal with this, here is one. There are three possible rankings of spatial equipment capture, transverse, depth and vertical. Call them T, D and V. Each can be declared to be relevant, with the assumption that at least one will be. Ones not declared relevant are ignored. Each can be declared to be fixed or variable width. In the case of fixed width, the width is also declared. Units are degrees (T AND V) and meters (D). Everything is referenced to the axis of symmetry of the unit, oriented from the point of view of an observer standing on the axis at the capture equipment facing the participants (i.e., the point of view of an outside observer). Order is right to left (T), front to back (D) and bottom to top (V) . A "starting" value is given (so that we don't have to deal with negative numbers in the list) and a "unit" value (the smallest addressable chunk of degrees or meters). All spatially equipment is given an ID (could just be numbered, it doesn't really matter). So, assuming the above AC0-AC5, this situation could be dealt with as something like Declare T UNIT 30 STARTING -60 Declare D UNIT 2 STARTING 2 Order (ACO, 1,2), (AC1, 2, 2), (AC2, 3,2), (AC3, 1,3), (AC4, 2,3), (AC5, 3,3), where (ACO, 1,1) means unit ACO covers T zone 1 (-60 to -30 degrees) and D zone 2 (2 to 4 meters into the depth of field), etc. If there was a camera that covered all 120 degrees, and 3 others that covered 30 degrees each, that could be Declare T VARIABLE UNIT 30 STARTING -60 Order (VC0, 1, 4), (VC1, 1, 1), (VC1, 1, 1), (VC2, 2, 1), (VC3, 3, 1), where (VC0, 1, 4) means unit VC0 covers from -60 to +60 degrees (4 30 degree slices), unit VC1 covers -60 to -30 degrees (1 slice), etc. I don't see this as really much more complicated or byte consuming and it really would avoid a bunch of problems I see as coming from overlaying a potentially variable 3-D spatial order onto a simple list. Regards Marshall > Regards, Christian > > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue > _______________________________________________ clue mailing list clue@ietf.org https://www.ietf.org/mailman/listinfo/clue --B_3397631508_1351640 Content-type: text/html; charset="US-ASCII" Content-transfer-encoding: quoted-printable
Thanks, Marshall.
= I strongly support such an approach.  In fact, we should probably add a= similar mechanism to describe the rendering side.  Doing so would prov= ide MCUs that can encode multiple video streams to be displayed on a multitu= de of monitors so much more leverage, accommodates for monitors of different= sizes, and so forth.
Stephan


<= span id=3D"OLK_SRC_BODY_SECTION">
From: Marshall Eubanks <marshall.eubanks@gmail.com>
Date: Wed, 31 Aug 2011 08:07:01 -0400
To: Christian Groves <Christian.Groves@nteczone.com>
Cc: <cl= ue@ietf.org>
Subject: Re: [= clue] Use Case and Framework question



On Wed, Aug 31, 2011 at 6:25 AM, Christian Groves <Christian.Groves= @nteczone.com> wrote:
Hello,

Whilst reviewing the use case document use case 4.1 point to point meeting = talks about the possibility for separate monophonic audio streams. Now this = presumably this allows for a microphone to be placed in front of each partic= ipant.

Now consider a three position (left to right) two row telepresence system. = Each of these has a microphone i.e. 6 audio captures, Front row AC0, AC1, AC= 2, Back row AC3, AC4, AC5.

The current framework document considers video and audio captures to be lef= t to right.
So its easy to describe the first row AC0, AC1, AC2. How would I describe u= sing the current framework the audio captures for the second row microphones= ?

Even if we disregard the row (depth) element, how would I then say AC0 &= ; AC3, AC1 & AC4 and AC3 & AC5 relate to the same left / centre / ri= ght position?


My own personal opinio= n is that

- it is dangerous and limiting to attempt= to capture equipment spatial placement in a simple ordered list, especially= when that equipment (or its capture zone) can move and change with time and= that

- we are likely to be sending kilobytes of in= formation about codec formats and the like , so I don't see why we can't aff= ord a few bytes to
describe a mapping between equipment ids (numbe= ring) and what and where they are capturing. 

= If you want a specific proposal to deal with this, here is one.
There are three possible rankings of spatial equipment capture,= transverse, depth and vertical. Call them T, D and V.

<= div>Each can be declared to be relevant, with the assumption that at least o= ne will be. Ones not declared relevant are ignored. 

Each can be declared to be fixed or variable width. In the case of fi= xed width, the width is also declared.

Units are de= grees (T AND V) and meters (D). 

Everything is= referenced to the axis of symmetry of the unit, oriented from the point of = view of an observer standing on the axis at the capture equipment facing the= participants (i.e., the point of view of an outside observer). Order is rig= ht to left (T), front to back (D) and bottom to top (V) . A "starting" value= is given (so that we don't have to deal with negative numbers in the list) = and a "unit" value (the smallest addressable chunk of degrees or meters).&nb= sp;

All spatially equipment is given an ID (could j= ust be numbered, it doesn't really matter). 

S= o, assuming the above AC0-AC5, this situation could be dealt with as somethi= ng like

Declare T UNIT 30 STARTING -60
De= clare D UNIT 2 STARTING 2

Order (ACO, 1,2), (AC1, 2= , 2), (AC2, 3,2), (AC3, 1,3), (AC4, 2,3), (AC5, 3,3),

where (ACO, 1,1) means unit ACO covers T zone 1 = (-60 to -30 degrees) and D zone 2 (2 to 4 meters into the depth of field), e= tc.

If there was a camera that covered all 120 degr= ees, and 3 others that covered 30 degrees each, that could be

=
Declare T VARIABLE UNIT 30 STARTING -60

= Order (VC0, 1, 4), (VC1, 1, 1), (VC1, 1, 1), (VC2, 2, 1),&nbs= p;(VC3, 3, 1),

where (VC0, 1, 4) means unit VC= 0 covers from -60 to +60 degrees (4 30 degree slices), unit VC1 covers -60 t= o -30 degrees (1 slice), etc. 

I don't see thi= s as really much more complicated or byte consuming and it really would avoi= d a bunch of problems I see as coming from overlaying a potentially variable= 3-D spatial order onto a simple list. 

Regard= s
Marshall

 
Regards, Christian

_______________________________________________
clue mailing list
clue@ie= tf.org
https://www.ietf.org/mailman/listinfo/clue

_______________________________________________ clue mailing list clue@ietf.org https://www.ietf.org/m= ailman/listinfo/clue --B_3397631508_1351640-- From Christian.Groves@nteczone.com Wed Aug 31 21:19:01 2011 Return-Path: X-Original-To: clue@ietfa.amsl.com Delivered-To: clue@ietfa.amsl.com Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A9F2421F8B40 for ; Wed, 31 Aug 2011 21:19:01 -0700 (PDT) X-Virus-Scanned: amavisd-new at amsl.com X-Spam-Flag: NO X-Spam-Score: -2.599 X-Spam-Level: X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599] Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TdAXkf-3kQIM for ; Wed, 31 Aug 2011 21:19:01 -0700 (PDT) Received: from ipmail06.adl2.internode.on.net (ipmail06.adl2.internode.on.net [150.101.137.129]) by ietfa.amsl.com (Postfix) with ESMTP id B2EA321F8B3B for ; Wed, 31 Aug 2011 21:19:00 -0700 (PDT) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApMBAJEEX0520VLN/2dsb2JhbAAMNqsHAQEBAQIBAQEBNRsUBwoBEAsYCRYIBwkDAgECARUfEQYNAQUCAQGHbgS5HIZVBKRA Received: from ppp118-209-82-205.lns20.mel4.internode.on.net (HELO [127.0.0.1]) ([118.209.82.205]) by ipmail06.adl2.internode.on.net with ESMTP; 01 Sep 2011 13:50:29 +0930 Message-ID: <4E5F07ED.7010809@nteczone.com> Date: Thu, 01 Sep 2011 14:19:57 +1000 From: Christian Groves User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20110812 Thunderbird/6.0 MIME-Version: 1.0 To: Stephen Botzko References: <4E5E0C00.6050208@nteczone.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: clue@ietf.org Subject: Re: [clue] Use Case and Framework question X-BeenThere: clue@ietf.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: CLUE - ControLling mUltiple streams for TElepresence List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Sep 2011 04:19:01 -0000 Hello Stephen, Please see my responses below. Regards, Christian On 31/08/2011 9:47 PM, Stephen Botzko wrote: > The current framework draft allows multiple captures to signal the > same linear index. So it is fine for AC0,AC3 and the the other pairs > to all signal the same index. This covers the case where the two rows > are captured in the same video captures (that is, the left camera VC0 > captures the left side of both rows). If the video capture set is VC0, > VC1, VC2, it is possible to associate both AC3 and AC0 with VC0. [CNG] OK, linear index is used to indicate the position of an audio capture rather than the left/right behaviour? > > There is a related case, which is when there are also independent > cameras (AC3 has a corresponding VC3). The framework also allows > multiple capture sets, so you can signal VC3, VC4, VC5, AC3, AC4, AC5 > as its own independent capture set. [CNG] However in this case you would need some "depth" parameter/s? > > There's been quite a bit of discussion on how two "separate monophonic > streams" relate to a "stereo audio stream" amongst the authors of the > framework. At this point I believe the conclusion is that the key > distinction between them is RTP transport. A second possible > distinction is that some audio codecs allow a stereo audio stream to > be jointly encoded, providing somewhat better compression. [CNG] I had assumed that the monophonic audio streams were separate RTP streams. However my question was more looking at the limitation of the framework model rather than opening how audio is encoded and carried. > > BTW, though it is convenient to think about audio captures as > independent microphones, it is important to keep in mind that there > are other ways to generate them. For instance, a microphone array can > use beam-forming techniques to construct multiple captures from the > microphone array. This can in principle be done with video, > particularly if you also have depth information. The framework does > not describe how many sensors are used to create a capture. [CNG] I agree. My questions came as a result of the use cases which talk about monophonic stream and microphones. In the beam-forming would the aspect of "depth" in addition to "linear index" also be useful? > > Stephen Botzko > > > > On Wed, Aug 31, 2011 at 6:25 AM, Christian Groves > > > wrote: > > Hello, > > Whilst reviewing the use case document use case 4.1 point to point > meeting talks about the possibility for separate monophonic audio > streams. Now this presumably this allows for a microphone to be > placed in front of each participant. > > Now consider a three position (left to right) two row telepresence > system. Each of these has a microphone i.e. 6 audio captures, > Front row AC0, AC1, AC2, Back row AC3, AC4, AC5. > > The current framework document considers video and audio captures > to be left to right. > So its easy to describe the first row AC0, AC1, AC2. How would I > describe using the current framework the audio captures for the > second row microphones? > > Even if we disregard the row (depth) element, how would I then say > AC0 & AC3, AC1 & AC4 and AC3 & AC5 relate to the same left / > centre / right position? > > Regards, Christian > > _______________________________________________ > clue mailing list > clue@ietf.org > https://www.ietf.org/mailman/listinfo/clue > >