X-Mozilla-Status: 0001
X-Mozilla-Status2: 00000000
MIME-Version: 1.0
Sender: barryleiba.mailing.lists@gmail.com
Received: by 10.58.106.73 with HTTP; Mon, 7 Jul 2014 11:53:19 -0700 (PDT)
In-Reply-To: <53BAB55A.2090008@alum.mit.edu>
References: <53BAB55A.2090008@alum.mit.edu>
Date: Mon, 7 Jul 2014 14:53:19 -0400
Delivered-To: barryleiba.mailing.lists@gmail.com
X-Google-Sender-Auth: _xOHvHgOguXMiJgaj7VaVek4u8I
Message-ID: <CAC4RtVBnjxi5Q-0WMd2J9Hm8oct+agc8h2V=koSJnYNrpzZ_Ag@mail.gmail.com>
Subject: Re: [abnf-discuss] defining "compatible" extensions
From: Barry Leiba <barryleiba@computer.org>
To: Paul Kyzivat <pkyzivat@alum.mit.edu>
Cc: "abnf-discuss@ietf.org" <abnf-discuss@ietf.org>
Content-Type: text/plain; charset=UTF-8

The main issue I have whenever this sort of thing come up is that ABNF
is there to specify syntax, not semantics.  The ABNF in 4566 correctly
says that the syntax of an att-name is that it's a token.  The
specification itself -- the rest of it, beyond the ABNF -- is there to
tell us what values to expect there, what to do with them, and how to
define extensions.

That many specs enumerate all the tokens that are valid at the time of
their writing isn't really relevant to this, as I see it.  Personally,
I think we should stop doing that *unless* we want to define something
that intentionally has no extensibility.  To me, this makes sense:

   florb-value = "true" / "false"

...while this does not:

   florb-value = "true" / "false" / florb-ext
   florb-ext = token

The first is clearly saying that *syntactically*, there are only two
things that can appear in a florb-value.  You can safely write a
parser that looks for those and throws a syntax error if it sees
anything else.

What on Earth is the second saying, syntactically?  I'd better write
my parser to parse it as a token.  I presume there's something else in
the text that tells me what to do with "true" and "false"
semantically, and that explains the extensibility.  What's the point
of having that in the *syntax*?

This sort of thing is also fine, as I see it:

   florb-value = token ; must be a registered item, as
                       ; defined in Section 3.2.1

Here we're using a comment in the syntax to point the reader to the
section that gives the semantics and explains the valid value.
References are good.

Twisting syntax specification around to try to make it go beyond
syntax is not good.

Clearly, opinions differ on this... but there's mine.

Barry

On Mon, Jul 7, 2014 at 10:57 AM, Paul Kyzivat <pkyzivat@alum.mit.edu> wrote:
> I've been bothered by something for a long time. Now it has come up again in
> the context of rfc4566bis (mmusic).
>
> The question is how best to define an extensible syntax, and then later
> define corresponding extensions.
>
> I'll give a couple of examples:
>
> RFC4566 defines:
>
>    attribute-fields =    *(%x61 "=" attribute CRLF)
>    attribute =           (att-field ":" att-value) / att-field
>    att-field =           token
>    att-value =           byte-string
>
> It defines an iana registry for attributes, keyed by <att-field> tokens.
> Those implementing SDP are required to ignore any attributes they don't
> understand/support.
>
> RFC4566 also defines a number of attributes. It does not provide ABNF for
> them - it defines the syntax informally. That is something I've proposed to
> fix in 4566bis, and is driving this query.
>
> Many new attributes have been defined (in separate RFCs) since the
> publication of RFC4566, using a variety of techniques. A popular one is to
> define new ones as:
>
>    attribute /= new-att
>    new-att = "new-att-name:" new-att-value
>    new-att-value = some-abnf
>
> To be valid, the definition of <new-att-value> then needs to be "compatible"
> with the definition of <att-value>. But there is no way to indicate this
> formally in the abnf.
>
> SIP approached this differently. It has:
>
>    message-header  =  (Accept
>                    /  Accept-Encoding
>    ...
>                    /  WWW-Authenticate
>                    /  extension-header) CRLF
>
>    extension-header  =  header-name HCOLON header-value
>    header-name       =  token
>    header-value      =  *(TEXT-UTF8char / UTF8-CONT / LWS)
>
>    Accept            =  "Accept" HCOLON
>                       [ accept-range *(COMMA accept-range) ]
>
> RFC6086 gives an example of an extension header defined elsewhere:
>
>    message-header      =/ (Info-Package / Recv-Info) CRLF
>    Info-Package        =  "Info-Package" HCOLON Info-package-type
>    Recv-Info           =  "Recv-Info" HCOLON [Info-package-list]
>    Info-package-list   =  Info-package-type *( COMMA Info-package-type )
>    Info-package-type   =  Info-package-name *( SEMI Info-package-param )
>    Info-package-name   =  token
>    Info-package-param  =  generic-param
>
> Again, the definitions of <Info-Package> and <Recv-Info> must also be
> *compatible* with <extension-header>. But there is nothing in the ABNF that
> says this.
>
> (Note: I had to hunt extensively to find an extension that was defined this
> cleanly. Most of them, even more recent ones, aren't this clean. And somehow
> they slipped by me, who cares about such things, even while I was chair of
> sipcore.)
>
> I don't think that ABNF *must* solve this problem, but I can imagine
> extensions to ABNF that would *help* with this. Or else, recommendations of
> how to approach this.
>
> The key thing is that an extension definition needs to satisfy *two* ABNF
> rules. For instance, above we need that both of the following be true:
>
>    new-att = "new-att-name:" new-att-value
>    new-att = (att-field ":" att-value) / att-field
>
> I have brought this issue up previously wrt 4566, and have seen conflicting
> opinions on the best way to deal with this. But this problem isn't really
> unique to RFC4566 or mmusic, or RAI. It is a general issue.
>
> I welcome your thoughts.
>
>         Thanks,
>         Paul
>
> _______________________________________________
> abnf-discuss mailing list
> abnf-discuss@ietf.org
> https://www.ietf.org/mailman/listinfo/abnf-discuss

