Message-Id: <199506010615.CAA13491@wilma.cs.utk.edu>
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Keith Moore <moore@cs.utk.edu>
To: Chris Newman <chrisn+@cmu.edu>
cc: ietf-822@dimacs.rutgers.edu, ietf-types@uninett.no, moore@cs.utk.edu
Subject: Re: Media Type registration form 
In-reply-to: Your message of "Wed, 31 May 1995 17:28:15 EDT."
             <801955695.6776.0@nifty.andrew.cmu.edu> 
Date: Thu, 01 Jun 1995 02:15:52 -0400

> I would like to see a few additions to the MIME media type
> registration form.
> 
> My first proposal is to have a new section on "Interchange
> considerations:" for discussing versioning problems, byte order
> problems, gateway problems for things like application/mac-binhex40,
> etc. 

I like the idea.  The problem is in getting people to understand what
it is.  People have enough trouble understanding security
considerations.  That doesn't mean I'm opposed to having that section,
just that it makes it more difficult on someone proposing a new type.
(maybe this section could be optional?)

> Magic Number: {length, bytes}
> File extension(s) in common use: {in order by preference}
> Macintosh File Type code: {4 octets}

I like these too. 

Keith


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa13234;
          1 Jun 95 17:25 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa13229;
          1 Jun 95 17:25 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa17281;
          1 Jun 95 17:25 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id QAA20511 for ietf-822-list; Thu, 1 Jun 1995 16:15:10 -0400
Received: from po6.andrew.cmu.edu (PO6.ANDREW.CMU.EDU [128.2.10.106]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id QAA20506 for <ietf-822@dimacs.rutgers.edu>; Thu, 1 Jun 1995 16:14:57 -0400
Received: (from postman@localhost) by po6.andrew.cmu.edu (8.6.12/8.6.12) id QAA20360 for ietf-822@dimacs.rutgers.edu; Thu, 1 Jun 1995 16:14:34 -0400
Received: via switchmail; Thu,  1 Jun 1995 16:14:32 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/service/mailqs/testq0/QF.gjnVxi200WBwQ0W25K>;
          Thu,  1 Jun 1995 16:13:34 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/usr7/jm36/.Outgoing/QF.UjnVxhC00WBwQ9xxgt>;
          Thu,  1 Jun 1995 16:13:33 -0400 (EDT)
Received: from BatMail.robin.v2.14.CUILIB.3.45.SNAP.NOT.LINKED.hogtown.andrew.cmu.edu.sun4c.411
          via MS.5.6.hogtown.andrew.cmu.edu.sun4c_411;
          Thu,  1 Jun 1995 16:13:30 -0400 (EDT)
Message-ID: <kjnVxeu00WBwE9xxUw@andrew.cmu.edu>
Date: Thu,  1 Jun 1995 16:13:30 -0400 (EDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: John Gardiner Myers <jgm+@cmu.edu>
To: ietf-822@dimacs.rutgers.edu
Subject: Canonical Encoding Model wording suggestion
Beak: Is

Here's some wording I suggest be added to the Canonical Encoding Model
section of the conformance document:

NOTE: There has been confusion caused by systems which represent
messages in a format which uses local newline conventions which differ
from the RFC822 CRLF convention.  It is important to note that these
formats are not canonical RFC822/MIME.  These formats are instead
*encodings* of RFC822, where CRLF sequences in the canonical
representation of the message are encoded as the local newline
convention.  Note that formats which encode CRLF sequences as, for
example, LF are not capable of representing MIME messages containing
binary data which contains LF octets not part of CRLF line separation
sequences.

-- 
_.John G. Myers		Internet: jgm+@CMU.EDU
			LoseNet:  ...!seismo!ihnp4!wiscvm.wisc.edu!give!up


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa20034;
          1 Jun 95 23:21 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa20030;
          1 Jun 95 23:21 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa23119;
          1 Jun 95 23:21 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id WAA28623 for ietf-822-list; Thu, 1 Jun 1995 22:14:16 -0400
Received: from THOR.INNOSOFT.COM (THOR.INNOSOFT.COM [192.160.253.66]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id WAA28620 for <ietf-822@dimacs.rutgers.edu>; Thu, 1 Jun 1995 22:14:14 -0400
Received: from INNOSOFT.COM by INNOSOFT.COM (PMDF V5.0-3 #2001)
 id <01HR79W0BQ5S8ZE5P5@INNOSOFT.COM>; Thu, 01 Jun 1995 19:13:34 -0700 (PDT)
Date: Thu, 01 Jun 1995 19:09:28 -0700 (PDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ned Freed <NED@innosoft.com>
Subject: Re: Canonical Encoding Model wording suggestion
In-reply-to: "Your message dated Thu, 01 Jun 1995 16:13:30 -0400 (EDT)"
 <kjnVxeu00WBwE9xxUw@andrew.cmu.edu>
To: John Gardiner Myers <jgm+@cmu.edu>
Cc: ietf-822@dimacs.rutgers.edu
Message-id: <01HR7B1RYW388ZE5P5@INNOSOFT.COM>
MIME-version: 1.0
Content-type: TEXT/PLAIN; CHARSET=US-ASCII
Content-transfer-encoding: 7BIT

> Here's some wording I suggest be added to the Canonical Encoding Model
> section of the conformance document:

> NOTE: There has been confusion caused by systems which represent
> messages in a format which uses local newline conventions which differ
> from the RFC822 CRLF convention.  It is important to note that these
> formats are not canonical RFC822/MIME.  These formats are instead
> *encodings* of RFC822, where CRLF sequences in the canonical
> representation of the message are encoded as the local newline
> convention.  Note that formats which encode CRLF sequences as, for
> example, LF are not capable of representing MIME messages containing
> binary data which contains LF octets not part of CRLF line separation
> sequences.

Good idea. I'll add it. 

At first I wasn't comfortable with the use of the term "encodings", but after
further consideration I think its proper and that using it is a very good idea.

				Ned


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa24953;
          2 Jun 95 1:20 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa24949;
          2 Jun 95 1:20 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa25025;
          2 Jun 95 1:20 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id AAA00741 for ietf-822-list; Fri, 2 Jun 1995 00:18:16 -0400
Received: from apple.com (apple.com [130.43.2.2]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id AAA00738 for <ietf-822@dimacs.rutgers.edu>; Fri, 2 Jun 1995 00:18:14 -0400
Received: by apple.com with SMTP (5.61/8-Oct-1993-eef)
	id AA22233; Thu, 1 Jun 95 21:17:49 -0700
	for ietf-822@dimacs.rutgers.edu
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: "Erik E. Fair" (Internet Architect) <fair@apple.com>
Subject: Re: Canonical Encoding Model wording suggestion 
In-Reply-To: <01HR7B1RYW388ZE5P5@INNOSOFT.COM> 
References: "Your message dated Thu, 01 Jun 1995 16:13:30 -0400 (EDT)" <kjnVxeu00WBwE9xxUw@andrew.cmu.edu> 
To: Ned Freed <NED@innosoft.com>
X-Return-Path: owner-ietf-822@dimacs.rutgers.edu 
Cc: John Gardiner Myers <jgm+@cmu.edu>, ietf-822@dimacs.rutgers.edu
Date: Thu, 01 Jun 95 21:17:48 -0700
Message-Id: <22231.802066668@apple.com>
X-Orig-Sender: fair@apple.com

This stuff is especially important to make clear for messages that are
going to be put though MD5 for authentication, and/or encryption...

	Erik E. Fair	apple!fair	fair@apple.com


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa06085;
          2 Jun 95 14:23 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa06079;
          2 Jun 95 14:23 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa09457;
          2 Jun 95 14:23 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id NAA14333 for ietf-822-list; Fri, 2 Jun 1995 13:31:56 -0400
Received: from po6.andrew.cmu.edu (PO6.ANDREW.CMU.EDU [128.2.10.106]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id NAA14330 for <ietf-822@dimacs.rutgers.edu>; Fri, 2 Jun 1995 13:31:54 -0400
Received: (from postman@localhost) by po6.andrew.cmu.edu (8.6.12/8.6.12) id NAA16718 for ietf-822@dimacs.rutgers.edu; Fri, 2 Jun 1995 13:31:48 -0400
Received: via switchmail; Fri,  2 Jun 1995 13:31:46 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/service/mailqs/testq0/QF.EjnofG200WBw80W2ss>;
          Fri,  2 Jun 1995 13:30:58 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/usr7/jm36/.Outgoing/QF.EjnofEm00WBw89xqch>;
          Fri,  2 Jun 1995 13:30:56 -0400 (EDT)
Received: from BatMail.robin.v2.14.CUILIB.3.45.SNAP.NOT.LINKED.hogtown.andrew.cmu.edu.sun4c.411
          via MS.5.6.hogtown.andrew.cmu.edu.sun4c_411;
          Fri,  2 Jun 1995 13:30:55 -0400 (EDT)
Message-ID: <4jnofD_00WBwA9xqQt@andrew.cmu.edu>
Date: Fri,  2 Jun 1995 13:30:55 -0400 (EDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: John Gardiner Myers <jgm+@cmu.edu>
To: ietf-822@dimacs.rutgers.edu
Subject: Re: Canonical Encoding Model wording suggestion
In-Reply-To: <01HR86HLQ76U8ZE5P5@INNOSOFT.COM>
References: <kjnVxeu00WBwE9xxUw@andrew.cmu.edu>
 <01HR7B1RYW388ZE5P5@INNOSOFT.COM>
	<01HR86HLQ76U8ZE5P5@INNOSOFT.COM>
Beak: Is

Ned Freed <NED@INNOSOFT.COM> writes:
> Agreed. That's why saying it as often as possible is probably a good idea

It might be a good idea to put that paragraph in *both* the Message
Bodies and Conformance documents.  I'm not sure where exactly it should
be put in Message Bodies to make it likely to be noticed.

-- 
_.John G. Myers		Internet: jgm+@CMU.EDU
			LoseNet:  ...!seismo!ihnp4!wiscvm.wisc.edu!give!up


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa07043;
          2 Jun 95 15:25 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa07039;
          2 Jun 95 15:25 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa10800;
          2 Jun 95 15:25 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id OAA15410 for ietf-822-list; Fri, 2 Jun 1995 14:55:06 -0400
Received: from gw2.att.com (gw1.att.com [192.20.239.133]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id OAA15406 for <ietf-822@dimacs.rutgers.edu>; Fri, 2 Jun 1995 14:55:05 -0400
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: hansen@pegasus.att.com
Received: from pegasus.UUCP by ig1.att.att.com id AA25123; Fri, 2 Jun 95 14:54:58 EDT
Message-Id: <9506021854.AA25123@ig1.att.att.com>
To: John Gardiner Myers <jgm+@cmu.edu>, ietf-822@dimacs.rutgers.edu
Date: Fri, 2 Jun 1995 14:50 EDT
Subject: Re: Canonical Encoding Model wording suggestion
Content-Type: text
Organization: AT&T Bell Laboratories, AT&T EasyLink Services

< Here's some wording I suggest be added to the Canonical Encoding Model
< section of the conformance document:
<
< NOTE: There has been confusion caused by systems which represent messages
< in a format which uses local newline conventions which differ from the
< RFC822 CRLF convention.  It is important to note that these formats are
< not canonical RFC822/MIME.  These formats are instead *encodings* of
< RFC822, where CRLF sequences in the canonical representation of the
< message are encoded as the local newline convention.  Note that formats
< which encode CRLF sequences as, for example, LF are not capable of
< representing MIME messages containing binary data which contains LF octets
< not part of CRLF line separation sequences.

I agree with all of this except the last sentence. The MIME systems I deal
with that use NL conventions all do it within just the text body parts.
Hence, your statement is not true for such systems.

					Tony Hansen
			  hansen@pegasus.att.com, tony@attmail.com


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa07195;
          2 Jun 95 15:37 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa07191;
          2 Jun 95 15:37 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa11022;
          2 Jun 95 15:37 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id NAA13948 for ietf-822-list; Fri, 2 Jun 1995 13:15:55 -0400
Received: from po6.andrew.cmu.edu (PO6.ANDREW.CMU.EDU [128.2.10.106]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id NAA13945 for <ietf-822@dimacs.rutgers.edu>; Fri, 2 Jun 1995 13:15:52 -0400
Received: (from postman@localhost) by po6.andrew.cmu.edu (8.6.12/8.6.12) id NAA16090 for ietf-822@dimacs.rutgers.edu; Fri, 2 Jun 1995 13:15:21 -0400
Received: via switchmail; Fri,  2 Jun 1995 13:15:18 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/service/mailqs/testq0/QF.YjnoPHy00WBwE0W2k=>;
          Fri,  2 Jun 1995 13:13:57 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/usr7/jm36/.Outgoing/QF.kjnoPFy00WBw09xpFb>;
          Fri,  2 Jun 1995 13:13:54 -0400 (EDT)
Received: from BatMail.robin.v2.14.CUILIB.3.45.SNAP.NOT.LINKED.hogtown.andrew.cmu.edu.sun4c.411
          via MS.5.6.hogtown.andrew.cmu.edu.sun4c_411;
          Fri,  2 Jun 1995 13:13:51 -0400 (EDT)
Message-ID: <IjnoPDG00WBwA9xp5A@andrew.cmu.edu>
Date: Fri,  2 Jun 1995 13:13:51 -0400 (EDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: John Gardiner Myers <jgm+@cmu.edu>
To: ietf-822@dimacs.rutgers.edu
Subject: Re: comments on latest MIME drafts
In-Reply-To: <01HR4FMWQFDS90MVQC@SIGURD.INNOSOFT.COM>
References: <01HQQG3M6YAI8WVYOU@INNOSOFT.COM> <01HQQG3M6YAI8WVYOU@INNOSOFT.COM>
	<01HR4FMWQFDS90MVQC@SIGURD.INNOSOFT.COM>
Beak: is Not

Ned Freed <NED@SIGURD.INNOSOFT.COM> writes:
> Then suggest alternative prose. We're years past the point where such
> generalities are acceptable input.

We appear to have different paradigms, so we have to discuss
generalities in order to discover prose that is commonly acceptable.

> I sort of agree with you about non-content headers actually being allowed in
> body parts. They are allowed syntactically but they have no sematics 
> associated with them.

There exists at least one MIME UA which puts "X-" headers in the
body-parts of multiparts in order to communicate with other instances
of itself.

> You tend to see things almost exclusively in terms of syntax. I do not. I see
> the various aspects of MIME in terms of the abstractions they are intended to
> represent and how their semantics tie in with those abstractions. Syntax then
> follows as secondary (or tertiary) concern.

It was clear that we had different paradigms and thus had problems
communicating.  Until I read this, i did not know what your paridgm
was.

Let me clarify my paradigm and suggest why it might be better than
yours.

In order for a data format to exist in the real world, it has to have
a syntax.  Data format specifications, such as MIME, usually specify
abstractions and give their semantics.  The semantics are then tied to
identifiable objects in the syntax.

Having the semantics be associated with identifiable syntactic objects
simplifies the task of generating and reading the data format.
Composers generate the syntactic constructs corresponding to the
semantics they want to convey.  Readers discover semantics by first
doing a parse to discover the syntax, then applying the association of
semantics to particular syntactic objects.

When semantics are not associated with syntactic objects, or when the
syntactic objects associated with given semanitics are not clearly
identifiable, then having a reader discover those semantics is
problematic.

> And there is quite clearly a HUGE
> difference in the abstract between a body part and a message.

In the abstract, they each have some features in common and some
features which differ.

The semantics they have in common (header/body syntax, content-
headers) I am trying to associate with the syntactic object known as
an "entity".

The semantics specific to a message (MIME-Version, destination and
other 822-defined fields) I am trying to associate with the syntactic
object known as a "message", which is a subset of an "entity".

The semantics specific to a body part (contained in a multipart, does
not require MIME-Version, may not contain enclosing multipart's
delimiter) I am trying to associate with the syntactic object known as
a "body-part", which is a disjoint subset of an "entity".

> The fact that they have nearly identical syntax does not mean that
> they are in fact the same -- they aren't. As such, trying to tie
> these things down using syntax as a distinguishing factor clouds the
> issues rather than clarifying anything.

They have some syntax (and associated semantics) in common, and they
have some syntax (and associated semantics) by themselves.

This is similar to comparing the value of a Sender: field with the
value of a Message-ID: field.  They both have some common semantics
associated with their included common syntactic object of an
addr-spec.  They are not, however, the same, but tying down their
common semantics using syntax does clarify things.

> Part of the problem surrounding the term "body part" is that this
> term has a well understood meaning semantically. It always has.

Actually, I don't think the meaning of the term is well understood.
It appears to be used for at least two different concepts.

> The term "body part" refers to headers and contents of either a
> message or one of the parts in the body of a multipart entity. Any
> sort of header may be present but only the content headers actually
> have any meaning in the context of a body part. A body part has a
> header and a body, so it makes sense to speak about the body of a
> body part.
> 
> Is this acceptable?

I think it's a bit confusing.  It also defines a term that is
a different concept than the body-part syntactic object, which has
semantics which do not apply to messages.

> Again, syntax and semantics are getting confused. Body parts can have
> non-content headers, but such headers do not tie into the body in any way.

"X-" headers in a body-part nonterminal are specifically allowed to
have private meanings.

> > Message bodies, Section 8.4, paragraph 1: this use of "body part" is
> > specific to multiparts.
> 
> Incorrect. CTE headers appear in contexts other than multiparts.

The previous sentence covers CTE headers appearing in messages.  It is
probably better to have a single sentence "If a
Content-Transfer-Encoding header field appears as part of an entity's
header, it applies to the entire body of that entity."

> > > The problem is that we define rules for body parts that have to
> > > apply to both the multipart and single part cases but don't apply to
> > > all entities.
> 
> > Could you give me an example of such a rule?
> 
> Any rule that defines something specific to MIME represents such an example.
> Entities can be messages and messages don't have to be MIME messages, hence
> MIME rules don't necessarily apply to all entities.

Here's another difference in paradigms.  In my paradigm, MIME rules
apply to all messages.  None of my MIME parsers ever look for
MIME-Version.

The MIME standard certainly permits this.  The MIME rules are
constructed such that if there are no Content-* headers, the MIME
rules are identical to the RFC 822 rules.  RFC 1049 Content-Type:
headers are not syntactically legal MIME Content-Type: headers, so a
MIME reader has the freedom to treat RFC 1049 Content-Type: headers as
it likes.

Even with the definition of "body part" in
draft-ietf-822ext-mime-imb-03.txt, messages which "aren't MIME
messages" have associated body parts.  Take the (presumably zero)
content headers of the message, along with the body and there's your
"body part".

If a message doesn't have a MIME-Version, then a receiving UA has the
option, given in section 6 of the message bodies document, of ignoring
all rules in MIME applying to the message body, including any rules
imposed by the fact that the message is an entity.

> >       entity = *field [ CRLF *OCTET ]
> 
> > It then follows that "message" and "body-part" are subsets of
> > "entity".
> 
> This is OK syntactically, but loses seriously on the semantics front. I
> don't see any way of making this change without making things even more
> confusing.

How does it lose?  You apply all the rules that you want for both
messages and body-parts, including the various Content-* headers, to
entities.  It appears to me to be a semantic win.

> > bodies are no longer restricted to 7bit data, so "*text" isn't
> > appropriate.
> 
> Neither is the original definition then. I'll change it to read:

The original definition of message from 822 was specific to 7bit data.
MIME expanded this in prose, but not in BNF, to be *OCTET.

By the way, I neglected to formally define OCTET.

> I actually like your original suggestion that we refer to the
> encoding of the body rather than that of the body part or entity
> better, so that's what I'll change it to.

Excellent.  This fits much better with the Canonical Encoding Model.
It is the body that the transfer-encoding is applied to, this
application is simply labeled in the headers of the body's enclosing entity.

-- 
_.John G. Myers		Internet: jgm+@CMU.EDU
			LoseNet:  ...!seismo!ihnp4!wiscvm.wisc.edu!give!up


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa08056;
          2 Jun 95 16:26 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa08052;
          2 Jun 95 16:26 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa12077;
          2 Jun 95 16:25 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id PAA16521 for ietf-822-list; Fri, 2 Jun 1995 15:57:39 -0400
Received: from po6.andrew.cmu.edu (PO6.ANDREW.CMU.EDU [128.2.10.106]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id PAA16518 for <ietf-822@dimacs.rutgers.edu>; Fri, 2 Jun 1995 15:57:37 -0400
Received: (from postman@localhost) by po6.andrew.cmu.edu (8.6.12/8.6.12) id PAA23702 for ietf-822@dimacs.rutgers.edu; Fri, 2 Jun 1995 15:57:25 -0400
Received: via switchmail; Fri,  2 Jun 1995 15:57:23 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/service/mailqs/testq0/QF.wjnqnJC00WBw00W386>;
          Fri,  2 Jun 1995 15:56:05 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/usr7/jm36/.Outgoing/QF.MjnqnDm00WBw49xtxg>;
          Fri,  2 Jun 1995 15:56:04 -0400 (EDT)
Received: from BatMail.robin.v2.14.CUILIB.3.45.SNAP.NOT.LINKED.hogtown.andrew.cmu.edu.sun4c.411
          via MS.5.6.hogtown.andrew.cmu.edu.sun4c_411;
          Fri,  2 Jun 1995 15:55:58 -0400 (EDT)
Message-ID: <4jnqnC200WBwE9xtkc@andrew.cmu.edu>
Date: Fri,  2 Jun 1995 15:55:58 -0400 (EDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: John Gardiner Myers <jgm+@cmu.edu>
To: ietf-822@dimacs.rutgers.edu
Subject: Re: Canonical Encoding Model wording suggestion
In-Reply-To: <9506021854.AA25123@ig1.att.att.com>
References: <9506021854.AA25123@ig1.att.att.com>
Beak: Is

hansen@pegasus.att.com writes:
> I agree with all of this except the last sentence. The MIME systems I deal
> with that use NL conventions all do it within just the text body parts.
> Hence, your statement is not true for such systems.

You'll have to describe in more detail what it is you're doing.  What
exactly is a "text body part"?

The systems I've had described to me which do something like this are
incompatible with MIME.  That is because what is or is not "text" is
not deterministic and a reader cannot always tell whether or not a 0a
octet maps to a 0a or to a 0d0a in the canonical MIME form.

-- 
_.John G. Myers		Internet: jgm+@CMU.EDU
			LoseNet:  ...!seismo!ihnp4!wiscvm.wisc.edu!give!up


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa09200;
          2 Jun 95 18:28 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa09196;
          2 Jun 95 18:28 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa14607;
          2 Jun 95 18:28 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id RAA18103 for ietf-822-list; Fri, 2 Jun 1995 17:56:10 -0400
Received: from THOR.INNOSOFT.COM (THOR.INNOSOFT.COM [192.160.253.66]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id RAA18100 for <ietf-822@dimacs.rutgers.edu>; Fri, 2 Jun 1995 17:56:07 -0400
Received: from INNOSOFT.COM by INNOSOFT.COM (PMDF V5.0-3 #2001)
 id <01HR79W0BQ5S8ZE5P5@INNOSOFT.COM>; Fri, 02 Jun 1995 10:13:34 -0700 (PDT)
Date: Fri, 02 Jun 1995 10:12:24 -0700 (PDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ned Freed <NED@innosoft.com>
Subject: Re: Canonical Encoding Model wording suggestion
In-reply-to: "Your message dated Thu, 01 Jun 1995 21:17:48 -0700"
 <22231.802066668@apple.com>
To: "Erik E. Fair" <fair@apple.com>
Cc: Ned Freed <NED@innosoft.com>, John Gardiner Myers <jgm+@cmu.edu>, 
    ietf-822@dimacs.rutgers.edu
Message-id: <01HR86HLQ76U8ZE5P5@INNOSOFT.COM>
MIME-version: 1.0
Content-type: TEXT/PLAIN; CHARSET=US-ASCII
Content-transfer-encoding: 7BIT
References: <kjnVxeu00WBwE9xxUw@andrew.cmu.edu>
 <01HR7B1RYW388ZE5P5@INNOSOFT.COM>

> This stuff is especially important to make clear for messages that are
> going to be put though MD5 for authentication, and/or encryption...

Agreed. That's why saying it as often as possible is probably a good idea --
MIME should say it, security multiparts should, content-md5 should, etc.

				Ned


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa09593;
          2 Jun 95 19:28 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa09589;
          2 Jun 95 19:28 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa15498;
          2 Jun 95 19:28 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id RAA18003 for ietf-822-list; Fri, 2 Jun 1995 17:41:19 -0400
Received: from gw2.att.com (gw1.att.com [192.20.239.133]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id RAA18000 for <ietf-822@dimacs.rutgers.edu>; Fri, 2 Jun 1995 17:41:18 -0400
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: hansen@pegasus.att.com
Received: from pegasus.UUCP by ig1.att.att.com id AA08641; Fri, 2 Jun 95 17:41:10 EDT
Message-Id: <9506022141.AA08641@ig1.att.att.com>
To: John Gardiner Myers <jgm+@cmu.edu>, ietf-822@dimacs.rutgers.edu
Date: Fri, 2 Jun 1995 17:36 EDT
Subject: Re: Canonical Encoding Model wording suggestion
Content-Type: text
Organization: AT&T Bell Laboratories, AT&T EasyLink Services

< hansen@pegasus.att.com writes:
<< I agree with all of this except the last sentence. The MIME systems I
<< deal with that use NL conventions all do it within just the text body
<< parts. Hence, your statement is not true for such systems.

< You'll have to describe in more detail what it is you're doing.  What
< exactly is a "text body part"?

Any body part whose content type says "text":

    Content-Type: text/*

< The systems I've had described to me which do something like this are
< incompatible with MIME.  That is because what is or is not "text" is not
< deterministic and a reader cannot always tell whether or not a 0a octet
< maps to a 0a or to a 0d0a in the canonical MIME form.

If the content type says it's text, then you can make assumptions about it.
If it doesn't say it's text, then you can't, so you have to leave it alone.
(If there is no content type, you can assume it's text and treat it
accordingly.)

					Tony Hansen
			  hansen@pegasus.att.com, tony@attmail.com


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa10519;
          2 Jun 95 21:20 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa10515;
          2 Jun 95 21:20 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa16976;
          2 Jun 95 21:20 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id UAA19142 for ietf-822-list; Fri, 2 Jun 1995 20:12:06 -0400
Received: from sigurd.innosoft.com (SIGURD.INNOSOFT.COM [192.160.253.70]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id UAA19139 for <ietf-822@dimacs.rutgers.edu>; Fri, 2 Jun 1995 20:12:04 -0400
Received: from SIGURD.INNOSOFT.COM by SIGURD.INNOSOFT.COM (PMDF V5.1-0 #8790)
 id <01HR8C3H9M409I44UF@SIGURD.INNOSOFT.COM>; Fri,
 02 Jun 1995 17:11:30 -0700 (PDT)
Date: Fri, 02 Jun 1995 16:54:05 -0700 (PDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ned Freed <NED@sigurd.innosoft.com>
Subject: Re: comments on latest MIME drafts
In-reply-to: "Your message dated Fri, 02 Jun 1995 13:13:51 -0400 (EDT)"
 <IjnoPDG00WBwA9xp5A@andrew.cmu.edu>
To: John Gardiner Myers <jgm+@cmu.edu>
Cc: ietf-822@dimacs.rutgers.edu
Message-id: <01HR8L2RPGWM9I44UF@SIGURD.INNOSOFT.COM>
X-Envelope-to: ietf-822@dimacs.rutgers.edu
MIME-version: 1.0
Content-type: TEXT/PLAIN; CHARSET=US-ASCII
Content-transfer-encoding: 7BIT
References: <01HQQG3M6YAI8WVYOU@INNOSOFT.COM> <01HQQG3M6YAI8WVYOU@INNOSOFT.COM>
 <01HR4FMWQFDS90MVQC@SIGURD.INNOSOFT.COM>
 <01HR4FMWQFDS90MVQC@SIGURD.INNOSOFT.COM>

> Ned Freed <NED@SIGURD.INNOSOFT.COM> writes:
> > Then suggest alternative prose. We're years past the point where such
> > generalities are acceptable input.

> We appear to have different paradigms, so we have to discuss
> generalities in order to discover prose that is commonly acceptable.

> > I sort of agree with you about non-content headers actually being allowed in
> > body parts. They are allowed syntactically but they have no sematics
> > associated with them.

> There exists at least one MIME UA which puts "X-" headers in the
> body-parts of multiparts in order to communicate with other instances
> of itself.

So what? The only semantics that matter here are those of MIME. It is of course 
permissible for people to add their own headers with private semantics.

> Let me clarify my paradigm and suggest why it might be better than
> yours.

> In order for a data format to exist in the real world, it has to have
> a syntax.  Data format specifications, such as MIME, usually specify
> abstractions and give their semantics.  The semantics are then tied to
> identifiable objects in the syntax.

> Having the semantics be associated with identifiable syntactic objects
> simplifies the task of generating and reading the data format.
> Composers generate the syntactic constructs corresponding to the
> semantics they want to convey.  Readers discover semantics by first
> doing a parse to discover the syntax, then applying the association of
> semantics to particular syntactic objects.

I disagree 100% with all of this. People do not discover sematics by
implementing parsers. They discover them by reading specifications. We have a
serious problem if people have to implement parsers before they can understand
MIME. The set of people who need to understand MIME semantics is far larger
than the set who worry about specifics of syntax, let alone go so far as to
implement parsers.

> When semantics are not associated with syntactic objects, or when the
> syntactic objects associated with given semanitics are not clearly
> identifiable, then having a reader discover those semantics is
> problematic.

Well, sort of. It certainly helps for semantics to be bound to various syntax
elements, as they are in MIME. But it certainly isn't necessary for semantics
to exist, nor is it necessary for different semantic constructs to bind unique
syntactic elements. In fact it can be quite the opposite -- dates appear in all
sorts of places in header fields, but I don't hear anyone suggesting that the
semantics of date need to be represented differently in all of these fields or
that dates are not important entities semantically.

> > And there is quite clearly a HUGE
> > difference in the abstract between a body part and a message.

> In the abstract, they each have some features in common and some
> features which differ.

Right. They are different.

> The semantics they have in common (header/body syntax, content-
> headers) I am trying to associate with the syntactic object known as
> an "entity".

And I think this is a very bad idea. Entities are more general than this.

> The semantics specific to a message (MIME-Version, destination and
> other 822-defined fields) I am trying to associate with the syntactic
> object known as a "message", which is a subset of an "entity".

Fine with me.

> The semantics specific to a body part (contained in a multipart, does
> not require MIME-Version, may not contain enclosing multipart's
> delimiter) I am trying to associate with the syntactic object known as
> a "body-part", which is a disjoint subset of an "entity".

And this flies in the face of common usage, common understanding, and common
sense. It makes MIME much harder to understand, and I am not willing to do it.
This is an absolute showstopper for me.

> > The fact that they have nearly identical syntax does not mean that
> > they are in fact the same -- they aren't. As such, trying to tie
> > these things down using syntax as a distinguishing factor clouds the
> > issues rather than clarifying anything.

> They have some syntax (and associated semantics) in common, and they
> have some syntax (and associated semantics) by themselves.

But they also each have their own semantics as well as their own syntax.

> This is similar to comparing the value of a Sender: field with the
> value of a Message-ID: field.  They both have some common semantics
> associated with their included common syntactic object of an
> addr-spec.  They are not, however, the same, but tying down their
> common semantics using syntax does clarify things.

Its all a question of where you draw the lines.

> > Part of the problem surrounding the term "body part" is that this
> > term has a well understood meaning semantically. It always has.

> Actually, I don't think the meaning of the term is well understood.
> It appears to be used for at least two different concepts.

Well, if you mean that there's a well understood common sense meaning that
is what most people mean when they say "body part", versus the old,
nonsensical definition that managed to slip into MIME, then I certainly
agree.

> > The term "body part" refers to headers and contents of either a
> > message or one of the parts in the body of a multipart entity. Any
> > sort of header may be present but only the content headers actually
> > have any meaning in the context of a body part. A body part has a
> > header and a body, so it makes sense to speak about the body of a
> > body part.
> >
> > Is this acceptable?

> I think it's a bit confusing.  It also defines a term that is
> a different concept than the body-part syntactic object, which has
> semantics which do not apply to messages.

What semantics does it have that don't apply to MIME messages?

> > Again, syntax and semantics are getting confused. Body parts can have
> > non-content headers, but such headers do not tie into the body in any way.

> "X-" headers in a body-part nonterminal are specifically allowed to
> have private meanings.

Of course. But they don't have any meaning that's defined in MIME. And that's
all that matters here.

> > > > The problem is that we define rules for body parts that have to
> > > > apply to both the multipart and single part cases but don't apply to
> > > > all entities.
> >
> > > Could you give me an example of such a rule?
> >
> > Any rule that defines something specific to MIME represents such an example.
> > Entities can be messages and messages don't have to be MIME messages, hence
> > MIME rules don't necessarily apply to all entities.

> Here's another difference in paradigms.  In my paradigm, MIME rules
> apply to all messages.  None of my MIME parsers ever look for
> MIME-Version.

The Working Group rejected exactly this paradigm some time ago, preferring
instead to go with the approach of MIME messages being a proper subset of
RFC822 messages.

> The MIME standard certainly permits this.  The MIME rules are
> constructed such that if there are no Content-* headers, the MIME
> rules are identical to the RFC 822 rules.

It does indeed permit this, because this is the way we originally planned
to do it.

> RFC 1049 Content-Type:
> headers are not syntactically legal MIME Content-Type: headers, so a
> MIME reader has the freedom to treat RFC 1049 Content-Type: headers as
> it likes.

Not if it treats all messages as MIME messages.

> Even with the definition of "body part" in
> draft-ietf-822ext-mime-imb-03.txt, messages which "aren't MIME
> messages" have associated body parts.  Take the (presumably zero)
> content headers of the message, along with the body and there's your
> "body part".

Sure. This can happen in MIME messages as well.

> If a message doesn't have a MIME-Version, then a receiving UA has the
> option, given in section 6 of the message bodies document, of ignoring
> all rules in MIME applying to the message body, including any rules
> imposed by the fact that the message is an entity.

Sure, but so what?

> > >       entity = *field [ CRLF *OCTET ]
> >
> > > It then follows that "message" and "body-part" are subsets of
> > > "entity".
> >
> > This is OK syntactically, but loses seriously on the semantics front. I
> > don't see any way of making this change without making things even more
> > confusing.

> How does it lose?  You apply all the rules that you want for both
> messages and body-parts, including the various Content-* headers, to
> entities.  It appears to me to be a semantic win.

Because it blurs the distinction between body parts and messages.
The early MIME work presented us with substantial evidence that losing
this distinction is very bad.

> > > bodies are no longer restricted to 7bit data, so "*text" isn't
> > > appropriate.
> >
> > Neither is the original definition then. I'll change it to read:

> The original definition of message from 822 was specific to 7bit data.
> MIME expanded this in prose, but not in BNF, to be *OCTET.

> By the way, I neglected to formally define OCTET.

I already corrected this omission.

				Ned


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa10865;
          2 Jun 95 22:18 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa10861;
          2 Jun 95 22:18 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa17699;
          2 Jun 95 22:17 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id VAA24057 for ietf-822-list; Fri, 2 Jun 1995 21:19:30 -0400
Received: from po8.andrew.cmu.edu (PO8.ANDREW.CMU.EDU [128.2.10.108]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id VAA24054 for <ietf-822@dimacs.rutgers.edu>; Fri, 2 Jun 1995 21:19:27 -0400
Received: (from postman@localhost) by po8.andrew.cmu.edu (8.6.12/8.6.12) id VAA29156 for ietf-822@dimacs.rutgers.edu; Fri, 2 Jun 1995 21:19:22 -0400
Received: via switchmail; Fri,  2 Jun 1995 21:19:19 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/service/mailqs/testq0/QF.QjnvUXi00WBw40W3M8>;
          Fri,  2 Jun 1995 21:17:24 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/usr7/jm36/.Outgoing/QF.UjnvUVi00WBwA9xmwO>;
          Fri,  2 Jun 1995 21:17:21 -0400 (EDT)
Received: from BatMail.robin.v2.14.CUILIB.3.45.SNAP.NOT.LINKED.hogtown.andrew.cmu.edu.sun4c.411
          via MS.5.6.hogtown.andrew.cmu.edu.sun4c_411;
          Fri,  2 Jun 1995 21:17:20 -0400 (EDT)
Message-ID: <AjnvUU600WBwQ9xmlx@andrew.cmu.edu>
Date: Fri,  2 Jun 1995 21:17:20 -0400 (EDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: John Gardiner Myers <jgm+@cmu.edu>
To: ietf-822@dimacs.rutgers.edu
Subject: Re: Canonical Encoding Model wording suggestion
In-Reply-To: <9506022141.AA08641@ig1.att.att.com>
References: <9506022141.AA08641@ig1.att.att.com>
Beak: is Not

hansen@pegasus.att.com writes:
> < You'll have to describe in more detail what it is you're doing.  What
> < exactly is a "text body part"?
> 
> Any body part whose content type says "text":
> 
>     Content-Type: text/*

So, the headers for body parts with other top-level types have lines
separated by CRLF?

Did you instead mean "any body with the text top-level content type"?

What about bodies with top-level type "message" and "multipart",
especially ones which contain bodies with top-level type "text" inside
them?

-- 
_.John G. Myers		Internet: jgm+@CMU.EDU
			LoseNet:  ...!seismo!ihnp4!wiscvm.wisc.edu!give!up


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa20118;
          3 Jun 95 3:14 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa20114;
          3 Jun 95 3:14 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa21441;
          3 Jun 95 3:13 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id CAA26185 for ietf-822-list; Sat, 3 Jun 1995 02:19:16 -0400
Received: from wilma.cs.utk.edu (WILMA.CS.UTK.EDU [128.169.94.141]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id CAA26182 for <ietf-822@dimacs.rutgers.edu>; Sat, 3 Jun 1995 02:19:15 -0400
Received: from LOCALHOST by wilma.cs.utk.edu with SMTP (cf v2.11c-UTK)
          id CAA01489; Sat, 3 Jun 1995 02:18:59 -0400
Message-Id: <199506030618.CAA01489@wilma.cs.utk.edu>
X-URI: http://www.cs.utk.edu/~moore/
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Keith Moore <moore@cs.utk.edu>
To: hansen@pegasus.att.com
cc: John Gardiner Myers <jgm+@cmu.edu>, ietf-822@dimacs.rutgers.edu, 
    moore@cs.utk.edu
Subject: Re: Canonical Encoding Model wording suggestion 
In-reply-to: Your message of "Fri, 02 Jun 1995 14:50:00 EDT."
             <9506021854.AA25123@ig1.att.att.com> 
Date: Sat, 03 Jun 1995 02:18:53 -0400
X-Orig-Sender: moore@cs.utk.edu

> I agree with all of this except the last sentence. The MIME systems I deal
> with that use NL conventions all do it within just the text body parts.
> Hence, your statement is not true for such systems.

What about the ends-of-line that surround MIME boundary markers?  Is
the EOL before a boundary marker in a non-text part represented as
CRLF or LF?  How about the one after the boundary marker?

If you're not *very* careful, you'll end up with something which
"looks" exactly like MIME, but is incompatible with it.  There's been
enough problems with various systems (including attmail) that look a
lot like 822 but aren't compatible with it.  Seems like we would want
to avoid that problem the second time around.

My take on MIME in environments where EOF != CRLF is that you use the
content-transfer-encoding to decide whether you are in transparent
mode or local-newline-mode.  So if the c-t-e is binary then EOF is
CRLF within that body part; otherwise EOF is the local newline.

This needs to be nailed down before people start shipping product
that makes one assumption or another.

Keith


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa14049;
          5 Jun 95 20:16 EDT
Received: from [132.151.1.1] by IETF.CNRI.Reston.VA.US id aa14045;
          5 Jun 95 20:15 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa03464;
          5 Jun 95 20:15 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id SAA17232 for ietf-822-list; Mon, 5 Jun 1995 18:12:27 -0400
Received: from domen.uninett.no (domen.uninett.no [129.241.131.10]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id SAA17227 for <ietf-822@dimacs.rutgers.edu>; Mon, 5 Jun 1995 18:12:23 -0400
Received: from trhm4.or.uninett.no by domen.uninett.no with SMTP (PP) 
          id <25651-0@domen.uninett.no>; Tue, 6 Jun 1995 00:11:43 +0200
Received: from dale.uninett.no (localhost [127.0.0.1]) 
          by dale.uninett.no (8.6.9/8.6.9) with ESMTP id BAA00681;
          Mon, 5 Jun 1995 01:08:08 +0200
Message-Id: <199506042308.BAA00681@dale.uninett.no>
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Harald.T.Alvestrand@uninett.no
To: John Gardiner Myers <jgm+@cmu.edu>
cc: ietf-822@dimacs.rutgers.edu
Subject: Re: transfer-encodings on subtypes of "message"
In-reply-to: Your message of "Tue, 30 May 1995 16:25:18 EDT." <Ijmrwi_00WBwI0wXUn@andrew.cmu.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <678.802307287.1@dale.uninett.no>
Date: Mon, 05 Jun 1995 01:08:07 +0200
X-Orig-Sender: hta@dale.uninett.no

I am convinced by JGM's argument that if subtypes of "message"
can choose to allow or to disallow "base64" or "quoted-printable"
encoding in the subtype definition, deploying the 8BITMIME SMTP
extension will be impossible, because of the difficulty in deciding
what to do when encountering an unknown message/* subtype with
8bit data.

I am also close to the conclusion that ANY message/* type that
contains non-7bit data and is not a Message/RFC822 with a
MIME-Version: header will make it impossible to communicate
successfully through an 8-to-7 bit gateway, but I'm willing to
be convinced otherwise on that point. (Just tell me how!)

If the latter is true, should we go whole hog and say that ALL
subtypes of MESSAGE, except for message/rfc822 with MIME headers,
MUST have c-t-e 7bit???????

             Harald A


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa19090;
          5 Jun 95 23:17 EDT
Received: from [132.151.1.1] by IETF.CNRI.Reston.VA.US id aa19086;
          5 Jun 95 23:17 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa05994;
          5 Jun 95 23:17 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id WAA24430 for ietf-822-list; Mon, 5 Jun 1995 22:55:00 -0400
Received: from gw2.att.com (gw1.att.com [192.20.239.133]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id WAA24427 for <ietf-822@dimacs.rutgers.edu>; Mon, 5 Jun 1995 22:54:59 -0400
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: hansen@pegasus.att.com
Received: from pegasus.UUCP by ig1.att.att.com id AA07867; Mon, 5 Jun 95 22:54:49 EDT
Message-Id: <9506060254.AA07867@ig1.att.att.com>
To: Harald.T.Alvestrand@uninett.no, John Gardiner Myers <jgm+@cmu.edu>
Cc: ietf-822@dimacs.rutgers.edu
Date: Mon, 5 Jun 1995 22:27 EDT
Subject: Re: transfer-encodings on subtypes of "message"
Content-Type: text
Organization: AT&T Bell Laboratories, AT&T EasyLink Services

< I am convinced by JGM's argument that if subtypes of "message" can choose
< to allow or to disallow "base64" or "quoted-printable" encoding in the
< subtype definition, deploying the 8BITMIME SMTP extension will be
< impossible, because of the difficulty in deciding what to do when
< encountering an unknown message/* subtype with 8bit data.
<
< I am also close to the conclusion that ANY message/* type that contains
< non-7bit data and is not a Message/RFC822 with a MIME-Version: header will
< make it impossible to communicate successfully through an 8-to-7 bit
< gateway, but I'm willing to be convinced otherwise on that point. (Just
< tell me how!)

Yes, gatewaying message/non-rfc822 is a serious problem. For example, if you
receive an 8-bit mime message which has been encapsulated and split up using
message/partial, and need to send it across a 7-bit SMTP link, there's no
way to downgrade it to a 7-bit message unless you either:

    o	collect the entire message, downgrade to 7-bit mime, and split it up
	again

or

    o	go illegal and ignore the restrictions on using base64 or quoted-printable
	content-transfer-encoding with message/*

< If the latter is true, should we go whole hog and say that ALL subtypes of
< MESSAGE, except for message/rfc822 with MIME headers, MUST have c-t-e
< 7bit???????

If the message had been downgraded to 7-bit before being encapsulated and
split up, this problem wouldn't be there. But of course, the gateway can't
force that. Or if we were allowed to introduce content-transfer-encodings of
base64 or quoted-printable, this problem wouldn't be there.

I'm tempted to recommend that this restriction be lifted before MIME becomes
standard. (This would be a pure extension since it's something which is
currently disallowed.) I know there were problems envisioned with allowing
base64 message/*, but the problems with not allowing it may be more serious.

					Tony Hansen
			  hansen@pegasus.att.com, tony@attmail.com


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa24964;
          6 Jun 95 3:23 EDT
Received: from [132.151.1.1] by IETF.CNRI.Reston.VA.US id aa24960;
          6 Jun 95 3:23 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa09421;
          6 Jun 95 3:23 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id CAA25804 for ietf-822-list; Tue, 6 Jun 1995 02:08:46 -0400
Received: from domen.uninett.no (domen.uninett.no [129.241.131.10]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id CAA25801 for <ietf-822@dimacs.rutgers.edu>; Tue, 6 Jun 1995 02:08:45 -0400
Received: from dale.uninett.no by domen.uninett.no with SMTP (PP) 
          id <04284-0@domen.uninett.no>; Tue, 6 Jun 1995 08:08:31 +0200
Received: from dale.uninett.no (localhost [127.0.0.1]) 
          by dale.uninett.no (8.6.9/8.6.9) with ESMTP id IAA02599;
          Tue, 6 Jun 1995 08:08:26 +0200
Message-Id: <199506060608.IAA02599@dale.uninett.no>
X-Mailer: exmh version 1.5.3 12/28/94
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Harald.T.Alvestrand@uninett.no
To: hansen@pegasus.att.com
cc: John Gardiner Myers <jgm+@cmu.edu>, ietf-822@dimacs.rutgers.edu
Subject: Re: transfer-encodings on subtypes of "message"
In-reply-to: Your message of "Mon, 05 Jun 1995 22:27:00 EDT." <9506060254.AA07867@ig1.att.att.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Tue, 06 Jun 1995 08:08:23 +0200
X-Orig-Sender: hta@dale.uninett.no

For message/partial, the problem was solved by requiring the use of only
7bit encoding - see section 7.2.2.2 of draft-ietf-822ext-mime-imt-01.txt;
I believe similar prose existed in RFC 1521.

The reason for the restriction was to avoid "nested encodings", which
would require multiple passes over a message in order to decide whether
it was possible to handle the message or not; this was considered a Bad
Thing by the original MIME group.

At the moment, I'm almost tempted to take one of Stef's ideas and define
an Application/MIME type which can encapsulate anything (includig an 8bit
message/*) and apply a transfer encoding, in order to get out of this bind.
But I don't like this "solution".

8-to-7-bit downgrading needs to be studied!

           Harald A


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa07384;
          6 Jun 95 13:37 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa07380;
          6 Jun 95 13:37 EDT
Received: from [128.6.75.16] by CNRI.Reston.VA.US id aa12550; 6 Jun 95 13:37 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id MAA03006 for ietf-822-list; Tue, 6 Jun 1995 12:19:11 -0400
Received: from THOR.INNOSOFT.COM (THOR.INNOSOFT.COM [192.160.253.66]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id MAA02997 for <ietf-822@dimacs.rutgers.edu>; Tue, 6 Jun 1995 12:19:09 -0400
Received: from INNOSOFT.COM by INNOSOFT.COM (PMDF V5.0-3 #2001)
 id <01HRDN8ZLP1S8ZE5P5@INNOSOFT.COM>; Tue, 06 Jun 1995 09:16:34 -0700 (PDT)
Date: Tue, 06 Jun 1995 08:19:42 -0700 (PDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ned Freed <NED@innosoft.com>
Subject: Re: transfer-encodings on subtypes of "message"
In-reply-to: "Your message dated Mon, 05 Jun 1995 01:08:07 +0200"
 <199506042308.BAA00681@dale.uninett.no>
To: Harald.T.Alvestrand@uninett.no
Cc: John Gardiner Myers <jgm+@cmu.edu>, ietf-822@dimacs.rutgers.edu
Message-id: <01HRDPNBBQXA8ZE5P5@INNOSOFT.COM>
MIME-version: 1.0
Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7BIT
References: <Ijmrwi_00WBwI0wXUn@andrew.cmu.edu>

> I am convinced by JGM's argument that if subtypes of "message"
> can choose to allow or to disallow "base64" or "quoted-printable"
> encoding in the subtype definition, deploying the 8BITMIME SMTP
> extension will be impossible, because of the difficulty in deciding
> what to do when encountering an unknown message/* subtype with
> 8bit data.

I don't find this line of reasoning to be very convincing.

First of all, it is simply incorrect to say that quoted-printable and/or base64
encodings on subtypes of message cause problems for 8-to-7 gateways. The reason
for this is pretty obvious: By definition something with a quoted-printable or
base64 encoding cannot contain any 8bit data!!! 8-to-7 gateways don't have
anything to do in such cases as a result.

Use of such encodings may cause major trouble when you go to read such a
message (the so-called nested encoding problem), but this is a separate issue
that would only be a problem with subtypes that are similar to message/rfc822.

The real problem for 8-to-7 gateways is caused by use of either 8bit or binary
encoding. Nothing more and nothing less. And since its inception MIME has
allowed the use of both the 8bit and binary encodings on message/rfc822. And
there has never been anything in MIME that would preclude using 8bit or binary
encodings on a new subtype of message.

The problem (obviously) arises when an 8bit or binary encoding is used on an
unknown subtype of message. Since message is a composite type, gateways have to
know how to get inside of each different sort of subtype in order to encode it
properly and avoid the nested encoding rule. And doing this requires gateway
reconfiguration at a minimum and possibly even additional code in every
gateway.

The change in the new MIME draft actually simplifies the addition of new
message subtypes somewhat rather than complicating it. Specifically,  where it
makes sense to allow quoted-printable or base64 encoding, new subtypes of
message can now avail themselves of existing encoding facilities. This was not
possible before. But you still have to know how to handle each subtype. This
hasn't changed and I don't see a way of changing it short of an outright ban on
additional subtypes that require recursive processing. (This is essentially
what I'm going to propose below.)

> I am also close to the conclusion that ANY message/* type that
> contains non-7bit data and is not a Message/RFC822 with a
> MIME-Version: header will make it impossible to communicate
> successfully through an 8-to-7 bit gateway, but I'm willing to
> be convinced otherwise on that point. (Just tell me how!)

How about this approach: Place an outright ban on the use of 8bit or binary
encoding on  all subtypes of message other than rfc822. This lets 8-to-7
gateways use the following two step procedure:

(1) Is it message/rfc822? If it is, check to see if its encoded
    using quoted-printable or base64 and if so, flag it as an error.
    Otherwise handle message/rfc822 recursively.

(2) Is it encoded as 7bit, quoted-printable, or base64? Stop if it
    is, and if it isn't flag it as an error.

I don't see any problems with this approach other than the fact that it
eliminates the possibility of registering a new subtype that behaves like
message/rfc822. On the other hand, standards can always be revised, and perhaps
having to revise the standard to add such a thing is a good idea.

> If the latter is true, should we go whole hog and say that ALL
> subtypes of MESSAGE, except for message/rfc822 with MIME headers,
> MUST have c-t-e 7bit???????

I can live with this, but it really isn't necessary. The only encodings
that must be avoided on new subtypes of message are 8bit and binary. 7bit,
quoted-printable, and base64 are all fine, at least as far as 8-to-7
(and, for that matter, binary-to-8 and binary-to-7) are concerned.

				Ned


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa00549;
          7 Jun 95 4:18 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa00545;
          7 Jun 95 4:18 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa01166;
          7 Jun 95 4:18 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id DAA21246 for ietf-822-list; Wed, 7 Jun 1995 03:35:30 -0400
Received: from domen.uninett.no (domen.uninett.no [129.241.131.10]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id DAA21243 for <ietf-822@dimacs.rutgers.edu>; Wed, 7 Jun 1995 03:35:23 -0400
Received: from dale.uninett.no by domen.uninett.no with SMTP (PP) 
          id <27905-0@domen.uninett.no>; Wed, 7 Jun 1995 09:35:18 +0200
Received: from dale.uninett.no (localhost [127.0.0.1]) 
          by dale.uninett.no (8.6.9/8.6.9) with ESMTP id JAA06319;
          Wed, 7 Jun 1995 09:35:14 +0200
Message-Id: <199506070735.JAA06319@dale.uninett.no>
X-Mailer: exmh version 1.5.3 12/28/94
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Harald.T.Alvestrand@uninett.no
To: Ned Freed <NED@innosoft.com>
cc: John Gardiner Myers <jgm+@cmu.edu>, ietf-822@dimacs.rutgers.edu
Subject: Re: transfer-encodings on subtypes of "message"
In-reply-to: Your message of "Tue, 06 Jun 1995 08:19:42 PDT." <01HRDPNBBQXA8ZE5P5@INNOSOFT.COM>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Wed, 07 Jun 1995 09:35:12 +0200
X-Orig-Sender: hta@dale.uninett.no

Ned,
after digging through a level or two of confusion, I find (as usual) that we 
agree pretty much.

The problem as I see it seems to be the same as you see: The case where a 
message/foo with 8bit or Binary encoding arrives at an 8-to-7 gateway; this 
was not clear enough in my earlier prose.

I like your solution - it specifies a nice, clean algorithm to implement in 
gateways, and leaves the burden of pssibly recursively encoded message/foo 
types where it belongs - in UAs that want to deal with newly defined types.

                      Harald A


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa07727;
          7 Jun 95 13:23 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa07723;
          7 Jun 95 13:23 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa10723;
          7 Jun 95 13:23 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id MAA03258 for ietf-822-list; Wed, 7 Jun 1995 12:41:05 -0400
Received: from VM1.ucc.okstate.edu (vm1.ucc.okstate.edu [139.78.100.4]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id MAA03251 for <ietf-822@dimacs.rutgers.edu>; Wed, 7 Jun 1995 12:40:58 -0400
Message-Id: <199506071640.MAA03251@dimacs.rutgers.edu>
Received: from OSUVM1.BITNET by VM1.ucc.okstate.edu (IBM VM SMTP V2R2)
   with BSMTP id 3229; Wed, 07 Jun 95 11:38:48 CST
Received: from OSUVM1.BITNET by OSUVM1.BITNET (Mailer R2.08 R208004) with BSMTP
 id 8690; Wed, 07 Jun 95 10:32:54 CST
Date:         Wed, 07 Jun 95 10:06:04 CST
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Brent Stilley <UCCXBRS@vm1.ucc.okstate.edu>
Organization: Oklahoma State University
Subject:      Re: transfer-encodings on subtypes of "message"
To: Harald.T.Alvestrand@uninett.no
cc: ietf-822@dimacs.rutgers.edu
In-Reply-To:  Message of Wed, 07 Jun 1995 09:35:12 +0200 from
 <Harald.T.Alvestrand@uninett.no>

On Wed, 07 Jun 1995 09:35:12 +0200 you said:
>
>I like your solution - it specifies a nice, clean algorithm to implement in
>gateways, and leaves the burden of possibly recursively encoded message/foo
>types where it belongs - in UAs that want to deal with newly defined types.

I'd be interested in a little more talk about this "burden" since I'm not
familiar with the "nested encoding rule" Ned referenced - if it applies at
all to message/rfc822.

How can a gateway (e.g. to a LAN mail system) which doesn't have a "user" to
provide advise, systematically deal with nested encodings?

For example, we currently attempt to parse nested message/rfc822 so that we
can give the cc:Mail user back his/her original attachments when a message
bounces (obviously parsing error reports is risky business, so we have a
fallback).

If we send out a message with 8bit or binary C-T-E body parts (I use the term
with great trepidation ;-) which at some later point traverses a 8-to-7 bit
gateway, and at a later point bounces back to the sending system, is it
possible/likely that we would be faced with the need to do an outer level
decode of the body of the error report before we could even see the
message/rfc822?

Also, in Ned's earlier two-step proposal:

 (1) Is it message/rfc822? If it is, check to see if its encoded
     using quoted-printable or base64 and if so, flag it as an error.
     Otherwise handle message/rfc822 recursively.

 (2) Is it encoded as 7bit, quoted-printable, or base64? Stop if it
     is, and if it isn't flag it as an error.

rule #2 doesn't make sense to me. Apparently it applies to all *other*
type/subtypes? Otherwise I don't understand the dual QP and base64 references
in the two rules.

Brent Stilley,  Oklahoma State University, 113 Math Sciences, Stillwater, 74078


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa16258;
          7 Jun 95 22:22 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa16253;
          7 Jun 95 22:21 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa28302;
          7 Jun 95 22:21 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id VAA28416 for ietf-822-list; Wed, 7 Jun 1995 21:25:47 -0400
Received: from THOR.INNOSOFT.COM (THOR.INNOSOFT.COM [192.160.253.66]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id VAA28413 for <ietf-822@dimacs.rutgers.edu>; Wed, 7 Jun 1995 21:25:43 -0400
Received: from INNOSOFT.COM by INNOSOFT.COM (PMDF V5.0-3 #2001)
 id <01HRFMC0UX288ZDVD1@INNOSOFT.COM>; Wed, 07 Jun 1995 18:24:54 -0700 (PDT)
Date: Wed, 07 Jun 1995 18:08:11 -0700 (PDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ned Freed <NED@innosoft.com>
Subject: Re: transfer-encodings on subtypes of "message"
In-reply-to: "Your message dated Wed, 07 Jun 1995 10:06:04 -0600 (CST)"
 <199506071640.MAA03251@dimacs.rutgers.edu>
To: Brent Stilley <UCCXBRS@vm1.ucc.okstate.edu>
Cc: Harald.T.Alvestrand@uninett.no, ietf-822@dimacs.rutgers.edu
Message-id: <01HRFN3ICR968ZDVD1@INNOSOFT.COM>
MIME-version: 1.0
Content-type: TEXT/PLAIN; CHARSET=US-ASCII
Content-transfer-encoding: 7BIT
References: <Harald.T.Alvestrand@uninett.no>

> > I like your solution - it specifies a nice, clean algorithm to implement in
> > gateways, and leaves the burden of possibly recursively encoded message/foo
> > types where it belongs - in UAs that want to deal with newly defined types.

> I'd be interested in a little more talk about this "burden" since I'm not
> familiar with the "nested encoding rule" Ned referenced - if it applies at
> all to message/rfc822.

The nested encoding rule is part of the core MIME specification. Simply put,
composite objects that MIME parsers are required to handle recursively and
hence can contain objects that are encoded with base64 or quoted-printable
cannot themselves be encoded using either base64 or quoted-printable.

> How can a gateway (e.g. to a LAN mail system) which doesn't have a "user" to
> provide advise, systematically deal with nested encodings?

There is no problem dealing with them systematically, but that's not the point.
The point is complexity. If nested encodings are allowed it is possible for a
part to encoded using quoted-printable or base64 multiple times. Not only is
this wasteful, it either complicates implementations considerably or else calls
for a relatively inefficient multi-pass decoding strategy.

> For example, we currently attempt to parse nested message/rfc822 so that we
> can give the cc:Mail user back his/her original attachments when a message
> bounces (obviously parsing error reports is risky business, so we have a
> fallback).

> If we send out a message with 8bit or binary C-T-E body parts (I use the term
> with great trepidation ;-) which at some later point traverses a 8-to-7 bit
> gateway, and at a later point bounces back to the sending system, is it
> possible/likely that we would be faced with the need to do an outer level
> decode of the body of the error report before we could even see the
> message/rfc822?

No. This is precisely what the nested encoding rule prohibits.

> Also, in Ned's earlier two-step proposal:

>  (1) Is it message/rfc822? If it is, check to see if its encoded
>      using quoted-printable or base64 and if so, flag it as an error.
>      Otherwise handle message/rfc822 recursively.

>  (2) Is it encoded as 7bit, quoted-printable, or base64? Stop if it
>      is, and if it isn't flag it as an error.

> rule #2 doesn't make sense to me. Apparently it applies to all *other*
> type/subtypes?

It only applies if rule #1 doesn't. But that's doesn't make it apply to
all other type/subtypes. This discussion is about the message type only. It
doesn't generalize.

The underlying issue here is what subtypes of message are actually used for. If
they are used to define new objects that MIME parsers must handle directly and
recursively, you have to teach parsers (including those in 8-to-7 gateways)
about them. There is no way to avoid doing so. But only the rfc822 subtype of
message is directly recursive -- the partial and external-body subtypes both
involve indirect recursion. They cannot be downgraded by an 8-to-7 gateway even
if you wanted to, so the definitions themselves restrict the set of permissible
encodings to make it unnecessary for 8-to-7 gateways to do anything at all.

My proposal is simply to restrict the encodings you can have on new subtypes of
message to the largest set that will not cause any problems with 8-to-7. The
previous MIME document restricted things in a way that was neither
necessary nor sufficient to obtain this effect.

				Ned


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa00562;
          8 Jun 95 4:27 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa00558;
          8 Jun 95 4:27 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa01245;
          8 Jun 95 4:27 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id DAA01433 for ietf-822-list; Thu, 8 Jun 1995 03:37:25 -0400
Received: from domen.uninett.no (domen.uninett.no [129.241.131.10]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id DAA01430 for <ietf-822@dimacs.rutgers.edu>; Thu, 8 Jun 1995 03:36:34 -0400
Received: from dale.uninett.no by domen.uninett.no with SMTP (PP) 
          id <14169-0@domen.uninett.no>; Thu, 8 Jun 1995 09:35:35 +0200
Received: from dale.uninett.no (localhost [127.0.0.1]) 
          by dale.uninett.no (8.6.9/8.6.9) with ESMTP id JAA09164 
          for <ietf-822@dimacs.rutgers.edu>; Thu, 8 Jun 1995 09:35:30 +0200
Message-Id: <199506080735.JAA09164@dale.uninett.no>
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Harald.T.Alvestrand@uninett.no
To: ietf-822@dimacs.rutgers.edu
Subject: Prohibition of EBCDIC in text/plain
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="----- =_aaaaaaaaaa0"
Content-ID: <9157.802596645.0@dale.uninett.no>
Date: Thu, 08 Jun 1995 09:35:28 +0200
X-Orig-Sender: hta@dale.uninett.no

------- =_aaaaaaaaaa0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <9157.802596645.1@dale.uninett.no>

Hi,
while fighting another battle, I came across this issue again.
Ned's latest draft says:

The canonical form of any MIME text type MUST represent a line
break as a CRLF sequence.  Similarly, any occurrence of CRLF
in text MUST represent a line break.  Use of CR and LF outside
of line break sequences is also forbidden.

This forbids, among others, ISO 10646 UCS-2 and EBCDIC as text/plain
character sets.

In the transfer form, it is easy to tell why.
However, why should the message I have written here be outlawed?
Was this message legal under RFC 1521 rules?

The best reason I could think of was to keep sanity when crossing gateways
that routinely remove content-transfer-encodings, but that does not strike
me as the most compelling thing in the world.

Also, it is "cleaner" to have the number of possible charsets be limited,
but is this better done by recommendation or by fiat?

I'm not arguing for removing the restriction, only to seek a better
understanding of why it is reasonable to impose it.
Comments?

    Harald A


------- =_aaaaaaaaaa0
Content-Type: text/plain; charset="ebcdic-int"
Content-ID: <9157.802596645.2@dale.uninett.no>
Content-Transfer-Encoding: quoted-printable
Content-description: EBCDIC message

=E8=85=A2k@=A3=88=89=A2@=89=A2@=C5=C2=C3=C4=C9=C3%=

------- =_aaaaaaaaaa0--


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa08894;
          8 Jun 95 15:21 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa08890;
          8 Jun 95 15:21 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa13855;
          8 Jun 95 15:21 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id OAA21927 for ietf-822-list; Thu, 8 Jun 1995 14:36:52 -0400
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id OAA21923 for <ietf-822@dimacs.rutgers.edu>; Thu, 8 Jun 1995 14:36:45 -0400
Received: from zap.Bunyip.Com by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b)
        id AA16009  (mail destined for ietf-822@dimacs.rutgers.edu); Thu, 8 Jun 95 14:33:32 -0400
X-Sender: m-3329@mailbox.swip.net
Message-Id: <v0211011eabfcf0df1641@[192.197.208.4]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Thu, 8 Jun 1995 14:34:10 -0400
To: Harald.T.Alvestrand@uninett.no, ietf-822@dimacs.rutgers.edu
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Patrik Faltstrom <paf@bunyip.com>
Subject: Re: Prohibition of EBCDIC in text/plain

At 09.35 95-06-08, Harald.T.Alvestrand@uninett.no wrote:
>This forbids, among others, ISO 10646 UCS-2 and EBCDIC as text/plain
>character sets.

I assumed that what is called UCS-2 is the same thing as what
in "The Unicode Standard, Version 1.1, Appendix F", is called
FSS-UTF, i.e. Filesystem Safe UCS Transformation Format.

If this is true, I read the encoding rules in a way that all
characters 0x00 to 0x7F is encoded as themselves (as one-byte
characters) and that all other characters is encoded in
two, three, four, five or six byte characters. All the bytes
in the multibyte characters have their 8:th bit set.

By using this encoding, this is to me actually an encoding which
can be sent as a text/plain message, because a 'CR', 'LF'
and NULL are encoded as themselves and those bit-patterns does
not exist in the multibyte encoding of the other characters.

If I am wrong, please let me know.

   Patrik


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id ab11288;
          8 Jun 95 17:19 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa11284;
          8 Jun 95 17:19 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa16408;
          8 Jun 95 17:19 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id QAA25834 for ietf-822-list; Thu, 8 Jun 1995 16:44:49 -0400
Received: from black-ice.cc.vt.edu (black-ice.cc.vt.edu [128.173.14.71]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id QAA25831 for <ietf-822@dimacs.rutgers.edu>; Thu, 8 Jun 1995 16:44:47 -0400
Received: from localhost (LOCALHOST [127.0.0.1]) by black-ice.cc.vt.edu (8.7.Beta.1/8.7.Beta.1) with ESMTP id QAA18200; Thu, 8 Jun 1995 16:43:47 -0400
Message-Id: <199506082043.QAA18200@black-ice.cc.vt.edu>
To: Patrik Faltstrom <paf@bunyip.com>
cc: ietf-822@dimacs.rutgers.edu
Subject: Re: Prohibition of EBCDIC in text/plain 
In-reply-to: Your message of "Thu, 08 Jun 1995 14:34:10 EDT."
             <v0211011eabfcf0df1641@[192.197.208.4]> 
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Valdis.Kletnieks@vt.edu
Date: Thu, 08 Jun 1995 16:43:47 -0400

On Thu, 08 Jun 1995 14:34:10 EDT, you said:
> By using this encoding, this is to me actually an encoding which
> can be sent as a text/plain message, because a 'CR', 'LF'
> and NULL are encoded as themselves and those bit-patterns does
> not exist in the multibyte encoding of the other characters.

The problem is that an ASCII CR-LF is 0x0d 0x0a, but an EBCDIC CR-LF is
0x0d 0x25, which is interpreted as a CR-% pair by an ASCII parser.

				Valdis Kletnieks
				Computer Systems Engineer
				Virginia Tech


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa12546;
          8 Jun 95 18:20 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa12542;
          8 Jun 95 18:20 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa17741;
          8 Jun 95 18:20 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id RAA26416 for ietf-822-list; Thu, 8 Jun 1995 17:35:50 -0400
Received: from staff.nada.kth.se (staff.nada.kth.se [130.237.225.70]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id RAA26413 for <ietf-822@dimacs.rutgers.edu>; Thu, 8 Jun 1995 17:35:41 -0400
Received: (from psv@localhost)
	by staff.nada.kth.se (8.6.10/8.6.9)
	id XAA28522;
	Thu, 8 Jun 1995 23:34:07 +0200
Date: Thu, 8 Jun 1995 23:34:05 +0200 (MET DST)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Peter Svanberg <psv@nada.kth.se>
To: Patrik Faltstrom <paf@bunyip.com>
cc: Harald.T.Alvestrand@uninett.no, ietf-822@dimacs.rutgers.edu
Subject: Re: Prohibition of EBCDIC in text/plain
In-Reply-To: <v0211011eabfcf0df1641@[192.197.208.4]>
Message-ID: <Pine.SUN.3.91N2.950608223330.23404C-100000@staff.nada.kth.se>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII

On Thu, 8 Jun 1995, Patrik Faltstrom wrote:

> At 09.35 95-06-08, Harald.T.Alvestrand@uninett.no wrote:
> >This forbids, among others, ISO 10646 UCS-2 and EBCDIC as text/plain
> >character sets.
> 
> I assumed that what is called UCS-2 is the same thing as what
> in "The Unicode Standard, Version 1.1, Appendix F", is called
> FSS-UTF, i.e. Filesystem Safe UCS Transformation Format.

No, UCS-2 is the 2-byte form of ISO/IEC 10646
    UCS-4 is the 4-byte form of ISO/IEC 10646

    UTF-1 is the first 16-to-(n*8) bits method for UCS; is in the
          standard but is proposed to be removed
    UTF-8 corresponds to FSS-UTF (exactly?) ((32 or 16)-to-(n*8) bits)
    UTF-16 is the proposed "enlargement" of BMP (~20-to-(2*16) bits)

I'm not updated on the formal state of the latter two
proposals.

And then there is the Internet-experimental UTF-7
(MIME charset UNICODE-1-1-UTF-7, RFC 1642), 16-to-(n*7) bits.
---
Peter Svanberg, 			    Email: psv@nada.kth.se
Dept of Num An & CS,
Royal Inst of Tech			    Phone: +46 8 790 71 40
S-100 44  Stockholm, SWEDEN		    Fax:   +46 8 790 09 30


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa12599;
          8 Jun 95 18:28 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa12595;
          8 Jun 95 18:28 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa17868;
          8 Jun 95 18:28 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id QAA25792 for ietf-822-list; Thu, 8 Jun 1995 16:42:41 -0400
Received: from mailserv.taligent.com (mailserv.taligent.com [134.149.9.10]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id QAA25788 for <ietf-822@dimacs.rutgers.edu>; Thu, 8 Jun 1995 16:42:38 -0400
Received: from david-goldsmith.taligent.com by mailserv.taligent.com (AIX 3.2/UCB 5.64/4.03)
          id AA48084; Thu, 8 Jun 1995 13:41:30 -0700
X-Sender: dgold@mailserv.taligent.com
Message-Id: <v01520d02abfd10522aa7@[134.149.23.16]>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Thu, 8 Jun 1995 13:41:41 -0700
To: Patrik Faltstrom <paf@bunyip.com>, Harald.T.Alvestrand@uninett.no, 
    ietf-822@dimacs.rutgers.edu
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: David Goldsmith <david_goldsmith@taligent.com>
Subject: Re: Prohibition of EBCDIC in text/plain

At 2:34 PM 6/8/95, Patrik Faltstrom wrote:
>At 09.35 95-06-08, Harald.T.Alvestrand@uninett.no wrote:
>>This forbids, among others, ISO 10646 UCS-2 and EBCDIC as text/plain
>>character sets.
>
>I assumed that what is called UCS-2 is the same thing as what
>in "The Unicode Standard, Version 1.1, Appendix F", is called
>FSS-UTF, i.e. Filesystem Safe UCS Transformation Format.
>
>If this is true, I read the encoding rules in a way that all
>characters 0x00 to 0x7F is encoded as themselves (as one-byte
>characters) and that all other characters is encoded in
>two, three, four, five or six byte characters. All the bytes
>in the multibyte characters have their 8:th bit set.
>
>By using this encoding, this is to me actually an encoding which
>can be sent as a text/plain message, because a 'CR', 'LF'
>and NULL are encoded as themselves and those bit-patterns does
>not exist in the multibyte encoding of the other characters.
>
>If I am wrong, please let me know.
>

No, UCS-2 is the 16 bit form of Unicode. FSS-UTF is now called UTF-8, and
it's an official annex to ISO 10646. You are correct, UTF-8 is compatible
with the MIME text/plain content type. So is UTF-7 (see RFC 1642). However,
straight Unicode (UCS-2), EBCDIC, and other character sets which do not
contain US-ASCII as a subset are not compatible, unfortunately.

----------------------------
David Goldsmith
david_goldsmith@taligent.com
Senior Scientist
Taligent, Inc.
10201 N. DeAnza Blvd.
Cupertino, CA  95014-2233


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa13083;
          8 Jun 95 19:16 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa13079;
          8 Jun 95 19:16 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa18681;
          8 Jun 95 19:16 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id SAA27652 for ietf-822-list; Thu, 8 Jun 1995 18:35:09 -0400
Received: from UA1VM.UA.EDU (ua1vm.ua.edu [130.160.4.100]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id SAA27649 for <ietf-822@DIMACS.RUTGERS.EDU>; Thu, 8 Jun 1995 18:35:07 -0400
Message-Id: <199506082235.SAA27649@dimacs.rutgers.edu>
Received: from UA1VM.UA.EDU by UA1VM.UA.EDU (IBM VM SMTP V2R2)
   with BSMTP id 7856; Thu, 08 Jun 95 17:34:51 CDT
Received: from UA1VM.UA.EDU (NJE origin TROTH@UA1VM) by UA1VM.UA.EDU (LMail
 V1.2a/1.8a) with BSMTP id 6303; Thu, 8 Jun 1995 17:34:52 -0500
MIME-Version: 1.0
Content-Type: text/plain
X-Mail-User-Agent: MAILBOOK/90.01.01
Date:         Thu, 08 Jun 95 17:29:59 CDT
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Rick Troth <TROTH@ua1vm.ua.edu>
Subject:      Re: Prohibition of EBCDIC in text/plain
To: Patrik Faltstrom <paf@bunyip.com>, Harald.T.Alvestrand@uninett.no, 
    ietf-822@dimacs.rutgers.edu
In-Reply-To:  Message of Thu, 8 Jun 1995 14:34:10 -0400 from <paf@bunyip.com>

>By using this encoding, this is to me actually an encoding which
>can be sent as a text/plain message, because a 'CR', 'LF'
>and NULL are encoded as themselves and those bit-patterns does
>not exist in the multibyte encoding of the other characters.
>
>If I am wrong, please let me know.

        In EBCDIC,  while CR is still 0x0D,  LF is 0x25.
Worse,  while UNIX uses LF as NL,  EBCDIC systems actually use the
separate character for NL at code point 0x15  (which maps to an ASCII
or ISO character in the 0x80 .. 0xAF range,  which code point I've
forgotten,  but it's also called NL).

        More confusing still is the fact that most EBCDIC-based systems
use record oriented filesystems instead of stream oriented filesystems.
So there's no CR nor LF nor even an NL between those lines.   So if you
toss out the record structure then you get a bunch of text all run
together as one very_long_line.

>   Patrik
>
>

--
Rick Troth <troth@ua1vm.ua.edu>, Houston, Texas, USA
http://ua1vm.ua.edu/~troth/


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa13691;
          8 Jun 95 20:16 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa13685;
          8 Jun 95 20:16 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa19615;
          8 Jun 95 20:16 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id SAA27563 for ietf-822-list; Thu, 8 Jun 1995 18:31:08 -0400
Received: from THOR.INNOSOFT.COM (THOR.INNOSOFT.COM [192.160.253.66]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id SAA27560 for <ietf-822@dimacs.rutgers.edu>; Thu, 8 Jun 1995 18:31:06 -0400
Received: from INNOSOFT.COM by INNOSOFT.COM (PMDF V5.0-3 #2001)
 id <01HRFMC0UX288ZDVD1@INNOSOFT.COM>; Thu, 08 Jun 1995 15:30:14 -0700 (PDT)
Date: Thu, 08 Jun 1995 14:50:08 -0700 (PDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ned Freed <NED@innosoft.com>
Subject: Re: Prohibition of EBCDIC in text/plain
In-reply-to: "Your message dated Thu, 08 Jun 1995 09:35:28 +0200"
 <199506080735.JAA09164@dale.uninett.no>
To: Harald.T.Alvestrand@uninett.no
Cc: ietf-822@dimacs.rutgers.edu
Message-id: <01HRGVAA9AEM8ZDVD1@INNOSOFT.COM>
MIME-version: 1.0
Content-type: TEXT/PLAIN; CHARSET=US-ASCII
Content-transfer-encoding: 7BIT

> while fighting another battle, I came across this issue again.
> Ned's latest draft says:

> The canonical form of any MIME text type MUST represent a line
> break as a CRLF sequence.  Similarly, any occurrence of CRLF
> in text MUST represent a line break.  Use of CR and LF outside
> of line break sequences is also forbidden.

> This forbids, among others, ISO 10646 UCS-2 and EBCDIC as text/plain
> character sets.

Correct.

> In the transfer form, it is easy to tell why.
> However, why should the message I have written here be outlawed?

Mostly because of conversions to and from local canonical form. Many existing
mail systems simply convert text material to local canonical form, which in
turn can change line termination sequences from CRLF to CR, LF, or something
out-of-band. These conversions need to be transparent, so stray CR and LF that
aren't part of a line termination sequence are disallowed.

Transfer encodings do not necessary protect you from such conversions. See
below.

> Was this message legal under RFC 1521 rules?

Yes, but it did not interoperate across platforms.

> The best reason I could think of was to keep sanity when crossing gateways
> that routinely remove content-transfer-encodings, but that does not strike
> me as the most compelling thing in the world.

It seems pretty compelling to me, especially if we ever intend to upgrade the
SMTP transport infrastructure.

However, there are also cases where mail agents (not necessary gateways)
absolutely have to "routinely remove" transfer encodings. No other course of
action is possible, since people on the non-MIME side of things tend to object
pretty strongly to getting a bunch of "base64 shit" (words I've heard more than
once, I'm afraid) in their mail.

The choice, then, is simple: Either you ban the use of stray CR and LF in text
or else you require agents to maintain a comprehensive list of all the
character sets and whether or not conversion to canonical form is possible
and/or necessary. (Steve Dorner in fact proposed adding a new parameter
available for all content types to indicate whether or not canonicalization
should be done.)

> Also, it is "cleaner" to have the number of possible charsets be limited,
> but is this better done by recommendation or by fiat?

This was not imposed by fiat. This issue has been discussed endlessly -- I have
many hundreds of messages from a bunch of different lists in my archives on
this topic. The current documents are the result of input from dozens of
people, including John Myers, Chris Newman, Steve Dorner, Keith Moore, John
Klensin, myself, and lots of others as well. The current solution was the best
we could come up with.

				Ned


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa13752;
          8 Jun 95 20:24 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa13748;
          8 Jun 95 20:24 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa19707;
          8 Jun 95 20:24 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id TAA28977 for ietf-822-list; Thu, 8 Jun 1995 19:05:26 -0400
Received: from UA1VM.UA.EDU (ua1vm.ua.edu [130.160.4.100]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id TAA28974 for <ietf-822@DIMACS.RUTGERS.EDU>; Thu, 8 Jun 1995 19:05:23 -0400
Message-Id: <199506082305.TAA28974@dimacs.rutgers.edu>
Received: from UA1VM.UA.EDU by UA1VM.UA.EDU (IBM VM SMTP V2R2)
   with BSMTP id 7995; Thu, 08 Jun 95 18:05:12 CDT
Received: from UA1VM.UA.EDU (NJE origin TROTH@UA1VM) by UA1VM.UA.EDU (LMail
 V1.2a/1.8a) with BSMTP id 7430; Thu, 8 Jun 1995 18:05:12 -0500
MIME-Version: 1.0
Content-Type: text/plain
X-Mail-User-Agent: MAILBOOK/90.01.01
Date:         Thu, 08 Jun 95 17:35:29 CDT
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Rick Troth <TROTH@ua1vm.ua.edu>
Subject:      Re: Prohibition of EBCDIC in text/plain
To: Harald.T.Alvestrand@uninett.no, ietf-822@dimacs.rutgers.edu, 
    hta@dale.uninett.no
In-Reply-To:  Message of Thu, 08 Jun 1995 09:35:28 +0200 from
 <Harald.T.Alvestrand@uninett.no>

>while fighting another battle, I came across this issue again.

        Oh, boy.   ;-)

>Ned's latest draft says:
>
>The canonical form of any MIME text type MUST represent a line
>break as a CRLF sequence.  Similarly, any occurrence of CRLF
>in text MUST represent a line break.  Use of CR and LF outside
>of line break sequences is also forbidden.

        Good.   Good for  on-the-wire.

        Let me restate this:  we need to look at MIME as an off-the-wire
concept.   MIME is *great*, and the spec is *great* as long as we stay
on-the-wire,  and that's fine.   But people are using MIME off-the-wire
and looking at the same on-the-wire spec.   This needs to be,  at least,
clarified,  better,  formally addressed.

        I think Ned convinced me that as an IETF specification,
it is properly focused at on-the-wire operation.   Is there a way
we can,  without ruffling too many feathers,  put in some wording
that will make the MIME spec a better fit for these off-the-wire
square pegs?

>This forbids, among others, ISO 10646 UCS-2 and EBCDIC as text/plain
>character sets.

        I tried using a higher level name on CHARSET= once ... ONCE.
It didn't go over too well.   :-(    Latin-1 would apply equally well
to both  ISO-8859-1  and to  IBM CECP 1047,  which are the canonical
pair for ASCII/EBCDIC translation.   Any,  my test didn't work.

>However, why should the message I have written here be outlawed?

        It shouldn't.   BUT,  it also shouldn't be  "really EBCDIC".

>The best reason I could think of was to keep sanity when crossing gateways
>that routinely remove content-transfer-encodings, but that does not strike
>me as the most compelling thing in the world.

        More fundamental and basic than that:  let plain text
be plain text.   Let plain text on the EBCDIC systems be converted
into ASCII when it goes out into SMTP.

>Also, it is "cleaner" to have the number of possible charsets be limited,
>but is this better done by recommendation or by fiat?
>
>I'm not arguing for removing the restriction, only to seek a better
>understanding of why it is reasonable to impose it.
>Comments?

        Here is a simplistic and incomplete overview:

                UNIX        ASCII       NL (actually LF)
                VM          EBCDIC      record structure in filesystem
                MS-DOS      ASCII       CR/LF
                OS/2        ASCII       CR/LF
                MVS         EBCDIC      record structure in filesystem
                VMS         ASCII       various,  it supports several
                NT          ASCII       CR/LF ???

        This is why I've been focusing on three basic canonicalizations,
text,  binary,  and record oriented.   And for the sake of "mail"
converting record-oriented plain text into ASCII with CR/LF.

>    Harald A

--
Rick Troth <troth@ua1vm.ua.edu>, Houston, Texas, USA
http://ua1vm.ua.edu/~troth/


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa14537;
          8 Jun 95 22:18 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa14533;
          8 Jun 95 22:18 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa21424;
          8 Jun 95 22:18 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id VAA06372 for ietf-822-list; Thu, 8 Jun 1995 21:17:31 -0400
Received: from THOR.INNOSOFT.COM (THOR.INNOSOFT.COM [192.160.253.66]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id VAA06368 for <ietf-822@dimacs.rutgers.edu>; Thu, 8 Jun 1995 21:17:29 -0400
Received: from INNOSOFT.COM by INNOSOFT.COM (PMDF V5.0-3 #2001)
 id <01HRFMC0UX288ZDVD1@INNOSOFT.COM>; Thu, 08 Jun 1995 18:16:42 -0700 (PDT)
Date: Thu, 08 Jun 1995 18:08:31 -0700 (PDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ned Freed <NED@innosoft.com>
Subject: Re: Prohibition of EBCDIC in text/plain
In-reply-to: "Your message dated Thu, 08 Jun 1995 17:35:29 -0500 (CDT)"
 <199506082305.TAA28974@dimacs.rutgers.edu>
To: Rick Troth <TROTH@ua1vm.ua.edu>
Cc: Harald.T.Alvestrand@uninett.no, ietf-822@dimacs.rutgers.edu, 
    hta@dale.uninett.no
Message-id: <01HRH13P1SZQ8ZDVD1@INNOSOFT.COM>
MIME-version: 1.0
Content-type: text/plain; CHARSET=US-ASCII
Content-transfer-encoding: 7BIT
References: <Harald.T.Alvestrand@uninett.no>

> > The canonical form of any MIME text type MUST represent a line
> > break as a CRLF sequence.  Similarly, any occurrence of CRLF
> > in text MUST represent a line break.  Use of CR and LF outside
> > of line break sequences is also forbidden.

>         Good.   Good for  on-the-wire.

>         Let me restate this:  we need to look at MIME as an off-the-wire
> concept.   MIME is *great*, and the spec is *great* as long as we stay
> on-the-wire,  and that's fine.   But people are using MIME off-the-wire
> and looking at the same on-the-wire spec.   This needs to be,  at least,
> clarified,  better,  formally addressed.

All Internet standards ever address is what happens on-the-wire. Not only are
off-wire matters not addressed, in the past work to specify off-wire aspects of
various protocols has been rejected as being out of scope. (Standardization of
POP3 and IMAP4 nearly didn't happen because of this, believe it or not. We had
to argue that mailbox access operations needed to be standards across the
Internet before the groups were approved.)

>         I think Ned convinced me that as an IETF specification,
> it is properly focused at on-the-wire operation.   Is there a way
> we can,  without ruffling too many feathers,  put in some wording
> that will make the MIME spec a better fit for these off-the-wire
> square pegs?

Sure. Write a document that goes along with the existing MIME specification
that addresses off-wire issues. Then, once you have a document in hand,
approach the IESG about it and see if you cannot get it onto the standards
track. (The composition of the IESG has changed significantly if not completely
in the past few years so things that were impossible before may be possible
now.) If you can, great, and if not, you can always publish it as an
informational RFC.

I don't think it makes sense to talk about such things without a concrete
example in hand. 

> > The best reason I could think of was to keep sanity when crossing gateways
> > that routinely remove content-transfer-encodings, but that does not strike
> > me as the most compelling thing in the world.

>         More fundamental and basic than that:  let plain text
> be plain text.   Let plain text on the EBCDIC systems be converted
> into ASCII when it goes out into SMTP.

Quite true -- use of ASCII-based material is preferable on the wire but says
nothing about how you store things locally.

					Ned


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa18810;
          8 Jun 95 23:28 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa18806;
          8 Jun 95 23:27 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa22341;
          8 Jun 95 23:28 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id WAA07222 for ietf-822-list; Thu, 8 Jun 1995 22:28:02 -0400
Received: from wilma.cs.utk.edu (WILMA.CS.UTK.EDU [128.169.94.141]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id WAA07219 for <ietf-822@dimacs.rutgers.edu>; Thu, 8 Jun 1995 22:28:00 -0400
Received: from LOCALHOST by wilma.cs.utk.edu with SMTP (cf v2.11c-UTK)
          id WAA29721; Thu, 8 Jun 1995 22:27:44 -0400
Message-Id: <199506090227.WAA29721@wilma.cs.utk.edu>
X-URI: http://www.cs.utk.edu/~moore/
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Keith Moore <moore@cs.utk.edu>
To: Rick Troth <TROTH@ua1vm.ua.edu>
cc: Harald.T.Alvestrand@uninett.no, ietf-822@dimacs.rutgers.edu, 
    hta@dale.uninett.no, moore@cs.utk.edu
Subject: Re: Prohibition of EBCDIC in text/plain 
In-reply-to: Your message of "Thu, 08 Jun 1995 17:35:29 CDT."
             <199506082305.TAA28974@dimacs.rutgers.edu> 
Date: Thu, 08 Jun 1995 22:27:34 -0400
X-Orig-Sender: moore@cs.utk.edu


It's possible that I got a bit off-track in my last message.
I apologize for the tirade.  Let me start over.

>         Let me restate this: we need to look at MIME as an
> off-the-wire concept.  MIME is *great*, and the spec is *great* as
> long as we stay on-the-wire, and that's fine.  But people are using
> MIME off-the-wire and looking at the same on-the-wire spec.  This
> needs to be, at least, clarified, better, formally addressed.

I agree with at least part of this.  I think it would be a good idea
to have a separate, informational, document that described how to deal
with what a MIME message looks like when it's been munged from the
on-the-wire format into your local off-the-wire format.  

Doing so might even make the standard MIME documents simpler, because
then they could really stick to talking about "on-the-wire" form.

There's also a somewhat related issue of how to encode MIME messages
that contain binary body parts, in environments where the traditional
text end-of-line character is newline.  It's not quite "on-the-wire"
versus "off-the-wire", it's more like different wires have different
"on-the-wire" representations.  Anyway, it wouldn't hurt to address
this either.

Keith


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa23339;
          9 Jun 95 0:20 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa23335;
          9 Jun 95 0:20 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa23091;
          9 Jun 95 0:20 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id XAA07643 for ietf-822-list; Thu, 8 Jun 1995 23:42:16 -0400
Received: from alpha.xerox.com (alpha.Xerox.COM [13.1.64.93]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id XAA07640 for <ietf-822@dimacs.rutgers.edu>; Thu, 8 Jun 1995 23:42:15 -0400
Received: from golden.parc.xerox.com ([13.1.100.139]) by alpha.xerox.com with SMTP id <14462(1)>; Thu, 8 Jun 1995 20:41:35 PDT
Received: by golden.parc.xerox.com id <2761>; Thu, 8 Jun 1995 20:41:30 -0700
To: NED@innosoft.com
CC: Harald.T.Alvestrand@uninett.no, ietf-822@dimacs.rutgers.edu
In-reply-to: Ned Freed's message of Thu, 8 Jun 1995 14:50:08 -0700 <01HRGVAA9AEM8ZDVD1@INNOSOFT.COM>
Subject: Re: Prohibition of EBCDIC in text/plain
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Larry Masinter <masinter@parc.xerox.com>
X-Orig-Sender: Larry Masinter <masinter@parc.xerox.com>
Fake-Sender: masinter@parc.xerox.com
Message-Id: <95Jun8.204130pdt.2761@golden.parc.xerox.com>
Date: Thu, 8 Jun 1995 20:41:18 PDT

In defense of:

> The canonical form of any MIME text type MUST represent a line
> break as a CRLF sequence.  Similarly, any occurrence of CRLF
> in text MUST represent a line break.  Use of CR and LF outside
> of line break sequences is also forbidden.

Ned Freed said:

> The choice, then, is simple: Either you ban the use of stray CR and LF in text
> or else you require agents to maintain a comprehensive list of all the
> character sets and whether or not conversion to canonical form is possible
> and/or necessary.

There's something interesting about this. We've been unable to control
the proliferation of zillions of charset registrations, so the idea of
'a comprehensive list of all the character sets' sounds daunting.
However, it isn't necessary that the agent know all of the charset
registrations if it were possible to determine SOLELY FROM THE CHARSET
NAME whether such the charset used something other than CR and LF as a
line break sequence, even with a stupid lexical trick like
charset=*unicode-1-1-ucs2.

I apologize; I hate to rehash something that you've discussed
endlessly without having also suffered through the hundreds and
hundreds of messages, but I think disallowing the simplest binary
representation of 16-bit charsets on-the-wire seems like a serious
restriction and worthy of just a little more mooting.


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa23680;
          9 Jun 95 0:39 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa23676;
          9 Jun 95 0:39 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa23415;
          9 Jun 95 0:39 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id WAA07160 for ietf-822-list; Thu, 8 Jun 1995 22:17:28 -0400
Received: from wilma.cs.utk.edu (WILMA.CS.UTK.EDU [128.169.94.141]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id WAA07157 for <ietf-822@dimacs.rutgers.edu>; Thu, 8 Jun 1995 22:17:26 -0400
Received: from LOCALHOST by wilma.cs.utk.edu with SMTP (cf v2.11c-UTK)
          id WAA29623; Thu, 8 Jun 1995 22:17:09 -0400
Message-Id: <199506090217.WAA29623@wilma.cs.utk.edu>
X-URI: http://www.cs.utk.edu/~moore/
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Keith Moore <moore@cs.utk.edu>
To: Rick Troth <TROTH@ua1vm.ua.edu>
cc: Harald.T.Alvestrand@uninett.no, ietf-822@dimacs.rutgers.edu, 
    hta@dale.uninett.no, moore@cs.utk.edu
Subject: Re: Prohibition of EBCDIC in text/plain 
In-reply-to: Your message of "Thu, 08 Jun 1995 17:35:29 CDT."
             <199506082305.TAA28974@dimacs.rutgers.edu> 
Date: Thu, 08 Jun 1995 22:17:02 -0400
X-Orig-Sender: moore@cs.utk.edu

>         Good.   Good for  on-the-wire.
> 
>         Let me restate this:  we need to look at MIME as an off-the-wire
> concept.   MIME is *great*, and the spec is *great* as long as we stay
> on-the-wire,  and that's fine.   But people are using MIME off-the-wire
> and looking at the same on-the-wire spec.   This needs to be,  at least,
> clarified,  better,  formally addressed.

The problem is that most environments already have well-entrenched
mechanisms for translating between "on-the-wire" and "off-the-wire"
(local) representations of email messages. These translation
mechanisms existed long before MIME, and generally remain ignorant of
MIME even today.  Of course, this leads to wierd-looking things such
as ASCII messages that were translated into EBCDIC on receipt by the
local MTA, but which are still labelled as US-ASCII.

Fortunately, as long as *all* incoming MIME messages go through this
translation, you can deal with the change.  If you're on an EBCDIC
machine, and if you see "text/plain; charset=US-ASCII", you know that
the message is really ASCII translated to EBCDIC (and so should your
user agent).

(This isn't just a problem with EBCDIC machines; UNIX machines have a
similar problem with the translation of line endings.  A number of
UNIX UAs don't properly handle a text body part encoded in base64,
because they expect all text to be translated automatically into local
format by the translation layer.  Nevertheless, the MIME rules are
clear and unambiguous -- all body parts are converted to canonical
form before encoding.)

If you want to add any sanity to the "off-the-wire" version of MIME,
you have to change that translation layer to be MIME-aware.  But once
you do this, you have to make sure that *all* messages go through the
new MIME-aware translation when they cross the boundary between
"local" and "on-the-wire". You also have to change *all* of your local
user agents (at least those that were already dealing with
"on-the-wire" MIME translated via the old mechanism), to know about
the new format instead.  

At worst, you need a flag day where you have to change *everything* at
once. At best, your user agents now have to cope with two versions of
MIME.  (You will, of course, want to add some sort of indicator to the
"off-the-wire" format so that user agents can reliably distinguish it
from the "on-the-wire translated to local by the old mechanism"
format.)  You've just made your user agents considerably more complex
to avoid something which was wierd but worked just fine.

If you're going to go to all this trouble, it's very tempting to
declare that the new "off-the-wire" format is *identical* to the
"on-the-wire" format, except for that format indicator that lets your
UAs know the difference.  That way, your translator is very simple,
and all of the knowledge about content-types is in the user agent --
where it should be.

Well, the on-the-wire MIME format isn't the best for local storage.
You might want to use a format for which all of the MIME body parts
are stored in canonical form.  But the one thing you DO NOT want to do
is to translate certain kinds of body parts (like text) into the local
format, because then your (new) user agents will fight with your
translator about who should deal with which parts.

>         I think Ned convinced me that as an IETF specification,
> it is properly focused at on-the-wire operation.   Is there a way
> we can,  without ruffling too many feathers,  put in some wording
> that will make the MIME spec a better fit for these off-the-wire
> square pegs?

MIME already goes to considerable effort to make sure that
"on-the-wire translated to local format via pre-MIME mechanisms" is
workable.

Quoted-printable was carefully defined in such a way that it works for
either ASCII or EBCDIC.  (The q-p sequence "ABCDE=46" translates into
"0x41 0x42 0x43 0x44 0x45 0x46" in canoncal form regardless of whether
the local charset is EBCDIC or ASCII.)

Base64 was also designed so that the encoded form could be translated
to and from EBCDIC without damaging the canonical form.  Trailing
SPACE characters were ignored in quoted-printable, base64, and
multipart boundary markers because of SPACE padding in fixed-length
record systems.  Line lengths in quoted-printable, base64, and header
fields with encoded-words were kept short so that they could fit into
the line-length limitations of the BITNET mail transport.

All of this was done so that you could use MIME with existing 822 UAs,
and without having to change that layer that translates between
on-the-wire and local format.

>         I tried using a higher level name on CHARSET= once ... ONCE.
> It didn't go over too well.   :-(    Latin-1 would apply equally well
> to both  ISO-8859-1  and to  IBM CECP 1047,  which are the canonical
> pair for ASCII/EBCDIC translation.   Any,  my test didn't work.

"Latin-1" is not a valid "character set" (as MIME defines the term)
because it doesn't define a unique mapping of octets to characters.
If you have a body part in canonical form with charset=Latin1, you
know what kind of characters might appear in the body part, but you
don't know what algorithm to use to translate those octets into
characters.

>         More fundamental and basic than that:  let plain text
> be plain text.   Let plain text on the EBCDIC systems be converted
> into ASCII when it goes out into SMTP.

This is what happens now, no?  The only wierd thing is that the text
you compose on your local machine must be labelled as US-ASCII or some
such (even though it's really EBCDIC) if you're using your old
translator that isn't MIME-aware.  But -- as long as your local user
agents generate messages that look like ASCII messages that arrived
from elsewhere and were translated locally to EBCDIC -- what results
is not ambiguous. It's just wierd.

				  --

If you still want the "off-the-wire" format, I suggest you implement
it with new content-transfer-encodings.  Let the existing c-t-e's
retain their present meanings: treat them as if the characters in
these encodings were ASCII translated to your local character set.
Define new content-transfer-encodings for use only in "off-the-wire"
mail in your particular environment.  For instance, you could define a
"ibm-off-the-wire-binary" c-t-e that could contain arbitrary octet
sequences without encoding them as characters, and would also work
efficiently on your file system.  And you could define an
"ibm-off-the-wire-plain-text" encoding for text that was already in
the right format to be blatted to a 3270.  Do make sure that the new
encodings *never* leave your environment without being translated back
into a standard MIME "on-the-wire" c-t-e.

Once you have this, you could (if you wish) translate an incoming
message containing:

content-type: text/plain; charset=us-ascii
content-transfer-encoding: quoted-printable

into:

content-type: text/plain; charset=ebcdic
content-transfer-encoding: ibm-off-the-wall-plain-text

because doing so would not create any ambiguity.


Keith


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa01416;
          9 Jun 95 7:22 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa01412;
          9 Jun 95 7:22 EDT
Received: from [128.6.75.16] by CNRI.Reston.VA.US id aa03467; 9 Jun 95 7:22 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id GAA10257 for ietf-822-list; Fri, 9 Jun 1995 06:18:54 -0400
Received: from domen.uninett.no (domen.uninett.no [129.241.131.10]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id GAA10254 for <ietf-822@dimacs.rutgers.edu>; Fri, 9 Jun 1995 06:18:50 -0400
Received: from dale.uninett.no by domen.uninett.no with SMTP (PP) 
          id <14836-0@domen.uninett.no>; Fri, 9 Jun 1995 12:16:38 +0200
Received: from dale.uninett.no (localhost [127.0.0.1]) 
          by dale.uninett.no (8.6.9/8.6.9) with ESMTP id KAA02882;
          Fri, 9 Jun 1995 10:03:10 +0200
Message-Id: <199506090803.KAA02882@dale.uninett.no>
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Harald.T.Alvestrand@uninett.no
To: Ned Freed <NED@innosoft.com>
cc: ietf-822@dimacs.rutgers.edu
Subject: Re: Prohibition of EBCDIC in text/plain
In-reply-to: Your message of "Thu, 08 Jun 1995 14:50:08 PDT." <01HRGVAA9AEM8ZDVD1@INNOSOFT.COM>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <2879.802684988.1@dale.uninett.no>
Date: Fri, 09 Jun 1995 10:03:09 +0200
X-Orig-Sender: hta@dale.uninett.no

Thanks - I'm convinced.
I would suggest adding another NOTE: to section 6.1.1, just so that I
won't forget again :-):

NOTE: The reason for this restriction is that when mail gateways or agents
handle text in the canonical form, they should have the freedom to convert
into and out of their local newline convention without loss of information.

(funny - didn't we put this in somewhere else, too? C-T-E 7bit?
Same argument, same text, different context)

              Harald A


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa09125;
          9 Jun 95 13:32 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa09121;
          9 Jun 95 13:32 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa10762;
          9 Jun 95 13:32 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id MAA18167 for ietf-822-list; Fri, 9 Jun 1995 12:25:19 -0400
Received: from THOR.INNOSOFT.COM (THOR.INNOSOFT.COM [192.160.253.66]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id MAA18164 for <ietf-822@dimacs.rutgers.edu>; Fri, 9 Jun 1995 12:25:18 -0400
Received: from INNOSOFT.COM by INNOSOFT.COM (PMDF V5.0-3 #2001)
 id <01HRFMC0UX288ZDVD1@INNOSOFT.COM>; Fri, 09 Jun 1995 09:24:04 -0700 (PDT)
Date: Fri, 09 Jun 1995 09:13:36 -0700 (PDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ned Freed <NED@innosoft.com>
Subject: Re: Prohibition of EBCDIC in text/plain
In-reply-to: "Your message dated Thu, 08 Jun 1995 20:41:18 -0700 (PDT)"
 <95Jun8.204130pdt.2761@golden.parc.xerox.com>
To: Larry Masinter <masinter@parc.xerox.com>
Cc: NED@innosoft.com, Harald.T.Alvestrand@uninett.no, 
    ietf-822@dimacs.rutgers.edu
Message-id: <01HRHWSNEKFU8ZDVD1@INNOSOFT.COM>
MIME-version: 1.0
Content-type: TEXT/PLAIN; CHARSET=US-ASCII
Content-transfer-encoding: 7BIT
References: <01HRGVAA9AEM8ZDVD1@INNOSOFT.COM>

> > The choice, then, is simple: Either you ban the use of stray CR and LF in text
> > or else you require agents to maintain a comprehensive list of all the
> > character sets and whether or not conversion to canonical form is possible
> > and/or necessary.

> There's something interesting about this. We've been unable to control
> the proliferation of zillions of charset registrations, so the idea of
> 'a comprehensive list of all the character sets' sounds daunting.
> However, it isn't necessary that the agent know all of the charset
> registrations if it were possible to determine SOLELY FROM THE CHARSET
> NAME whether such the charset used something other than CR and LF as a
> line break sequence, even with a stupid lexical trick like
> charset=*unicode-1-1-ucs2.

This is just a way of hiding canonicalization information in the charset, as
opposed to having its own new field. As such, it inherits all of the same
problems that the new field approach has, not the least of which is that it is
a fundamental change that would both require resetting the standard to proposed
as well as modifications to all existing MIME agents.

If we were to choose this path I'd prefer to have an explicit field that would
work for types other than text. As long as you're going to break everything you
might as well do it right... But we've already decided against all this -- the
loss of the last four years would effectively kill MIME.

> I apologize; I hate to rehash something that you've discussed
> endlessly without having also suffered through the hundreds and
> hundreds of messages, but I think disallowing the simplest binary
> representation of 16-bit charsets on-the-wire seems like a serious
> restriction and worthy of just a little more mooting.

Well, I would first take issue with it being uncategorically simpler. In fact
it depends on how you define "simple" -- my definition of "simple" would
involve backwards compatibility which would translate into the raw 16 bit form
being substantially more complex than UTF-8 or UTF-8.

And second, there is nothing about this that disallows the use of 16-bit
Unicode. There is in fact no problem with using in MIME -- the only problem
is using it in subtypes of text. All that's needed is a definition of either
a subtype of application or the definition of a new top-level content-type
(e.g. widetext). This does not seem like an undue burden given that the
viewing application is guaranteed to be substantially different for information
of this sort.

				Ned


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa11427;
          9 Jun 95 15:26 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa11423;
          9 Jun 95 15:26 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa13119;
          9 Jun 95 15:26 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id OAA26069 for ietf-822-list; Fri, 9 Jun 1995 14:11:48 -0400
Received: from po8.andrew.cmu.edu (PO8.ANDREW.CMU.EDU [128.2.10.108]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id OAA26066 for <ietf-822@dimacs.rutgers.edu>; Fri, 9 Jun 1995 14:11:44 -0400
Received: (from postman@localhost) by po8.andrew.cmu.edu (8.6.12/8.6.12) id OAA11434 for ietf-822@dimacs.rutgers.edu; Fri, 9 Jun 1995 14:11:23 -0400
Received: via switchmail; Fri,  9 Jun 1995 14:11:21 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/service/mailqs/testq0/QF.wjq8uDi00WBwQ0W5V:>;
          Fri,  9 Jun 1995 14:10:24 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/usr7/jm36/.Outgoing/QF.Mjq8uAu00WBwI9xmtA>;
          Fri,  9 Jun 1995 14:10:21 -0400 (EDT)
Received: from BatMail.robin.v2.14.CUILIB.3.45.SNAP.NOT.LINKED.hogtown.andrew.cmu.edu.sun4c.411
          via MS.5.6.hogtown.andrew.cmu.edu.sun4c_411;
          Fri,  9 Jun 1995 14:10:18 -0400 (EDT)
Message-ID: <kjq8u_a00WBwA9xmhf@andrew.cmu.edu>
Date: Fri,  9 Jun 1995 14:10:18 -0400 (EDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: John Gardiner Myers <jgm+@cmu.edu>
To: ietf-822@dimacs.rutgers.edu
Subject: Re: Prohibition of EBCDIC in text/plain
In-Reply-To: <199506090217.WAA29623@wilma.cs.utk.edu>
References: <199506090217.WAA29623@wilma.cs.utk.edu>
Beak: is Not

Keith Moore <moore@cs.utk.edu> writes:
> Fortunately, as long as *all* incoming MIME messages go through this
> translation, you can deal with the change.  If you're on an EBCDIC
> machine, and if you see "text/plain; charset=US-ASCII", you know that
> the message is really ASCII translated to EBCDIC (and so should your
> user agent).

The wording I recently suggested adding to the "Canoncial Encoding
Model" is appropriate here.

What you have here is a MIME message where the text/plain part really
is in US-ASCII.  It's just that the entire MIME message has gone
through an encoding where each octet has been replaced by the
corresponding octet from the ASCII-->EBCDIC translation table.

To read the message, you have to first remove the local-form encoding
by running each octet through the EBCDIC-->ASCII translation table.
Then, you can interpret the MIME structure and extract the US-ASCII
text/plain part.  If you want, you can then convert the US-ASCII
text/plain part back into EBCDIC when saving it to a file.

This Canonical Encoding Model is *very important*, especially in the
face of such things as binary objects, MD5 hashes, and digital
signatures.  

If you try to define a format which is MIME, but where all occurences
of "US-ASCII" are replaced with "EBCDIC" gives you a format which is
*incompatible* with MIME.

-- 
_.John G. Myers		Internet: jgm+@CMU.EDU
			LoseNet:  ...!seismo!ihnp4!wiscvm.wisc.edu!give!up


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa14090;
          9 Jun 95 17:19 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa14086;
          9 Jun 95 17:19 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa15554;
          9 Jun 95 17:19 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id QAA00780 for ietf-822-list; Fri, 9 Jun 1995 16:59:19 -0400
Received: from po8.andrew.cmu.edu (PO8.ANDREW.CMU.EDU [128.2.10.108]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id QAA00777 for <ietf-822@dimacs.rutgers.edu>; Fri, 9 Jun 1995 16:59:17 -0400
Received: (from postman@localhost) by po8.andrew.cmu.edu (8.6.12/8.6.12) id QAA15472 for ietf-822@dimacs.rutgers.edu; Fri, 9 Jun 1995 16:59:12 -0400
Received: via switchmail; Fri,  9 Jun 1995 16:59:10 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/service/mailqs/testq0/QF.8jq=MJC00WBwQ0W5cJ>;
          Fri,  9 Jun 1995 16:59:01 -0400 (EDT)
Received: from hogtown.andrew.cmu.edu via qmail
          ID </afs/andrew.cmu.edu/usr7/jm36/.Outgoing/QF.kjq=MHC00WBwM9xzM3>;
          Fri,  9 Jun 1995 16:58:59 -0400 (EDT)
Received: from BatMail.robin.v2.14.CUILIB.3.45.SNAP.NOT.LINKED.hogtown.andrew.cmu.edu.sun4c.411
          via MS.5.6.hogtown.andrew.cmu.edu.sun4c_411;
          Fri,  9 Jun 1995 16:58:56 -0400 (EDT)
Message-ID: <kjq=MEK00WBw49xzAC@andrew.cmu.edu>
Date: Fri,  9 Jun 1995 16:58:56 -0400 (EDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: John Gardiner Myers <jgm+@cmu.edu>
To: ietf-822@dimacs.rutgers.edu
Subject: Re: comments on latest MIME drafts
In-Reply-To: <01HR8L2RPGWM9I44UF@SIGURD.INNOSOFT.COM>
References: <01HQQG3M6YAI8WVYOU@INNOSOFT.COM> <01HQQG3M6YAI8WVYOU@INNOSOFT.COM>
 <01HR4FMWQFDS90MVQC@SIGURD.INNOSOFT.COM>
 <01HR4FMWQFDS90MVQC@SIGURD.INNOSOFT.COM>
	<01HR8L2RPGWM9I44UF@SIGURD.INNOSOFT.COM>
Beak: Is

Ned Freed <NED@SIGURD.INNOSOFT.COM> writes:
> > Having the semantics be associated with identifiable syntactic objects
> > simplifies the task of generating and reading the data format.
> > Composers generate the syntactic constructs corresponding to the
> > semantics they want to convey.  Readers discover semantics by first
> > doing a parse to discover the syntax, then applying the association of
> > semantics to particular syntactic objects.
> 
> I disagree 100% with all of this. People do not discover sematics by
> implementing parsers. They discover them by reading specifications.

The specification of a format has to give sufficient information about
how to write programs to both generate and read objects in the format.
Anyone implementing anything which reads MIME objects is implementing
a parser.

> But it certainly isn't necessary for semantics to exist, nor is it
> necessary for different semantic constructs to bind unique syntactic
> elements. In fact it can be quite the opposite -- dates appear in
> all sorts of places in header fields, but I don't hear anyone
> suggesting that the semantics of date need to be represented
> differently in all of these fields or that dates are not important
> entities semantically.

I think you're misunderstanding what I'm saying.

The date-time nonterminal in RFC 822 has certain semantics associated
with it, these semantics are common to all occurences of date-time in
other nonterminals.  The date-time nonterminal occurs in quite a
number of other nonterminals, those other nonterminals have their own
additional semantics which are relevant to the date-time.

The semantics that identify a particular point in time are associated
with the date-time nonterminal and the nonterminals underneath
date-time.  The semantic that identifies that point as being the time
of message creation is associated with the orig-date nonterminal.  The
semantic that identifies that point as being the time of message
transport is associated with the received nonterminal.

> > The semantics they have in common (header/body syntax, content-
> > headers) I am trying to associate with the syntactic object known as
> > an "entity".
> 
> And I think this is a very bad idea. Entities are more general than this.

I think what you call an "entity" is different from what I call an
"entity".

The definition of "body part" you suggested in your message of May 30
is actually very close to what I would call an "entity".  Perhaps a
BNF of:

       entity = *(field / content-field) [ CRLF *OCTET ]

       content-field = content / encoding / id /
                       description / mime-extension-field

would make the semantic associations clearer?

> > The semantics specific to a body part (contained in a multipart, does
> > not require MIME-Version, may not contain enclosing multipart's
> > delimiter) I am trying to associate with the syntactic object known as
> > a "body-part", which is a disjoint subset of an "entity".
> 
> And this flies in the face of common usage, common understanding,
> and common sense. It makes MIME much harder to understand, and I am
> not willing to do it.  This is an absolute showstopper for me.

I have absolutely no idea where your statement is coming from.  These
semantics have existed since 1341 and everything that deals with a
multipart has to know about them.  Most of them have been specifically
attached to the definition of the body-part nonterminal.

It is the idea of having the term "body part" refer to something other
than the "body-part" BNF nonterminal that defies common sense and
makes MIME much harder to understand.

> > They have some syntax (and associated semantics) in common, and they
> > have some syntax (and associated semantics) by themselves.
> 
> But they also each have their own semantics as well as their own syntax.

So you stick the common syntax and semantics on the common
nonterminals/terms and stick their own syntax semantics on their own
nonterminals/terms.

> > Actually, I don't think the meaning of the term is well understood.
> > It appears to be used for at least two different concepts.
> 
> Well, if you mean that there's a well understood common sense meaning that
> is what most people mean when they say "body part", versus the old,
> nonsensical definition that managed to slip into MIME, then I certainly
> agree.

> > I think it's a bit confusing.  It also defines a term that is
> > a different concept than the body-part syntactic object, which has
> > semantics which do not apply to messages.
> 
> What semantics does it have that don't apply to MIME messages?

The semantics associated with a body-part syntactic object, which do
not apply to messages are:

* Does not require a MIME-Version header field
* May not include the boundary delimiter line of the enclosing
  multipart

> The Working Group rejected exactly this paradigm some time ago, preferring
> instead to go with the approach of MIME messages being a proper subset of
> RFC822 messages.

At some point we have to learn from our mistakes.

> > RFC 1049 Content-Type:
> > headers are not syntactically legal MIME Content-Type: headers, so a
> > MIME reader has the freedom to treat RFC 1049 Content-Type: headers as
> > it likes.
> 
> Not if it treats all messages as MIME messages.

MIME does not specify how one must treat syntactically illegal
Content-Type: headers.  Nothing in MIME prevents a reader from
interpreting syntactically illegal Content-Type: headers as it likes.
This is true whether or not there is a MIME-Version: header.

Put another way, RFC 1049 can be considered an extension to MIME,
albeit one which MIME agents are not permitted to generate.

> > Even with the definition of "body part" in
> > draft-ietf-822ext-mime-imb-03.txt, messages which "aren't MIME
> > messages" have associated body parts.  Take the (presumably zero)
> > content headers of the message, along with the body and there's your
> > "body part".
> 
> Sure. This can happen in MIME messages as well.

So therefore rules which MIME defines for body parts apply to messages
which "aren't MIME", since those messages also have body parts.  This
is no different than applying those rules to entities.

> > If a message doesn't have a MIME-Version, then a receiving UA has the
> > option, given in section 6 of the message bodies document, of ignoring
> > all rules in MIME applying to the message body, including any rules
> > imposed by the fact that the message is an entity.
> 
> Sure, but so what?

This is the escape hatch.  The MIME rules apply to all messages,
including those without MIME-Version.  However, in the absence of
MIME-Version, those rules which would apply to the body can be
ignored.

This escape hatch works equally well depending on whether the rules
are applied to "body parts" or "entities".

> > How does it lose?  You apply all the rules that you want for both
> > messages and body-parts, including the various Content-* headers, to
> > entities.  It appears to me to be a semantic win.
> 
> Because it blurs the distinction between body parts and messages.
> The early MIME work presented us with substantial evidence that losing
> this distinction is very bad.

It actually clarifies the distinction between a body-part and a
message.  Your approach blurs the distinction between a "body part"
and a body-part.

-- 
_.John G. Myers		Internet: jgm+@CMU.EDU
			LoseNet:  ...!seismo!ihnp4!wiscvm.wisc.edu!give!up


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa04997;
          13 Jun 95 14:24 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa04993;
          13 Jun 95 14:24 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa12864;
          13 Jun 95 14:23 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id NAA25761 for ietf-822-list; Tue, 13 Jun 1995 13:31:00 -0400
Received: from callandor.cybercash.com (callandor.cybercash.com [204.178.186.70]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id NAA25758 for <ietf-822@dimacs.rutgers.edu>; Tue, 13 Jun 1995 13:30:59 -0400
Received: by callandor.cybercash.com; id OAA12358; Fri, 26 May 1995 14:33:09 -0400
Received: from cybercash.com(204.254.34.52) by callandor.cybercash.com via smap (V1.3)
	id sma012351; Fri May 26 14:32:56 1995
Received: by cybercash.com.cybercash.com (4.1/SMI-4.1)
	id AA02357; Tue, 13 Jun 95 13:27:22 EDT
Date: Tue, 13 Jun 1995 13:27:22 -0400 (EDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: "Donald E. Eastlake 3rd" <dee@cybercash.com>
To: ietf-822@dimacs.rutgers.edu
Cc: "Donald E. Eastlake" <dee@cybercash.com>
Subject: 3xx reply
Message-Id: <Pine.SUN.3.91.950613132353.1637B-100000@cybercash.com>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII

This is really an RFC 821 question, but can anyone tell me what
typical SMTP clients do if they get a 3xx reply after a MAIL or RCPT
command?

RFC 821 says this is an "error" as opposed to a "failure" but do they
(1) abort the whole conversation, (2) give up on that piece of mail,
or (3) (for RCPT) just treat it like a failure and go on to the next
recipient, or (4) something else I haven't though of?

I'd appreaciate any info on this.

Donald

PS:  (please cc me on your response as I'm not sure I'm on this list
right now...)

=====================================================================
Donald E. Eastlake 3rd     +1 508-287-4877(tel)     dee@cybercash.com
   318 Acton Street        +1 508-371-7148(fax)     dee@world.std.com
Carlisle, MA 01741 USA     +1 703-620-4200(main office, Reston, VA)


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa02709;
          14 Jun 95 9:32 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa02705;
          14 Jun 95 9:32 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa06144;
          14 Jun 95 9:32 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id JAA12914 for ietf-822-list; Wed, 14 Jun 1995 09:04:09 -0400
Received: from lotus.com (lotus.com [192.233.136.1]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id JAA12911 for <ietf-822@dimacs.rutgers.edu>; Wed, 14 Jun 1995 09:04:08 -0400
Received: from internet1.lotus.com (crd.lotus.com) by lotus.com (4.1/SMI-4.10801.1994)
	id AA13803; Tue, 13 Jun 95 18:34:14 EDT
Received: by internet1.lotus.com (4.1/SMI-4.1.8.5.94)
	id AA19905; Wed, 14 Jun 95 08:05:11 EST
Message-Id: <9506141305.AA19905@internet1.lotus.com>
Received: from Lotus with "Lotus Notes Mail Gateway for SMTP" id
  C93D690AB26BF236852561DA007CF43A; Wed, 14 Jun 95 13:05:10 
To: ietf-822 <ietf-822@dimacs.rutgers.edu>
Cc: "Donald E. Eastlake 3rd" <dee@cybercash.com>
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Michael Harer <Michael_Harer.NOTES@crd.lotus.com>
Date: 14 Jun 95  7:28:44 EDT
Subject: Re: 3xx reply
Mime-Version: 1.0
Content-Type: Text/Plain

>dee @ cybercash.com ("Donald E. Eastlake") writes:
>This is really an RFC 821 question, but can anyone tell me what
>typical SMTP clients do if they get a 3xx reply after a MAIL or RCPT
>command?

>RFC 821 says this is an "error" as opposed to a "failure" but do they
>(1) abort the whole conversation, (2) give up on that piece of mail,
>or (3) (for RCPT) just treat it like a failure and go on to the next
>recipient, or (4) something else I haven't though of?

>I'd appreaciate any info on this.
>
>Donald

The following excerpt from rfc1123 briefly mentions "interoperability problems 
have arisen" but I didn't see anything that describes a required "action" for 
this occurrence. I also didn't see where (in rfc821) a 3xx reply was 
categorized as an "error" as opposed to a "failure" but only general guidelines 
that receivers must adhere to the listed reply codes. This is an interesting 
question that I am also looking for additional clarification.  
  

 5.2.10  SMTP Replies:  RFC-821 Section 4.2

         A receiver-SMTP SHOULD send only the reply codes listed in
         section 4.2.2 of RFC-821 or in this document.  A receiver-SMTP
         SHOULD use the text shown in examples in RFC-821 whenever
         appropriate.

         A sender-SMTP MUST determine its actions only by the reply
         code, not by the text (except for 251 and 551 replies); any
         text, including no text at all, must be acceptable.  The space
         (blank) following the reply code is considered part of the
         text.  Whenever possible, a sender-SMTP SHOULD test only the
         first digit of the reply code, as specified in Appendix E of
         RFC-821.

         DISCUSSION:
              Interoperability problems have arisen with SMTP systems
              using reply codes that are not listed explicitly in RFC-
              821 Section 4.3 but are legal according to the theory of
              reply codes explained in Appendix E.


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa07174;
          14 Jun 95 15:22 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa07170;
          14 Jun 95 15:22 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa15407;
          14 Jun 95 15:22 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id OAA24280 for ietf-822-list; Wed, 14 Jun 1995 14:13:36 -0400
Received: from callandor.cybercash.com (callandor.cybercash.com [204.178.186.70]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id OAA24274 for <ietf-822@dimacs.rutgers.edu>; Wed, 14 Jun 1995 14:13:28 -0400
Received: by callandor.cybercash.com; id TAA16921; Fri, 26 May 1995 19:03:08 -0400
Received: from cybercash.com(204.254.34.52) by callandor.cybercash.com via smap (V1.3)
	id sma016908; Fri May 26 19:03:00 1995
Received: by cybercash.com.cybercash.com (4.1/SMI-4.1)
	id AA06823; Wed, 14 Jun 95 12:18:42 EDT
Date: Wed, 14 Jun 1995 12:18:41 -0400 (EDT)
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: "Donald E. Eastlake 3rd" <dee@cybercash.com>
To: Michael Harer <Michael_Harer.NOTES@crd.lotus.com>
Cc: ietf-822 <ietf-822@dimacs.rutgers.edu>, 
    "Donald E. Eastlake" <dee@cybercash.com>
Subject: Re: 3xx reply
In-Reply-To: <9506141305.AA19909@internet1.lotus.com>
Message-Id: <Pine.SUN.3.91.950614121423.5701D-100000@cybercash.com>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII

The following is cut and pasted from RFC821:


      For each command there are three possible outcomes:  "success"
      (S), "failure" (F), and "error" (E). In the state diagrams below
      we use the symbol B for "begin", and the symbol W for "wait for
      reply".

      First, the diagram that represents most of the SMTP commands:

         
                                  1,3    +---+
                             ----------->| E |
                            |            +---+
                            |
         +---+    cmd    +---+    2      +---+
         | B |---------->| W |---------->| S |
         +---+           +---+           +---+
                            |
                            |     4,5    +---+
                             ----------->| F |
                                         +---+
         

         This diagram models the commands:

            HELO, MAIL, RCPT, RSET, SEND, SOML, SAML, VRFY, EXPN, HELP,
            NOOP, QUIT, TURN.


I hope that error is really processed like failure but I still don't know...

Anyone out there know?

Donald


On 14 Jun 1995, Michael Harer wrote:
> >dee @ cybercash.com ("Donald E. Eastlake") writes:
> >This is really an RFC 821 question, but can anyone tell me what
> >typical SMTP clients do if they get a 3xx reply after a MAIL or RCPT
> >command?
> 
> >RFC 821 says this is an "error" as opposed to a "failure" but do they
> >(1) abort the whole conversation, (2) give up on that piece of mail,
> >or (3) (for RCPT) just treat it like a failure and go on to the next
> >recipient, or (4) something else I haven't though of?
> 
> >I'd appreaciate any info on this.
> >
> >Donald
> 
> The following excerpt from rfc1123 briefly mentions "interoperability problems 
> have arisen" but I didn't see anything that describes a required "action" for 
> this occurrence. I also didn't see where (in rfc821) a 3xx reply was 
> categorized as an "error" as opposed to a "failure" but only general guidelines 
> that receivers must adhere to the listed reply codes. This is an interesting 
> question that I am also looking for additional clarification.  
>   
> 
>  5.2.10  SMTP Replies:  RFC-821 Section 4.2
> 
>          A receiver-SMTP SHOULD send only the reply codes listed in
>          section 4.2.2 of RFC-821 or in this document.  A receiver-SMTP
>          SHOULD use the text shown in examples in RFC-821 whenever
>          appropriate.
> 
>          A sender-SMTP MUST determine its actions only by the reply
>          code, not by the text (except for 251 and 551 replies); any
>          text, including no text at all, must be acceptable.  The space
>          (blank) following the reply code is considered part of the
>          text.  Whenever possible, a sender-SMTP SHOULD test only the
>          first digit of the reply code, as specified in Appendix E of
>          RFC-821.
> 
>          DISCUSSION:
>               Interoperability problems have arisen with SMTP systems
>               using reply codes that are not listed explicitly in RFC-
>               821 Section 4.3 but are legal according to the theory of
>               reply codes explained in Appendix E.
> 

=====================================================================
Donald E. Eastlake 3rd     +1 508-287-4877(tel)     dee@cybercash.com
   318 Acton Street        +1 508-371-7148(fax)     dee@world.std.com
Carlisle, MA 01741 USA     +1 703-620-4200(main office, Reston, VA)


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa05486;
          15 Jun 95 12:28 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa05482;
          15 Jun 95 12:28 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa10023;
          15 Jun 95 12:28 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id MAA13489 for ietf-822-list; Thu, 15 Jun 1995 12:04:47 -0400
Received: from ids.net (ids.net [155.212.1.2]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id MAA13486 for <ietf-822@dimacs.rutgers.edu>; Thu, 15 Jun 1995 12:04:45 -0400
Received: from conan.ids.net by ids.net with SMTP;
          Thu, 15 Jun 1995 12:02:42 -0400 (EDT)
Received: by conan.ids.net (4.1/SMI-4.1)
	id AA09121; Thu, 15 Jun 95 12:01:26 EDT
Date: Thu, 15 Jun 95 12:01:26 EDT
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Mike Braca <mbraca@conan.ids.net>
Message-Id: <9506151601.AA09121@conan.ids.net>
To: elevinson@accurate.com
Subject: Multipart/related comments
Cc: ietf-822@dimacs.rutgers.edu

Regarding ietf-draft-mimesgml-related-01.txt:

1 These statements in 3.2 (The Type Parameter) seem contradictory:

  "The type parameter must be specified if the start parameter
   is present. It permits a MIME user agent to determine the
   content-type without reference to the enclosed body part."

  "Where the content-type of the object root and the one indicated
   by the type parameter disagree, the object root is authoritative."

  It seems to me that if we are explicitly allowing them to disagree,
  the MIME UA ought to take action based on the authoritative one;
  thus would have to reference the body part.

2 Couple of problems in the examples:
  in 4.1:
  - need a ; after boundary=tiger-lilly
  - start="...1133...." should be start="...1132...."
  in 4.2
  - need a ; at end of start= line.
  - one of the cid URLs has a : instead of a . embedded.
  - the cid: URLs should not have angle brackets in them  :)

 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
 Mike Braca                                       Wasabe Software, Inc
 Software Artisan                                 mbraca@conan.ids.net
 

Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa07225;
          16 Jun 95 14:25 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa07220;
          16 Jun 95 14:24 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa13053;
          16 Jun 95 14:24 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id NAA15413 for ietf-822-list; Fri, 16 Jun 1995 13:56:26 -0400
Received: from Princeton.EDU (root@Princeton.EDU [128.112.128.1]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id NAA15407 for <ietf-822@dimacs.rutgers.edu>; Fri, 16 Jun 1995 13:56:23 -0400
Received: from acupain.UUCP by Princeton.EDU (5.65b/2.122/princeton)
	id AA09788; Fri, 16 Jun 95 13:52:34 -0400
Received: by Accurate.COM (4.1/SMI-4.0)
	id AA25160; Fri, 16 Jun 95 12:52:44 EDT
Message-Id: <9506161652.AA25160@Accurate.COM>
To: Mike Braca <mbraca@conan.ids.net>
Cc: elevinson@accurate.com, ietf-822@dimacs.rutgers.edu, 
    elevinso@accurate.com
Subject: Re: Multipart/related comments 
In-Reply-To: Your message of "Thu, 15 Jun 1995 12:01:26 EDT."
             <9506151601.AA09121@conan.ids.net> 
Organization: Accurate Information Systems, Inc.
X-Org-Addr: 2 Industrial Way
X-Org-Addr: Eatontown, NJ  07724
X-Org-Misc: 1.908.389.5550 (phone) 1.908.389.5556 (fax)
X-Mailer: MH 6.8
Date: Fri, 16 Jun 1995 12:52:41 -0400
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ed Levinson <elevinso@accurate.com>

Mike,

Thanks for pointing out the typos.

As to the apparent contradictions, if you are making only one pass you
have to decide what you're going to do when you encounter
Multipart/Related.  After further processing you reach the root entity.
If you discover the type parameter and the content-type disagree you
know you have the wrong one and a user agent should probably report
the error.  The spec just says which one to believe.

A user agent that makes several passes may wish to handle the
situation differently.

What to do when the two disagree is not part of the spec.  User agents
get to decide for themselves.

Best.../Ed

On Thu, 15 Jun 1995 12:01:26 EDT Mike Braca wrote:
> Regarding ietf-draft-mimesgml-related-01.txt:
> 
> 1 These statements in 3.2 (The Type Parameter) seem contradictory:
> 
>   "The type parameter must be specified if the start parameter
>    is present. It permits a MIME user agent to determine the
>    content-type without reference to the enclosed body part."
> 
>   "Where the content-type of the object root and the one indicated
>    by the type parameter disagree, the object root is authoritative."
> 
>   It seems to me that if we are explicitly allowing them to disagree,
>   the MIME UA ought to take action based on the authoritative one;
>   thus would have to reference the body part.
> 
> 2 Couple of problems in the examples:
>   in 4.1:
>   - need a ; after boundary=tiger-lilly
>   - start="...1133...." should be start="...1132...."
>   in 4.2
>   - need a ; at end of start= line.
>   - one of the cid URLs has a : instead of a . embedded.
>   - the cid: URLs should not have angle brackets in them  :)
> 
>  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
>  Mike Braca                                       Wasabe Software, Inc
>  Software Artisan                                 mbraca@conan.ids.net
>  


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa08238;
          21 Jun 95 15:20 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa08234;
          21 Jun 95 15:20 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa13896;
          21 Jun 95 15:20 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id OAA00258 for ietf-822-list; Wed, 21 Jun 1995 14:25:28 -0400
Received: from ohio.bbn.com (OHIO.BBN.COM [128.89.3.251]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id OAA00255 for <ietf-822@dimacs.rutgers.edu>; Wed, 21 Jun 1995 14:25:27 -0400
Received: (nyang@localhost) by ohio.bbn.com (8.6.10/8.6.5) id OAA20028; Wed, 21 Jun 1995 14:26:11 -0400
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Nancy Yang <nyang@bbn.com>
Message-Id: <199506211826.OAA20028@ohio.bbn.com>
Subject: Freeware 821 code
To:  ietf-822 <ietf-822@dimacs.rutgers.edu>
Date: Wed, 21 Jun 1995 14:26:11 -0400 (EDT)
Cc: Nancy Yang <nyang@ohio.bbn.com>
X-Mailer: ELM [version 2.4 PL23]
Content-Type: text
Content-Length: 71        

Could you tell me where I could get the freeware for 821 code?

-Nancy


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa10036;
          28 Jun 95 19:31 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa10032;
          28 Jun 95 19:31 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa17178;
          28 Jun 95 19:31 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id TAA13200 for ietf-822-list; Wed, 28 Jun 1995 19:02:51 -0400
Received: from welch.ncd.com (root@welch.ncd.com [192.43.160.250]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with ESMTP id TAA13197 for <ietf-822@dimacs.rutgers.edu>; Wed, 28 Jun 1995 19:02:49 -0400
Received: from zex (zex.z-code.com [192.82.56.53]) by welch.ncd.com (8.6.9/8.6.6) with ESMTP id PAA22786 for <@internetgate.ncd.com:ietf-822@dimacs.rutgers.edu>; Wed, 28 Jun 1995 15:59:34 -0700
Received: by zex (950215.SGI.8.6.10/940406.SGI.AUTO)
	for ietf-822@dimacs.rutgers.edu id QAA07612; Wed, 28 Jun 1995 16:00:51 -0700
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ben Liblit <liblit@zex.z-code.com>
Message-Id: <9506281600.ZM7610@zex.z-code.com>
Date: Wed, 28 Jun 1995 16:00:40 -0700
Reply-To: Ben Liblit <liblit@z-code.z-code.com>
X-Mailer: Z-Mail-SGI (3.2S.1 24mar95 MediaMail)
To: ietf-822@dimacs.rutgers.edu
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii

unsubscribe liblit@z-code.com


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa00854;
          29 Jun 95 5:26 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa00850;
          29 Jun 95 5:26 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa01954;
          29 Jun 95 5:26 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id EAA21422 for ietf-822-list; Thu, 29 Jun 1995 04:57:08 -0400
Received: from muswell.demon.co.uk (muswell.demon.co.uk [158.152.10.120]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id EAA21419 for <ietf-822@dimacs.rutgers.edu>; Thu, 29 Jun 1995 04:57:03 -0400
Date: Thu, 29 Jun 95 09:50:46 GMT
Message-Id: <3942@muswell.demon.co.uk>
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ruth Moulton <ruth@muswell.demon.co.uk>
Reply-To: ruth@muswell.demon.co.uk
To: info-mime@cs.utk.edu, ietf-822@dimacs.rutgers.edu
Subject: Equivalence Tables (fwd)
Lines: 45
X-Mailer: PCElm 3.1 (1.6 DIS)

Forwarded message follows:

> From owner-info-mime@cs.utk.edu Mon Jun 26 07:34:38 1995
> Received: from punt3.demon.co.uk by muswell.demon.co.uk with SMTP
> 	id AA3826 ; Mon, 26 Jun 95 07:34:35 GMT
> Received: from punt3.demon.co.uk via puntmail for ruth@muswell.demon.co.uk;
>           Fri, 23 Jun 95 15:32:41 GMT
> Received: from cs.utk.edu by punt3.demon.co.uk id aa21921; 23 Jun 95 16:31 +0100
> Received: from localhost by CS.UTK.EDU with SMTP (cf v2.9s-UTK)
> 	id KAA11030; Fri, 23 Jun 1995 10:19:12 -0400
> X-Resent-To: info-mime@CS.UTK.EDU ; Fri, 23 Jun 1995 10:19:11 EDT
> Errors-to: owner-info-mime@CS.UTK.EDU
> Received: from ics.uci.edu by CS.UTK.EDU with SMTP (cf v2.9s-UTK)
> 	id KAA11023; Fri, 23 Jun 1995 10:19:07 -0400
> Received: from USENET by q2.ics.uci.edu id aa29136; 23 Jun 95 7:18 PDT
> From: Ruth Moulton <ruth@muswell.demon.co.uk>
> Subject: Equivalence Tables
> Message-ID: <3825@muswell.demon.co.uk>
> X-Mailer: PCElm 3.1 (1.6 DIS)
> Newsgroups: comp.mail.mime
> Date: 23 Jun 95 14:18:37 GMT
> To: info-mime@cs.utk.edu
> Status: R

Hi,

RFC1494 and RFC1495 mention that IANA will maintain an

MHS/MIME Equivalence Table

for mapping of MIME body/media types to X.400

Could some one give me a URL to the latest table, if it exists.

RFC1700 - Assigned Numbers Oct '94 - has a table which has no more information
in it than RFC1494.

Thanks for  your help
Ruth
--
Ruth Moulton                 ruth@muswell.demon.co.uk
Consultant,
65 Tetherdown, London N10 1NH, UK.   Tel:  +44 181 883 5823


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa05139;
          29 Jun 95 12:32 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa05135;
          29 Jun 95 12:32 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa13557;
          29 Jun 95 12:32 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id MAA27338 for ietf-822-list; Thu, 29 Jun 1995 12:03:36 -0400
Received: from gateway1.DHL.COM (gateway1.DHL.COM [137.98.208.11]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id MAA27335 for <ietf-822@dimacs.rutgers.edu>; Thu, 29 Jun 1995 12:03:35 -0400
Received: from dhlsys.systems.DHL.COM by gateway1.DHL.COM id aa23913;
          29 Jun 95 9:03 PDT
Received: from maverick.systems.DHL.COM by dhlsys.systems.DHL.COM with SMTP
	(DHLGMS 4.07-DSI) id AA154291725; Thu, 29 Jun 1995 09:02:05 -0700
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Paul Rarey <Paul.Rarey@systems.dhl.com>
Message-Id: <9506290902.ZM20319@maverick.systems.DHL.COM>
Date: Thu, 29 Jun 1995 09:02:50 -0700
In-Reply-To: Ruth Moulton <ruth@muswell.demon.co.uk>
        "Equivalence Tables (fwd)" (Jun 29,  2:50)
References: <3942@muswell.demon.co.uk>
Reply-To: Paul Rarey <Paul.Rarey@systems.dhl.com>
X-Mailer: ZM-Win (3.2.1 09Sep94)
To: ruth@muswell.demon.co.uk, info-mime@cs.utk.edu, 
    ietf-822@dimacs.rutgers.edu
Subject: Re: Equivalence Tables (fwd)
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii

On Jun 29,  2:50, Ruth Moulton wrote:

>for mapping of MIME body/media types to X.400
>
>Could some one give me a URL to the latest table, if it exists.

   URL:ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/media-types

For more detail

  URL:ftp://ftp.isi.edu/in-notes/iana/assignments/media-types

-- 


Cheers!

[ psr ]


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa00996;
          30 Jun 95 5:42 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa00992;
          30 Jun 95 5:42 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa02151;
          30 Jun 95 5:42 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id EAA16999 for ietf-822-list; Fri, 30 Jun 1995 04:05:52 -0400
Received: from muswell.demon.co.uk (muswell.demon.co.uk [158.152.10.120]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id EAA16995 for <ietf-822@dimacs.rutgers.edu>; Fri, 30 Jun 1995 04:05:47 -0400
Date: Fri, 30 Jun 95 09:02:45 GMT
Message-Id: <3970@muswell.demon.co.uk>
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Ruth Moulton <ruth@muswell.demon.co.uk>
Reply-To: ruth@muswell.demon.co.uk
To: Paul.Rarey@systems.dhl.com, info-mime@cs.utk.edu, 
    ietf-822@dimacs.rutgers.edu
Subject: Re: Equivalence Tables (fwd)
Lines: 31
X-Mailer: PCElm 3.1 (1.6 DIS)

Paul,
thanks, but I'd looked there - I could find no mention of X.400/MIME
equivalence in the info in these directories - the information here is
just about the MIME media type, how it is defined, etc.

Ruth
In message <9506290902.ZM20319@maverick.systems.DHL.COM> Paul.Rarey@systems.dhl.com writes:
> On Jun 29,  2:50, Ruth Moulton wrote:
> 
> >for mapping of MIME body/media types to X.400
> >
> >Could some one give me a URL to the latest table, if it exists.
> 
>    URL:ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/media-types
> 
> For more detail
> 
>   URL:ftp://ftp.isi.edu/in-notes/iana/assignments/media-types
> 
> -- 
> 
> 
> Cheers!
> 
> [ psr ]
> 

-- 
Ruth Moulton                 ruth@muswell.demon.co.uk
Consultant,
65 Tetherdown, London N10 1NH, UK.   Tel:  +44 181 883 5823


Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa01211;
          30 Jun 95 6:35 EDT
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa01207;
          30 Jun 95 6:35 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa02810;
          30 Jun 95 6:34 EDT
Received: (from daemon@localhost) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) id FAA17263 for ietf-822-list; Fri, 30 Jun 1995 05:34:49 -0400
Received: from domen.uninett.no (domen.uninett.no [129.241.131.10]) by dimacs.rutgers.edu (8.6.12+bestmx+oldruq+newsunq+grosshack/8.6.12) with SMTP id FAA17260 for <ietf-822@dimacs.rutgers.edu>; Fri, 30 Jun 1995 05:34:35 -0400
Received: from dale.uninett.no by domen.uninett.no with SMTP (PP) 
          id <01989-0@domen.uninett.no>; Fri, 30 Jun 1995 11:23:58 +0200
Received: from dale.uninett.no (localhost [127.0.0.1]) 
          by dale.uninett.no (8.6.9/8.6.9) with ESMTP id LAA02200;
          Fri, 30 Jun 1995 11:23:55 +0200
Message-Id: <199506300923.LAA02200@dale.uninett.no>
X-Mailer: exmh version 1.5.3 12/28/94
Sender:ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Harald.T.Alvestrand@uninett.no
To: ruth@muswell.demon.co.uk
cc: info-mime@cs.utk.edu, ietf-822@dimacs.rutgers.edu
Subject: Re: Equivalence Tables (fwd)
In-reply-to: Your message of "Thu, 29 Jun 1995 09:50:46 GMT." <3942@muswell.demon.co.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Fri, 30 Jun 1995 11:23:53 +0200
X-Orig-Sender: hta@dale.uninett.no

I don't think anyone has registered any new ones.

          Harald A