Microsoft KB Archive/836555

= Frequently asked questions about MIME and content conversion in Exchange 2000 Server and in Exchange Server 2003 =

Article ID: 836555

Article Last Modified on 10/25/2007

-

APPLIES TO


 * Microsoft Exchange Server 2003 Enterprise Edition
 * Microsoft Exchange Server 2003 Standard Edition
 * Microsoft Exchange 2000 Server Standard Edition

-



SUMMARY
This article includes answers to frequently asked questions about the MIME standard and the encoding methods that make it possible for different mailing systems to work together. This article briefly explains how you can troubleshoot garbled or unreadable mail messages in a Microsoft Exchange environment.



MORE INFORMATION
Q1: What is the format of an e-mail message?

A1: Internet e-mail messages follow the format standards that are defined in RFC 2822. A message is made up of header fields and a body. The header fields are collectively named the &quot;header&quot; of the message. The body of the message is optional. A message can be sent without a body, but not without a header.

The header contains a sequence of lines of characters that have a special syntax, as defined in the RFC 2822 format standard. The body contains a sequence of characters that follow the header and that are separated from the header by an empty line (that is, a line that has nothing before the Carriage Return and Line Feed [CRLF]).

Header fields are lines that are composed of a field name followed by a colon, followed by a field body, and ended by a CRLF. A field name must be composed of printable US-ASCII characters (that is, characters that have values between 33 and 126, inclusive), except the colon. The colon is used as a separation character.

A field body may be composed of any US-ASCII characters, except for the CRLF. However, a field body may contain a CRLF when used in header folding and unfolding. Folding is when, for convenience, a single line appears on multiple lines. Unfolding is the reverse of this. All field bodies must follow the syntax described in sections 3 and 4 of the RFC 2822 format standard.

The body of the message may include one or more sections. Each body section is separated by a boundary. The boundary parameter is a text string that begins with two hyphens (--).

Q2: What is MIME?

A2: MIME is a standard that can be used to include content of various types in a single message. MIME extends the Simple Mail Transfer Protocol (SMTP) format of mail messages to include multiple content, both textual and non-textual. Parts of the message may be images, audio, or text in different character sets. The MIME standard derives from RFCs such as 2821 and 2822.

Q3: How does Exchange handle MIME?

A3: There are three main components in Exchange that perform content conversion:
 * The IMAIL component is the core component that converts Internet messages to MAPI / Exchange format and vice versa. It also checks the integrity of the message.
 * The EXMIME component is responsible for translating Internet messages in Exchange's internal object representations and for generating MIME-formatted messages.
 * The RFHTML component converts Rich Text Format (RTF) messages to HTML and vice versa, according to the settings on the Microsoft Outlook client and the settings on the Exchange Server computer.

Q4: What are the minimum requirements of a MIME-formatted e-mail message?

A4: A MIME-formatted e-mail message will typically have a header and a body. At a minimum, a MIME message must have a header. The body is optional. The header section includes a MIME version 1.0 header (one per message, as defined in RFC 2822).

The body section includes:
 * Content-type (This is optional. The default is text/plain, as defined in RFC 2822-5.2.)
 * Content-Transfer-Encoding header (This is optional. The default is 7-bit, as defined in RFC 2822-6.1.)
 * Content-Disposition header (This is optional.)
 * A Content-ID (This is optional.)

Q5: What is a MIME version header?

A5: The MIME version header field denotes a MIME formatted message. Messages that are sent from earlier software that do not support MIME do not have this field. Mail clients use the absence of this field to distinguish non-MIME messages.

Q6: What is a Content-type header?

A6: The Content-Type field is used to specify the type and sub-type of data in a field.

Some of the header types are multipart/mixed, multipart/alternative, text/plain, text/html, application/applefile, application/ms-tnef, and application/octet-stream.

Q7: What is Content-Transfer-Encoding?

A7: Different mail systems handle data differently, and some earlier mail systems cannot handle multimedia data.

To work around systems that cannot handle multimedia data, an encoding scheme is used to convert the data to a uniform 7-bit format. When the recipient receives the message, the data is restored to its original format. Some examples of Content-Transfer-Encoding formats are:
 * Quoted-printable
 * Base64
 * 8-bit
 * 7-bit
 * Binary

Q8: How is Transfer-Encoding implemented in Exchange 5.5?

A8: Exchange 5.5 encodes in quoted-printable format or in 7-bit format, as required. Additionally, Exchange 5.5 encodes messages in base64 format if 25% of the message is made up of 8-bit characters (that is, characters that are outside the US-ASCII range). This applies only to the message body; attachments are always base64-encoded.

Q9: How is Transfer-Encoding implemented in Exchange 2000 Server?

A9: Routing group boundaries and SMTP target destinations determine how Exchange 2000 Server encodes mail. Exchange 2000 will encode as quoted-printable or 7-bit or Transport-Neutral Encapsulation Format (TNEF) when sending between two servers/recipients in different routing groups, and to the Internet.

Exchange 2000 Server will encode in Binary or Summary TNEF when sending to a recipient/server in the same routing group.

Q10: How is Transfer-Encoding implemented in Microsoft Exchange Server 2003?

A10: When routing group boundaries, Exchange and SMTP target destinations also determine how Exchange Server 2003 encodes mail.

In mixed mode, Exchange Server 2003 encodes as quoted-printable or 7-bit (TNEF format) when sending between two servers/recipients in different routing groups and to the Internet.

In native mode, Exchange Server 2003 encodes in Binary (Summary TNEF) when sending to a recipient/server in the same or other routing groups.

Q11: What is a Content-Disposition header?

A11: This header identifies whether a section will appear as an attachment or appear in the message body.

Q12: How does Exchange handle attachments?

A12: Exchange Server 2003 saves Internet messages in their native format. This means that if the messages are read from an Internet-format-aware client like Outlook Express, they will be rendered in their original format.

However, when messages are read from a MAPI client, IMAIL maps the appropriate elements in the Internet format to MAPI properties. In fact, before an Internet message is even delivered to a mailbox, a minimum set of MAPI properties must be promoted from the Internet message; these include PR_SENT_REPRESENTING, PR_SUBJECT, and recipient table.

Other MAPI properties like PR_BODY, PR_HTML, and PR_ATTACH_DATA_BIN are computed from the Internet format on demand. Conversion occurs when a MAPI client requests the message for the first time. Exchange 2000 Server then promotes those native MIME properties to MAPI format.

During content-conversion in Exchange Server 2003, when rendering an inbound MIME message to a client, IMAIL does the following:
 * 1) IMAIL checks for a complete file name ( . ) in the MIME header. If a name is found, the file name is used.
 * 2) IMAIL checks for a partial file name. If one is found, IMAIL looks at the  subkey to map a file name extension to the content type. If a matching content type is found, that extension is added to the attachment. If no matching content type is found, the partial file name is used.
 * 3) If no file names are specified in the Content-type or Content-Disposition headers, IMAIL searches for a matching content type. If one is found, the attachment will have the format ATT ., where   is the extension that is associated with this content type. If no matching content type is found, an extension of &quot;.att&quot; is used. Exchange 2000 Server requires a file name in the Content-type header.

Q13: What is the structure of the full message?

A13: The following is an example of a full message, with all the headers shown. Recipients will generally see only the body parts. Comments have been added in brackets.

Received: from SMTP server (server1.example1.com IP address)

by Receiver SMTP version 1.0 (server2.example2.com IP address);

Mon, 28 Oct 2002 08:42:42 -0500 (EST)

Message-ID: <006@example1.com>

From: &quot;User1&quot;  (RFC 2822 sender)

To: &quot;User2&quot; user2@example2.com> (RFC 2822 recipient)

Subject: A test message to see if you can see me now!

Date: Mon, 8 Nov 2002 13:46:54 +0100

MIME-Version: 1.0

Content-Type: multipart/mixed; boundary=&quot;=_NextPart_000_005A_01C27E88.79B98A90&quot;

X-Priority: 3

Return-Path: user1@domain1.com

X-OriginalArrivalTime: 28 Oct 2002 13:45:00.0541 (UTC) FILETIME=[35F0A2D0:01C27E88]

This is a multi-part message in MIME format (If you see this in the message body, there is a problem. Notice there is no space or CRLF in the headers in the previous text. There must be no space until the message body itself)

This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. (If you see this in the message body, there is a problem)

--=_NextPart_000_005A_01C27E88.79B98A90 (Boundary)

Content-Type: text/plain;

charset=&quot;iso-8859-1&quot;

Content-Transfer-Encoding: 7bit

Can you see me now??! (This is the text of the message. You see this in your mail client.)

--=_NextPart_000_005A_01C27E88.79B98A90 (Boundary. See Multipart/Mixed definition earlier in this article.)

Content-Type: image/jpeg;

name=&quot;Haloweenpictures.jpg&quot;

Content-Transfer-Encoding: base64 (All attachments in Exchange 2000 Server are encoded by using base64 and bloat 33%.)

Content-Disposition: attachment;

filename=&quot;Haloweenpictures.jpg&quot;

/9j/4AAQSkZJRgABAQEBLAEsAAD/2wB/9j/4AAQSkZJRgABAQEBLAEsAAD/2wB (Base64 data in raw format. This is the picture being encoded in binary format; to be repackaged later back as a picture.)

/9j/4AAQSkZJRgABAQEBLAEsAAD/2wB/9j/4AAQSkZJRgABAQEBLAEsAAD/2wB

/9j/4AAQSkZJRgABAQEBLAEsAAD/2wB/9j/4AAQSkZJRgABAQEBLAEsAAD/2wB

--=_NextPart_000_005A_01C27E88.79B98A90--

Q14: Why do mail addresses show up in the body of my message?

A14: The first “CRLF CRLF” in the MIME stream denotes the beginning of the message body. Therefore, if there is a  sequence between two headers, the client (Microsoft Outlook or Microsoft Outlook Express) will show a garbled message.

The first “CRLF CRLF” separates the Internet message headers from the message body. These headers are known as P1 headers. P1 headers are envelope headers and are not part of the Internet message that is processed by IMAIL.

Exchange may be forwarding to a smart host, content scanning device, virus wall or firewall. Third-party firewall software may sometimes distort messages.

You may also be receiving e-mail messages through a UNIX system, through a firewall, or through a virus wall. Also, a third-party content-scanning device may be on the network. If you have similar applications or devices on your network, verify recipient limits. Avoid using such devices or applications in diagnostic tests.

Q15: Why do question marks appear in my message?

A15: If question marks appear in the message, it means that the system does not know how to translate some ANSI or Unicode characters that are in the message. You must make sure that the client where the message is viewed has code pages installed that match the inbound character set. For example, make sure that the Windows workstation has the Japanese locale installed to view messages that are written in Japanese.

Q16: Why do I have difficulty opening HTML messages and I see events 12002 and 12003 in my application log?

A16: These application log events are from the source Exchange Information Store, and are content conversion errors. Some of these messages can be ignored and do not have any affect on the received messages. But if you see many of these messages in the application logs, see if clients have problems opening HTML messages.

If this is the case, establish more diagnostic information like service pack versions, copies of problem messages, copies of application and system logs, and then troubleshoot.

Q17: Why do I sometimes see the message body as an attachment?

A17: You must establish information about the source of the message. You must establish all details of the source server. You must investigate whether the sender is on a UNIX network. For more information about this problem, click the following article number to view the article in the Microsoft Knowledge Base:

323482 Exchange displays message that uses &quot;inline&quot; MIME Content-Disposition header as attachment

Q18: Why do I see “This is a multi-part message in MIME format.” or “This message is in MIME format.&quot; in the body of the e-mail?

A18: You may see one of the following in the text body:
 * “This is a multi-part message in MIME format.”
 * “This message is in MIME format. Because your mail reader does not understand this format, some or all of this message may not be legible.”

This text is inserted before the first boundary, is typically present in every multipart message, and is not visible to the client unless there is a problem with the e-mail format. For example, a hard line break might have been inserted in the message in the wrong position.

To troubleshoot this issue, turn on message archival on the Microsoft Exchange Internet Mail Service. To turn on message archival, use the Exchange Server 2003 Archive Sink utility, save the message as .eml or .pst, and then do more analysis. For more information about the Archive sink feature, click the following article number to view the article in the Microsoft Knowledge Base:

307798 The Archive Sink utility is available in Service Pack 2

Q19: Why is the size of my mail about 33% larger than I expect, and why is it base64-encoded even when it does not contain attachments?

A19: This behavior occurs if you use the following character sets:
 * Japanese Shift-JIS
 * KOR
 * Japanese EUC
 * Korean ISO-2202-KR
 * Taiwan ISO-2202-TW
 * Chinese ISO-2202-CN
 * Chinese HZ_GB, Big5, GB18030

Q20: What are other common and known issues?

A20: Messages that are relayed from one Exchange Server 2003 server to another may appear garbled, and the recipients of the message may show up in the body of the message.

For example, this may occur in a message that is routed from the Internet to an inbound Internet Exchange Server 2003 computer, and then to another Exchange Server 2003 computer.

The message might appear correctly on the Inbound Internet Exchange Server 2003, but garbled on the backend Exchange Server. These symptoms may also change if the number of recipients on the message is reduced. This issue can appear for several reasons, but the two most common reasons are:  RFC 2822 message headers that have more than 1000 or 1024 characters in a line, instead of 998 characters in a line. For more information, see Request for Comments (RFC) 2822. To this, visit the following Internet Engineering Task Force (IETF) Web site:

http://www.ietf.org/rfc/rfc2822.txt?number=2822

You may experience this issue if messages are relayed to a Exchange Server 2003 computer by using binary data or by chunking. Chunking is an extension to the SMTP format that supports data sent in chunks.

When Exchange Server 2003 receives a message that has more than 998 characters in a line, the SMTP service parses the header and discovers that the line is longer than 1000 characters. The SMTP service then assumes this line is not a header part and includes it in the body.

The SMTP service on the Exchange Server computer will then re-write its own headers, including message ID and a DATE header, followed by a blank line or a CRLF. Line length limits. There are many implementations that, in accordance with the transport requirements of RFC 2821, do not accept messages that have more than 1000 characters per line, including the CRLF. Therefore, mail applications must not create such messages. To work around this issue, turn off the ESMTP verb (or chunking) capability on the Exchange Server computer and force the Exchange Server computers to format messages in typical SMTP format when the message is being relayed. For more information, click the following article number to view the article in the Microsoft Knowledge Base:

257569 How to turn off ESMTP verbs in Exchange 2000 Server and in Exchange Server 2003

821733 Incoming message is garbled if the To line exceeds 1,022 characters



