Windows DatabaseebooksStatistical Information

LLE10070 : What is the default encoding for an XML document

Symptom:

You do not have an encoding in an XML document and need to know it

Cause:

This is by design

Solution:

If no encoding declaration is present in the XML document (and no external encoding declaration mechanism such as the HTTP header is available), the assumed encoding of an XML document depends on the presence of the Byte-Order-Mark (BOM).

The Byte-Order-Mark (or BOM), is a special marker added at the very beginning of an Unicode file encoded in UTF-8, UTF-16  or UTF-32. It is used to indicate whether the file uses the big-endian or little-endian byte order. The BOM is mandatory for UTF-16 and UTF-32, but it is optional for UTF-8.

The BOM is a Unicode special marker placed at the top of the file that indicate its encoding. The BOM is optional for UTF-8.
First bytes     Encoding assumed
EF BB BF     UTF-8
FE FF     UTF-16 (big-endian)
FF FE     UTF-16 (little-endian)
00 00 FE FF     UTF-32 (big-endian)
FF FE 00 00     UTF-32 (little-endian)
None of the above     UTF-8

Note that the encoding of an XML document is never iso-8859-1 by default.

One of the most common mistake when editing an XML document is to add some extended characters and forget to set the encoding declaration at the top of the document.

Disclaimer:

The information provided in this document is intended for your information only. Lubby makes no claims to the validity of this information. Use of this information is at own risk!
Copyright © 2004-2011 Lubby (V3.0.10 Aug 2011)
Sponsored by Keskon.
Statistical information by Google Analytics