Xml binary data cdata
So how can this potential headache be avoided? Binary data should be encoded as hex or base Encoding data has both pros and cons, but in any event it is the only way to insure the data will not break an XML document.
The upside to encoding is that the data is now more portable. The downside is that encoding introduces additional overhead. The document is larger than if the raw binary data were used directly and it must be decoded, adding additional required processing, in order to use it.
Size wise, base64 encoding is your best bet. Another thing to be aware of is future compatibility. Its initial state is a small internal service, so you decide to on a custom homebrewed solution for these technologies. After some time the company decides that the service is extremely useful and wants to expose it for its clients, but would like it to be compatible with the W3C specifications for these technologies.
According to the specs, base64 is the only required encoding algorithm. It is quite possible that if you are using hex encoding, your application will not inter-operate with other applications also based on these specifications.
Once encoded, there are no longer any special circumstances requiring it to live within a CDATA section and can happily live as element content. Trackback specific URI for this entry. As a text-based standard, XML is well-suited for exchanging data between client and server systems. Much data is already text-based file paths, descriptions, addresses, names, and so on , and things like integers, floating-point numbers, and dates can be easily converted to and from string representations.
This expands the data and makes it extremely hard for humans to read, not to mention the annoyance of translating markup if you write the XML manually in a text editor. A better solution might be to put the data directly into your XML document.
CDATA character data sections are treated as a block of data by the parser, allowing you to include any character in the data stream.
Listing 1 shows a simple paragraph sample with some emphasized text. It becomes a bit of a nightmare when you want to show the markup see Listing 2. Anything between those bits of markup will pass through the XML parser untouched. You'll often see something like Listing 4.
This isn't generally going to cause trouble unless you're very, very unlucky, but it can certainly cause parser errors that lead to confusing and hard-to-debug rendering errors. Also, the dash dash -- sequence can be seen as the unexpected start or end of an XHTML comment block. Clearly the CDATA section is useful, but like all good things, it has a couple of limitations for you to keep in mind.
You'll either lose their contents the CDATA section has vanished from the normal DOM or have the contents rendered as text with some stray markup characters showing up. To see this effect, look at a page that shows the sample paragraph, the sample paragraph with the markup visible using entities , and an attempt to show the sample paragraph with the markup visible using CDATA.
View a text-only version of Figure 1. View image at full size. WebKit-based browsers such as Safari and Chrome render it with spurious markup characters see Figure 2. View a text-only version of Figure 2. View a text-only version of Figure 3. If they didn't, the browser's XML parser would be considered "non-conforming" and people would mock it mercilessly before marking it as horrifically broken for Ajax.
If the XML parser reads this sequence, it's the end of your CDATA section and you might end up getting a parser error when it hits the real section end. Luckily, this situation doesn't come up very often. Even though the contents of the CDATA section pass through your parser untouched, they still need to be valid XML data characters, as specified by the document's character encoding.
Using something like UTF-8 lets you use a huge range of characters for the data, but it's not 8-bit clean. Any of the so-called control characters those with a hex value below 0x20, the space character can cause your parser to stop with an invalid token error.