Friday, April 18, 2008

PHP5 SimpleXML and CDATA

I am in the process of porting a rather large PHP4 application to PHP5 (just in time for PHP6, yes, yes, I know) for one of my customers. Most of the application is pure imperative programming so the switch has been rather painless.

Unfortunately (for me) they have made rather judicious use of the PHP4 domxml extension libraries which no longer exist in PHP5 (where they have been replaced with the dom extension).

Moving from domxml to dom was straight forward (looping over tags, tags, attributes and text inside tags are addresses differently) so I chose to simply re-write the code to use the new dom extensions instead of opting for a translation library like the one provided by Alexandre Alapetite.

One exception to this was the use of < !--[CDATA[...]]--> blocks which SimpleXML simply seemed to discard when creating a new object.

A quick look around Google (here, here and here) and I found what needed to be done to address SimpleXML's ignorant behaviour.

The SimpleXML constructor allows you to pass in extra libxml2 parameters which allow you to get further functionality out of the library. The one I was interested was of course:

LIBXML_NOCDATA (integer)
Merge CDATA as text nodes

So, simply changing my constructor from:
$xml = new SimpleXMLElement($text)
to:
$xml = new SimpleXMLElement($text, LIBXML_NOCDATA)
was all that was required for me to gain access to those < !--[CDATA[...]]--> structures.


No comments:

About Me

My photo
I love solving real-world problems with code and systems (web apps, distributed systems and all the bits and pieces in-between).