Solving XPath Case Sensitivity with PHP
Case sensitivity can be a bit of a problem when doing XPath queries in PHP. This is because PHP implements XPath 1.0 and not 2.0 where useful functions such as the lower-case and upper-case functions are available. To overcome this limitation, developers have often used a bit of a hack with the translate function, e.g. like below.
$doc = new \DOMDocument();
$doc->loadHTML('html markup');
$xpath = new \DOMXPath($doc);
$query = "//meta[contains(translate(@name, 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'description')]";
$nodeList = $xpath->query($query);
The above XPath query looks for a meta element with a property tag with the value of “description” in any case. It does this by transforming all uppercase letters to lowercase before performing the comparison. The problem is that this query does not look very pretty, and if more than one case insensitive search is needed in one query, it will become very difficult to read and maintain the query.
Instead, PHP provides a very clever and convenient extension to XPath which enables developers to use PHP functions within XPath queries. Consider the code below which accomplishes the same thing as the previous example did.
$doc = new \DOMDocument();
$doc->loadHTML('html markup');
$xpath = new \DOMXPath($doc);
$xpath->registerNamespace('php', 'http://php.net/xpath');
$xpath->registerPhpFunctions(); // Allow all PHP functions
$query = "//meta[contains(php:functionString('strtolower', @name), 'description')]";
$nodeList = $xpath->query($query);
To be able to use PHP functions within XPath queries, we first have to register the PHP namespace. Secondly, we have to register the PHP functions themselves. In the above example, no parameter is passed to the registerPhpFunctions method, but a string or array of function names can be passed. It is also possible to use custom functions, which you can see an example of in the documentation.
We can now call PHP functions like this: php:functionString(“functionName”, [parameters]). The code example above converts the value of the name attribute to lowercase and then compares it to the “description” string, thus being case insensitive. As a result, the following values for a meta element’s name attribute would all match: “description”, “DESCRIPTION”, “dEsCriPtIoN”, etc.
That is really all there is to it. Now you can use PHP functions within XPath queries whenever a task cannot be solved with XPath alone, or is easier to solve by using a PHP function. This approach is of course not limited to performing case insensitive searched in XPath, but can be used in many other scenarios as well.
2 comments on »Solving XPath Case Sensitivity with PHP«
I can’t believe I’m the first one to thank you. Thanks! :)
Thanks Bo for this tricky method, you opened my eyes )