Manipulating DOM Documents with phpQuery

Published on July 3, 2013 by

If you have ever needed to manipulate a DOM document (e.g. a HTML document) in PHP, you have probably noticed that the DOMDocument class offers very limited functionality for this, and it is not so convenient. After doing some research, I found a nice overview of extensions and libraries that displays various options for manipulating and/or parsing DOM documents. I found the phpQuery library to be particularly interesting, so I decided to look further into it.

phpQuery is, as the name suggests, a PHP port of the popular JavaScript library named jQuery. The core principles from jQuery remain in phpQuery, such as method chaining as well as the fact that it is driven by CSS3 selectors. This makes extracting data from documents as well as manipulating the Document Object Model (DOM) very easy. I will present a few examples of how phpQuery can be used below, but I urge you to visit the official project page for further information about the API and installation instructions. If you are using Composer for your project, then I found that phpQuery can be installed by adding “duvanmonsa/php-query”: “dev-master” to your composer.json and then write php composer.phar update in the Terminal or Command Prompt.

Loading Documents

There are a number of functions which can be used to load documents that can then be operated upon by phpQuery. If using one of the static methods, one can operate on the document by using the pq function, which operates on the last selected document. If you have more than one document, it is possible to pass a document ID or document object to the function as a second parameter. All of this is demonstrated in the code snippet below.

$document = phpQuery::newDocumentHTML('HTML markup');
$container = pq('#container'); // Operates on the document loaded above

// Explicitly specifies which document to query
$secondDocument = phpQuery::newDocumentXHTML('HTML markup');
$container = pq('#container', $secondDocument);

// Queries the document with a specific ID
$thirdDocument = phpQuery::newDocumentXHTML('HTML markup');
$container = pq('#container', $thirdDocument->getDocumentID());

// Invoking the find method on a document corresponds to the pq method
$container = $thirdDocument->find('#container');

Selectors

The selectors and filters available in phpQuery very much correspond to those of jQuery, as most CSS3 selectors have been implemented. Below are a few examples. For a more comprehensive overview, please see the documentation of jQuery selectors.

Please note that I am using the find method in the subsequent examples, but could just as well have used the pq method as in the previous code snippet.

$document = phpQuery::newDocumentHTML('HTML markup');

// Selects all elements with a given class
$matches = $document->find('.some-class');

// Selects the element with a given ID
$match = $document->find('#some-id');

// Selects all input elements
$matches = $document->find(':input');

// Selects all text input elements
$matches = $document->find(':input[type=text]');

/* Selects input elements that have a "data-city" attribute with a value of "New York" */
$matches = $document->find('input[data-city="New York"]');

// Matches the first child of each table row
$matches = $document->find('tr:first-child');

Document Manipulation

The code snippet below lists some common operations to manipulate a DOM document. As with jQuery, methods can conveniently be chained to provide a fluent interface that reduces the amount of code.

$document = phpQuery::newDocumentHTML('HTML markup');

// Gets the element's value for the "class" attribute
$class = $document->find('#container')->attr('class');

// Sets the element's "class" attribute
$document->find('#container')->attr('class', 'my-container');

// Removes the "class" attribute
$document->find('#container')->removeAttr('class');

// Adds "my-container" to the element's "class" attribute
$document->find('#container')->addClass('my-container');

// Removes "my-container" from the element's "class" attribute
$document->find('#container')->removeClass('my-container');

// Gets the HTML content of the element
$html = $document->find('#container')->html();

// Sets the HTML content of the element
$document->find('#container')->html('HTML markup');

// Gets the text content of the element
// This function is similar to PHP's strip_tags function
$text = $document->find('#container')->text();

// Sets the element's text
$document->find('#container')->text('some text');

// Append content to an element
$document->find('#container')->append('some content');

// Prepend content to an element
$document->find('#container')->prepend('some content');

// Gets the value of the "value" attribute
$username = $document->find('#username-textbox')->val();

// Sets the value of the "value" attribute
$document->find('#username-textbox')->val('some value');

// Removes all child nodes
$document->find('#container')->empty();

Conclusion

There are many options for parsing and/or manipulating DOM documents in PHP. This article focused on the phpQuery project for a number of reasons. First of all, it allows developers to use their experience from jQuery such that they do not have to learn an entirely new library. Secondly, the syntax is extremely simple and intuitive, enabling developers to do complex logic within a single or few lines of code.

It should be noted, however, that I have not yet tried most of the other libraries and extensions that are available, so if you have any other recommendations, then you are more than welcome to leave a comment. It does seem as if the phpQuery project is no longer maintained, as the last update was a few years ago. Whether or not this is simply because it is stable and functional, I do not know. For my use cases, it has worked well, but this is something to consider depending on your use case.

There is more to phpQuery than was discussed in this article. For a complete overview of supported functionality, please refer to the official project page and the jQuery API.

Author avatar
Bo Andersen

About the Author

I am a back-end web developer with a passion for open source technologies. I have been a PHP developer for many years, and also have experience with Java and Spring Framework. I currently work full time as a lead developer. Apart from that, I also spend time on making online courses, so be sure to check those out!

2 comments on »Manipulating DOM Documents with phpQuery«

  1. Ricardo Schroeder

    The composer package is offline. :(

  2. Thank you very much! You saved my day.

Leave a Reply

Your e-mail address will not be published.