๐Ÿงฌ PHP XML Processing
Estimated reading: 4 minutes 36 views

๐Ÿงพ PHP DOM Parser Example โ€“ Read, Modify & Traverse HTML/XML

The PHP DOM Parser is a powerful interface that allows developers to work with HTML and XML documents by representing them as a tree of nodes. It is part of the standard PHP library and enables reading, editing, inserting, and deleting elements easily โ€” making it an essential tool for parsing structured content like HTML or XML.


๐Ÿ“˜ Introduction to DOM in PHP

The DOMDocument class in PHP provides methods to navigate and manipulate a DOM tree structure. It’s commonly used for parsing:

  • HTML Documents
  • XML Feeds (like RSS)
  • Structured web scraping

Key Class: DOMDocument

$dom = new DOMDocument();

๐Ÿ”ง Loading HTML with DOMDocument

To begin using the DOM parser, you must load an HTML string or file.

$html = '<html><body><h1>Hello World</h1></body></html>';
$dom = new DOMDocument();
@$dom->loadHTML($html);  // Suppress warnings with @

โœ… loadHTML() parses HTML content
โœ… Use @ to suppress malformed HTML warnings


๐Ÿ“‚ Accessing Elements by Tag Name

You can retrieve specific elements using their tag names like div, h1, a, etc.

$elements = $dom->getElementsByTagName('h1');
foreach ($elements as $element) {
    echo $element->nodeValue;  // Output: Hello World
}

๐Ÿ” Accessing Elements by ID or Class

Since DOMDocument does not support getElementByClassName() directly, you can use DOMXPath.

$xpath = new DOMXPath($dom);
$elements = $xpath->query("//*[@class='highlight']");
foreach ($elements as $el) {
    echo $el->nodeValue;
}

๐Ÿง  XPath allows querying using expressions like //*[@id='foo'], //div[@class='bar']


โœ๏ธ Modifying DOM Elements

Update the content or attributes of a node easily:

$h1 = $dom->getElementsByTagName('h1')->item(0);
$h1->nodeValue = "Updated Heading";

$h1->setAttribute("style", "color:red;");

๐Ÿงฑ Creating and Appending Elements

You can dynamically create and add elements to your HTML structure.

$body = $dom->getElementsByTagName('body')->item(0);
$newDiv = $dom->createElement("div", "This is a new div");
$newDiv->setAttribute("class", "custom");

$body->appendChild($newDiv);

๐Ÿ—‘๏ธ Removing Elements

To remove a node:

$element = $dom->getElementsByTagName('h1')->item(0);
$element->parentNode->removeChild($element);

๐Ÿงช Full Working Example

<?php
$html = '
<html>
  <body>
    <h1 id="main-title">Hello World</h1>
    <p class="desc">This is a paragraph.</p>
  </body>
</html>';

$dom = new DOMDocument();
@$dom->loadHTML($html);

// Modify the H1
$h1 = $dom->getElementById('main-title');
$h1->nodeValue = "Welcome to PHP DOM Parser";

// Add a new paragraph
$body = $dom->getElementsByTagName('body')->item(0);
$newP = $dom->createElement('p', 'Added with DOM!');
$body->appendChild($newP);

// Remove existing paragraph
$xpath = new DOMXPath($dom);
$para = $xpath->query("//p[@class='desc']")->item(0);
$para->parentNode->removeChild($para);

// Output modified HTML
echo $dom->saveHTML();
?>

๐Ÿ” Output:

<html>
  <body>
    <h1 id="main-title">Welcome to PHP DOM Parser</h1>
    <p>Added with DOM!</p>
  </body>
</html>

โœ… Summary

  • The DOMDocument class is a robust tool for parsing and manipulating HTML/XML.
  • You can read, modify, and traverse HTML elements with native PHP.
  • Use XPath to select nodes by class, ID, or custom attributes.
  • Ideal for tasks like web scraping, HTML transformation, and data extraction.

โ“ FAQs

Q1. Can PHP DOM parse broken HTML?
Yes, DOMDocument::loadHTML() is tolerant of malformed HTML and tries to correct it.

Q2. What’s the difference between loadHTML and loadXML?
loadHTML() is used for HTML content, while loadXML() is stricter and used for well-formed XML.

Q3. Can DOMDocument edit an HTML file directly?
Yes, after modification, use saveHTML() or save() to write back to a file.

Q4. Is it better than regex for parsing HTML?
Absolutely. DOM is structured and safe, whereas regex is error-prone for HTML parsing.


Share Now :

Leave a Reply

Your email address will not be published. Required fields are marked *

Share

๐Ÿงฑ PHP DOM Parser Example

Or Copy Link

CONTENTS
Scroll to Top