This is a tutorial of Python XML Parser - the Standard XML module capable of parsing XML files and writing data to the same in Python.
XML stands for Extensible Markup Language and like HTML, it is also a markup language. In XML, however, we do not use predefined tags but here we can use our own custom tags based on the data we are storing in the XML file.
An XML file is often used to share, store, and structure data because it can easily be transferred between servers and systems.
We all know when it comes to data, Python is one of the best programming languages to process and parse it. Luckily, Python comes with a Standard XML module that can parse XML files in Python and also write data in the XML file. This is called Python XML Parser.
 In this Python tutorial, we will walk through the Python XML
 
  minidom
 
 and
 
  ElemetnTree
 
 modules, and learn how to parse an XML file in Python.
 
  Python XML
  
   minidom
  
  and
  
   ElementTree
  
  module
 
 The Python XML module support two sub-modules
 
  minidom
 
 and
 
  ElementTree
 
 to parse an XML file in Python. The
 
  minidom
 
 or Minimal DOM module provides a DOM (Document Object Model) like structure to parse the XML file, which is similar to the DOM structure of JavaScript.
 Although we can parse an XML document using
 
  minidom
 
 ,
 
  ElementTree
 
 provides a much better Pythonic way to parse an XML file in Python.
XML File
 For all the examples in this tutorial, we will be using the
 
  demo.xml
 
 file, which contains the following XML data: #
 
  demo.xml
 
<item>
    <record>
        <name>Jameson</name>
        <phone>(080) 78168241</phone>   
        <email>cursus.in.hendrerit@ipsumdolor.edu</email>
        <country>South Africa</country>
    </record>
    <record>
        <name>Colton</name>
        <phone>(026) 53458662</phone>
        <email>non@idmagna.ca</email>
        <country>Libya</country>
    </record>
    <record>
        <name>Dillon</name>
        <phone>(051) 96790901</phone>
        <email>Aliquam.ornare@Etiamlaoreetlibero.ca</email>
        <country>Madagascar</country>
    </record>
  
    <record>
        <name>Channing</name>
        <phone>(014) 98829753</phone>
        <email>faucibus.Morbi.vehicula@aliquamarcu.co.uk</email>
        <country>Korea, South</country>
    </record>
</item>
In the above example, you can see that the data is nested under custom <tags>. The root tag is <item>, which has <record> as a nested tag, which further has 4 more nested tags:
- <name>,
- <phone>,
- <email>, and
- <country>.
Parse/Read XML Document in Python using minidom
 
  minidom
 
 is the submodule of the Python standard XML
 
  module
 
 , which means you do not have to pip install XML to use
 
  minidom
 
 . The
 
  minidom
 
 module
 
  
   parses the XML document
  
 
 in a Document Object Model(DOM), whose data can further be extracted using the
 
  getElemetsByTagName()
 
 function.
 
  Syntax: To parse the XML document in Python using
  
   minidom
  
 
from xml.dom import minidom
minidom.parse("filename")
Example: Let's grab all the names and phone data from our demo.xml file.
from xml.dom import minidom
#parse xml file
file = minidom.parse('demo.xml')
#grab all <record> tags
records = file.getElementsByTagName("record")
print("Name------>Phone")
for record in records:
    #access <name> and <phone> node of every record
    name = record.getElementsByTagName("name")
    phone = record.getElementsByTagName("phone")
    
    #access data of name and phone
    print(name[0].firstChild.data, end="----->")
    print(phone[0].firstChild.data)
Output
Name------>Phone
Jameson----->(080) 78168241
Colton----->(026) 53458662
Dillon----->(051) 96790901
Channing----->(014) 98829753
 In the above example, you can see that first, we imported the
 
  minidom
 
 module using the
 
  from xml.dom import minidom
 
 statement. Then we parse our demo.xml file with
 
  file = minidom.parse('demo.xml')
 
 statement. The
 
  parse()
 
 function parses the XML document in a model node object with the
 
  <item>
 
 root node.
Note: " Our Python script and thedemo.xmlfile are located at the same location that's why we only specify the file namedemo.txtin theminidom.parse()function. If your Python script and xml file are located at different locations, then you have to specify the absolute or relative path of the file."
 After passing the XML file in our Python program we accessed all the
 
  <record>
 
 nodes using the
 
  records = file.getElementsByTagName("record")
 
 statement. The
 
  getElementsByTagName()
 
 is the
 
  minidom
 
 object function which returns a node objects of the specified tag.
 Once we had all the record nodes, we loop through those nodes, and again using the
 
  getElementsByTagName()
 
 function we accessed its nested
 
  <name>
 
 and
 
  <phone>
 
 nodes.
 Next, after accessing the individual
 
  name
 
 and
 
  phone
 
 node we printed their data using
 
  name[0].firstChild.data
 
 and
 
  phone[0].firstChild.data
 
 statement. The
 
  firstChild.data
 
 is the property of every node, by which we can access the text data of a specific node object.
Parse/Read XML Document in Python Using ElementTree
 The
 
  ElementTree
 
 module provides a simple and straightforward way to parse and read XML files in Python. As
 
  minidom
 
 is the submodule of
 
  xml.dom,
 
 the ElementTree is the submodule of
 
  xml.etree
 
 . The
 
  ElementTree
 
 module parses the XML file in a tree-like structure where the root branch will be the first <tag> of the xml file(<item> in our case).
Syntax: To parse the XML document in Python using ElementTree
import xml.etree.ElementTree as ET 
 ET.parse('file_name.xml')
Example
 Using
 
  minidom
 
 we grab the name and phone data, now let's access email and country data using XML
 
  ElementTree.
 
import xml.etree.ElementTree as ET
tree = ET.parse('demo.xml')
#get root branch <item>
item = tree.getroot()
#loop through all <record> of <item>
for record in item.findall("record"):
    email = record.find("email").text
    country = record.find("country").text
    print(f"Email: {email},-------->Country:{country}")
Output
Email: cursus.in.hendrerit@ipsumdolor.edu,-------->Country:South Africa
Email: non@idmagna.ca,-------->Country:Libya
Email: Aliquam.ornare@Etiamlaoreetlibero.ca,-------->Country:Madagascar
Email: faucibus.Morbi.vehicula@aliquamarcu.co.uk,-------->Country:Korea, South
 From the above example, you can see that using
 
  ElementTree
 
 provides a more elegant and pythonic way to read or Parse an XML file in Python.
 In our first statement, we imported
 
  import xml.etree.ElementTree as ET
 
 ElementTree as ET in our program. Then using the
 
  tree= ET.parse('demo.xml')
 
 statement we parse
 
  demo.xml
 
 file.
 With the help of  the
 
  item = tree.getroot()
 
 statement we access the root branch of our xml file, which is <item>. Then we loop through every <record> branch with the
 
  item.findall("record")
 
 statement and grab their email and phone data with
 
  record.find("email").text
 
 and
 
  record.find("phone").text
 
 statements.
Check out the Official documentation of the XML ElementTree module to know more about ElementTree and its functions.
Conclusion
 That sums up this tutorial on Python XML Parser. As you can see, Python provides an inbuild Standard
 
  xml
 
 module to read and parse XML files in Python. It generally has 2 submodules that can parse an XML file:
- 
  minidomand
- 
  ElementTree.
 The
 
  minidom
 
 module follows the Document Object Model approach to parse an XML file. On the other hand, the
 
  ElementTree
 
 module follows the tree-like structure to parse the XML file.
People are also reading:
- PHP XML Parsing Functions
- Best XML Editors
- How to Convert HTML Tables into CSV Files in Python?
- Face Detection in Python
- HOG Feature Extraction in Python
- Python Google Custom Search Engine API
- Automated Browser Testing in Python
- Python readline() Method
- Install Python package using Jupyter Notebook
- Starting Python Coding on a MacBook
 
                            
                             
                                    
                                     
                          
                         