{"id":129698,"date":"2022-11-12T14:05:43","date_gmt":"2022-11-12T14:05:43","guid":{"rendered":"https:\/\/blog.finxter.com\/?p=883225"},"modified":"2022-11-12T14:05:43","modified_gmt":"2022-11-12T14:05:43","slug":"parsing-xml-files-in-python-4-simple-ways","status":"publish","type":"post","link":"https:\/\/sickgaming.net\/blog\/2022\/11\/12\/parsing-xml-files-in-python-4-simple-ways\/","title":{"rendered":"Parsing XML Files in Python \u2013 4 Simple Ways"},"content":{"rendered":"\n<div class=\"kk-star-ratings kksr-auto kksr-align-left kksr-valign-top\" data-payload='{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;883225&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;1&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;5&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;5\\\/5 - (1 vote)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;142.5&quot;,&quot;_legend&quot;:&quot;{score}\\\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>\n<div class=\"kksr-stars\">\n<div class=\"kksr-stars-inactive\">\n<div class=\"kksr-star\" data-star=\"1\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"2\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"3\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"4\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"5\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<div class=\"kksr-stars-active\" style=\"width: 142.5px;\">\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<\/div>\n<div class=\"kksr-legend\" style=\"font-size: 19.2px;\"> 5\/5 &#8211; (1 vote) <\/div>\n<\/div>\n<h2 class=\"wp-embed-aspect-16-9 wp-has-aspect-ratio\">Problem Formulation and Solution Overview<\/h2>\n<p class=\"wp-embed-aspect-16-9 wp-has-aspect-ratio\">This article will show you various ways to work with an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> file.<\/p>\n<p class=\"has-base-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/2139.png\" alt=\"\u2139\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> is an acronym for E<strong>x<\/strong>tensible <strong>M<\/strong>arkup <strong>L<\/strong>anguage. This file type is similar to HTML. However, <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> does not have pre-defined tags like HTML. Instead, a coder can define their own tags to meet specific requirements. <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> is a great way to transmit and share data, either locally or via the internet. This file can be parsed based on standardized <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><\/a><a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> if structured correctly.<\/p>\n<p>To make it more interesting, we have the following running scenario:<\/p>\n<p class=\"wp-embed-aspect-16-9 wp-has-aspect-ratio\">Jan, a Bookstore Owner, wants to know the top three (3) selling Books in her store. This data is currently saved in an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> format. <\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity wp-embed-aspect-16-9 wp-has-aspect-ratio\"\/>\n<p class=\"wp-embed-aspect-16-9 wp-has-aspect-ratio has-global-color-8-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f4ac.png\" alt=\"\ud83d\udcac\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <strong>Question<\/strong>: How would we write code to read in and extract data from an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\">XML<\/a> file into a Python script<em>?<\/em><\/p>\n<p class=\"wp-embed-aspect-16-9 wp-has-aspect-ratio\">We can accomplish this by performing the following steps:<\/p>\n<ul>\n<li><strong>Method 1<\/strong>: Use <a rel=\"noreferrer noopener\" href=\"https:\/\/pypi.org\/project\/xmltodict\/\" data-type=\"URL\" data-id=\"https:\/\/pypi.org\/project\/xmltodict\/\" target=\"_blank\"><code>xmltodict()<\/code><\/a> <\/li>\n<li><strong>Method 2<\/strong>: Use <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.python.org\/3\/library\/xml.dom.minidom.html\" target=\"_blank\"><code>minidom.parse()<\/code><\/a><\/li>\n<li><strong>Method 3<\/strong>: Use <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.python.org\/3\/library\/xml.etree.elementtree.html\" data-type=\"URL\" data-id=\"https:\/\/docs.python.org\/3\/library\/xml.etree.elementtree.html\" target=\"_blank\"><code>etree<\/code><\/a><\/li>\n<li><strong>Method 4:<\/strong> Use <a rel=\"noreferrer noopener\" href=\"https:\/\/pypi.org\/project\/untangle\/\" data-type=\"URL\" data-id=\"https:\/\/pypi.org\/project\/untangle\/\" target=\"_blank\"><code>untangle.parse()<\/code><\/a><\/li>\n<\/ul>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<h2>Method 1: Use xmltodict()<\/h2>\n<p class=\"has-global-color-8-background-color has-background\">This method uses the <a rel=\"noreferrer noopener\" href=\"https:\/\/pypi.org\/project\/xmltodict\/\" data-type=\"URL\" data-id=\"https:\/\/pypi.org\/project\/xmltodict\/\" target=\"_blank\"><code>xmltodict()<\/code><\/a> function to read an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> file, convert it to a <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/python-dictionary\/\" data-type=\"URL\" data-id=\"https:\/\/blog.finxter.com\/python-dictionary\/\" target=\"_blank\"><code>Dictionary<\/code><\/a> and extract the data.<\/p>\n<p>In the current working directory, create an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> file called <code>books.xml<\/code>. Copy and paste the code snippet below into this file and save it.<\/p>\n<pre class=\"wp-block-preformatted\"><code>&lt;bookstore&gt; &lt;book&gt; &lt;title&gt;Surrender&lt;\/title&gt; &lt;author&gt;Bono&lt;\/author&gt; &lt;sales&gt;21987&lt;\/sales&gt; &lt;\/book&gt; &lt;book&gt; &lt;title&gt;Going Rogue&lt;\/title&gt; &lt;author&gt;Janet Evanovich&lt;\/author&gt; &lt;sales&gt;15986&lt;\/sales&gt; &lt;\/book&gt; &lt;book&gt; &lt;title&gt;Triple Cross&lt;\/title&gt; &lt;author&gt;James Patterson&lt;\/author&gt; &lt;sales&gt;11311&lt;\/sales&gt; &lt;\/book&gt;\n&lt;\/bookstore&gt;<\/code><\/pre>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<p>In the current working directory, create a Python file called <code>books.py<\/code>. Copy and paste the code snippet below into this file and save it. This code reads in and parses the above <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> file. If necessary, <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/how-to-install-xmltodict-in-python\/\" data-type=\"URL\" data-id=\"https:\/\/blog.finxter.com\/how-to-install-xmltodict-in-python\/\" target=\"_blank\">install<\/a> the <a rel=\"noreferrer noopener\" href=\"https:\/\/pypi.org\/project\/xmltodict\/\" data-type=\"URL\" data-id=\"https:\/\/pypi.org\/project\/xmltodict\/\" target=\"_blank\"><code>xmltodict<\/code><\/a> library.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"3-5, 7-10\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import xmltodict with open('books.xml', 'r') as fp: books_dict = xmltodict.parse(fp.read()) fp.close() for i in books_dict: for j in books_dict[i]: for k in books_dict[i][j]: print(f'Title: {k[\"title\"]} \\t Sales: {k[\"sales\"]}')<\/pre>\n<p>The first line in the above code snippet imports the <a rel=\"noreferrer noopener\" href=\"https:\/\/pypi.org\/project\/xmltodict\/\" data-type=\"URL\" data-id=\"https:\/\/pypi.org\/project\/xmltodict\/\" target=\"_blank\"><code>xmltodict<\/code><\/a> library. This library is needed to access and parse the <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> file.<\/p>\n<p>The following highlighted section opens <code>books.xml<\/code> in read mode (<code>r<\/code>) and saves it as a File Object, fp. If fp was output to the terminal, an object similar to the one below would display.<\/p>\n<pre class=\"wp-block-preformatted\"><code>&lt;_io.TextIOWrapper name='books.xml' mode='r' encoding='cp1252'&gt;<\/code><\/pre>\n<p>Next, the <a href=\"https:\/\/pypi.org\/project\/xmltodict\/\" data-type=\"URL\" data-id=\"https:\/\/pypi.org\/project\/xmltodict\/\"><code>xmltodict.parse()<\/code><\/a> function is called and passed one (1) argument, <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/5-ways-to-read-a-text-file-from-a-url\/\" data-type=\"URL\" data-id=\"https:\/\/blog.finxter.com\/5-ways-to-read-a-text-file-from-a-url\/\" target=\"_blank\"><code>fp.read()<\/code><\/a>, which reads in and parses the contents of the <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> file. The results save to <code>books_dict<\/code> as a <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/python-dictionary\/\" data-type=\"URL\" data-id=\"https:\/\/blog.finxter.com\/python-dictionary\/\" target=\"_blank\"><code>Dictionary<\/code><\/a>, and the file is closed. The contents of <code>books_dict<\/code> are shown below.<\/p>\n<pre class=\"wp-block-preformatted\"><code>{'bookstore': {'book': [{'title': Surrender', 'author': 'Bono', 'sales': '21987'}, {'title': 'Going Rogue', 'author': 'Janet Evanovich', 'sales': '15986'}, {'title': 'Triple Cross', 'author': 'James Patterson', 'sales': '11311'}]}}<\/code><\/pre>\n<p>The final highlighted section loops through the above <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/python-dictionary\/\" data-type=\"URL\" data-id=\"https:\/\/blog.finxter.com\/python-dictionary\/\" target=\"_blank\"><code>Dictionary<\/code><\/a> and extracts each book&#8217;s <code>Title<\/code> and <code>Sales<\/code>.<\/p>\n<pre class=\"wp-block-preformatted\"><code>Title: Surrender Sales: 21987\nTitle: Going Rogue Sales: 15986\nTitle: Triple Cross Sales: 11311<\/code><\/pre>\n<p class=\"has-global-color-8-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f4a1.png\" alt=\"\ud83d\udca1\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <strong>Note<\/strong>: The <code>\\t<\/code> character represents the &lt;Tab&gt; key on the keyboard.<\/p>\n<figure class=\"wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube\"><a href=\"https:\/\/blog.finxter.com\/parsing-xml-files-in-python-a-simple-guide\/\"><img decoding=\"async\" src=\"https:\/\/blog.finxter.com\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FqX0qqEVpP5s%2Fhqdefault.jpg\" alt=\"YouTube Video\"><\/a><figcaption><\/figcaption><\/figure>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<h2>Method 2: Use minidom.parse()<\/h2>\n<p class=\"has-global-color-8-background-color has-background\">This method uses the <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.python.org\/3\/library\/xml.dom.minidom.html\" data-type=\"URL\" data-id=\"https:\/\/docs.python.org\/3\/library\/xml.dom.minidom.html\" target=\"_blank\"><code>minidom.parse()<\/code><\/a> function to read and parse an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\">XML<\/a> file. This example extracts the ID, Title and Sales for each book.<\/p>\n<p>This example differs from Method 1 as this <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> file contains an additional line at the top (<code>&lt;?xml version=\"1.0\"?&gt;<\/code>) of the file and each <code>&lt;book&gt;<\/code> tag now has an <code>id<\/code> (attribute) assigned to it. <\/p>\n<p>In the current working directory, create an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> file called <code>books2.xml<\/code>. Copy and paste the code snippet below into this file and save it.<\/p>\n<pre class=\"wp-block-preformatted\"><code>&lt;?xml version=\"1.0\"?&gt;\n&lt;bookstore&gt; &lt;storename&gt;Jan's Best Sellers List&lt;\/storename&gt; &lt;book id=\"21237\"&gt; &lt;title&gt;Surrender&lt;\/title&gt; &lt;author&gt;Bono&lt;\/author&gt; &lt;sales&gt;21987&lt;\/sales&gt; &lt;\/book&gt; &lt;book id=\"21946\"&gt; &lt;title&gt;Going Rogue&lt;\/title&gt; &lt;author&gt;Janet Evanovich&lt;\/author&gt; &lt;sales&gt;15986&lt;\/sales&gt; &lt;\/book&gt; &lt;book id=\"18241\"&gt; &lt;title&gt;Triple Cross&lt;\/title&gt; &lt;author&gt;James Patterson&lt;\/author&gt; &lt;sales&gt;11311&lt;\/sales&gt; &lt;\/book&gt;\n&lt;\/bookstore&gt;<\/code><\/pre>\n<p>In the current working directory, create a Python file called <code>books2.py<\/code>. Copy and paste the code snippet below into this file and save it. <\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"3-5, 7-13\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">from xml.dom import minidom doc = minidom.parse('books2.xml')\nname = doc.getElementsByTagName('storename')[0]\nbooks = doc.getElementsByTagName('book') for b in books: bid = b.getAttribute('id') title = b.getElementsByTagName('title')[0] sales = b.getElementsByTagName('sales')[0] print(f'{bid} {title.firstChild.data} {sales.firstChild.data}')<\/pre>\n<p>The first line in the above code snippet imports the <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.python.org\/3\/library\/xml.dom.minidom.html\" data-type=\"URL\" data-id=\"https:\/\/docs.python.org\/3\/library\/xml.dom.minidom.html\" target=\"_blank\"><code>minidom<\/code><\/a> library. This allows access to various functions to parse the <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> file and retrieve tags and attributes.<\/p>\n<p>The first section of highlighted lines performs the following:<\/p>\n<ul>\n<li>Reads and parse the <code>books2.xml<\/code> file and saves the results to <code>doc<\/code>. This action creates the Object shown as (1) below.<\/li>\n<li>Retrieves the <code>&lt;storename&gt;<\/code> tag and saves the results to <code>name<\/code>. This action creates an Object shown as (2) below.<\/li>\n<li>Retrieves the <code>&lt;book&gt;<\/code> tag for each <code>book<\/code> and saves the results to <code>books<\/code>. This action creates a List of three (3) Objects: one for each book shown as (3) below.<\/li>\n<\/ul>\n<pre class=\"wp-block-preformatted\"><code>(1) &lt;xml.dom.minidom.Document object at 0x0000022D764AFEE0&gt; (2) &lt;DOM Element: storename at 0x22d764f0ee0&gt; (3) [&lt;DOM Element: book at 0x22d764f3a30&gt;, &lt;DOM Element: book at 0x22d764f3c70&gt;, &lt;DOM Element: book at 0x22d764f3eb0&gt;]<\/code><\/pre>\n<p>The last section of highlighted lines loop through the books Object and outputs the results to the terminal.<\/p>\n<pre class=\"wp-block-preformatted\"><code>21237 Surrender 21987\n21946 Going Rogue 15986\n18241 Triple Cross 11311<\/code><\/pre>\n<figure class=\"wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube\"><a href=\"https:\/\/blog.finxter.com\/parsing-xml-files-in-python-a-simple-guide\/\"><img decoding=\"async\" src=\"https:\/\/blog.finxter.com\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2F5MXDZI3jRio%2Fhqdefault.jpg\" alt=\"YouTube Video\"><\/a><figcaption><\/figcaption><\/figure>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<h2>Method 3: Use etree<\/h2>\n<p class=\"has-global-color-8-background-color has-background\">This method uses <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.python.org\/3\/library\/xml.etree.elementtree.html\" data-type=\"URL\" data-id=\"https:\/\/docs.python.org\/3\/library\/xml.etree.elementtree.html\" target=\"_blank\"><code>etree<\/code><\/a> to read in and parses an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\">XML<\/a> file. This example extracts the Title and Sales data for each book.<\/p>\n<p class=\"has-base-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/2139.png\" alt=\"\u2139\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> The <code>etree<\/code> considers the <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\">XML<\/a> file as a tree structure. Each element represents a node of said tree. Accessing elements is done on an element level.<\/p>\n<p>This example reads in and parses the <code>books2.xml<\/code> file created earlier.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"3,4, 6-10\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import xml.etree.ElementTree as ET xml_data = ET.parse('books2.xml')\nroot = xml_data.getroot() for books in root.findall('book'): title = books.find('title').text author = books.find('author').text sales = books.find('sales').text print(title, author, sales)<\/pre>\n<p>The first line in the above code snippet imports the <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.python.org\/3\/library\/xml.etree.elementtree.html\" data-type=\"URL\" data-id=\"https:\/\/docs.python.org\/3\/library\/xml.etree.elementtree.html\" target=\"_blank\"><code>etree<\/code><\/a> library. This allows access to all nodes of the <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\">XML<\/a> <code>&lt;tag&gt;<\/code> structure.<\/p>\n<p>The following line reads in and parses <code>books2.xml<\/code>. The results save as an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\">XML<\/a> Object to <code>xml_data<\/code>. If output to the terminal, an Object similar to the one below displays.<\/p>\n<pre class=\"wp-block-preformatted\"><code>&lt;Element 'bookstore' at 0x000001E45E9442C0&gt;<\/code><\/pre>\n<p>The following highlighted section uses a <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/python-loops\/\" data-type=\"URL\" data-id=\"https:\/\/blog.finxter.com\/python-loops\/\" target=\"_blank\"><code>for<\/code><\/a> loop to iterate through each <code>&lt;book&gt;<\/code> tag, extracting the <code>&lt;title&gt;<\/code>, <code>&lt;author&gt;<\/code> and <code>&lt;sales&gt;<\/code> tags for each book and outputting them to the terminal.<\/p>\n<pre class=\"wp-block-preformatted\"><code>Surrender Bono 21987\nGoing Rogue Janet Evanovich 15986\nTriple Cross James Patterson 11311<\/code><\/pre>\n<p>To retrieve the attribute of the <code>&lt;book&gt;<\/code> tag, run the following code.<\/p>\n<p>This code extracts the <code>id<\/code> attribute from each <code>&lt;book&gt;<\/code> tag and outputs it to the terminal.<\/p>\n<pre class=\"wp-block-preformatted\"><code>{'id': '21237'}\n{'id': '21946'}\n{'id': '18241'}<\/code><\/pre>\n<p>To extract the values, run the following code.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">for id in root.iter('book'): vals = id.attrib.values() for v in vals: print(vals)<\/pre>\n<pre class=\"wp-block-preformatted\"><code>21237\n21946\n18241<\/code><\/pre>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<h2>Method 4: Use untangle.parse()<\/h2>\n<p class=\"has-global-color-8-background-color has-background\">This method uses <a rel=\"noreferrer noopener\" href=\"https:\/\/pypi.org\/project\/untangle\/\" data-type=\"URL\" data-id=\"https:\/\/pypi.org\/project\/untangle\/\" target=\"_blank\"><code>untangle.parse()<\/code><\/a> to parse an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\">XML<\/a> string.<\/p>\n<p>This example reads in and parses the <code>books3.xml<\/code> file shown below. If necessary, install the <a rel=\"noreferrer noopener\" href=\"https:\/\/pypi.org\/project\/untangle\/\" data-type=\"URL\" data-id=\"https:\/\/pypi.org\/project\/untangle\/\" target=\"_blank\"><code>untangle<\/code><\/a> library.<\/p>\n<p class=\"has-base-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/2139.png\" alt=\"\u2139\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> The <a rel=\"noreferrer noopener\" href=\"https:\/\/pypi.org\/project\/untangle\/\" data-type=\"URL\" data-id=\"https:\/\/pypi.org\/project\/untangle\/\" target=\"_blank\"><code>untangle<\/code><\/a> library converts an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\">XML<\/a> file to a Python object. This is a good option when you have a group of items, such as book names.<\/p>\n<p>In the current working directory, create an <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> file called <code>books3.xml<\/code>. Copy and paste the code snippet below into this file and save it. If necessary, <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/how-to-install-xmltodict-in-python\/\" target=\"_blank\">install<\/a> the <a href=\"https:\/\/pypi.org\/project\/untangle\/\" data-type=\"URL\" data-id=\"https:\/\/pypi.org\/project\/untangle\/\" target=\"_blank\" rel=\"noreferrer noopener\"><code>untangle<\/code><\/a> library.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">&lt;?xml version=\"1.0\"?>\n&lt;root> &lt;book name=\"Surrender\"\/> &lt;book name=\"Going Rogue\"\/> &lt;book name=\"Triple Cross\"\/>\n&lt;\/root><\/pre>\n<p>In the current working directory, create a Python file called <code>books3.py<\/code>. Copy and paste the code snippet below into this file and save it. <\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"3-4,6-7\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import untangle book_obj = untangle.parse('books3.xml')\nbooks = ','.join([book['name'] for book in book_obj.root.book]) for b in books.split(','): print(b)<\/pre>\n<p>The first line in the above code snippet imports the <a rel=\"noreferrer noopener\" href=\"https:\/\/pypi.org\/project\/untangle\/\" data-type=\"URL\" data-id=\"https:\/\/pypi.org\/project\/untangle\/\" target=\"_blank\"><code>untangle<\/code><\/a> library allowing access to the <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> file structure.<\/p>\n<p>The following line reads in and parses the <code>books3.xml<\/code> file. The results save to <code>book_obj<\/code>. <\/p>\n<p>The next line calls the <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/python-string-join\/\" data-type=\"URL\" data-id=\"https:\/\/blog.finxter.com\/python-string-join\/\" target=\"_blank\"><code>join()<\/code><\/a> function and passes it one (1) argument: <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/list-comprehension\/\" data-type=\"URL\" data-id=\"https:\/\/blog.finxter.com\/list-comprehension\/\" target=\"_blank\">List Comprehension<\/a>. This code iterates through and retrieves the name of each book and saves the results to <code>books<\/code>. If output to the terminal, the following displays:<\/p>\n<pre class=\"wp-block-preformatted\"><code> Surrender,Going Rogue,Triple Cross<\/code><\/pre>\n<p>The next line instantiates a <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/python-loops\/\" data-type=\"URL\" data-id=\"https:\/\/blog.finxter.com\/python-loops\/\" target=\"_blank\"><code>for<\/code><\/a> loop, iterates through each book name, and sends it to the terminal.<\/p>\n<pre class=\"wp-block-preformatted\"><code>Surrender\nGoing Rogue\nTriple Cross<\/code><\/pre>\n<figure class=\"wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube\"><a href=\"https:\/\/blog.finxter.com\/parsing-xml-files-in-python-a-simple-guide\/\"><img decoding=\"async\" src=\"https:\/\/blog.finxter.com\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FaBC0VhpXkOQ%2Fhqdefault.jpg\" alt=\"YouTube Video\"><\/a><figcaption><\/figcaption><\/figure>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<h2>Summary<\/h2>\n<p>This article has shown four (4) ways to work with <a rel=\"noreferrer noopener\" href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" data-type=\"URL\" data-id=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/XML\/XML_introduction\" target=\"_blank\"><code>XML<\/code><\/a> files to select the best fit for your coding requirements.<\/p>\n<p>Good Luck &amp; Happy Coding!<\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<h2>Programmer Humor &#8211; Blockchain<\/h2>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"280\" height=\"394\" src=\"https:\/\/blog.finxter.com\/wp-content\/uploads\/2022\/07\/image-31.png\" alt=\"\" class=\"wp-image-457795\" srcset=\"https:\/\/blog.finxter.com\/wp-content\/uploads\/2022\/07\/image-31.png 280w, https:\/\/blog.finxter.com\/wp-content\/uploads\/2022\/07\/image-31-213x300.png 213w\" sizes=\"auto, (max-width: 280px) 100vw, 280px\" \/><figcaption><em>&#8220;Blockchains are like grappling hooks, in that it&#8217;s extremely cool when you encounter a problem for which they&#8217;re the right solution, but it happens way too rarely in real life.&#8221;<\/em> <strong>source <\/strong> &#8211; <a href=\"https:\/\/imgs.xkcd.com\/comics\/blockchain.png\" data-type=\"URL\" data-id=\"https:\/\/imgs.xkcd.com\/comics\/blockchain.png\" target=\"_blank\" rel=\"noreferrer noopener\">xkcd<\/a><\/figcaption><\/figure>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>5\/5 &#8211; (1 vote) Problem Formulation and Solution Overview This article will show you various ways to work with an XML file. XML is an acronym for Extensible Markup Language. This file type is similar to HTML. However, XML does not have pre-defined tags like HTML. Instead, a coder can define their own tags to [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[857],"tags":[73,468,528],"class_list":["post-129698","post","type-post","status-publish","format-standard","hentry","category-python-tut","tag-programming","tag-python","tag-tutorial"],"_links":{"self":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/129698","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/comments?post=129698"}],"version-history":[{"count":0,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/129698\/revisions"}],"wp:attachment":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/media?parent=129698"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/categories?post=129698"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/tags?post=129698"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}