11-13-2022, 10:49 AM
Parsing XML Files in Python – 4 Simple Ways
<div>
<div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload='{"align":"left","id":"883225","slug":"default","valign":"top","ignore":"","reference":"auto","class":"","count":"1","readonly":"","score":"5","best":"5","gap":"5","greet":"Rate this post","legend":"5\/5 - (1 vote)","size":"24","width":"142.5","_legend":"{score}\/{best} - ({count} {votes})","font_factor":"1.25"}'>
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 142.5px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend" style="font-size: 19.2px;"> 5/5 – (1 vote) </div>
</div>
<h2 class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Problem Formulation and Solution Overview</h2>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">This article will show you various ways to work with an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> is an acronym for E<strong>x</strong>tensible <strong>M</strong>arkup <strong>L</strong>anguage. This file type is similar to HTML. However, <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"></a><a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> does not have pre-defined tags like HTML. Instead, a coder can define their own tags to meet specific requirements. <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"></a><a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> is a great way to transmit and share data, either locally or via the internet. This file can be parsed based on standardized <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"></a><a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> if structured correctly.</p>
<p>To make it more interesting, we have the following running scenario:</p>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Jan, a Bookstore Owner, wants to know the top three (3) selling Books in her store. This data is currently saved in an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> format. </p>
<hr class="wp-block-separator has-alpha-channel-opacity wp-embed-aspect-16-9 wp-has-aspect-ratio"/>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4ac.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Question</strong>: How would we write code to read in and extract data from an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file into a Python script<em>?</em></p>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">We can accomplish this by performing the following steps:</p>
<ul>
<li><strong>Method 1</strong>: Use <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict()</code></a> </li>
<li><strong>Method 2</strong>: Use <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.dom.minidom.html" target="_blank"><code>minidom.parse()</code></a></li>
<li><strong>Method 3</strong>: Use <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.etree.elementtree.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.etree.elementtree.html" target="_blank"><code>etree</code></a></li>
<li><strong>Method 4:</strong> Use <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle.parse()</code></a></li>
</ul>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 1: Use xmltodict()</h2>
<p class="has-global-color-8-background-color has-background">This method uses the <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict()</code></a> function to read an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file, convert it to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank"><code>Dictionary</code></a> and extract the data.</p>
<p>In the current working directory, create an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file called <code>books.xml</code>. Copy and paste the code snippet below into this file and save it.</p>
<pre class="wp-block-preformatted"><code><bookstore> <book> <title>Surrender</title> <author>Bono</author> <sales>21987</sales> </book> <book> <title>Going Rogue</title> <author>Janet Evanovich</author> <sales>15986</sales> </book> <book> <title>Triple Cross</title> <author>James Patterson</author> <sales>11311</sales> </book>
</bookstore></code></pre>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<p>In the current working directory, create a Python file called <code>books.py</code>. Copy and paste the code snippet below into this file and save it. This code reads in and parses the above <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file. If necessary, <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-install-xmltodict-in-python/" data-type="URL" data-id="https://blog.finxter.com/how-to-install-xmltodict-in-python/" target="_blank">install</a> the <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict</code></a> library.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3-5, 7-10" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import xmltodict with open('books.xml', 'r') as fp: books_dict = xmltodict.parse(fp.read()) fp.close() for i in books_dict: for j in books_dict[i]: for k in books_dict[i][j]: print(f'Title: {k["title"]} \t Sales: {k["sales"]}')</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict</code></a> library. This library is needed to access and parse the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file.</p>
<p>The following highlighted section opens <code>books.xml</code> in read mode (<code>r</code>) and saves it as a File Object, fp. If fp was output to the terminal, an object similar to the one below would display.</p>
<pre class="wp-block-preformatted"><code><_io.TextIOWrapper name='books.xml' mode='r' encoding='cp1252'></code></pre>
<p>Next, the <a href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/"><code>xmltodict.parse()</code></a> function is called and passed one (1) argument, <a rel="noreferrer noopener" href="https://blog.finxter.com/5-ways-to-read-a-text-file-from-a-url/" data-type="URL" data-id="https://blog.finxter.com/5-ways-to-read-a-text-file-from-a-url/" target="_blank"><code>fp.read()</code></a>, which reads in and parses the contents of the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file. The results save to <code>books_dict</code> as a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank"><code>Dictionary</code></a>, and the file is closed. The contents of <code>books_dict</code> are shown below.</p>
<pre class="wp-block-preformatted"><code>{'bookstore': {'book': [{'title': Surrender', 'author': 'Bono', 'sales': '21987'}, {'title': 'Going Rogue', 'author': 'Janet Evanovich', 'sales': '15986'}, {'title': 'Triple Cross', 'author': 'James Patterson', 'sales': '11311'}]}}</code></pre>
<p>The final highlighted section loops through the above <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank"><code>Dictionary</code></a> and extracts each book’s <code>Title</code> and <code>Sales</code>.</p>
<pre class="wp-block-preformatted"><code>Title: Surrender Sales: 21987
Title: Going Rogue Sales: 15986
Title: Triple Cross Sales: 11311</code></pre>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Note</strong>: The <code>\t</code> character represents the <Tab> key on the keyboard.</p>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/parsing-xml-files-in-python-a-simple-guide/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FqX0qqEVpP5s%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 2: Use minidom.parse()</h2>
<p class="has-global-color-8-background-color has-background">This method uses the <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.dom.minidom.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.dom.minidom.html" target="_blank"><code>minidom.parse()</code></a> function to read and parse an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file. This example extracts the ID, Title and Sales for each book.</p>
<p>This example differs from Method 1 as this <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file contains an additional line at the top (<code><?xml version="1.0"?></code>) of the file and each <code><book></code> tag now has an <code>id</code> (attribute) assigned to it. </p>
<p>In the current working directory, create an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file called <code>books2.xml</code>. Copy and paste the code snippet below into this file and save it.</p>
<pre class="wp-block-preformatted"><code><?xml version="1.0"?>
<bookstore> <storename>Jan's Best Sellers List</storename> <book id="21237"> <title>Surrender</title> <author>Bono</author> <sales>21987</sales> </book> <book id="21946"> <title>Going Rogue</title> <author>Janet Evanovich</author> <sales>15986</sales> </book> <book id="18241"> <title>Triple Cross</title> <author>James Patterson</author> <sales>11311</sales> </book>
</bookstore></code></pre>
<p>In the current working directory, create a Python file called <code>books2.py</code>. Copy and paste the code snippet below into this file and save it. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3-5, 7-13" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from xml.dom import minidom doc = minidom.parse('books2.xml')
name = doc.getElementsByTagName('storename')[0]
books = doc.getElementsByTagName('book') for b in books: bid = b.getAttribute('id') title = b.getElementsByTagName('title')[0] sales = b.getElementsByTagName('sales')[0] print(f'{bid} {title.firstChild.data} {sales.firstChild.data}')</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.dom.minidom.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.dom.minidom.html" target="_blank"><code>minidom</code></a> library. This allows access to various functions to parse the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file and retrieve tags and attributes.</p>
<p>The first section of highlighted lines performs the following:</p>
<ul>
<li>Reads and parse the <code>books2.xml</code> file and saves the results to <code>doc</code>. This action creates the Object shown as (1) below.</li>
<li>Retrieves the <code><storename></code> tag and saves the results to <code>name</code>. This action creates an Object shown as (2) below.</li>
<li>Retrieves the <code><book></code> tag for each <code>book</code> and saves the results to <code>books</code>. This action creates a List of three (3) Objects: one for each book shown as (3) below.</li>
</ul>
<pre class="wp-block-preformatted"><code>(1) <xml.dom.minidom.Document object at 0x0000022D764AFEE0> (2) <DOM Element: storename at 0x22d764f0ee0> (3) [<DOM Element: book at 0x22d764f3a30>, <DOM Element: book at 0x22d764f3c70>, <DOM Element: book at 0x22d764f3eb0>]</code></pre>
<p>The last section of highlighted lines loop through the books Object and outputs the results to the terminal.</p>
<pre class="wp-block-preformatted"><code>21237 Surrender 21987
21946 Going Rogue 15986
18241 Triple Cross 11311</code></pre>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/parsing-xml-files-in-python-a-simple-guide/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2F5MXDZI3jRio%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 3: Use etree</h2>
<p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.etree.elementtree.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.etree.elementtree.html" target="_blank"><code>etree</code></a> to read in and parses an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file. This example extracts the Title and Sales data for each book.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> The <code>etree</code> considers the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file as a tree structure. Each element represents a node of said tree. Accessing elements is done on an element level.</p>
<p>This example reads in and parses the <code>books2.xml</code> file created earlier.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3,4, 6-10" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import xml.etree.ElementTree as ET xml_data = ET.parse('books2.xml')
root = xml_data.getroot() for books in root.findall('book'): title = books.find('title').text author = books.find('author').text sales = books.find('sales').text print(title, author, sales)</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.etree.elementtree.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.etree.elementtree.html" target="_blank"><code>etree</code></a> library. This allows access to all nodes of the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> <code><tag></code> structure.</p>
<p>The following line reads in and parses <code>books2.xml</code>. The results save as an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> Object to <code>xml_data</code>. If output to the terminal, an Object similar to the one below displays.</p>
<pre class="wp-block-preformatted"><code><Element 'bookstore' at 0x000001E45E9442C0></code></pre>
<p>The following highlighted section uses a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>for</code></a> loop to iterate through each <code><book></code> tag, extracting the <code><title></code>, <code><author></code> and <code><sales></code> tags for each book and outputting them to the terminal.</p>
<pre class="wp-block-preformatted"><code>Surrender Bono 21987
Going Rogue Janet Evanovich 15986
Triple Cross James Patterson 11311</code></pre>
<p>To retrieve the attribute of the <code><book></code> tag, run the following code.</p>
<p>This code extracts the <code>id</code> attribute from each <code><book></code> tag and outputs it to the terminal.</p>
<pre class="wp-block-preformatted"><code>{'id': '21237'}
{'id': '21946'}
{'id': '18241'}</code></pre>
<p>To extract the values, run the following code.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for id in root.iter('book'): vals = id.attrib.values() for v in vals: print(vals)</pre>
<pre class="wp-block-preformatted"><code>21237
21946
18241</code></pre>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 4: Use untangle.parse()</h2>
<p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle.parse()</code></a> to parse an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> string.</p>
<p>This example reads in and parses the <code>books3.xml</code> file shown below. If necessary, install the <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle</code></a> library.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> The <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle</code></a> library converts an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file to a Python object. This is a good option when you have a group of items, such as book names.</p>
<p>In the current working directory, create an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file called <code>books3.xml</code>. Copy and paste the code snippet below into this file and save it. If necessary, <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-install-xmltodict-in-python/" target="_blank">install</a> the <a href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank" rel="noreferrer noopener"><code>untangle</code></a> library.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><?xml version="1.0"?>
<root> <book name="Surrender"/> <book name="Going Rogue"/> <book name="Triple Cross"/>
</root></pre>
<p>In the current working directory, create a Python file called <code>books3.py</code>. Copy and paste the code snippet below into this file and save it. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3-4,6-7" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import untangle book_obj = untangle.parse('books3.xml')
books = ','.join([book['name'] for book in book_obj.root.book]) for b in books.split(','): print(b)</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle</code></a> library allowing access to the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file structure.</p>
<p>The following line reads in and parses the <code>books3.xml</code> file. The results save to <code>book_obj</code>. </p>
<p>The next line calls the <a rel="noreferrer noopener" href="https://blog.finxter.com/python-string-join/" data-type="URL" data-id="https://blog.finxter.com/python-string-join/" target="_blank"><code>join()</code></a> function and passes it one (1) argument: <a rel="noreferrer noopener" href="https://blog.finxter.com/list-comprehension/" data-type="URL" data-id="https://blog.finxter.com/list-comprehension/" target="_blank">List Comprehension</a>. This code iterates through and retrieves the name of each book and saves the results to <code>books</code>. If output to the terminal, the following displays:</p>
<pre class="wp-block-preformatted"><code> Surrender,Going Rogue,Triple Cross</code></pre>
<p>The next line instantiates a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>for</code></a> loop, iterates through each book name, and sends it to the terminal.</p>
<pre class="wp-block-preformatted"><code>Surrender
Going Rogue
Triple Cross</code></pre>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/parsing-xml-files-in-python-a-simple-guide/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FaBC0VhpXkOQ%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Summary</h2>
<p>This article has shown four (4) ways to work with <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> files to select the best fit for your coding requirements.</p>
<p>Good Luck & Happy Coding!</p>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Programmer Humor – Blockchain</h2>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="280" height="394" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-31.png" alt="" class="wp-image-457795" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-31.png 280w, https://blog.finxter.com/wp-content/uplo...13x300.png 213w" sizes="(max-width: 280px) 100vw, 280px" /><figcaption><em>“Blockchains are like grappling hooks, in that it’s extremely cool when you encounter a problem for which they’re the right solution, but it happens way too rarely in real life.”</em> <strong>source </strong> – <a href="https://imgs.xkcd.com/comics/blockchain.png" data-type="URL" data-id="https://imgs.xkcd.com/comics/blockchain.png" target="_blank" rel="noreferrer noopener">xkcd</a></figcaption></figure>
</div>
</div>
https://www.sickgaming.net/blog/2022/11/...mple-ways/
<div>
<div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload='{"align":"left","id":"883225","slug":"default","valign":"top","ignore":"","reference":"auto","class":"","count":"1","readonly":"","score":"5","best":"5","gap":"5","greet":"Rate this post","legend":"5\/5 - (1 vote)","size":"24","width":"142.5","_legend":"{score}\/{best} - ({count} {votes})","font_factor":"1.25"}'>
<div class="kksr-stars">
<div class="kksr-stars-inactive">
<div class="kksr-star" data-star="1" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="2" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="3" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="4" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" data-star="5" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
<div class="kksr-stars-active" style="width: 142.5px;">
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
<div class="kksr-star" style="padding-right: 5px">
<div class="kksr-icon" style="width: 24px; height: 24px;"></div>
</p></div>
</p></div>
</div>
<div class="kksr-legend" style="font-size: 19.2px;"> 5/5 – (1 vote) </div>
</div>
<h2 class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Problem Formulation and Solution Overview</h2>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">This article will show you various ways to work with an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> is an acronym for E<strong>x</strong>tensible <strong>M</strong>arkup <strong>L</strong>anguage. This file type is similar to HTML. However, <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"></a><a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> does not have pre-defined tags like HTML. Instead, a coder can define their own tags to meet specific requirements. <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"></a><a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> is a great way to transmit and share data, either locally or via the internet. This file can be parsed based on standardized <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"></a><a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> if structured correctly.</p>
<p>To make it more interesting, we have the following running scenario:</p>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">Jan, a Bookstore Owner, wants to know the top three (3) selling Books in her store. This data is currently saved in an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> format. </p>
<hr class="wp-block-separator has-alpha-channel-opacity wp-embed-aspect-16-9 wp-has-aspect-ratio"/>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4ac.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Question</strong>: How would we write code to read in and extract data from an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file into a Python script<em>?</em></p>
<p class="wp-embed-aspect-16-9 wp-has-aspect-ratio">We can accomplish this by performing the following steps:</p>
<ul>
<li><strong>Method 1</strong>: Use <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict()</code></a> </li>
<li><strong>Method 2</strong>: Use <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.dom.minidom.html" target="_blank"><code>minidom.parse()</code></a></li>
<li><strong>Method 3</strong>: Use <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.etree.elementtree.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.etree.elementtree.html" target="_blank"><code>etree</code></a></li>
<li><strong>Method 4:</strong> Use <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle.parse()</code></a></li>
</ul>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 1: Use xmltodict()</h2>
<p class="has-global-color-8-background-color has-background">This method uses the <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict()</code></a> function to read an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file, convert it to a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank"><code>Dictionary</code></a> and extract the data.</p>
<p>In the current working directory, create an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file called <code>books.xml</code>. Copy and paste the code snippet below into this file and save it.</p>
<pre class="wp-block-preformatted"><code><bookstore> <book> <title>Surrender</title> <author>Bono</author> <sales>21987</sales> </book> <book> <title>Going Rogue</title> <author>Janet Evanovich</author> <sales>15986</sales> </book> <book> <title>Triple Cross</title> <author>James Patterson</author> <sales>11311</sales> </book>
</bookstore></code></pre>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<p>In the current working directory, create a Python file called <code>books.py</code>. Copy and paste the code snippet below into this file and save it. This code reads in and parses the above <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file. If necessary, <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-install-xmltodict-in-python/" data-type="URL" data-id="https://blog.finxter.com/how-to-install-xmltodict-in-python/" target="_blank">install</a> the <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict</code></a> library.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3-5, 7-10" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import xmltodict with open('books.xml', 'r') as fp: books_dict = xmltodict.parse(fp.read()) fp.close() for i in books_dict: for j in books_dict[i]: for k in books_dict[i][j]: print(f'Title: {k["title"]} \t Sales: {k["sales"]}')</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/" target="_blank"><code>xmltodict</code></a> library. This library is needed to access and parse the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file.</p>
<p>The following highlighted section opens <code>books.xml</code> in read mode (<code>r</code>) and saves it as a File Object, fp. If fp was output to the terminal, an object similar to the one below would display.</p>
<pre class="wp-block-preformatted"><code><_io.TextIOWrapper name='books.xml' mode='r' encoding='cp1252'></code></pre>
<p>Next, the <a href="https://pypi.org/project/xmltodict/" data-type="URL" data-id="https://pypi.org/project/xmltodict/"><code>xmltodict.parse()</code></a> function is called and passed one (1) argument, <a rel="noreferrer noopener" href="https://blog.finxter.com/5-ways-to-read-a-text-file-from-a-url/" data-type="URL" data-id="https://blog.finxter.com/5-ways-to-read-a-text-file-from-a-url/" target="_blank"><code>fp.read()</code></a>, which reads in and parses the contents of the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file. The results save to <code>books_dict</code> as a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank"><code>Dictionary</code></a>, and the file is closed. The contents of <code>books_dict</code> are shown below.</p>
<pre class="wp-block-preformatted"><code>{'bookstore': {'book': [{'title': Surrender', 'author': 'Bono', 'sales': '21987'}, {'title': 'Going Rogue', 'author': 'Janet Evanovich', 'sales': '15986'}, {'title': 'Triple Cross', 'author': 'James Patterson', 'sales': '11311'}]}}</code></pre>
<p>The final highlighted section loops through the above <a rel="noreferrer noopener" href="https://blog.finxter.com/python-dictionary/" data-type="URL" data-id="https://blog.finxter.com/python-dictionary/" target="_blank"><code>Dictionary</code></a> and extracts each book’s <code>Title</code> and <code>Sales</code>.</p>
<pre class="wp-block-preformatted"><code>Title: Surrender Sales: 21987
Title: Going Rogue Sales: 15986
Title: Triple Cross Sales: 11311</code></pre>
<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Note</strong>: The <code>\t</code> character represents the <Tab> key on the keyboard.</p>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/parsing-xml-files-in-python-a-simple-guide/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FqX0qqEVpP5s%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 2: Use minidom.parse()</h2>
<p class="has-global-color-8-background-color has-background">This method uses the <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.dom.minidom.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.dom.minidom.html" target="_blank"><code>minidom.parse()</code></a> function to read and parse an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file. This example extracts the ID, Title and Sales for each book.</p>
<p>This example differs from Method 1 as this <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file contains an additional line at the top (<code><?xml version="1.0"?></code>) of the file and each <code><book></code> tag now has an <code>id</code> (attribute) assigned to it. </p>
<p>In the current working directory, create an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file called <code>books2.xml</code>. Copy and paste the code snippet below into this file and save it.</p>
<pre class="wp-block-preformatted"><code><?xml version="1.0"?>
<bookstore> <storename>Jan's Best Sellers List</storename> <book id="21237"> <title>Surrender</title> <author>Bono</author> <sales>21987</sales> </book> <book id="21946"> <title>Going Rogue</title> <author>Janet Evanovich</author> <sales>15986</sales> </book> <book id="18241"> <title>Triple Cross</title> <author>James Patterson</author> <sales>11311</sales> </book>
</bookstore></code></pre>
<p>In the current working directory, create a Python file called <code>books2.py</code>. Copy and paste the code snippet below into this file and save it. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3-5, 7-13" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from xml.dom import minidom doc = minidom.parse('books2.xml')
name = doc.getElementsByTagName('storename')[0]
books = doc.getElementsByTagName('book') for b in books: bid = b.getAttribute('id') title = b.getElementsByTagName('title')[0] sales = b.getElementsByTagName('sales')[0] print(f'{bid} {title.firstChild.data} {sales.firstChild.data}')</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.dom.minidom.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.dom.minidom.html" target="_blank"><code>minidom</code></a> library. This allows access to various functions to parse the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file and retrieve tags and attributes.</p>
<p>The first section of highlighted lines performs the following:</p>
<ul>
<li>Reads and parse the <code>books2.xml</code> file and saves the results to <code>doc</code>. This action creates the Object shown as (1) below.</li>
<li>Retrieves the <code><storename></code> tag and saves the results to <code>name</code>. This action creates an Object shown as (2) below.</li>
<li>Retrieves the <code><book></code> tag for each <code>book</code> and saves the results to <code>books</code>. This action creates a List of three (3) Objects: one for each book shown as (3) below.</li>
</ul>
<pre class="wp-block-preformatted"><code>(1) <xml.dom.minidom.Document object at 0x0000022D764AFEE0> (2) <DOM Element: storename at 0x22d764f0ee0> (3) [<DOM Element: book at 0x22d764f3a30>, <DOM Element: book at 0x22d764f3c70>, <DOM Element: book at 0x22d764f3eb0>]</code></pre>
<p>The last section of highlighted lines loop through the books Object and outputs the results to the terminal.</p>
<pre class="wp-block-preformatted"><code>21237 Surrender 21987
21946 Going Rogue 15986
18241 Triple Cross 11311</code></pre>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/parsing-xml-files-in-python-a-simple-guide/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2F5MXDZI3jRio%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 3: Use etree</h2>
<p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.etree.elementtree.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.etree.elementtree.html" target="_blank"><code>etree</code></a> to read in and parses an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file. This example extracts the Title and Sales data for each book.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> The <code>etree</code> considers the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file as a tree structure. Each element represents a node of said tree. Accessing elements is done on an element level.</p>
<p>This example reads in and parses the <code>books2.xml</code> file created earlier.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3,4, 6-10" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import xml.etree.ElementTree as ET xml_data = ET.parse('books2.xml')
root = xml_data.getroot() for books in root.findall('book'): title = books.find('title').text author = books.find('author').text sales = books.find('sales').text print(title, author, sales)</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://docs.python.org/3/library/xml.etree.elementtree.html" data-type="URL" data-id="https://docs.python.org/3/library/xml.etree.elementtree.html" target="_blank"><code>etree</code></a> library. This allows access to all nodes of the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> <code><tag></code> structure.</p>
<p>The following line reads in and parses <code>books2.xml</code>. The results save as an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> Object to <code>xml_data</code>. If output to the terminal, an Object similar to the one below displays.</p>
<pre class="wp-block-preformatted"><code><Element 'bookstore' at 0x000001E45E9442C0></code></pre>
<p>The following highlighted section uses a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>for</code></a> loop to iterate through each <code><book></code> tag, extracting the <code><title></code>, <code><author></code> and <code><sales></code> tags for each book and outputting them to the terminal.</p>
<pre class="wp-block-preformatted"><code>Surrender Bono 21987
Going Rogue Janet Evanovich 15986
Triple Cross James Patterson 11311</code></pre>
<p>To retrieve the attribute of the <code><book></code> tag, run the following code.</p>
<p>This code extracts the <code>id</code> attribute from each <code><book></code> tag and outputs it to the terminal.</p>
<pre class="wp-block-preformatted"><code>{'id': '21237'}
{'id': '21946'}
{'id': '18241'}</code></pre>
<p>To extract the values, run the following code.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">for id in root.iter('book'): vals = id.attrib.values() for v in vals: print(vals)</pre>
<pre class="wp-block-preformatted"><code>21237
21946
18241</code></pre>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Method 4: Use untangle.parse()</h2>
<p class="has-global-color-8-background-color has-background">This method uses <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle.parse()</code></a> to parse an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> string.</p>
<p>This example reads in and parses the <code>books3.xml</code> file shown below. If necessary, install the <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle</code></a> library.</p>
<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> The <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle</code></a> library converts an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank">XML</a> file to a Python object. This is a good option when you have a group of items, such as book names.</p>
<p>In the current working directory, create an <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file called <code>books3.xml</code>. Copy and paste the code snippet below into this file and save it. If necessary, <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-install-xmltodict-in-python/" target="_blank">install</a> the <a href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank" rel="noreferrer noopener"><code>untangle</code></a> library.</p>
<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""><?xml version="1.0"?>
<root> <book name="Surrender"/> <book name="Going Rogue"/> <book name="Triple Cross"/>
</root></pre>
<p>In the current working directory, create a Python file called <code>books3.py</code>. Copy and paste the code snippet below into this file and save it. </p>
<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="3-4,6-7" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import untangle book_obj = untangle.parse('books3.xml')
books = ','.join([book['name'] for book in book_obj.root.book]) for b in books.split(','): print(b)</pre>
<p>The first line in the above code snippet imports the <a rel="noreferrer noopener" href="https://pypi.org/project/untangle/" data-type="URL" data-id="https://pypi.org/project/untangle/" target="_blank"><code>untangle</code></a> library allowing access to the <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> file structure.</p>
<p>The following line reads in and parses the <code>books3.xml</code> file. The results save to <code>book_obj</code>. </p>
<p>The next line calls the <a rel="noreferrer noopener" href="https://blog.finxter.com/python-string-join/" data-type="URL" data-id="https://blog.finxter.com/python-string-join/" target="_blank"><code>join()</code></a> function and passes it one (1) argument: <a rel="noreferrer noopener" href="https://blog.finxter.com/list-comprehension/" data-type="URL" data-id="https://blog.finxter.com/list-comprehension/" target="_blank">List Comprehension</a>. This code iterates through and retrieves the name of each book and saves the results to <code>books</code>. If output to the terminal, the following displays:</p>
<pre class="wp-block-preformatted"><code> Surrender,Going Rogue,Triple Cross</code></pre>
<p>The next line instantiates a <a rel="noreferrer noopener" href="https://blog.finxter.com/python-loops/" data-type="URL" data-id="https://blog.finxter.com/python-loops/" target="_blank"><code>for</code></a> loop, iterates through each book name, and sends it to the terminal.</p>
<pre class="wp-block-preformatted"><code>Surrender
Going Rogue
Triple Cross</code></pre>
<figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/parsing-xml-files-in-python-a-simple-guide/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FaBC0VhpXkOQ%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Summary</h2>
<p>This article has shown four (4) ways to work with <a rel="noreferrer noopener" href="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" data-type="URL" data-id="https://developer.mozilla.org/en-US/docs/Web/XML/XML_introduction" target="_blank"><code>XML</code></a> files to select the best fit for your coding requirements.</p>
<p>Good Luck & Happy Coding!</p>
<hr class="wp-block-separator has-alpha-channel-opacity"/>
<h2>Programmer Humor – Blockchain</h2>
<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="280" height="394" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-31.png" alt="" class="wp-image-457795" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-31.png 280w, https://blog.finxter.com/wp-content/uplo...13x300.png 213w" sizes="(max-width: 280px) 100vw, 280px" /><figcaption><em>“Blockchains are like grappling hooks, in that it’s extremely cool when you encounter a problem for which they’re the right solution, but it happens way too rarely in real life.”</em> <strong>source </strong> – <a href="https://imgs.xkcd.com/comics/blockchain.png" data-type="URL" data-id="https://imgs.xkcd.com/comics/blockchain.png" target="_blank" rel="noreferrer noopener">xkcd</a></figcaption></figure>
</div>
</div>
https://www.sickgaming.net/blog/2022/11/...mple-ways/