[Tut] How to Convert Tab-Delimited File to CSV in Python? - Printable Version +- Sick Gaming (https://www.sickgaming.net) +-- Forum: Programming (https://www.sickgaming.net/forum-76.html) +--- Forum: Python (https://www.sickgaming.net/forum-83.html) +--- Thread: [Tut] How to Convert Tab-Delimited File to CSV in Python? (/thread-99803.html) |
[Tut] How to Convert Tab-Delimited File to CSV in Python? - xSicKxBot - 08-14-2022 How to Convert Tab-Delimited File to CSV in Python? <div> <div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload="{"align":"left","id":"563635","slug":"default","valign":"top","ignore":"","reference":"auto","class":"","count":"1","readonly":"","score":"5","best":"5","gap":"5","greet":"Rate this post","legend":"5\/5 - (1 vote)","size":"24","width":"142.5","_legend":"{score}\/{best} - ({count} {votes})","font_factor":"1.25"}"> <div class="kksr-stars"> <div class="kksr-stars-inactive"> <div class="kksr-star" data-star="1" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="2" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="3" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="4" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="5" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> </p></div> <div class="kksr-stars-active" style="width: 142.5px;"> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> </p></div> </div> <div class="kksr-legend" style="font-size: 19.2px;"> 5/5 – (1 vote) </div> </div> <p><strong>The easiest way to convert a tab-delimited values (TSV) file to a comma-separated values (CSV) file is to use the following three lines of code:</strong></p> <ol class="has-global-color-8-background-color has-background"> <li><strong><code>import pandas as pd</code></strong></li> <li><strong><code>df = pd.read_csv('my_file.txt', sep='\t', header=None)</code></strong></li> <li><strong><code>df.to_csv('my_file.csv', header=None)</code></strong></li> </ol> <p>We’ll explain this and other approaches in more detail next—scroll down to <strong>Method 3</strong> for this exact method.</p> <h2>Problem Formulation</h2> <p>Given a tab-delimited file with one tab character <code>'\t'</code> between two values in a given column.</p> <p><strong>Input:</strong> <code>'my_file.tsv'</code></p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img loading="lazy" width="572" height="435" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-12.png" alt="" class="wp-image-563647" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-12.png 572w, https://blog.finxter.com/wp-content/uploads/2022/08/image-12-300x228.png 300w" sizes="(max-width: 572px) 100vw, 572px" /><figcaption><strong>Figure</strong>: File <code>'my_file.tsv'</code> with tab <code>'\t'</code> separated values.</figcaption></figure> </div> <pre class="wp-block-preformatted">Alice DataScience $100000 Bob Programmer $90000 Carl Manager $122000 Dave Freelancer $144000</pre> <p>How to convert the tab-delimited values (TSV) to a comma-separated values (CSV) file?</p> <p><strong>Output:</strong> <code>'my_file.csv'</code></p> <pre class="wp-block-preformatted"><code>0,Alice,DataScience,$100000 1,Bob,Programmer,$90000 2,Carl,Manager,$122000 3,Dave,Freelancer,$144000</code></pre> <p>We’ll also look at slight variations of this problem. Let’s go!</p> <h2>Method 1: String Replace Single Tab</h2> <p class="has-global-color-8-background-color has-background">The most straightforward way to convert a tab-delimited (TSV) to a comma-separated (CSV) file in Python is to replace each tabular character <code>'\t'</code> with a comma <code>','</code> character using the <code><a rel="noreferrer noopener" href="https://blog.finxter.com/python-string-replace-2/" data-type="post" data-id="26083" target="_blank">string.replace()</a></code> method. This works if two values are separated by exactly one tabular character.</p> <p>Here’s an example input file <code>'my_file.tsv'</code>:</p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img loading="lazy" width="580" height="438" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-13.png" alt="" class="wp-image-563655" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-13.png 580w, https://blog.finxter.com/wp-content/uploads/2022/08/image-13-300x227.png 300w" sizes="(max-width: 580px) 100vw, 580px" /></figure> </div> <p>Here’s an example of some code to convert the tab-delimited file to the CSV file:</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">with open('my_file.tsv') as f: # Read space-delimited file and replace all empty spaces by commas data = f.read().replace('\t', ',') # Write the CSV data in the output file print(data, file=open('my_file.csv', 'w'))</pre> <p>Output file <code>'my_file.csv'</code>:</p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img loading="lazy" width="573" height="428" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-14.png" alt="" class="wp-image-563660" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-14.png 573w, https://blog.finxter.com/wp-content/uploads/2022/08/image-14-300x224.png 300w" sizes="(max-width: 573px) 100vw, 573px" /></figure> </div> <p>If you have any doubts, feel free to dive into our related tutorials:</p> <ul class="has-base-3-background-color has-background"> <li><a href="https://blog.finxter.com/python-string-replace/" data-type="URL" data-id="https://blog.finxter.com/python-string-replace/" target="_blank" rel="noreferrer noopener">Python String Replace Method</a></li> <li><a href="https://blog.finxter.com/python-open-function/" data-type="post" data-id="24793" target="_blank" rel="noreferrer noopener">Python <code>open()</code> Function</a></li> <li><a href="https://blog.finxter.com/python-one-liner-write-string-to-file/" data-type="post" data-id="10913" target="_blank" rel="noreferrer noopener">Python <code>print()</code> to File</a></li> <li><a href="https://blog.finxter.com/python-print/" data-type="post" data-id="20731" target="_blank" rel="noreferrer noopener">Python <code>print()</code> Function</a></li> <li><a href="https://blog.finxter.com/python-__enter__-magic-method/" data-type="post" data-id="140907">Python Context Mana</a><a href="https://blog.finxter.com/python-__enter__-magic-method/" data-type="post" data-id="140907" target="_blank" rel="noreferrer noopener">g</a><a href="https://blog.finxter.com/python-__enter__-magic-method/" data-type="post" data-id="140907">ers</a></li> </ul> <h2>Method 2: Regex Replace Arbitrary Tabs</h2> <p class="has-global-color-8-background-color has-background">To replace one <code>'\t'</code> or more tabs <code>'\t\t\t'</code> between two column values with a comma <code>','</code> and obtain a CSV, use the regular expressions operation <code>re.sub('[\t]+', ',', data)</code> on the space-separated <code>data</code>.</p> <p>If you have any doubts, feel free to dive into our related tutorials:</p> <ul> <li><a href="https://blog.finxter.com/python-regex-sub/" data-type="post" data-id="5861" target="_blank" rel="noreferrer noopener">Python Regex <code>re.sub()</code></a></li> <li><a href="https://blog.finxter.com/python-character-set-regex-tutorial/" data-type="post" data-id="6208" target="_blank" rel="noreferrer noopener">Python Regex <code>[]</code> Character Class</a></li> <li><a href="https://blog.finxter.com/python-regex-quantifiers-question-mark-vs-plus-vs-asterisk-differences/" data-type="post" data-id="6915" target="_blank" rel="noreferrer noopener">Python Regex <code>+</code> Quantifier</a></li> <li><a href="https://blog.finxter.com/python-regex/" data-type="post" data-id="6210" target="_blank" rel="noreferrer noopener">Python Regex Superpower</a></li> </ul> <p>Here’s an example input file <code>'my_file.tsv'</code>, notice the additional tabular characters that may separate two column values:</p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img loading="lazy" width="567" height="435" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-16.png" alt="" class="wp-image-563673" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-16.png 567w, https://blog.finxter.com/wp-content/uploads/2022/08/image-16-300x230.png 300w" sizes="(max-width: 567px) 100vw, 567px" /></figure> </div> <p>Here’s an example of some code to convert the TSV to the CSV file:</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="1,6" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import re with open('my_file.txt') as infile: # Read space-delimited file and replace all empty spaces by commas data = re.sub('[ ]+', ',', infile.read()) # Write the CSV data in the output file print(data, file=open('my_file.csv', 'w'))</pre> <p>Output file <code>'my_file.csv'</code>:</p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img loading="lazy" width="578" height="436" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-17.png" alt="" class="wp-image-563676" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-17.png 578w, https://blog.finxter.com/wp-content/uploads/2022/08/image-17-300x226.png 300w" sizes="(max-width: 578px) 100vw, 578px" /></figure> </div> <h2>Method 3: Pandas read_csv() and to_csv()</h2> <p class="has-global-color-8-background-color has-background">To convert a tab-delimited file to a CSV, first read the file into a Pandas DataFrame using <code>pd.read_csv(filename, sep='\t+', header=None)</code> and then write the DataFrame to a file using <code>df.to_csv(outfilename, header=None)</code>.</p> <p>Here’s an example input file <code>'my_file.tsv'</code>:</p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img loading="lazy" width="567" height="435" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-16.png" alt="" class="wp-image-563673" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-16.png 567w, https://blog.finxter.com/wp-content/uploads/2022/08/image-16-300x230.png 300w" sizes="(max-width: 567px) 100vw, 567px" /></figure> </div> <p>Here’s an example of some code to convert the tab-delimited file to the CSV file:</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="4,7" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd # Read space-delimited file df = pd.read_csv('my_file.tsv', sep='\t+', header=None) # Write DataFrame to file df.to_csv('my_file.csv', header=None) </pre> <p>Output file <code>'my_file.csv'</code>:</p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img loading="lazy" width="573" height="438" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-18.png" alt="" class="wp-image-563680" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-18.png 573w, https://blog.finxter.com/wp-content/uploads/2022/08/image-18-300x229.png 300w" sizes="(max-width: 573px) 100vw, 573px" /></figure> </div> <p>You can also use the simpler <code>sep='\t'</code> if you are sure that only a single tabular character separates two column values.</p> <p>If you have any doubts, feel free to dive into our related tutorials:</p> <ul> <li><a href="https://blog.finxter.com/read-a-csv-file-to-a-pandas-dataframe/" data-type="post" data-id="440655" target="_blank" rel="noreferrer noopener">Pandas <code>read_csv()</code></a></li> <li><a href="https://blog.finxter.com/pandas-to_csv/" data-type="post" data-id="8027" target="_blank" rel="noreferrer noopener">Pandas <code>to_csv()</code></a></li> </ul> <h2>Summary</h2> <p>We examined three great ways to convert a space-delimited to a comma-separated CSV file:</p> <ul> <li><a href="https://blog.finxter.com/how-to-convert-tab-delimited-file-to-csv-in-python/#Method_1_String_Replace_Single_Tab">Method 1: String Replace Single Tab</a></li> <li><a href="https://blog.finxter.com/how-to-convert-tab-delimited-file-to-csv-in-python/#Method_2_Regex_Replace_Arbitrary_Tabs">Method 2: Regex Replace Arbitrary Tabs</a></li> <li><a href="https://blog.finxter.com/how-to-convert-tab-delimited-file-to-csv-in-python/#Method_3_Pandas_read_csv_and_to_csv">Method 3: Pandas <code>read_csv()</code> and <code>to_csv()</code></a></li> </ul> <p>Thanks for taking the time to read this article, my friend! <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f40d.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f49b.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p> <hr class="wp-block-separator has-alpha-channel-opacity"/> <h2>Regex Humor</h2> <div class="wp-block-image"> <figure class="aligncenter size-full is-resized"><img loading="lazy" src="https://blog.finxter.com/wp-content/uploads/2022/06/image-133.png" alt="" class="wp-image-428862" width="700" height="629" srcset="https://blog.finxter.com/wp-content/uploads/2022/06/image-133.png 785w, https://blog.finxter.com/wp-content/uploads/2022/06/image-133-300x270.png 300w, https://blog.finxter.com/wp-content/uploads/2022/06/image-133-768x691.png 768w" sizes="(max-width: 700px) 100vw, 700px" /><figcaption><em>Wait, forgot to escape a space. Wheeeeee[taptaptap]eeeeee.</em> (<a href="https://imgs.xkcd.com/comics/regular_expressions.png" data-type="URL" data-id="https://imgs.xkcd.com/comics/regular_expressions.png" target="_blank" rel="noreferrer noopener">source</a>)</figcaption></figure> </div> </div> https://www.sickgaming.net/blog/2022/08/10/how-to-convert-tab-delimited-file-to-csv-in-python/ |