{"id":127674,"date":"2022-08-30T12:40:30","date_gmt":"2022-08-30T12:40:30","guid":{"rendered":"https:\/\/blog.finxter.com\/?p=628454"},"modified":"2022-08-30T12:40:30","modified_gmt":"2022-08-30T12:40:30","slug":"python-convert-parquet-to-csv","status":"publish","type":"post","link":"https:\/\/sickgaming.net\/blog\/2022\/08\/30\/python-convert-parquet-to-csv\/","title":{"rendered":"Python Convert Parquet to CSV"},"content":{"rendered":"\n<div class=\"kk-star-ratings kksr-auto kksr-align-left kksr-valign-top\" data-payload=\"{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;628454&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;top&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;1&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;5&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;5&quot;,&quot;greet&quot;:&quot;Rate this post&quot;,&quot;legend&quot;:&quot;5\\\/5 - (1 vote)&quot;,&quot;size&quot;:&quot;24&quot;,&quot;width&quot;:&quot;142.5&quot;,&quot;_legend&quot;:&quot;{score}\\\/{best} - ({count} {votes})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}\">\n<div class=\"kksr-stars\">\n<div class=\"kksr-stars-inactive\">\n<div class=\"kksr-star\" data-star=\"1\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"2\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"3\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"4\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" data-star=\"5\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<div class=\"kksr-stars-active\" style=\"width: 142.5px;\">\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<div class=\"kksr-star\" style=\"padding-right: 5px\">\n<div class=\"kksr-icon\" style=\"width: 24px; height: 24px;\"><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<\/div>\n<div class=\"kksr-legend\" style=\"font-size: 19.2px;\"> 5\/5 &#8211; (1 vote) <\/div>\n<\/div>\n<h2>Problem<\/h2>\n<p class=\"has-global-color-8-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f4ac.png\" alt=\"\ud83d\udcac\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <strong>Challenge<\/strong>: How to convert a Parquet file <code>'my_file.parquet'<\/code> to a CSV file <code>'my_file.csv'<\/code> in Python?<\/p>\n<p>In case you don&#8217;t know what a Parquet file is, here&#8217;s the definition:<\/p>\n<p class=\"has-base-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f4a1.png\" alt=\"\ud83d\udca1\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <strong>Info<\/strong>: <a rel=\"noreferrer noopener\" href=\"https:\/\/parquet.apache.org\/\" target=\"_blank\">Apache Parquet<\/a> is an open-source, column-oriented data file format designed for efficient data storage and retrieval using data compression and encoding schemes to handle complex data in bulk. Parquet is available in multiple languages including Java, C++, and Python.<\/p>\n<p>Here\u2019s an example Parquet file format:<\/p>\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter\"><img loading=\"lazy\" decoding=\"async\" width=\"952\" height=\"486\" src=\"https:\/\/blog.finxter.com\/wp-content\/uploads\/2022\/06\/image-135.png\" alt=\"\" class=\"wp-image-430271\" srcset=\"https:\/\/blog.finxter.com\/wp-content\/uploads\/2022\/06\/image-135.png 952w, https:\/\/blog.finxter.com\/wp-content\/uploads\/2022\/06\/image-135-300x153.png 300w, https:\/\/blog.finxter.com\/wp-content\/uploads\/2022\/06\/image-135-768x392.png 768w\" sizes=\"auto, (max-width: 952px) 100vw, 952px\" \/><figcaption><a href=\"https:\/\/parquet.apache.org\/docs\/file-format\/\" target=\"_blank\" rel=\"noreferrer noopener\">source<\/a><\/figcaption><\/figure>\n<\/div>\n<h2>Solution<\/h2>\n<p class=\"has-global-color-8-background-color has-background\">The most simple way to convert a Parquet to a CSV file in Python is to import the Pandas library, call the <code>pandas.read_parquet()<\/code> function passing the <code>'my_file.parquet'<\/code> filename argument to load the file content into a DataFrame, and convert the DataFrame to a CSV using the DataFrame <code><a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/pandas-dataframe-to_csv-method\/\" data-type=\"post\" data-id=\"344277\" target=\"_blank\">to_csv()<\/a><\/code> method.<\/p>\n<ul>\n<li><code><strong>import pandas as pd<\/strong><\/code><\/li>\n<li><code><strong>df = pd.read_parquet('my_file.parquet')<\/strong><\/code><\/li>\n<li><code><strong>df.to_csv('my_file.csv')<\/strong><\/code><\/li>\n<\/ul>\n<p>Here&#8217;s a minimal example:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import pandas as pd\ndf = pd.read_parquet('my_file.parquet')\ndf.to_csv('my_file.csv')<\/pre>\n<p>For this to work, you may have to <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/how-to-install-pandas-in-python\/\" data-type=\"post\" data-id=\"35926\" target=\"_blank\">install pandas<\/a> and <a rel=\"noreferrer noopener\" href=\"https:\/\/blog.finxter.com\/how-to-install-pyarrow-in-python\/\" data-type=\"post\" data-id=\"35940\" target=\"_blank\">pyarrow<\/a>. But if I were you, I&#8217;d just try it because chances are you&#8217;ve already installed them or don&#8217;t explicitly need to install the PyArrow library.<\/p>\n<h2>Related<\/h2>\n<p class=\"has-base-background-color has-background\"><img decoding=\"async\" src=\"https:\/\/s.w.org\/images\/core\/emoji\/14.0.0\/72x72\/1f30d.png\" alt=\"\ud83c\udf0d\" class=\"wp-smiley\" style=\"height: 1em; max-height: 1em;\" \/> <strong>Related Tutorial<\/strong>: <a href=\"https:\/\/blog.finxter.com\/python-convert-csv-to-parquet\/\" data-type=\"post\" data-id=\"430254\">Python Convert CSV to Parquet<\/a><\/p>\n<p>I also found this video from a great YT channel that concerns this particular problem of converting a Parquet to a CSV:<\/p>\n<figure class=\"wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube\"><a href=\"https:\/\/blog.finxter.com\/python-convert-parquet-to-csv\/\"><img decoding=\"async\" src=\"https:\/\/blog.finxter.com\/wp-content\/plugins\/wp-youtube-lyte\/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FkYghFTfDXnU%2Fhqdefault.jpg\" alt=\"YouTube Video\"><\/a><figcaption><\/figcaption><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>5\/5 &#8211; (1 vote) Problem Challenge: How to convert a Parquet file &#8216;my_file.parquet&#8217; to a CSV file &#8216;my_file.csv&#8217; in Python? In case you don&#8217;t know what a Parquet file is, here&#8217;s the definition: Info: Apache Parquet is an open-source, column-oriented data file format designed for efficient data storage and retrieval using data compression and encoding [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[857],"tags":[73,468,528],"class_list":["post-127674","post","type-post","status-publish","format-standard","hentry","category-python-tut","tag-programming","tag-python","tag-tutorial"],"_links":{"self":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/127674","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/comments?post=127674"}],"version-history":[{"count":0,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/posts\/127674\/revisions"}],"wp:attachment":[{"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/media?parent=127674"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/categories?post=127674"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sickgaming.net\/blog\/wp-json\/wp\/v2\/tags?post=127674"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}