[Tut] Data Science Tells This Story About the Global Wine Markets ? - Printable Version +- Sick Gaming (https://www.sickgaming.net) +-- Forum: Programming (https://www.sickgaming.net/forum-76.html) +--- Forum: Python (https://www.sickgaming.net/forum-83.html) +--- Thread: [Tut] Data Science Tells This Story About the Global Wine Markets ? (/thread-100776.html) |
[Tut] Data Science Tells This Story About the Global Wine Markets ? - xSicKxBot - 02-20-2023 Data Science Tells This Story About the Global Wine Markets ? <div> <div class="kk-star-ratings kksr-auto kksr-align-left kksr-valign-top" data-payload='{"align":"left","id":"1147143","slug":"default","valign":"top","ignore":"","reference":"auto","class":"","count":"2","legendonly":"","readonly":"","score":"5","starsonly":"","best":"5","gap":"5","greet":"Rate this post","legend":"5\/5 - (2 votes)","size":"24","width":"142.5","_legend":"{score}\/{best} - ({count} {votes})","font_factor":"1.25"}'> <div class="kksr-stars"> <div class="kksr-stars-inactive"> <div class="kksr-star" data-star="1" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="2" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="3" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="4" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" data-star="5" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> </p></div> <div class="kksr-stars-active" style="width: 142.5px;"> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> <div class="kksr-star" style="padding-right: 5px"> <div class="kksr-icon" style="width: 24px; height: 24px;"></div> </p></div> </p></div> </div> <div class="kksr-legend" style="font-size: 19.2px;"> 5/5 – (2 votes) </div> </p></div> <h2><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4d6.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Background</h2> <p>Many people like to relax or party with a glass of wine. That makes wine an important industry in many countries. Understanding this market is important to the livelihood of many people.</p> <figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="696" height="480" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-287.png" alt="" class="wp-image-1147173" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-287.png 696w, https://blog.finxter.com/wp-content/uploads/2023/02/image-287-300x207.png 300w" sizes="(max-width: 696px) 100vw, 696px" /></figure> <p>For fun, consider the following fictional scenario:</p> <p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f377.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Story</strong>: <em>You work at a multinational consumer goods organization that is considering entering the wine production industry. Managers at your company would like to understand the market better before making a decision.</em></p> <h2><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4be.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> The Data</h2> <p>This dataset is a subset of the University of Adelaide’s <a href="https://universityofadelaide.app.box.com/s/eqpjqyq8o3mfy7139cr4pn76iolr1o1w" target="_blank" rel="noreferrer noopener">Annual Database of Global Wine Markets</a>.</p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" loading="lazy" width="697" height="521" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-286.png" alt="" class="wp-image-1147172" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-286.png 697w, https://blog.finxter.com/wp-content/uploads/2023/02/image-286-300x224.png 300w" sizes="(max-width: 697px) 100vw, 697px" /></figure> </div> <p>The dataset consists of a single CSV file, <code>data/wine.csv</code>.</p> <p>Each row in the dataset represents the wine market in one country. There are 34 metrics for the wine industry covering both the production and consumption sides of the market.</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd wine = pd.read_csv("wine.csv") print(wine)</pre> <div class="wp-block-image"> <figure class="aligncenter size-large"><img decoding="async" loading="lazy" width="1024" height="375" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-273-1024x375.png" alt="" class="wp-image-1147146" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-273-1024x375.png 1024w, https://blog.finxter.com/wp-content/uploads/2023/02/image-273-300x110.png 300w, https://blog.finxter.com/wp-content/uploads/2023/02/image-273-768x281.png 768w, https://blog.finxter.com/wp-content/uploads/2023/02/image-273.png 1455w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure> </div> <p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Info</strong>: The <code><a href="https://blog.finxter.com/read-a-csv-file-to-a-pandas-dataframe/" data-type="post" data-id="440655" target="_blank" rel="noreferrer noopener">pandas.read_csv()</a></code> is a function in the Pandas library that reads data from a CSV file and creates a DataFrame object. It has various parameters for customization and can handle missing data, date parsing, and different data formats. It’s a useful tool for importing and manipulating CSV data in Python.</p> <h2><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4aa.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Challenge</h2> <p>Explore the dataset to understand the global wine market. </p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" loading="lazy" width="697" height="463" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-289.png" alt="" class="wp-image-1147181" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-289.png 697w, https://blog.finxter.com/wp-content/uploads/2023/02/image-289-300x199.png 300w" sizes="(max-width: 697px) 100vw, 697px" /></figure> </div> <p><em>The given analysis should satisfy four criteria: Technical approach (20%), Visualizations (20%), Storytelling (30%), and Insights and recommendations (30%).</em></p> <p><em>The Technical approach will focus on the soundness of the approach and the quality of the code. Visualizations will assess whether the visualizations are appropriate and capable of providing clear insights. The Storytelling component will evaluate whether the data supports the narrative and if the narrative is detailed yet concise. The Insights and recommendations component will check for clarity, relevance to the domain, and recognition of analysis limitations.</em></p> <hr class="wp-block-separator has-alpha-channel-opacity"/> <h2><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f377.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Wine Market Analysis in Four Steps</h2> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" loading="lazy" width="698" height="464" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-288.png" alt="" class="wp-image-1147180" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-288.png 698w, https://blog.finxter.com/wp-content/uploads/2023/02/image-288-300x199.png 300w" sizes="(max-width: 698px) 100vw, 698px" /></figure> </div> <h3>Step 1: Data Preparation</h3> <p>Import the necessary libraries, and the dataset. Then, if necessary I clean the data and see what features are available for analysis.</p> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import pandas as pd import matplotlib.pyplot as plt import seaborn as sns wine = pd.read_csv("wine.csv") # Check DataFrame print(wine.info())</pre> <p>I print some information about the DataFrame to get the column names and non-zero values with <code>df.info()</code> method.</p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" loading="lazy" width="865" height="914" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-274.png" alt="" class="wp-image-1147147" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-274.png 865w, https://blog.finxter.com/wp-content/uploads/2023/02/image-274-284x300.png 284w, https://blog.finxter.com/wp-content/uploads/2023/02/image-274-768x812.png 768w" sizes="(max-width: 865px) 100vw, 865px" /></figure> </div> <p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Info</strong>: The Pandas <code><a rel="noreferrer noopener" href="https://blog.finxter.com/data-cleaning-in-python/" data-type="post" data-id="375538" target="_blank">DataFrame.info()</a></code> method provides a concise summary of a DataFrame’s content and structure, including data types, column names, memory usage, and the presence of null values. It’s useful for data inspection, optimization, and error-checking.</p> <p><strong>Check „NaN” values:</strong></p> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">wine[wine.isna().any(axis=1)]</pre> <p>There is some “NaN” that we have to manage. It is logical that if there is no „Vine Area” then they cannot produce wine. So where there is 0 area, we change production to 0.</p> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(wine[['Country', 'Wine produced (ML)']][wine["Vine Area ('000 ha)"]==0])</pre> <pre class="wp-block-preformatted"><code> Country Wine produced (ML) 6 Denmark NaN 7 Finland NaN 10 Ireland NaN 12 Sweden NaN 42 Hong Kong NaN 46 Malaysia NaN 47 Philippines NaN 48 Singapore NaN</code> </pre> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">wine.loc[wine["Vine Area ('000 ha)"] == 0, "Wine produced (ML)"] = 0</pre> <p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Info</strong>: The <code><a href="https://blog.finxter.com/pandas-loc-and-iloc-a-simple-guide-with-video/" data-type="post" data-id="36716" target="_blank" rel="noreferrer noopener">DataFrame.loc</a></code> is a powerful Pandas method used for selecting or modifying data based on labels or boolean conditions. It allows for versatile data manipulations, including filtering, sorting, and value assignment.</p> <p>You can watch an explainer video on it here:</p> <figure class="wp-block-embed-youtube wp-block-embed is-type-video is-provider-youtube"><a href="https://blog.finxter.com/global-wine-markets-analysis/"><img src="https://blog.finxter.com/wp-content/plugins/wp-youtube-lyte/lyteCache.php?origThumbUrl=https%3A%2F%2Fi.ytimg.com%2Fvi%2FJQBOpbhxQrM%2Fhqdefault.jpg" alt="YouTube Video"></a><figcaption></figcaption></figure> <h3>Step 2: Gain Data Overview</h3> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" loading="lazy" width="701" height="680" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-291.png" alt="" class="wp-image-1147207" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-291.png 701w, https://blog.finxter.com/wp-content/uploads/2023/02/image-291-300x291.png 300w" sizes="(max-width: 701px) 100vw, 701px" /></figure> </div> <p>To find the biggest importers and exporters and to get a more comprehensive picture of the market, I have created some queries.</p> <p><code>DataFrame.nlargest(n, columns)</code> is the easiest way to perform the search, where “<code>n</code>” is the number of hits and “<code>columns</code>” is the name of the column being searched. <code>nlargest()</code> returns the values in sorted order.</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">best_10_importers_by_value = wine.nlargest(10, 'Value of wine imports (US$ mill)') print(best_10_importers_by_value)</pre> <div class="wp-block-image"> <figure class="aligncenter size-large"><img decoding="async" loading="lazy" width="1024" height="307" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-275-1024x307.png" alt="" class="wp-image-1147148" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-275-1024x307.png 1024w, https://blog.finxter.com/wp-content/uploads/2023/02/image-275-300x90.png 300w, https://blog.finxter.com/wp-content/uploads/2023/02/image-275-768x230.png 768w, https://blog.finxter.com/wp-content/uploads/2023/02/image-275-1536x461.png 1536w, https://blog.finxter.com/wp-content/uploads/2023/02/image-275.png 1874w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure> </div> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">best_10_importers_by_liter = wine.nlargest(10, 'Wine import vol. (ML)') print(best_10_importers_by_liter)</pre> <div class="wp-block-image"> <figure class="aligncenter size-large"><img decoding="async" loading="lazy" width="1024" height="306" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-276-1024x306.png" alt="" class="wp-image-1147149" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-276-1024x306.png 1024w, https://blog.finxter.com/wp-content/uploads/2023/02/image-276-300x90.png 300w, https://blog.finxter.com/wp-content/uploads/2023/02/image-276-768x229.png 768w, https://blog.finxter.com/wp-content/uploads/2023/02/image-276-1536x459.png 1536w, https://blog.finxter.com/wp-content/uploads/2023/02/image-276.png 1851w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure> </div> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">best_10_exporters_by_value = wine.nlargest(10, 'Value of wine exports (US$ mill)') print(best_10_exporters_by_value)</pre> <div class="wp-block-image"> <figure class="aligncenter size-large"><img decoding="async" loading="lazy" width="1024" height="326" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-277-1024x326.png" alt="" class="wp-image-1147150" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-277-1024x326.png 1024w, https://blog.finxter.com/wp-content/uploads/2023/02/image-277-300x96.png 300w, https://blog.finxter.com/wp-content/uploads/2023/02/image-277-768x245.png 768w, https://blog.finxter.com/wp-content/uploads/2023/02/image-277-1536x489.png 1536w, https://blog.finxter.com/wp-content/uploads/2023/02/image-277.png 1821w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure> </div> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">best_10_exporters_by_liter = wine.nlargest(10, 'Wine export vol. (ML)') print(best_10_exporters_by_liter)</pre> <div class="wp-block-image"> <figure class="aligncenter size-large"><img decoding="async" loading="lazy" width="1024" height="307" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-278-1024x307.png" alt="" class="wp-image-1147151" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-278-1024x307.png 1024w, https://blog.finxter.com/wp-content/uploads/2023/02/image-278-300x90.png 300w, https://blog.finxter.com/wp-content/uploads/2023/02/image-278-768x230.png 768w, https://blog.finxter.com/wp-content/uploads/2023/02/image-278-1536x460.png 1536w, https://blog.finxter.com/wp-content/uploads/2023/02/image-278.png 1829w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure> </div> <h3>Step 3: Create Diagrams</h3> <p>It is time to create diagrams.</p> <p>Let’s look at imports/exports by country. I have put the import/export columns on the y-axis of a barplot for easy comparison. A barplot displays the relationship between a numeric (export/import) and a categorical (Countries) variable.</p> <p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Info</strong>: The <code><a href="https://blog.finxter.com/pandas-plotting-part-1/" data-type="post" data-id="171808" target="_blank" rel="noreferrer noopener">pandas.DataFrame.plot()</a></code> is a method in the Pandas library that generates various visualizations from DataFrame objects. It’s easy to use and allows for customization of plot appearance and behavior. <code>plot()</code> is a useful tool for data exploration, communication, and hypothesis testing.</p> <p>I used the pandas built-in <code>plot</code> function to create the chart. The <code>plot</code> function here takes the x and y values, the kind of graph, and the title as arguments.</p> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">best_10_importers_by_liter.plot(x = 'Country', y = ['Wine import vol. (ML)', 'Wine export vol. (ML)'], kind = 'bar', title = 'Import / Export by Country')</pre> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" loading="lazy" width="560" height="552" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-279.png" alt="" class="wp-image-1147156" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-279.png 560w, https://blog.finxter.com/wp-content/uploads/2023/02/image-279-300x296.png 300w" sizes="(max-width: 560px) 100vw, 560px" /></figure> </div> <p>The first insight that I got, is that it’s a bit confusing that France has the largest exports but still takes the 4th (and third) place in imports… The French seem to like foreign wines.</p> <p>See what countries do not produce enough wine to cover their own consumption! To do this, I subtracted wine production and exports from their own consumption in a new column.</p> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">#create new column to calculate wine demand wine['wine_demand'] = wine['Wine consumed (ML)'] - (wine['Wine produced (ML)'] - wine['Wine export vol. (ML)']) top_10_wine_demand = wine.nlargest(10, 'wine_demand') print(top_10_wine_demand) </pre> <div class="wp-block-image"> <figure class="aligncenter size-large"><img decoding="async" loading="lazy" width="1024" height="299" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-280-1024x299.png" alt="" class="wp-image-1147157" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-280-1024x299.png 1024w, https://blog.finxter.com/wp-content/uploads/2023/02/image-280-300x88.png 300w, https://blog.finxter.com/wp-content/uploads/2023/02/image-280-768x224.png 768w, https://blog.finxter.com/wp-content/uploads/2023/02/image-280-1536x448.png 1536w, https://blog.finxter.com/wp-content/uploads/2023/02/image-280.png 1858w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure> </div> <p>Or, visualized:</p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" loading="lazy" width="408" height="501" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-284.png" alt="" class="wp-image-1147164" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-284.png 408w, https://blog.finxter.com/wp-content/uploads/2023/02/image-284-244x300.png 244w" sizes="(max-width: 408px) 100vw, 408px" /></figure> </div> <p><strong><em>Is there enough GDP per capita for consumption?</em></strong></p> <p>I think that people who live in countries with high GDP per capita can afford more expensive and more wine. </p> <p>I have created a seaborn relation plot, where the hue represents GDP and the y-axis represents wine demand.</p> <p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f4a1.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Info</strong>: <a href="https://blog.finxter.com/heatmaps-with-seaborn/" data-type="post" data-id="19568" target="_blank" rel="noreferrer noopener">Seaborn</a> is a Python data visualization library that offers a high-level interface for creating informative and attractive statistical graphics. It’s built on top of the <a href="https://blog.finxter.com/matplotlib-full-guide/" data-type="post" data-id="20151" target="_blank" rel="noreferrer noopener">Matplotlib</a> library and includes several color palettes and themes, making it easy to create complex visualizations with minimal code. Seaborn is often used for data exploration, visualization in scientific research, and communication of data insights.</p> <p>I set the plot style to <code>'darkgrid'</code> for better look. Please note that this setting will remain as long as you do not change it, including the following graphs.</p> <p>Seaborn’s <code>relplot</code> returns a <code>FacetGrid</code> object which has a <code>set_xticklabels</code> function to customize <code>x</code> labels.</p> <pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sns.set_style('darkgrid') chart = sns.relplot(data = top_10_wine_demand, x = 'Country', y = 'wine_demand', hue = "GDP per capita ('000 US$)") chart.set_xticklabels(rotation = 65, horizontalalignment = 'right') </pre> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" loading="lazy" width="660" height="582" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-281.png" alt="" class="wp-image-1147159" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-281.png 660w, https://blog.finxter.com/wp-content/uploads/2023/02/image-281-300x265.png 300w" sizes="(max-width: 660px) 100vw, 660px" /></figure> </div> <p>My main conclusion from this is that if you have a winery in Europe, the best place to sell your wine is in the UK and Germany, and otherwise, in the US.</p> <h3>Step 4: Competitor Analysis</h3> <figure class="wp-block-image size-full"><img decoding="async" loading="lazy" width="697" height="520" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-292.png" alt="" class="wp-image-1147209" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-292.png 697w, https://blog.finxter.com/wp-content/uploads/2023/02/image-292-300x224.png 300w" sizes="(max-width: 697px) 100vw, 697px" /></figure> <p>And now, let’s look at the competitors:</p> <p class="has-global-color-8-background-color has-background"><strong><em>Where is the cheapest wine from, and what country exports lot of cheap wine?</em></strong></p> <p>Since we have no data on this, I did a little feature engineering to find out which countries export wine at the lowest price per litre. Feature engineering</p> <p>when we create a feature (a new column) to add useful information from existing data to your dataset.</p> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">wine['export_per_liter'] = wine['Value of wine exports (US$ mill)'] / wine['Wine export vol. (ML)'] top_10_cheapest = wine.nsmallest(10, 'export_per_liter') print(top_10_cheapest)</pre> <div class="wp-block-image"> <figure class="aligncenter size-large"><img decoding="async" loading="lazy" width="1024" height="300" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-283-1024x300.png" alt="" class="wp-image-1147163" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-283-1024x300.png 1024w, https://blog.finxter.com/wp-content/uploads/2023/02/image-283-300x88.png 300w, https://blog.finxter.com/wp-content/uploads/2023/02/image-283-768x225.png 768w, https://blog.finxter.com/wp-content/uploads/2023/02/image-283-1536x449.png 1536w, https://blog.finxter.com/wp-content/uploads/2023/02/image-283.png 1812w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure> </div> <p><em>Plot the findings:</em></p> <pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">top_10_cheapest.plot(x = 'Country', y = ['Value of wine exports (US$ mill)', 'Wine export vol. (ML)'], kind = 'bar', figsize = (8, 6)) plt.legend(loc = 'upper left', title = 'Cheapest wine exporters')</pre> </p> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" loading="lazy" width="684" height="595" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-282.png" alt="" class="wp-image-1147160" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-282.png 684w, https://blog.finxter.com/wp-content/uploads/2023/02/image-282-300x261.png 300w" sizes="(max-width: 684px) 100vw, 684px" /></figure> </div> <p>It is clear that Spain is by far the biggest exporter of cheap wine, followed by South Africa, but in much smaller quantities.</p> <h2>Conclusion</h2> <div class="wp-block-image"> <figure class="aligncenter size-full"><img decoding="async" loading="lazy" width="692" height="472" src="https://blog.finxter.com/wp-content/uploads/2023/02/image-290.png" alt="" class="wp-image-1147185" srcset="https://blog.finxter.com/wp-content/uploads/2023/02/image-290.png 692w, https://blog.finxter.com/wp-content/uploads/2023/02/image-290-300x205.png 300w" sizes="(max-width: 692px) 100vw, 692px" /></figure> </div> <p>If you want to gain insight into large data sets, visualization is king and you don’t need fancy, complicated graphs to see the relationships behind the data clearly. </p> <p>Understanding the tools is vital — without DataFrames, we wouldn’t have been able to pull off this analysis quickly and efficiently:</p> <p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f449.png" alt="?" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended Tutorial</strong>: <a href="https://blog.finxter.com/pandas-quickstart/" data-type="post" data-id="16511" target="_blank" rel="noreferrer noopener">Pandas in 10 Minutes</a></p> </div> https://www.sickgaming.net/blog/2023/02/19/data-science-tells-this-story-about-the-global-wine-markets-%f0%9f%8d%b7/ |