Programming Samples

Click here to go to:



Excel VBA

Word VBA

MS Access

Python

T-SQL

SSIS

SSRS

Power BI

Crystal Reports

SSAS

SQL Replication

C# Code

ASP .NET Code

Oracle PL/SQL

Database Diagramming


Back to Home Page


Python Matplotlib BarChart Plots

BarChart Plots in Jupyter Notebooks

This article describes how to create a Bar Charts from a CSV dataset file using Matplotlib. The Python code is written & executed in a Jupyter Notebook.  The CSV dataset file (penguins_size.csv) was downloaded from Kaggle https://www.kaggle.com/datasets/parulpandey/palmer-archipelago-antarctica-penguin-data?resource=download.

The article describes how to create Bar Charts to display the Antarctica penguins species and compare their average Body Mass.

 

Load the dataset into a DataFrame

Write code to use pandas to read the CSV data into a DataFrame and then write a statement to view the data. The DataFrame GroupBy statement is used to display a Count for each species of Penguin. df_penguin.head() could have also been used to take a look at the data.

Python Data output

Another dataframe statement to get a look at the data by Grouping the Species and displaying the Average body mass (aka the Mean).

Python dataframe GroupBy statement to calculate Mean over each Penguin Species

DataFrame & Lists for the BarChart

Create a New DataFrame - Create a New Dataframe to capture the Grouped Average body mass by Penguin Species. Reset the Index for the new DataFrame.

Create the Grouped Dataframe

Create Lists for the X & Y axis of the Bar Chart. Each statement will create separate Lists to be used in the matplotlib code and produce the chart.

Convert dataframe to Lists

Note:If the Index is not Reset when creating the DataFrame for df_penguins, the query for the Species as a separate List will not work.

Index Not Reset in DataFrame
KeyError Failure when Index is Not reset

Matplotlib BarChart - Create the code to output the chart to include the Size, Title, Y-Axis, and data for plt.bar.

plt.bar arguments are the X-axis values (penguin species names), Y-axis values (body mass averages).

Simple matplotlib bar chart code

The Bar Chart should appear after running the code. The addition of the values for each bar would be helpful.

Simple matplotlib bar chart image

Updated Version of Matplotlib BarChart - a barchart with additional information.

Write code to produce a data to extract the Body Mass average by Species & Sex for comparison.

Penguin Body Mass by Species & Sex

Separate the male and female penguins into 2 data frames (df_female & df_male). Create 2 new data frames for the Average Body Mass of the male and female penguin species (df_female_penguins & df_male_penguins).

2 Data Frames to split males & females + 2 new data frames with Average Body Mass

Create 2 Lists from the data frames: df_female_penguins & df_male_penguins.

Create Lists of Average Body Mass for both Data Frames

Generate the Penguin Species List for the X axis from the data frame df_penguins. (also shown up above)

Create List of Penguin Species

Update the 2 Lists to Round the number of digits after the decimal to 2 (to shorten the 12 digits after the decimal) for both male and females lists (m_list & f_list)

Rounding Body Mass to 2 digits

Write code to produce the Bar Chart with 2 sets of bars for each species - one for the Blue bars and one for the Pink bars.

Code for 2 bars per species

Run the code to produce the first draft of the Bar Chart with body mass values for male and female penguins.

Output of 2 Bars plotted for each Species

Additional updates to the Bar Chart code will make the Bar Chart Visualization a bit cleaner by adding code to the remove some of the border lines (left, right, top), adding Values on top of each Column, and a Legend for each group.

Updated Code for 2 Bars plotted for each Species

Final Bar Chart Version

Updated Bar Chart Data Visualization of Average Body Mass for Males and Female Penguin Species