Histogram

OVERVIEW

Histogram was first introduced by Karl Pearson as a visualization to present the distribution of a set of data. It shows the estimation of the probability distribution for a continuous variable like a range of value, which is referred to be “bin”. Histogram is easily recognized as a bar chart with no space between each bar because the bins are consecutive and non-overlap intervals of data. The bins and the bars have to be adjacent and equal of size.

1

  • NOTE: The above histogram is an example from book “Excel Dashboard and Report”. Each bin is a range of unit sold with the bar showing the number of sale representatives in each bin. It includes a line of cumulative percentage in blue. The distribution was dimmed in the background with a light gray color to differentiate with the line.

CREATE ON EXCEL

  • Concept: First, we need to create the bin with equal intervals of data. Then we count a number of Sale representatives in each bin with cumulative % for each bin. Then we create a combo chart with a line and a bar chart.
  • Data: Open Excel file -> Tab Data
  • Raw Data includes only information of sale for each sale representative.

2.JPG

  • Using Frequency Formula to create range and count data for each bin

3.JPG

{=FREQUENCY(B3:B246,D3:D13)}

NOTE: The frequency formula is an array formula (with the bracket {}) pointing to the data (B3: B246) and the bins (D3: D13). Type in the formula in the first cell, then select the entire range up to the last bin, click on the formula bar to edit (or use F2) then use combo Ctrl + Shift + Enter. The formula will automatically apply to all the cells of the range and become an array formula.

  • Create final bin with range and count of sale representatives with cumulative %

4.JPG

NOTE: Unit Sold is created by the two consecutive bins above. The count of sale representative is the count of the end of the bin. For example, range 0-5 will have 8 representatives, which is equal to the number of bin 5. The cumulative % is calculated by taking the ratio of the sum from the first range up to the range it is counting over the total sum of representatives.

=SUM($H$3:H3)/SUM($H$3:$H$12)
  • Create a combo chart with bar graph and line graph

5.JPG

  • Erase the space between each bar by setting 0% Series Overlap and 0% Gap Width

6.JPG

  • Format the graph with dimmed Axis and bar color in light gray. The line is highlighted with blue color and showing the label.

TIPS AND OPINIONS

  • Color: Use proper color for the bar graphhere I use color light gray not to interrupt the line but still give a great visualization to see the distribution.
  • Bins: With the data set, different bin structures will tell a different story. The more intervals of data, the more detail of the distribution will be visualized. Proper use of bin range will make it easier for the reader to understand the story behind.

REFERENCES:

  • Alexander, M., & Walkenbach, J. (2016). Excel dashboards and reports. Hoboken, NJ: John Wiley & Sons.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s