Creating a Box Plot

Contents[Hide]

1. Overview

This walkthrough shows you how to set up a box plot chart which is also known as a box-and-whisker diagram. Box plots are used to summarize the distribution of data in a compact format.

2. Background

The following diagram shows the elements of a box plot chart.

Elements of a box plot
Elements of a box plot

There are typically five measure values associated with a box plot data point:

  1. Upper Whisker (Highest Sample): The highest value in the data. Sometimes used to represent the 95th percentile.
  2. Lower Whisker (Lowest Sample): The lowest value in the data. Sometimes used to represent the 5th percentile.
  3. Upper Box (Upper Quartile): Exactly 25 percent of the values in the data are greater than this value.
  4. Lower Box (Lower Quartile): Exactly 25 percent of the values in the data are less than this value.
  5. Solid Band (Median): Represents the median (middle) value in the data. 

In the Data Visualization toolbar, there is a Box Plot chart type which you can add to your canvas. The only caveat with this is that you need to calculate the above measure values yourself.

However, in DBI Version 2.5 or later, there is a new toolbar option Summarize With Box Plot which handles all of the setup and calculations for you. The focus of this walkthrough is on using this new option.

3. Walkthrough

3.1. Summarize With Box Plot

Create a new dashboard using the Blank template.

Go to the toolbar, click Data Visualization, and then select Summarize With Box Plot. This adds a blank data visualization to the canvas.

Click Summarize With Box Plot
Click Summarize With Box Plot

3.2. Set up a Strip Plot

Go to Explore and drag the Reseller Sales Amount measure onto the data visualization. A single dot or data point is displayed.

Add Reseller Sales Amount as a measure
Add Reseller Sales Amount as a measure

Next, add Product as a row hierarchy. The result is a Strip Plot which displays a vertical arrangement of dots or data points.

A strip plot such as this can still give you an idea about the distribution of data. For example, the dots are drawn semi-transparently so that darker areas indicate there are multiple data points close together.

Add a Product hierarchy to create a strip plot
Add a Product hierarchy to create a strip plot

3.3. Add Box Plot

With the chart selected on the canvas, go to the toolbar and click Add Box Plot.

Click Add Box Plot
Click Add Box Plot

The chart now displays the box-and-whisker diagram as a second series which is actually a formula metric set. Use the dropdown at the top of the Data Binding Panel to switch to this newly added metric set. You'll see five formula measures listed which represent the necessary calculations for the box plot.

Box plot is added as a second metric set
Box plot is added as a second metric set

Above the box-and-whisker diagram portion of the chart, you'll see a set of dots. These data points are outliers which belong to the first series.

Go to the Properties for the chart to see two series listed. The first series is displayed as a Point chart while the second series is displayed as a Box Plot chart.

Properties showing two series
Properties showing two series

In the Data Binding Panel, click the Edit button of the first formula measure. In the Configure Metric Set Element dialog, scroll down and click Formula. In the formula bar, you can see that it calculates the 95th percentile which you can modify as needed (e.g., some box plots use 90th percentile for the highest sample).

Formula for calculating the highest sample
Formula for calculating the highest sample

3.4. Multiple Box Plot data points

In the Data Binding Panel, with the formula metric set still displayed, click the Edit button at the top. In the Configure Metric Set Binding dialog, scroll all the way down and click Remove this metric set from the control. This will remove the box plot.

Remove the formula metric set
Remove the formula metric set

Go to Explore and add Date.Calendar as a second row hierarchy in the Data Binding Panel. You'll see a strip plot displayed for each date value.

Add Date.Calendar as a second row hierarchy
Add Date.Calendar as a second row hierarchy

With the chart still selected, go to the toolbar and click Add Box Plot. A box-and-whisker diagram is displayed for each date (year).

Show a box plot for each year
Show a box plot for each year

3.5. Cluster of data points

As a variation of the above example, set up a strip plot chart by also adding Gender as a COLUMNS hierarchy to the metric set.

This gives you a cluster of strip plots for each year, where each cluster has a male and female data point.

Add Gender as a COLUMNS hierarchy
Add Gender as a COLUMNS hierarchy

Go to the toolbar and click Add Box Plot to get the following result.

Cluster of box plots for each year
Cluster of box plots for each year

4. See also

 

Dundas Data Visualization, Inc.
500-250 Ferrand Drive
Toronto, ON, Canada
M3C 3G8

North America: 1.800.463.1492
International: 1.416.467.5100

Dundas Support Hours: 7am-6pm, ET, Mon-Fri