# Creating a Box Plot

## 1. Overview

This walkthrough shows you how to set up a box plot chart, also known as a box-and-whisker diagram. Box plots are used to visualize at once several key indicators of how the data is distributed in a dataset.

## 2. Background

The following diagram shows the elements of a box plot chart.

There are typically five measure values associated with a box plot data point:

- Upper Whisker (95th Percentile): Exactly 5 percent of the values in the data are greater than this value. (May instead be the highest value if not displayed separately as dots).
- Lower Whisker (5th Percentile): Exactly 5 percent of the values in the data are less than this value. (May instead be the smallest value if not displayed separately as dots).
- Upper Box (Upper Quartile): Exactly 25 percent of the values in the data are greater than this value.
- Lower Box (Lower Quartile): Exactly 25 percent of the values in the data are less than this value.
- Solid Band (Median): Represents the median (middle) value in the data.

In the toolbar, **Calculate Box Plot** with an *fx* in its icon will handle the setup and calculations for you, creating a formula metric set visualized as box plots, with some "outlier" points visible beyond the ends of each box plot.

There is also a **Box Plot** chart type in the toolbar, which is what is used to plot the box plots but does not calculate and summarize the values of a dataset. This option can be used if you have already calculated the statistical values based on a dataset; otherwise, the following demonstrates using the Calculate Box Plot option.

## 3. Walkthrough

### 3.1. Summarize With Box Plot

Create a new dashboard using the *Blank* template.

Go to the toolbar, click **Data Visualization**, and then select **Calculate Box Plot**. This adds a blank data visualization to the canvas.

### 3.2. Set up a Strip Plot

Go to **Explore** and drag the *Reseller Sales Amount* measure onto the data visualization. A single dot or data point is displayed.

Next, add *Product* as a row hierarchy. The result is a *Strip Plot* which displays a vertical arrangement of dots or data points.

A strip plot such as this can still give you an idea about the distribution of data. For example, the dots are drawn semi-transparently so that darker areas indicate there are multiple data points close together.

### 3.3. Add Box Plot

With the chart selected on the canvas, go to the **toolbar** and click **Add Box Plot**.

The chart now displays the box-and-whisker diagram as a second series which is actually a formula metric set. Use the dropdown at the top of the Data Analysis Panel to switch to this newly added metric set. You'll see five formula measures listed which represent the necessary calculations for the box plot.

Above the box-and-whisker diagram portion of the chart, you'll see a set of dots. These data points are outliers which belong to the first series.

Go to the **Properties** for the chart to see two series listed. The first series is displayed as a Point chart while the second series is displayed as a Box Plot chart.

In the Data Analysis Panel, click the Edit button of the first formula measure. In the *Configure Metric Set Element* dialog, scroll down and click **Formula**. In the formula bar, you can see that it calculates the 95th percentile which you can modify as needed (e.g., some box plots use 90th percentile for the highest sample).

### 3.4. Multiple Box Plot data points

In the Data Analysis Panel, with the formula metric set still displayed, click the **Edit** button at the top. In the *Configure Metric Set* dialog, scroll all the way down and click **Remove this metric set from the control**. This will remove the box plot.

Go to **Explore** and add *Date.Calendar* as a second row hierarchy in the Data Analysis Panel. You'll see a strip plot displayed for each date value.

With the chart still selected, go to the **toolbar** and click **Add Box Plot**. A box-and-whisker diagram is displayed for each date (year).

### 3.5. Cluster of data points

As a variation of the above example, set up a strip plot chart by also adding *Gender* as a **COLUMNS** hierarchy to the metric set.

This gives you a cluster of strip plots for each year, where each cluster has a male and female data point.

Go to the **toolbar** and click **Add Box Plot** to get the following result.

## 4. See also