Statistics Resources

This guide contains all of the ASC's statistics resources. If you do not see a topic, suggest it through the suggestion box on the Statistics home page.

Data Display Basics

Data Display Basics

When choosing the best visual representation for a dataset, there are a number of factors to consider. The key factor to consider is the purpose of the graph. What do you want to show in your graph? This answer will serve as a starting point for choosing an appropriate visual.

Another important factor to consider is the level of measurement of the variable(s) being displayed. When we have categorical data (nominal or ordinal levels of measurement), we often want to report the frequency or number of people in each category. With numerical data (interval or ratio levels of measurement), we want to show how the data is distributed. Of course, there are also graphs that use a combination of different variable types as well. Continue reading to learn about common methods of displaying data.

Categorical Data

Bar Graph – A bar graph (or bar chart) is used to display frequencies of the groups of categorical variables, such as gender, level of education, or race/ethnicity. For example, the following bar graph depicts the number of individuals who identified as male or female in a dataset.

Bar graphs can also be used to show a measure of central tendency (i.e., mean) for a numerical variable based on group identification. The following bar chart shows the average dependent score for each treatment group after treatment.

Pie Chart - A pie chart is used to show the proportion or percentage of responses belonging to the different levels of a categorical variable. The following pie chart shows the proportion of participants who were assigned to each of the treatment groups.

Numerical Data

Histogram – A histogram is used to show the shape of a frequency distribution of a numerical variable (interval or ratio level of measurement). A visual assessment of a histogram can help you determine if the data is approximately normally distributed or not. The following histogram shows the distribution of dependent scores that were obtained before treatment.

While a histogram may look similar to a bar graph, they have a distinct difference: The bars in the bar graph do not touch because each bar represents a distinct group of individuals. In the histogram, the bars represent a subset of the range of the continuous data.

Stem-and-Leaf Plot - Another option for visualizing the distribution of numerical data is by using a stem-and-leaf plot. This type of graph divides the data into a stem and leaves. This division will depend on the data that you're trying to chart. Consider the following example:

In the chart, values are shown in thousands. The stem represents the ten thousand place value, while the leaves represent the thousand place. So, 1|0 = 10,000. Each leaf is a separate entry in the dataset.

Box and Whisker Plot - Like histograms and stem-and-leaf plots, box and whisker plots (boxplots) can be used to show how numerical data is distributed. Unlike those graphs, boxplots can display multiple distributions in the same visual. Additionally, boxplots clearly identify outliers in the data. The following boxplot depicts the distribution of vehicle prices based on vehicle type.

Line Graph - Line graphs depict changes in numerical data over time. Finance data, showing monthly or quarterly sales, is an excellent example of when to use a line graph. The line graph below depicts the average sales for each quarter in 2022.

Scatterplot – When examining the relationship between two numerical variables, a scatterplot is an excellent visual of that relationship. The following scatterplot depicts the relationship between the original sale price and the 4-year resale value of the vehicles in the dataset.