LibGuides: Statistics Resources: Displaying Data

Data Display Basics

When choosing the best visual representation for a dataset, there are a number of factors to consider. The key factor to consider is the purpose of the graph. What do you want to show in your graph? This answer will serve as a starting point for choosing an appropriate visual.

Another important factor to consider is the level of measurement of the variable(s) being displayed. When we have categorical data (nominal or ordinal levels of measurement), we often want to report the frequency or number of people in each category. With numerical data (interval or ratio levels of measurement), we want to show how the data is distributed. Of course, there are also graphs that use a combination of different variable types as well. Continue reading to learn about common methods of displaying data.

Categorical Data

Bar Graph – A bar graph (or bar chart) is used to display frequencies of the groups of categorical variables, such as gender, level of education, or race/ethnicity. For example, the following bar graph depicts the number of individuals who identified as male or female in a dataset.

simple bar count of gender

Bar graphs can also be used to show a measure of central tendency (i.e., mean) for a numerical variable based on group identification. The following bar chart shows the average dependent score for each treatment group after treatment.

Bar Chart

Pie Chart - A pie chart is used to show the proportion or percentage of responses belonging to the different levels of a categorical variable. The following pie chart shows the proportion of participants who were assigned to each of the treatment groups.

Pie Chart

Numerical Data

Histogram – A histogram is used to show the shape of a frequency distribution of a numerical variable (interval or ratio level of measurement). A visual assessment of a histogram can help you determine if the data is approximately normally distributed or not. The following histogram shows the distribution of dependent scores that were obtained before treatment.

Histogram

While a histogram may look similar to a bar graph, they have a distinct difference: The bars in the bar graph do not touch because each bar represents a distinct group of individuals. In the histogram, the bars represent a subset of the range of the continuous data.

Stem-and-Leaf Plot - Another option for visualizing the distribution of numerical data is by using a stem-and-leaf plot. This type of graph divides the data into a stem and leaves. This division will depend on the data that you're trying to chart. Consider the following example:

In the chart, values are shown in thousands. The stem represents the ten thousand place value, while the leaves represent the thousand place. So, 1|0 = 10,000. Each leaf is a separate entry in the dataset.

Box and Whisker Plot - Like histograms and stem-and-leaf plots, box and whisker plots (boxplots) can be used to show how numerical data is distributed. Unlike those graphs, boxplots can display multiple distributions in the same visual. Additionally, boxplots clearly identify outliers in the data. The following boxplot depicts the distribution of vehicle prices based on vehicle type.

Line Graph - Line graphs depict changes in numerical data over time. Finance data, showing monthly or quarterly sales, is an excellent example of when to use a line graph. The line graph below depicts the average sales for each quarter in 2022.

Scatterplot – When examining the relationship between two numerical variables, a scatterplot is an excellent visual of that relationship. The following scatterplot depicts the relationship between the original sale price and the 4-year resale value of the vehicles in the dataset.

scatterplot

Data Display Handout

Statistics Resources

ASC Chat

Data Display Basics

Was this resource helpful?