Skip Header

Component ID: #ti847798910

Census Data with R

Component ID: #ti1691623311

Using the R Package RankingProject to Make Simple Visualizations for Comparing Populations

Developed and presented by Jerzy Wieczorek.


Skill level: Advanced

Duration: 1-2 hours

Component ID: #ti2113892144


This course introduces the "RankingProject" package in R, which accompanies "A Primer on Visualizations for Comparing Populations, Including the Issue of Overlapping Confidence Intervals" (Wright, Klein, and Wieczorek, 2018). In comparing a collection of K populations, it is common practice to display K confidence intervals (CIs) for the corresponding population parameters on a single graph. For a pair of CIs that do (or do not) overlap, many viewers find it natural to declare that there is not (or there is) a statistically significant difference between the two corresponding parameters, even though it is well known that this interpretation is not strictly correct.

We will discuss several alternative visualizations designed to help data users avoid this common misinterpretation. CIs for differences from a baseline make the reference population explicit. "Comparison intervals" show a CI for the reference as well as CIs for its difference with other populations. "Shaded columns plots" show the statistical significance of differences directly. Goldstein-Healy adjusted CIs show a confidence level chosen such that overlap (non-overlap) of CIs does indeed imply non-significance (significance) of differences at an "average significance level" across all possible pairwise comparisons. Two-tiered error bars allow us to show several types of CIs at once.

We will justify and recommend use-cases for each of these plots. Finally, we will demonstrate how to produce them in R with the RankingProject package, illustrating its usage on several U.S. Census Bureau datasets with a variety of population types and demographic variables.

Component ID: #ti1123882291

Who Should Take this Course?

Data Analysts, Data Scientists and developers who wish to learn more about how to use Census Data with R to create visualizations.

Component ID: #ti402091447


Jerzy Wieczorek is an Assistant Professor of Statistics at Colby College. His research focuses on model selection and assessment, from cross-validation in high-dimensional settings to multiple comparisons-corrected visualization of estimates with uncertainty.

Course Materials

Component ID: #ti781888781

Module 1: Motivations

In this module you will learn about:

  • Motivations
  • Reviewing ranking tables, statistical significance and confidence intervals
  • How to best visualize and analyze ranking tables

Component ID: #ti935708601
Component ID: #ti1803751715

Module 2: Visualization

In this module you will learn about:

  • Plotting ranking tables and statistical significance
  • Plotting Comparison intervals
  • The Goldstein-Healy Concept
  • Two-Tiered Confidence Intervals (CIs)

Component ID: #ti1057804536
Component ID: #ti2048605229

Module 3: R Package Ranking Project

In this module you will learn about:

  • Datasets Structure and Formatting
  • Setting up a Table for Plots of CIs for Differences
  • Cleaning up and Modifying Plots
  • Where to Access the R Package Ranking Project

Component ID: #ti842360523


Back to top


Back to Header