D3.js in Action, Third Edition cover
welcome to this free extract from
an online version of the Manning book.
to read more
or

6 Visualizing distributions

 

This chapter covers

  • Grouping data points into bins
  • Drawing a histogram
  • Comparing two distributions side-by-side with a pyramid chart
  • Calculating the quartiles of a dataset and generating box plots
  • Using violin plots to compare distributions of multiple categories

Visualizing distributions is a common request in data visualization. We use data distributions to assess how often data values occur within a specific bracket or the probability for data points to appear within a range.

In this chapter, we will study the distribution of salaries for data visualization practitioners based in the United States. The data behind the report we will build comes from the 2021 State of the Industry Survey hosted by the Data Visualization Society (DVS) (www.datavisualizationsociety.org). You can see this report in figure 6.1 or online at https://d3js-in-action-third-edition.github.io/dvs-salary-distribution.

For this report, we will start by building the most common representation of data distribution, a histogram, to visualize the salary of the survey’s 788 US-based and salaried respondents. We’ll then compare the wages of respondents identifying as women and men using two types of visualizations: a pyramid chart and box plots. The first one is handy for comparing two categories side-by-side. The latter offers an extra layer of information compared to histograms, revealing the quartiles and median of a dataset.

6.1 Binning data

6.2 Drawing a histogram

6.3 Creating a pyramid chart

6.4 Generating box plots

6.4.1 Calculating quartiles with the quantile scale

6.4.2 Positioning multiple box plots on a chart

6.4.3 Drawing a box plot

6.5 Comparing distributions with violin plots

6.6 Summary

sitemap