Data Viz for analysis and discovery
Through my work helping scientists at Google, Berkeley, and Yale analyze their data, as well as in my own past experience as a data analyst at Google, I’ve identified some core concepts of visualization that apply across many projects. In 2018, I synthesized and presented these at the SciPy conference, as a Distinguish Speaker in the University of Washington’s eSciences speaker series, and at the Moss Landing Research Station. The goal was to help scientists create better charts and graphs for their own use and to show analysis tool builders, like those at SciPy, the types of features that would be good to have by default.
goals, context, and constraints
Data visualization has lots of “rules”, like “pie charts are bad” and “don’t use rainbow color schemes”. However, these rules assume a certain set of constraints.
For example, for rainbow color schemes, it assumes that (1) somebody who is colorblind might look at the chart, (2) it might be printed in black & white, and (3) a perceptually even color scheme is more important than having the most possible perceptual variation between colors. Yet, there are plenty of instances in scientific analysis in which a chart will (1) only be viewed by one person who is not colorblind, (2) will only look at the chart on a digital screen, and (3) getting as much perceptual differentiation as possible is much more important than perceptually even schemes because the key question is “is there a difference” rather than “how large is the difference.”
As chart creators, we should identify goals/context/constraints for our charts, be aware of the assumptions behind any data viz “rule”, and learn the advantages/disadvantages of various types of charts and design decisions.