In March 2017, Case & Deaton's paper Mortality and Morbidity in the 21st Century was published by the Brookings Institute. I got into a conversation on slack with some data viz friends about one of the images in the appendix where I was frustrated that the paper's authors had chosen a dual y-axis chart in which the two axis had different scales and different bases. I found this misleading. As a result, I found comparable data from the CDC and created a set of charts using the exact same data but different chart parameters, which I wrote about in Part I.
Part II, "Why a Dual Y-Axis Chart is Not a Normalized Delta Chart", is my response to a friend's follow up question.
In content this set of essays is about dual y-axis charts. However, it's also shows the value of both in having a critical, iterative, conversation and in creating charts with real data as part of that conversation. I gained a much deeper understanding of the nuance of these issues by iterating on the charts and in the back-and-forth conversation. Cheers to both creating and questioning!
In spring 2016, there was a bit of a debate sparked about whether "scrollytelling" or "steppers" is best. Like the “which visualization is best” and the “are pie charts really evil” debates, the question of "is scrollytelling or steppers best" doesn't really make sense to me. It’s like asking “which is the best tool: a hammer or a wrench?” There is no way to answer that question unless you know what the person is trying to do.
I wrote the article Why Choose? Scrollytelling & Steppers to explore why scrollytelling seems to work well, when steppers work well, and also to showcase a number of examples that take advantage of both techniques in some way.
In Spring of 2016 I collaborated with the team at Stamen on "Atlas of Emotions." One of the key challenges in this project was creating charts that gave both an immediate sense of an emotion and the intensity of that emotion. In The Shapes of Emotions, I share some of the visual effects that helped give this intuitive sense of an emotion, the ways I addressed various challenges, and what we learned in the process.
"As a child, I dreamed of being a National Geographic photographer. What could be better than going exploring to find just the right perspective to help everyone appreciate and better understand this amazing world we call home. I never expected that I would partially realize this dream in a completely different way. Instead of a camera’s lens, my tools included code, design, maps, and data. My first project with Stamen was creating an interactive page where users would compare and contrast maps showing various types of human impact across the Amazon Basin...."
- Read more in Exploring the Amazon with Code and Data
Joint Statistical Meeting Aug 2016, Chicago - Invited Speaker on the Recent Advances in Information Visualization panel, sharing research previously published in Proceedings of IEEE InfoVis
OpenVis Conf Apr 2016, Boston - Everything is Seasonal - video
Abstract: People, and our data, are heavily influenced by our regular hourly, daily, weekly, seasonal and annual patterns, as well as by typical (holidays, weather variation) and one-off (natural disasters, electrical outages, war, etc) aberrations to these patterns.
Time series analysis must take seasonality and seasonal variation into consideration, but so should any analysis comparing data at two different points in time. For example, frustrated with how much worse the traffic has gotten from July to November? Worried it's just going to keep getting worse? Consider that most commuters take a week's vacation during the 8 weeks of summer, but almost nobody takes vacation in early November. Based on that alone, you'd expect there to be around 14% more cars on the road in early November than in July.
This talk uses examples to illustrate 6 key recommendations for thinking about the role of seasonality in your data, both to avoid (sometimes surprising) pitfalls and to reveal insights that are too often aggregated away.
IEEE InfoVis Nov 2014, Paris - Visualizing Statistical Mix Effects and Simpson's Paradox - paper
Abstract: We discuss how “mix effects” can surprise users of visualizations and potentially lead them to incorrect conclusions. This statistical issue (also known as “omitted variable bias” or, in extreme cases, as “Simpson's paradox”) is widespread and can affect any visualization in which the quantity of interest is an aggregated value such as a weighted sum or average. Our first contribution is to document how mix effects can be a serious issue for visualizations, and we analyze how mix effects can cause problems in a variety of popular visualization techniques, from bar charts to treemaps. Our second contribution is a new technique, the “comet chart,” that is meant to ameliorate some of these issues.
teaching & Workshops
Elijah Meeks' Complex Data Visualization with D3 weekend workshop - taught 3-hour Drawing with Data: Creating Custom Visualizations workshop session
Lick-Wilmerding High School in San Francisco - 10th grade class - guest teacher introducing data visualization
BB&N grade school in Boston - 2nd grade class - guest teacher introducing data visualization
Google's internal training program - Taught several half-day data visualization courses for Googlers
speaker series & bay area meetups
Metis San Francisco Data Science Meetup, April 2016 - Panel Discussion: Creating Custom, But Generalizable, Visualizations
Bay Area D3 Meetup, July 2015 - Map Matching - video
SF Data Mining Meetup, June 2015 - Your Data Doesn't Mean What You Think It Does - slide deck
SF D3 Meetup - Math to D3, Feb 2015 - video
UC Davis SIAM Mathematics in Industry Speaker Series, May 2014 - Visualizing Data for Analysis and Data-Driven Questions
Abstract: In my 5 years analyzing data on Google's "revenue team," I discovered that data visualization could, and should, be a critical part of effective analysis. In this talk I'll demonstrate why it's important to look beyond top-line metrics and how great visualizations help make sense of tens, hundreds, thousands, millions, or billions of data points at a time. I will also share the two most important techniques I've learned for creating effective data visualizations and tell a bit of the story of how I got from abstract math research as an undergrad to teaching high schoolers to Google analyst to freelance data visualization.
USF's Digital Literacy Course for undergraduates - shared my experiences as a data visualization professional
TechChange's Technology for Data Visualization online course - interviewed as a guest expert, with a focus on sharing different views on common data