6.4. Making meaningful visualizations#

ā€œData is food for AI.ā€ ā€“ Andrew Ng

One of the basic skills of a data scientist is the ability to create a meaningful data visualization that helps answer questions you might have. Prior to visualizing your data, you need to ensure that it has been cleaned and prepared, as you did in prior sections. After that, you can start deciding how best to present the data.

In this section, you will review:

  1. How to choose the right chart type

  2. How to avoid deceptive charting

  3. How to work with color

  4. How to style your charts for readability

  5. How to build animated or 3D charting solutions

  6. How to build a creative visualization

6.4.1. Choose the right chart type#

In previous sections, you experimented with building all kinds of interesting data visualizations using Matplotlib and Seaborn for charting. In general, you can select the right kind of chart for the question you are asking using this table:

You need to:

You should use:

Show data trends over time

Line

Compare categories

Bar, Pie

Compare totals

Pie, Stacked Bar

Show relationships

Scatter, Line, Facet, Dual Line

Show distributions

Scatter, Histogram, Box

Show proportions

Pie, Donut, Waffle

Depending on the makeup of your data, you might need to convert it from text to numeric to get a given chart to support it.

6.4.2. Avoid deception#

Even if a data scientist is careful to choose the right chart for the right data, there are plenty of ways that data can be displayed in a way to prove a point, often at the cost of undermining the data itself. There are many examples of deceptive charts and infographics!

How Charts Lie by Alberto Cairo

Click the image above for a conference talk about deceptive charts

This chart reverses the X axis to show the opposite of the truth, based on date:

bad chart 1

This chart is even more deceptive, as the eye is drawn to the right to conclude that, over time, COVID cases have declined in the various counties. In fact, if you look closely at the dates, you find that they have been rearranged to give that deceptive downward trend.

bad chart 2

This notorious example uses color AND a flipped Y axis to deceive: instead of concluding that gun deaths spiked after the passage of gun-friendly legislation, in fact the eye is fooled to think that the opposite is true:

bad chart 3

This strange chart shows how proportion can be manipulated, to hilarious effect:

bad chart 4

Comparing the incomparable is yet another shady trick. There is a wonderful web site all about ā€˜spurious correlationsā€™ displaying ā€˜factsā€™ correlating things like the divorce rate in Maine and the consumption of margarine. A Reddit group also collects the ugly uses of data.

Itā€™s important to understand how easily the eye can be fooled by deceptive charts. Even if the data scientistā€™s intention is good, the choice of a bad type of chart, such as a pie chart showing too many categories, can be deceptive.

6.4.3. Color#

You saw in the ā€˜Florida gun violenceā€™ chart above how color can provide an additional layer of meaning to charts, especially ones not designed using libraries such as Matplotlib and Seaborn which come with various vetted color libraries and palettes. If you are making a chart by hand, do a little study of color theory

Be aware, when designing charts, that accessibility is an important aspect of visualization. Some of your users might be color blind - does your chart display well for users with visual impairments?

Be careful when choosing colors for your chart, as color can convey meaning you might not intend. The ā€˜pink ladiesā€™ in the ā€˜heightā€™ chart above convey a distinctly ā€˜feminineā€™ ascribed meaning that adds to the bizarreness of the chart itself.

While color meaning might be different in different parts of the world, and tend to change in meaning according to their shade. Generally speaking, color meanings include:

Color

Meaning

0

red

power

1

blue

trust, loyalty

2

yellow

happiness, caution

3

green

ecology, luck, envy

4

purple

happiness

5

orange

vibrance

If you are tasked with building a chart with custom colors, ensure that your charts are both accessible and the color you choose coincides with the meaning you are trying to convey.

6.4.4. Styling your charts for readability#

Charts are not meaningful if they are not readable! Take a moment to consider styling the width and height of your chart to scale well with your data. If one variable (such as all 50 states) need to be displayed, show them vertically on the Y axis if possible so as to avoid a horizontally-scrolling chart.

Label your axes, provide a legend if necessary, and offer tooltips for better comprehension of data.

If your data is textual and verbose on the X axis, you can angle the text for better readability. Matplotlib offers 3d plotting, if you data supports it. Sophisticated data visualizations can be produced using mpl_toolkits.mplot3d.

3d plots

6.4.5. Animation and 3D chart display#

Some of the best data visualizations today are animated. Shirley Wu has amazing ones done with D3, such as ā€˜film flowersā€™, where each flower is a visualization of a movie. Another example for the Guardian is ā€˜bussed outā€™, an interactive experience combining visualizations with Greensock and D3 plus a scrollytelling article format to show how NYC handles its homeless problem by bussing people out of the city.

busing

Bussed out: How America moves thousands of homeless people around the country. (n.d.). The Guardian. Retrieved 5 November 2022, from http://www.theguardian.com/us-news/ng-interactive/2017/dec/20/bussed-out-america-moves-homeless-people-country-study

While this section is insufficient to go into depth to teach these powerful visualization libraries, try your hand at D3 in a Vue.js app using a library to display a visualization of the book ā€œDangerous Liaisonsā€ as an animated social network.

ā€œLes Liaisons Dangereusesā€ is an epistolary novel, or a novel presented as a series of letters. Written in 1782 by Choderlos de Laclos, it tells the story of the vicious, morally-bankrupt social maneuvers of two dueling protagonists of the French aristocracy in the late 18th century, the Vicomte de Valmont and the Marquise de Merteuil. Both meet their demise in the end but not without inflicting a great deal of social damage. The novel unfolds as a series of letters written to various people in their circles, plotting for revenge or simply to make trouble. Create a visualization of these letters to discover the major kingpins of the narrative, visually.

You will complete a web app that will display an animated view of this social network. It uses a library that was built to create a visual of a network using Vue.js and D3. When the app is running, you can pull the nodes around on the screen to shuffle the data around.

liaisons

6.4.6. Project: Build a chart to show a network using D3.js#

This section includes a solution folder where you can find the completed project, for your reference.

  1. Follow the instructions in the README.md file in the starter folderā€™s root. Make sure you have NPM and Node.js running on your machine before installing your projectā€™s dependencies.

  2. Open the starter/src folder. Youā€™ll discover an assets folder where you can find a .json file with all the letters from the novel, numbered, with a ā€˜toā€™ and ā€˜fromā€™ annotation.

  3. Complete the code in components/Nodes.vue to enable the visualization. Look for the method called createLinks() and add the following nested loop.

Loop through the .json object to capture the ā€˜toā€™ and ā€˜fromā€™ data for the letters and build up the links object so that the visualization library can consume it:

//loop through letters let f = 0; let t = 0; for (var i = 0; i < letters.length; i++) { for (var j = 0; j < characters.length; j++) { if (characters[j] == letters[i].from) { f = j; } if (characters[j] == letters[i].to) { t = j; } } this.links.push({ sid: f, tid: t }); }

Run your app from the terminal (npm run serve) and enjoy the visualization!

6.4.7. Self study#

Here are some articles to read about deceptive data visualization:

Take a look at these interest visualizations for historical assets and artifacts:

Look through this article on how animation can enhance your visualizations:

6.4.8. Your turn! šŸš€#

Take a tour of the internet to discover deceptive visualizations. How does the author fool the user, and is it intentional? Try correcting the visualizations to show how they should look.

Assignment -

6.4.9. Acknowledgments#

Thanks to Microsoft for creating the open-source course Data Science for Beginners. It inspires the majority of the content in this chapter.