Selecting the right chart type when telling your story

One of the resources from Tableau that we use in the seminar on telling your data story is this whitepaper titled Visual Analysis Best Practices.  The first section shows different chart types and what’s really nice is it matches them to the story you are trying to tell.  One issue it points out is that although we often see pie charts, they are generally not a good choice.

Another chart that gets used (or misused) a lot is a bar chart.  Although they are excellent for comparing counts, too often they are used to show averages, and they too often hide the interesting story about the distribution behind the average.  An excellent read on this topic is this blog post by Martin Fowler titled Don’t Compare Averages.  Although Martin Fowler is best know more for his work in software development, and particularly agile development, this article provides an excellent example of why not to use a bar chart for visualizing averages.

Since the article talks about showing the distribution of the data, two examples in Tableau’s whitepaper are histograms and box and whisker plots, and Storytelling with Data had two blog posts that are helpful in understanding how these are used: Differences Between Histograms and Bar Charts  and What is a Boxplot?.

Looking for a New Challenge? Consider Tableau’s Student Iron Viz Competition!

The Fall 2021 semester is winding down, and possibly you are looking for a new challenge.  If so, consider creating a visualization to enter in the Tableau Iron Viz (Student Edition):

https://www.tableau.com/academic/student-Iron-Viz

The deadline for entry is December 31st, but if you took the Data Science for All seminar on visualizing your data story using Tableau, you have already completed the first two steps to competing!

The prizes are mainly swag, bragging rights (looks good on a resume or LinkedIn), some training, a chance to network, and even if you don’t win, just competing helps build your portfolio of projects and shows recruiters you are interested in the field beyond the classroom.

This competition is only open to students, and the above linked competition website lists the following four steps:

  • Join Tableau for Students
  • Create a Tableau Public Profile
  • Get the Data
  • Submit your Viz

If you participated in the seminar, you have already completed steps 1 and 2!

The competition website even has a 1-hour video from a webinar with last year’s winner and runner up.

Are you into Formula 1? Take a look at their data

Auto racing has embraced Big Data and data science.  This web page at AWS discusses their partnership with Formula 1, with each race car generating over 1.1 million data points per second that are transmitted from the car to the pits.  If you want to play around with some of the data, there is a Python module named Fast F1 That will provide access to data and has examples of some analysis you can do with the data.  Another example using this data is covered in this Medium post.

A different set of historical data on Formula 1 is also available through Kaggle.

 

What is Data Science?

If you Google for a definition, you will no doubt find Venn diagrams with three circles (Venn diagrams are those figures you have seen with overlapping circles showing the relationships between things).  If you Google for “data science Venn diagram”, you will find some where folks went wild and have a dozen overlapping circles, but most have 3 overlapping circles (maybe because people who write definitions, such as academics, always seem to want to describe things as three-legged stools).

Recently Datanami had an interview with Jeffrey Ullman from Stanford who is a big name in computer science (particularly databases), and in 2020 won the Turing Award (think Nobel prize in computer science).  The article is short, but interesting, and he points out that everyone has their own diagram that emphasizes their domain!

You can find the article at this link.

Google Cloud Next ’21 is coming up next week, virtual, and free

Google has a cloud computing event on October 12-14th that has sessions on AI & ML (which may be of interest if you took the machine learning seminar), data analytics (which may be interesting if you found any of the Data Science for All seminars interesting), and other topics ranging from Databases to Diversity, Equity & Inclusion, which may be of particular interest if you are participating in the DEI Summit that the MISA group here at SJSU is hosting today!

I noticed the conference because there is going to be a Kaggle session titled, “The State of Data Science and Machine Learning 2021” on October 14th at 3:30 PST that will include Julia Eliott (Kaggle’s interim CEO), Kaggle grandmasters, and others.  If you participated in the free Kaggle machine learning course we posted about this past summer, this may be of interest to you.

To Register for the event, go to: https://cloud.withgoogle.com/next