Introduction to Tableau and Power BI for Data Scientists

Eric Gustavo Romano
3 min readAug 29, 2021

Through out my journey as a Data Scientist you learn quickly that your role does not only include performing the latest trending algorithm but understanding how to apply these techniques into the respective business you are in. One of the best ways to communicate these models and insights are through powerful visualizations. This blog post will consist of an introduction to two useful tools every data scientist should explore and add to their analytical toolbox. You will understand the differences for both Tableau and Power BI followed by a use case using Tableau.

What is Tableau?

Tableau is a powerful data visualization tool that is extremely popular in the business intelligence industry. With the growth and adoption of data science applications many businesses consider it vital for data science related work. The compatibility with multiple data sources combined with its ease of use makes it an excellent choice for Data Scientists.

What is Power BI?

Power BI is also a powerful business intelligence and visualization tool offered by Microsoft. Having the ability to connect with various data sources to create interactive dashboard and BI reports. It also has multiple software connectors and services.

Here are the major differences between Tableau and Power BI

The rest of this blog post I will be applying a machine learning technique known as Sentiment Analysis using tableau.

First things first, we need to connect Tableau with Python to harness all the libraries to perform statistical analysis, predictive modeling, or machine learning. Tabpy was developed to do such a task and here’s an article that will help you get the server running, see this article.

In this blog post I will be applying sentiment analysis on news headlines found on the website coinmarketcap.com. I created a closing price prediction model utilizing the sentiment of news headlines as a feature for the model. You can find the following project and dataset used in this blog post in the following link.

After importing the data set and connecting with the server we can finally start using Tabpy. To start using our python code we need to open a standard Tableau calculated field with some syntax to connect the two.

Here is the code that I used:

There are 4 different script functions, one for each data type:

  • SCRIPT_INT — for integers / whole numbers
  • SCRIPT_REAL — for real numbers / decimal numbers
  • SCRIPT_STR — for strings / text
  • SCRIPT_BOOL — Boolean / true or false

These are specific to the data type you expect to return in your function. Since our sentiment score returns a range from -1.0 to 1.0, we will use the SCRIPT_REAL script to return a decimal number.

Another alteration of our python code required to function on Tableau:

  1. The key to connecting Python to Tableau is using arguments and Tableau fields. In python these are specified like _agr1, _agr2, _arg3.. . Arguments are arranged in descending order.
  2. We need to specify the “return” variable to obtain the data you want from the function.

You can now start using this code when creating visuals. Here is a visualization created for your reference.

--

--

Eric Gustavo Romano

Hi! My name is Eric Gustavo Romano. I am a data science enthusiast and practitioner located in Jersey.