Saturday 18 September 2021

Using Python with Tableau Prep Builder

Connecting to TabPy from Prep Builder

ü  Open Tableau Prep Builder

ü  Open the Superstore Sample flow

ü  Click the plus icon to the right of the "Clean 2" step

Select Add Script


ü  In the Settings panel at the left edge of the Prep builder UI, Select "Tableau Python (TabPy) Server" as the Connection type

    Click "Connect to Tableau Python (TabPy) Server"



ü  Enter "localhost" as the server and port 9004

Write the Python script

ü  Open Visual Studio Code

Exercise scenario: Sales management would like to gain insight on the revenue effect if discounting. To support this, we will use Python to add a field to our flow called "non-discounted sales".

In this context, our Python script must be written with a function that can be called from Prep Builder. We choose the name of this function. It will accept a structure called a dataframe, and will return a dataframe.

ü  Create a new file called nds.ps (nds = non-discounted sales).

First we'll simply print the dataframe to get a look at what our function will receive from Prep Builder.

ü  Add the following Python code to nds.py:

def nds(df):

  print(df)

  return(df)

ü  Save the file to your Documents directory. (The location does not matter as long as you can find it later.)

ü  In the Prep Builder configuration panel for the script added above, browse to the nds.py file to connect this script step to your new Python script.

ü  Look at the command prompt running TabPy. You should see the results of the print statement.



ü  Let's add our calculation. Replace the print(df) statement with the following:

  df['Non Discounted Sales'] = df['Sales'] / (1-df['Discount'])

This line creates a new column in the dataframe called "Non Discounted Sales".

ü  Be sure the two leading spaces are included. The Python specification requires the contents of a function to be indented.

If the schema of the returned dataframe does not exactly match that of the calling dataframe, we need to include an additional function called "get_output_schema".

In this case we will return our new "Non Discounted Sales" column along with the "Row ID" column so that we can join this column back into the flow.

ü  Add the following to nds.ps:

def get_output_schema():

return pd.DataFrame({

  'Row ID': prep_int(),

  'Non Discounted Sales': prep_decimal()

})

ü  Save the file.

ü  In Prep Builder, in the flow, click off of the script step icon then back on it. This will cause Prep Builder to re-execute the Python script.

You should see the result of the script in Prep Builder.



Join the result of the script to the flow before the "Create Superstore Sales.tde" output step by dragging the “Clean 2” step icon over the “Non Discounted Sales” script step icon, and dropping it on the “Join” area that appears


Thanks Guru's,
Prazwal Ks

Wednesday 15 September 2021

Running Python Scripts from Tableau Desktop

 

Tableau can pass code to TabPy through one of four different functions: SCRIPT_INT, SCRIPT_REAL, SCRIPT_STR and SCRIPT_BOOL. Each of these accommodates a different return type.


Exercise:-

In Tableau Desktop, Connect to the SuperStore sample dataset


ü  Create a calculated field by selecting the Analysis menu / Create a Calculated Field…

ü  Name this new calculation as  "City State" and with below Formula


SCRIPT_STR("

print(_arg1)

print(_arg2)

 

lst=[]

for i in range(0,len(_arg1)):

  lst.append(_arg1[i]+', '+_arg2[i])

return lst

",

ATTR([City]),ATTR([State])

)



ü  Create a New worksheet and Drag:

 year of Order Date to columns.

Region to Rows,

Sales to rows.


Drag the calculated field "City State" to color.




Check the result of the "City State" calculation in the tooltip.



Thanks Guru's,
Prazwal Ks

Friday 10 September 2021

Python Integration with Tableau using TabPY

 

Installation and TabPy Server Startup

Pre-requisite:-

Python Installation

Install Python 3/higher version  from https://anaconda.com/distribution

Anaconda is a convenient way to install Python plus Jupyter notebooks and all the tools needed.

Installing TabPy

ü  Start an Anaconda command prompt.



ü  At the Anaconda command prompt, type: “pip install tabpy”

 

ü  At the Anaconda command prompt, type “pip install tabpy_client” 

ü  At the command prompt, start TabPy by typing: “tabpy”


Be sure to note on which port tabpy is running. This will be the last output line from startup.cmd.9004 is the default.

Tableau Desktop:-

ü  Open Tableau Desktop

Connecting Tableau Desktop to TabPy

ü  To connect Tableau Desktop to your TabPy server: Select the Help menu, then select Manage External Service Connection 



ü  Enter the server name or ip address of the machine on which TabPy is running. In this case, "localhost" since it's running on the workstation that you are using. Enter the port number that you observed when you started TabPy (9004).

ü  We have not configured any security on our TabPy server, so leave the "sign in" and "Require SSL" boxes unchecked.



Thanks Guru's,
Prazwal Ks


Wednesday 8 September 2021

Explain Data in Tableau

 

Explain Data:-

Explain Data automatically provides AI-driven explanations for the value of a data point with a single click. 

Based on advanced statistical models, explanations are relevant and integrated, saving analysts’ time and enabling them to uncover insights they may not have found otherwise.

 Explain Data accelerates your exploration. 

 With a single click, Explain Data uses powerful Bayesian methods to surface   statistically significant explanations behind data points.

 Explain Data provides focused, interactive explanations for the specific data   points you are interested in. 

 Because every explanation is a fully-featured Tableau viz, Explain Data   encourages you to stay in your analytical flow as you engage more deeply   with your data.

 Reduce the risk of error.

 Explain Data helps reduce the risk of error from dirty data and human bias by   searching for explanations in   the entire data source, beyond what is shown   in the viz. 

 It's important to understand that explanations are based on models of the   data, and are not causal   explanations.

 While Explain Data can help uncover and describe relationships in your data,   it can't tell you what is   causing  those relationships.


How To Use Explain Data:-

This is a brief overview of how to use Explain Data. 

Build a visualization.

Make sure it uses a measure that is aggregated with SUM, AVG, COUNT, or COUNTD.

 

Select a mark.

Select a mark of interest, and then click on the light bulb icon in the tooltip, or right-click on the mark and select Explain Data. 

 


Explore the explanations.

If multiple explanations are available, click each explanation tab to see the related details.

 


 

If there are multiple measures in the view, click each measure tab for more explanations.

 


Open the explanation in a new sheet. 

Click the open icon in the top right corner of an explanation viz to open the visualization as a new worksheet and explore the data further.



 Thanks Guru's,

 Prazwal


 

Monday 30 August 2021

Customer Retention Analysis

Challenge:- 
Whether it's customers or other items you track over time, it is useful to understand how the  purchasing patterns/behaviours changes year over year.
Calculate following four key metrics about our Superstore customers.

1) Retained: Bought in the current year and in the previous year. (Bought in 2016 and 2015 would count as "retained" in 2016)

2) Churn:  Didn't buy in the current year, bought in the previous year. (Bought in 2014 but not in 2015, would count as "Churned" in 2015)

3) Resurrected: Bought in the current year, bought in a past year but not the previous year. (Bought in 2015, not in 2014, but in 2012. Would count as "Resurrected" in 2015)

4) Acquisition: Bought in the current year, haven't bought in any previous years. (Bought for the first time in 2014, counts as "Acquired" in 2014

Data Set: 
Superstore



DAX Time Intelligence Functions

   Thanks Guru's, Prazval.Ks