TABLEAU FOR BUSINESS USERS: A hands-on approach

Data-Driven decision making is no longer a "nice to have" in today's context but an absolute must. Unfort

794 104 28MB

English Pages 134 Year 2021

Report DMCA / Copyright

DOWNLOAD FILE

Polecaj historie

TABLEAU FOR BUSINESS USERS: A hands-on approach

  • Commentary
  • TABLEAU FOR BUSINESS USERS, An approach

Table of contents :
Title Page
Table of Contents
1 Introduction
1.1 Why visualize data ?
1.2 Who is this book for ?
1.3 How is this book different ?
1.4 How to contact us
1.5 Acknowledgements
2 Installation & Setup
2.1 Installation of Tableau
2.2 Data sources required for the exercises in the book
3 Fundamentals of Data
3.1 Data types
3.2 Data sources
3.3 Data preparation
3.4 Converting "Business questions" to the language of data
4 The Crux of Tableau
4.1 The 4 building pillars
4.1.1 Dimensions, Measures & Aggregations
4.1.2 Viz Pane - columns & rows shelf
4.1.3 Marks card
4.1.3.1 Color block
4.1.3.2 Size block
4.1.3.3 Label block
4.1.3.4 Detail block
4.1.3.5 Tooltip block
4.1.4 Filters
4.2 Putting it all together
4.3 Show Me
4.4 Sheets & Dashboards
5 Calculations
5.1 Grouping values
5.2 Calculated Fields
5.3 Row level, Aggregation & Dis-aggregation
5.4 Bringing in more data
5.4.1 From Excel/ CSV
5.4.2 From MySQL
5.5 Importance of cardinality - A practical example
5.6 Data Modeling
6 Tables & Table calculations
6.1 Show me or start from scratch ?
6.2 Table totals
6.3 Table calculations
6.3.1 Table & Pane - Down & Across
6.3.2 Down then Across & Across then Down
6.3.3 Shortcut to reading Table calculations in English
6.3.4 Formulation of Table calculations
6.3.5 Comparisons - YoY, WoW, MoM
6.4 Sorting
6.4.1 Nested Sort
6.4.2 Rank Sort
6.4.3 Sorting in Blended data
7 Advanced Tips
7.1 Dynamic Inputs - Parameters
7.2 Top 10/20/50 filters
7.3 Dual Axis
7.4 Shapes & Icons
7.5 Level of Detail (LOD) calculations
7.5.1 Fixed LOD
7.5.2 Include LOD
7.5.3 Exclude LOD
7.6 Reference Lines & Forecasts
7.6.1 Reference Lines using Parameters
7.6.2 Reference Lines using secondary data
7.6.3 Forecast & Trend lines
7.7 Order of operations
8 Dashboards
8.1 Less is more
8.2 Dashboards: A view from 10000ft
8.3 Fit & Layout
8.4 Filters & Interactions
8.4.1 Customizing filters
8.4.2 Discrete vs continuous filters
8.4.3 Filter domain
9 Useful links

Citation preview

Table of Contents 1

Introduction 1.1 Why visualize data ? 1.2 Who is this book for ? 1.3 How is this book different ? 1.4 How to contact us 1.5 Acknowledgements 2 Installation & Setup 2.1 Installation of Tableau 2.2 Data sources required for the exercises in the book 3 Fundamentals of Data 3.1 Data types 3.2 Data sources 3.3 Data preparation 3.4 Converting "Business questions" to the language of data 4 The Crux of Tableau 4.1 The 4 building pillars 4.1.1 Dimensions, Measures & Aggregations 4.1.2 Viz Pane - columns & rows shelf 4.1.3 Marks card 4.1.3.1 Color block 4.1.3.2 Size block 4.1.3.3 Label block 4.1.3.4 Detail block 4.1.3.5 Tooltip block 4.1.4 Filters 4.2 Putting it all together 4.3 Show Me

4.4 Sheets & Dashboards 5 Calculations 5.1 Grouping values 5.2 Calculated Fields 5.3 Row level, Aggregation & Dis-aggregation 5.4 Bringing in more data 5.4.1 From Excel/ CSV 5.4.2 From MySQL 5.5 Importance of cardinality - A practical example 5.6 Data Modeling 6 Tables & Table calculations 6.1 Show me or start from scratch ? 6.2 Table totals 6.3 Table calculations 6.3.1 Table & Pane - Down & Across 6.3.2 Down then Across & Across then Down 6.3.3 Shortcut to reading Table calculations in English 6.3.4 Formulation of Table calculations 6.3.5 Comparisons - YoY, WoW, MoM 6.4 Sorting 6.4.1 Nested Sort 6.4.2 Rank Sort 6.4.3 Sorting in Blended data 7 Advanced Tips 7.1 Dynamic Inputs - Parameters 7.2 Top 10/20/50 filters 7.3 Dual Axis 7.4 Shapes & Icons 7.5 Level of Detail (LOD) calculations 7.5.1 Fixed LOD 7.5.2 Include LOD

7.5.3 Exclude LOD 7.6 Reference Lines & Forecasts 7.6.1 Reference Lines using Parameters 7.6.2 Reference Lines using secondary data 7.6.3 Forecast & Trend lines 7.7 Order of operations 8 Dashboards 8.1 Less is more 8.2 Dashboards: A view from 10000ft 8.3 Fit & Layout 8.4 Filters & Interactions 8.4.1 Customizing filters 8.4.2 Discrete vs continuous filters 8.4.3 Filter domain 9 Useful links Tableau Public & Tableau Desktop Most of the examples illustrated here can be followed along with Tableau Public. Cases requiring Tableau Desktop are highlighted.

1.1 Why visualize data ? Over the past few decades, Excel has become the de facto data analytics tool for most business users. When you need to sum two values, it couldn’t be simpler than clicking on the first value that you need to add, follow it up with a "+" sign and the next value to be added. Voilà, you’ve got yourself the total of two values. Drag the formula down by clicking on the corners, you’ve got yourself a sum of 2 columns. Unfortunately this flexibility comes at a cost. The user gets gradually trapped in the world of quick fixes and patched formulae that Excel has to offer. Initially Lotus 123, the predecessor of Excel was conceived primarily as a data entry tool and indeed Excel "excels" at this task. But now in this new era of big data, data visualization and data analytics deserve their own tool-kit. The human brain does a very poor job deciphering meaningful trends from a table of raw data (numbers) but at the same time excels at comparing, extrapolating and spotting trends in visual shapes and colors. It turns out the brain is able to take in a picture and process it in one stroke while on the other hand processes text in a linear fashion. Imagine, a

bar chart which condenses 100 rows of data into a few columns against reading the rows one by one. You start to get the picture. Having said that, it’s the responsibility of the analyst to effectively distill and convey the meaning hidden behind the numbers through effective and meaningful visualizations. If you take a close look at the following table of raw numbers, you’ll be able to make a few observations. There are four datasets Each dataset has an x and y column Numbers seem to range from 4 to 13 There are atmost 2 decimals

Figure 1: Anscombe’s dataset

And we start squeezing our eyelids together to squeeze out more information from this table. The more astute among you, might have copy, pasted these numbers into a good ol’ excel sheet and grabbed your "I’m a data scientist" coffee mug. You start making a list Average of x: 9 Sample variance of x: 11 Average of y: 7.5 Sample variance of x: 4.125 Correlation between x and y: 0.816 A nice linear regression line: y = 3 + 0.5x R2 of the linear regression line: 0.67 That’s a lot of numbers and now the strange thing that you notice is that this above list is the same across all the 4 datasets. It’s fairly easy to make simplistic reductions about the distributions of x and y that they are similar based on the summary statistics. A quick visualization would instantly reveal the hidden gems in the distributions. This example also helps underline the importance of exploratory data analytics before drawing any inferences and conclusions.

Figure 2: Anscombe’s dataset in Tableau Having said that, I’m definitely not discouraging the use of Excel in any way. It’s a powerful tool in the repertoire of any competent professional. The main point I want to drive home is the fact that Excel often needs to be

complemented by a data visualization tool to help effectively communicate and share your findings. Microsoft PowerBI is just a good tool as Tableau but this book being about Tableau, I’ll contain myself to illustrations with Tableau.

1.2 Who is this book for ? The primary intended audience of this book are Business analysts, Data Analysts and Financial Analysts or more broadly anyone who is hitting the limits of Excel with their data analytics needs. If your day to day revolves around staring at numbers all day long, then you’re definitely part of the target audience. There are no prerequisites to follow along the concepts in this book. We will work our way gradually from the very fundamentals of data all the way up to to building fancy dashboards & visualizations on gigabytes of data.

1.3 How is this book different ? There are many books on the market which are excellent Tableau user guides and reference manuals. They do an excellent job of presenting every menu tab, button, pane and shelf in Tableau. If you’re the kind of person who needs to know every single button and functionality tucked into Tableau then this might not be the right book for you. When you start to learn a new language and want to go about it in a systematic and methodical way, you would start with the grammar. Understanding the foundational underpinnings of the languages, helps you get the basics right and then it’s a matter of stringing words together to make sentences. Lining up words within the rules defined by the grammar (or not) in infinite possible ways, to write Shakespearean poetry or tabloid articles or have conversations, is a logical next step. This book intends to approach the subject of mastering Tableau in a similar fashion. We’ll try to distill the very core essence of Tableau in a few concepts and then it’s just a matter of combining them in infinite possible ways to build the required data visualizations. Icons used in this book

Tips & shortcuts worth keeping in mind Traps to watch out for in Tableau which could help you avoid potential headaches down the road. Technical details which are not necessary to follow along the chapters in this book and can be comfortably skipped or glossed over.

1.4 How to contact us Please address comments and questions concerning this book to: [email protected]. Please feel free to reach out if you need any help on your data analytics projects at: [email protected]. You can also find me on LinkedIn.

1.5 Acknowledgements As always, as with all dedications, I would like to thank my parents who enabled me to write this book in the first place and my wife who supports me in all my endeavors. I would also like to make a special dedication to my kids Nikie and Brooklyn who show me the joy of life everyday.

2.1 Installation of Tableau We’ll gloss over very quickly the installation of Tableau public which you will require to follow along with this book. Unfortunately for the corner cases, business users working on Macs and Linux, I’ll have to redirect your questions and concerns to the almighty Google. Head over to Tableau public’s site. Hand over your email, as usual, in return for the Tableau public installation file (.exe).

Figure 3: Tableau public download page Double click on the .exe and follow along the instructions to complete the installation of Tableau Public.

Figure 4: Tableau public download page It’s time to take the cover off and take tableau for a quick spin. As soon as you open up Tableau, click on Microsoft Excel under connect as shown in the figure below.

Figure 5: Tableau - Connect to Excel In the next screen, drag and drop the Excel tab from which you’d like to import the data as shown in the figure below. Make sure that the data looks right in the preview pane before clicking on the Sheet 1

Figure 6: Tableau - Import Data from Tab

2.2 Data sources required for the exercises in the book The various tableau workbooks, Excel and CSV files required for you to follow along are available at this book’s Github source page. You can also directly get the workbooks from the Tableau public server available under my profile.

3.1 Data types Now that we’re done with the formalities of setting up Tableau, let’s step into the shallow end of the pool with some basic data types. Essentially, we can broadly categorize any piece of data into the following 3 types

Absolutely nothing mind blowing here so far. To give you a few business examples (keeping inline with the book’s audience) String / textual data types could include product names, categories, country names etc. (Identified by "Abc" next to column names in Tableau)

Numeric data types could include profit, sales, sale price etc. (Identified by the icon "#") Date types could include invoice date, shipping date, return time etc. (Identified by the calendar icon) You have the flexibility to change the data types if Tableau didn’t interpret the data type correctly. Say for example, the column Row ID which is interpreted as a number needs to be changed into a String, click on the "123" number icon next to the column name and switch it to the String datatype.

Figure 7: Variable types in Tableau You should also have noticed that the numeric columns are arranged at the bottom of the screen under the "Measures" pane and the textual, date time column types are arranged at the top under the "Dimensions" pane. This is based on how tableau considers and treats these data types as either continuous or discrete quantities. In addition to the classification of data types based on their intrinsic value, we could classify them based on the presence of continuity or not in the values. For example, the product names are discrete values. In the

superstore dataset, you’ll notice that there are 3 types of values in the category column (Furniture, office supplies and technology). There is no sense of continuity between the three distinct values and hence we would call them discrete values. Dimensions Measures You can quickly filter the columns by type using the following shortcuts: D: Displays just the dimensions. M: Displays just the measures. C: Displays just the calculated fields that you have created in Tableau (covered in the section about calculations). On the other-hand, take the example of the order date column which contains dates. When you want to understand the sales over the past 3 years of order dates, we consider the date column as a continuous range as the days are in successive order and we want to see the evolution across time.

Figure 8: Discrete vs Continuous variables

Discrete vs Continuous quantities Discrete columns are identified in blue pills and Continuous quantities are identified as green pills. This subtle difference often leads to "unexpected outcomes" in visualizations. You can switch the type of the variable from discrete to continuous or vice versa and there are many ways of doing it. You could right click on the YEAR(Order Date) green pill as shown in the figure 8 and switch it to discrete. This will ensure that this value is treated as a discrete value in just this analysis.

3.2 Data sources In order to get a car started, just as you need fuel, in order to start building your visualizations with Tableau, we need to import some data. Keeping in mind with the primary audience of this book, let’s walk through the steps with Excel and CSV files to begin with. In section 5.4, we will see how to bring in more than one data source as you might have data coming in from multiple sources. For example, you might have to consolidate budget data from the finance teams, sales data from Salesforce and Web performance from Google analytics in your reports. Let’s build our way slowly to that starting with a single data source in this section. In the Global superstore.xlsx file (available here), there are 3 tabs: Orders, Returns and People. If you’ve been following along, we imported the orders tab in section 2.1. I just want to draw your attention to a couple of points on this step. By default, when Tableau imports data, it will provide you a preview of the first few rows in a tabular format as you can see in figure 9. Sometimes you’d prefer to see the list of columns and make sure that they are imported properly and more importantly check if their data types are inferred properly. You can switch views by clicking on the "Manage metadata" icon in the box highlighted as 1. This will provide you the metadata as shown in the below figure. You now have the possibility to give the column names an alias. Let’s say for some reason, you want to change the column name from "Row ID" to "Sno Row", then you would change the value under the column "Field Name". These aliases will replace your column names everywhere in

your Tableau reports.

Figure 9: Excel data import

Figure 10: Excel metadata The second thing which could be interesting for you at this step are the Data Source filters. Let’s say you’re responsible for the French sales and you would like to import just the data where the column "country" is equal to France, you could add a global "Data source" filter which will filter the data upstream before getting imported into Tableau. By clicking on "Add" next to

the Filters as highlighted in the block 2 on the figure 9, you’ll be able to go through the steps illustrated in figure 11.

Figure 11: Data source filters Extract vs Live Data ? When you work with SQL like data sources, you’ve got 2 ways of working with data. You could either work on live mode in which the data will be queried in real time by Tableau. i.e, everytime you build a visualization for your report. On the other-hand, if you select the "Extract" mode, Tableau will pull-down the data and store it locally in ".hyper" format in your computer. As a result, subsequent querying and processing will be much much faster compared to the "Live" mode. In case you’re working with voluminous data, it’s definitely recommended to put the data source in "Extract" mode to help you keep things fluid while you slice and dice the data. We will get into more details about these details in section 7.7.

3.3 Data preparation You would have probably heard analysts, working with a lot of data, gripe that 80% of their time is spent cleaning and preparing data. Unfortunately it is no exaggeration, I would even push it up to 90-95%. There are a vast suite of tools which can help you get data in the right format for your analysis such as Alteryx, Tableau Data Prep etc. Excel VBA can be very helpful at times when you want to perform certain manual actions on your excel files. Data preparation is a topic that merits it’s own standalone book. But one thing that you would need to keep in mind is that data analytics solutions work best on tabular data in which each row presents a unique combination of values. Country Segment Profit Algeria Consumer $9,000 AustraliaHome Office$5,000 Hungary Corporate $7,000 Sweden Home Office$9,000 Canada Corporate $10,000 AustraliaConsumer $5,000 Hungary Consumer $3,000 Canada Home Office$9,000 Sweden Consumer $5,000 $62,000 Table 1: Tabular Data The Table 1 above provides you a classical tabular dataset of profits by country and segment. You will notice that Australia for example gets repeated twice (one row when the segment = Home Office and one when the segment = Consumer). You will also observe that the sum of profits across all countries and segment is $62 000.

Table 2: Pivoted Data The same values can be represented in a pivoted fashion as illustrated in Table 2. The pivoted dataset manages to provide the same values in a more condensed format and unfortunately this dataset would not work as well as the tabular dataset with any data analytics tool. You need to transpose the 3 segments which are in the column headers into a new column called segment to get back your initial tabular dataset. This operation goes by many names: Transpose, Stacking, Pivoting etc.

Table 3: Monthly sales data Let’s say you’ve got a table of sales across the last 12 months for 5 countries in a pivoted data format. We can easily transform this data into a tabular version in Tableau. Import the excel file (Ch 3 - Transpose Tab in the Global Superstore dataset available in Github) into Tableau as usual, and then select the 12 months of data keeping the shift key pressed. Right click on the column and click on "Pivot" as highlighted in figure 12. This will create your tabular dataset on which we can easily start comparing MoM (Month over Month), QoQ (Quarter over Quarter) comparisons which we will get into in the following chapters.

Figure 12: Pivot data in Tableau

3.4 Converting "Business questions" to the language of data "The wise man doesn’t give the right answers, he poses the right questions." (Claude Levi-Strauss) When the questions are asked in the right format, the answers eventually reveal themselves. So far, we have managed to import data into Tableau and now the logical next step is to start analyzing this data. But before we make the dive, let’s pause for a second and make sure that we have the right framework to ask the "right" business questions. The "right" applies to the validity of the syntax of the question not the legitimacy of your business question. Let’s say, in our dataset, you would like to visualize the profit generated by Qatar in the last 3 months for every category of product sold. A recommended way to gently rephrase the question would be as follows: Total Profits by Category when Sale Date My Tableau Repository Folder. 4.1.3.2 Size block No surprises here. This block allows you to change the size of the visualizations. Let’s build a quick visualization of a stacked bar chart of total profits generated by ship mode. We add the profit in the rows column as we would like to sum over them and drop the ship mode in the color block as we saw in the previous section. This provides a single multicolor bar as illustrated in the visualization on the left of figure 23. We can quickly see that the total profits are the highest in the standard class of shipping and decreases progressively as we move up to the same day shipping. But let us say we want to accentuate this and clearly make a point that total profits are very low on same day shipping, we would CMD / CTRL click on the SUM(Profit) on the rows shelf and drop it into the size block as well. This will ensure that the width of the bars is determined by the total profits. (Homework: show that the profitability of the orders is not influenced by the ship mode ? It might be tempting to conclude that same day shipping mode is not profitable.)

Figure 23: Sizing up the graphs In the beginning of this section, I mentioned briefly that there is a dropdown tucked in sneakily above the size, color & label blocks. This dropdown allows us to change the shapes of the visualizations. Now would be a good time as any to take a detour and check this out. Let’s make a small change to the visualization in figure 24. Let’s empty the rows shelf and simply shift the shape from a bar to square in the dropdown. We’ve got ourselves a beloved Treemap. Fancy a packed bubble chart ? Switch the square to a circle !

Figure 24: Change the shapes 4.1.3.3 Label block We’re off to great start with just 3 blocks on the marks card. You’ll be able to craft almost more than 90% of the typical graph types that you would require for your dashboard by playing with these combinations. Let’s run with the Tree map to showcase the label functionality. As highlighted in figure 25, drag the ship mode on to the label card. This allows you to annotate the visual elements in a chart. Clicking on the label card itself, opens a popup which allows endless customization possibilities.

Figure 25: Annotating with labels 4.1.3.4 Detail block Let’s go back to the basic bar chart to help us wrap our head around the detail block. By now, you should be capable of building a bar chart with your eyes closed (almost!). In the previous section about labels, section 4.1.3.3, we saw that adding a dimension to the label annotates the graph. In figure 26, let’s start off with a simple total sales by category and add segment to the color card. Now you have 3 segments highlighted in each vertical bar. Let’s then add the ship mode to the label card. As you can notice, ship mode is not part of the initial configuration of the visualization. So visually nothing will look different apart from the labels that were just added, which might initially seem randomly strewn about. As you start hovering your mouse over the bars and you will notice that the bars are broken down into granular blocks. Each of the sub-blocks represents the ship mode corresponding to the segment and category combination and now the labels start to make more sense.

Figure 26: Adding more details This is just the same as adding ship mode to the details block instead. In essence, the detail block allows you to break down a visual element into it’s granular details. As you can see there are 3 pills in the Marks section, the first pill segment dictates the color, the second and third pill which are ship mode provide the granular details within each segment. Note: There is a hierarchy in the order of the color and the details block. If you flip the order of segment and ship mode in the pill order, you will find yourself with an adjacent visualization.

Figure 27: Hierarchy in details 4.1.3.5 Tooltip block We can continue to add more details into a visualization and there comes a point when the visualization becomes no longer meaningful or decipherable or a combination of both. Tooltips, fortunately provide you an elegant way to load more information without visually cramming your visualization. Used appropriately, tooltips can help you tell the right story effectively by reducing "Chartjunk". "The interior decoration of graphics generates a lot of ink that does not tell the viewer anything new. The purpose of decoration varies— to make the graphic appear more scientific and precise, to enliven the display, to give the designer an opportunity to exercise artistic skills. Regardless of its cause, it is all non-data-ink or redundant data-ink, and it is often chartjunk." (Edward Tufte)

Figure 28: Hierarchy in details As you can observe in figure 28, the furniture segment ranks second in the number of sales across the 3 categories but in terms of percentage contribution toward profit it ranks last. The triangle icon next to the sales and profit is how we were able to calculate the percentage contribution. We will get to it in section 6, which is dedicated to Table calculations.

4.1.4 Filters We saw earlier in section 3.2 - figure 9 about data sources filters and how they can help you filter the raw data upstream and get you exactly the dataset which you require for your analysis.

Figure 29: Filter data Sometimes you’d also like to filter your data conditionally for your analysis and not the entire dataset. This is where the filter pane comes into action. As shown in figure 29, when you drag and drop country into the filter pane, you will be greeted with a popup box which provides us a host of customization options. Let’s keep it simple for now and start with the assumption that you’re interested in Albania and Algeria for your analysis. You would simply tick both the countries and click the OK button to confirm your filter selection.

Figure 30: Show the filter For this analysis, you’re now looking at data where the country column contains either Albania or Algeria. You can confirm this by right clicking on the country pill in the filter pane as highlighted in figure 30 and ticking the Show Filter. This option works well when you want to pre-select a country by default for your filter and still have the ability to change the filter values on the fly either during your exploratory analysis or while providing your end-users with the flexibility to change the filter values in case of a dynamic dashboards (which we will get to in section 8. If you take a close look at the filter popup window in figure 29, you have got three options Select from list Custom value list Use all Select from list vs Use all ? There is a very subtle difference between the first and third option. We just used the first option in figure 30 which allows us to pre-

select certain values in a list. Let’s imagine that you had just 4 countries in your dataset and you had selected them all individually using the "Select from list" option in your filter. Let’s say there is a new value in your country column "Tuvalu" when you refresh the dataset. The filter will be pre-ticked with just the 4 initial values you had chosen and not select "Tuvalu" by default. If your initial thought was just to pre-tick all the possible values and not restrict it to the 4 on purpose, you might want to go with the "Use All" option instead. This way your filter is pre-selected with all the possible values in the column and not restricted to a few initial values. It all depends on what you try to achieve. You can go to bed and cover yourself or cover yourself and go to bed, so goes an adage in Tamil.

Figure 31: Custom Value list Custom value list, as you can see in figure 31, on the other-hand provides you the possibility to define values which might or might not be in the initial list of values. Taking the example of Tuvalu, let’s say you know that currently the column country does not contain this value but will in the future, you can pre-define it and add it to your custom value list. This way the

filter will be appropriately applied, the day when Tuvalu shows up in your country column.

Figure 32: On the fly filtering Filtering on the fly Most often as you conduct your exploratory analysis, you would like to filter and deep dive into certain outliers or interesting values. In Tableau, it is as simple as right clicking on the value that you’d like to zoom into and then selecting Keep only. You could do this anywhere from a simple bar (figure 34(a), stacked bar (filtered on multiple values figure 34(b) to individual values on a scatter plot (figure 34(c). On a scatter plot, you could just click & drag a square over the points of interest before right clicking.

4.2 Putting it all together Let’s bring all the different cards together and build the "sneaky" pie

chart shown below. Sneaky because building a humble pie chart is notoriously complicated in Tableau. The visualization we would like to build will help us answer the following question: "Percentage contribution of profits by market in each of the category of products being sold". Start by dropping the category on the columns shelf and switching the visualization type to "Pie" in the Marks section. This will create 3 circles of uniform color under each of the categories. The marks section then gets populated with an extra card "Angle" which will determine the size of the individual slices of the pie. Next step would be to drag the market section on to the color card. This will create 6 uniformly sized slices in each of the pie chart. We’re starting to get closer to our end-goal. Next, let’s bring in the profits into the Angle card and the Size card. This will ensure that the size of the slices and the size of the pie itself is a function of the total profits.

Figure 33: The tricky pie

4.3 Show Me

Figure 34: The "Show me" magic With the 4 pillars of Tableau, we covered so far, you should be able to conjure up any visualization that you fancy. But sometimes, when you’re lacking inspiration, you could always turn to the "Show Me" ribbon. Sometimes they also can help speed up the construction of your visualization i.e, building a simple menial table with two measures in Tableau can be timeconsuming and that’s where the Show Me button could come to the rescue. As you can see in figure 34, clicking on the Show Me on the top right unveils a popup window with a host of visualizations (some active and some greyed out based on the values in the rows and shelf columns). Hovering over a visualization, provides you with a minimum required list of dimensions and measures.

4.4 Sheets & Dashboards So far, for every visualization built, we created a new sheet. When you want to bring all these analysis together and start your "data story telling", you have two options. The first option, Dashboard, allows you to bring together the various sheets of analysis in a single sheet. You can then apply global filters in your dashboard across the entire sheets or even use values in your certain visualizations to cross-filter other visualizations in your

dashboard. Building a dashboard is an art in itself. As we start cramming too many analysis in our dashboards, over time, the dashboards tend to become illegible. We will learn more about Dashboards and best practices in section 8. The second option, Stories, a relatively new addition in Tableau, allows you to build a strong narrative by stitching together various sheets. Stories allow you to add more context, highlight certain insights and salient points in your analysis.

Figure 35: Sheets, dashboards & stories

5.1 Grouping values Most often, the data we have on our hands, requires more preparation and manipulation. We might have to calculate new KPIs, lookup values from another table, scale up or down certain columns of values, calculate percentages etc. Tableau provides you the required tools to automate these time consuming steps. You have the possibility either to prepare the data upstream in a tool such as Tableau Prep or Alteryx or directly within Tableau. In this book, we will be covering the possibilities purely within Tableau. Let’s start with one of the simplest tasks which business users often are faced with grouping values. Task: Group a list of countries into various regions Regional Mapping Mexico North America US North America Country

Canada North America France Western Europe Italy Southern Europe Table 4: Lookup value to create grouping

We could easily resolve this task in excel with a vlookup. We would create a lookup table as shown in Table 4 and use the formula vlookup in our new column to look up the appropriate regions for our list of countries. We could achieve the same result in Tableau in 3 different ways Using Aliases or Group functionality to do the grouping Data blending with relationships (section 5.4) Joins (section 5.4) Let’s start with the easiest approach using Aliases. As illustrated in Step 1 of figure 36, right click on the column (country) that you would like to group and duplicate it. Right click on the duplicated column to rename it as fit (Market Revised) in step 2. The last and final step is to right click on this new column and click on Aliases which opens up a popup window where you can define the new values. Or you could click on Create > Group where you can custom group the values manually. Every time the data is refreshed, Tableau will automatically create this new column with this mapping. The caveat with this method is that it works well for a handful of values (less than 10). As your mapping tends to grow, it’s much easier to handle this with either Relationships or Joins which we will see in section 5.4. The last 2 methods help you industrialize this mapping to bigger data sets.

Figure 36: Grouping values with Aliases You might want to do the same grouping but with numeric quantities. In our superstore dataset, we have a column quantity which contains the values 1-14. Let’s say you were conducting a study on how quantities were affecting your bottom-line profitability and you notice that when you ship less than 5 items, your shipping costs are low and as a consequence, you generate higher profit margins. But on the other hand, the profit margins drop slightly with every incremental item but remain fairly consistent till 10 items before the next steep drop occurs. To visually represent this, you would like to group the quantities into 3 buckets based on the magnitude. (1-5,6-10 and > 10).

Figure 37: Grouping numerical values As illustrated in figure 37, right click on the column quantity and under the create sub-menu, click on Group. This opens up the window on the right where you can select multiple values and click on Group to create your desired grouping. Histograms to understand distribution of numeric columns ? Let’s say during your exploratory data analysis, you would like to construct a histogram to understand the distribution of numeric quantities. Instead of selecting group as above, if you chose Bins, Tableau will automatically create equal sized bins. In the case of "quantity" column, Tableau suggests a bin size of 1.77 (probably using Freedman–Diaconis rule. It’s then as easy as dragging quantity (bin) into the column shelf and a count of the quantity in the rows shelf which provides a frequency distribution of the values in the quantity column.

Figure 38: Histograms in Tableau

One other frequent data preparation step is to combine two columns. Imagine you have two columns with first name and last name respectively, you would need to concatenate them into one column. In Excel, it would be the "&" operator to the rescue in a formula (=A2 &", "&B2). In figure 39, you see how to combine the segment & ship mode column. Start by clicking the segment and ship mode dimensions while keeping the CTRL / CMD pressed down and then select Create > Combined Field. This will create a new combined column, which, you can then right-click and select "Edit combined column" to specify the delimiter between the values of the column.

Figure 39: Concatenating columns

5.2 Calculated Fields Calculated fields allow us to apply formulae to modify existing columns of data or combine multiple fields. In the superstore data set, we have a column called shipping cost and another column with the quantity. Let’s say we would like to compute the unit shipping cost by dividing the shipping cost column by the quantity column. Start by right clicking shipping cost and selecting "Create > Calculated Field". This should open up a popup window as you see in figure 40. Now using basic mathematical operators, you can string together formulae to your heart’s content.

Figure 40: Calculated columns Null Traps in calculated fields ? Watch out for null values especially in your numeric columns. Here’s a fabricated example, let’s replace the profits by Null for just the "consumer" segment with the following formula in a new column called Null’d profit IF [Segment]="Consumer" THEN NULL ELSE [Profit] END As you can see in figure 41, we need to be careful while adding columns with null values. The function ZN could come handy in such cases to replaces the nulls with zeros.

Figure 41: Null values need to be replaced with zeros before calculations

Tableau’s documentation provides an extensive list of Tableau functions at your disposal. But here’s a handy cheatsheet of functions. Commonly used Tableau functions ABS(number): Returns absolute value of a number. e.g., ABS(-42) = 42 AVG(column): Returns average of all the numeric value (Note: Null values are ignored in the calculation). CASE Expression WHEN value1 THEN return1 WHEN value2 THEN return2... ELSE default return END: Can be used to create a custom mapping. e.g., CASE [Region] WHEN ’West’ THEN 1 WHEN ’East’ THEN 2 ELSE 3 END CEILING(number): Rounds a number to the nearest integer of equal or greater value. CONTAINS(string, substring): Check if the string contains the

specified substring. e.g., CONTAINS("Sub-Category","Cat") = True COUNT(expression): Returns the number of items in a group. Null values are not counted. COUNTD(expression): Returns the number of distinct items in a group. Null values are not counted. Let’s say you have a column with products "Car", "Bike", "Space ships" each repeated 10 times individually. The COUNTD function will return a value of 3. DATE(expression): Returns a date given a number, string, or date expression. e.g., DATE("April 15, 2004") = April 15, 2004; DATE("4/15/2004"); DATE(2006-06-15 14:52) = 2006-06-15 DATEADD(date_part, interval, date): Useful when you need to offset dates. e.g., DATEADD(’month’, 3, 2004-04-15) = 2004-07-15 12:00:00 AM; DATEADD(’days’, 42, [Order Date]) = Adds 42 days to all values in the order date column. DATEDIFF(date_part, date1, date2, [start_of_week]): Useful to calculate difference between two dates. e.g., DATEDIFF(’month’, 2013-09-22, 2013-12-24)= 3; DATEDIFF(’year’, 2013-11-11, 201111-11)= 2 DATEPART(date_part, date, [start_of_week]): Useful to recover just the part of the date that you’re interested in. e.g., DATEPART(’year’, 2004-04-15) = 2004; DATEPART(’month’, 2004-04-15) = 4 DATETRUNC(date_part, date, [start_of_week]): Useful to truncate dates. e.g., DATETRUNC(’quarter’, 2004-08-15) = 200407-01 12:00:00 AM; DATETRUNC(’month’, 2004-04-15) = 200404-01 12:00:00 AM DAY(date): Returns the day of the given date as an integer. FIRST( ): Returns the number of rows from the current row to the first row in the partition. We will see more about this in section 6. IF test1 THEN value1 ELSEIF test2 THEN value2 ELSEIF test3 THEN value3

ELSE value4 END: Typical If-Else can be used similar to CASE-WHEN statements to create custom mappings and groupings. e.g., IF [Age] < 21 THEN ’Under 21’ ELSEIF [Age] [Budget Cost], ’Over Budget’, ’Under Budget’) INDEX(): Returns the index of the current row in the partition, without any sorting with regard to value. The first row index starts at 1. We will see more about this in section 6. LEFT(string, number) || RIGHT(string, number): Truncate a string to a given number of characters starting from the left or right respectively. e.g., LEFT("Shankar", 4) = "Shan"; RIGHT("Shankar", 3) = "kar" LOOKUP(expression, [offset]): Returns the value of the expression in a target row, specified as a relative offset from the current row. We will see more about this in section 6. LOWER(string) || UPPER(string) : Convert uniformly to lowercase or uppercase respectively. MAX(a, b) || MIN(a, b) : Returns the maximum / minimum of a and b respectively. NOW( ): Returns the current date and time. ROUND(number, [decimals]): Rounds numbers to a specified number of digits. RUNNING_AVG(expression) || RUNNING_SUM(expression) ): Returns the running average/sum of the given expression, from the first row in the partition to the current row. We will see more about

this in section 6. ZN(expression): Returns the expression if it is not null, otherwise returns zero. Use this function to use zero values instead of null values. e.g., ZN([Profit]) = [Profit]

5.3 Row level, Aggregation & Dis-aggregation When you author formulae in Tableau, by referencing the column names directly, the calculations are done at the level of each row. e.g., [Profit] / [Sales], would calculate the value of this column for each row by dividing the respective values of profit and sales at each row. Hence called as row level operations. On the other hand, if you were to modify the formula slightly to SUM([Profit]) / SUM([Sales]), these values will be aggregated to the appropriate level of detail visible in your visualization before the division occurs. e.g., If your row shelf contains the column segment as illustrated in figure 42, then the ratios would be calculated for each of the three unique values (consumer, corporate and home office) by aggregating profits and sales first before dividing.

Figure 42: Average of Averages is not the same as consolidated averages

By default, Tableau will aggregate the measures for you. (Menu bar: Analysis > Aggregate Measures) should be ticked by default. In case you would like Tableau not to aggregate the measures and show the row level data, toggling this tick mark will get the results you intend. (Note: you can achieve the same result by converting the measure into a dimension as well). This option is useful especially while you’re building scatter plots where you want to compare two measures against each other at a row level but not aggregated.

5.4 Bringing in more data For fast and reactive Tableau dashboards & reports, it’s usually a good idea to have a data source where you have already consolidated your data at the right level of aggregation. But most often this is a dream in an utopian world and you’re often confronted with various data sources (databases, stray excel and CSV files etc). Luckily Tableau provides various options to consolidate your disparate data sources. The logic is the same after you import the data and it’s the just the import steps which vary between the various data sources. The first two in the list below are essentially the same and hence we will look at the process of importing another tab in the sheet. Another tab in the excel sheet (Exactly same as importing another excel file) Import a CSV file MySQL Database - SQL

5.4.1 From Excel/ CSV In the examples we covered so far, we used the orders sheet from the Global superstore.xlsx file. Unfortunately, this data set does not contain the sales representatives who are responsible for the various geographic regions. Let’s say, we would like to compare the performance of the various sales representatives relative to each other and eventually evaluate their individual performance over the past few years. In an world of "Excel", now would be the time for a VLOOKUP to shine in all it’s glory. To do the same in Tableau, let’s import data from the people tab as it contains the sales representatives who are responsible for the different

geographic regions and blend it with the orders data. You can either click on the New data source button as highlighted in figure 43, or select Menu Ribbon: Data> New data Source or navigate to the data source Tab at the bottom and click on the "Add" button next to the connections. All roads lead to Rome!

Figure 43: Bring in some data from another tab Once you have imported the data from the people Tab, you will notice that there are two data sources listed under the data pane at the top and you can confirm that by navigating to the data menu. Clicking on the Edit Blend Relationships will open up the popup you see in figure 44. Had you peeked at the people tab while you imported the data, you would have noticed that it contains 2 columns (Person & Region). Tableau automatically recognises that there is a region column in the orders tab and automatically defines a relationship. (Note: For this automatic detection to work, the column names need to be exactly the same between the data sources)

Figure 44: Data blending In case Tableau incorrectly defined the relationships automatically or if you need to add or edit the relationship, the Add/Edit buttons will open up a popup similar to figure 45 which will allow you to select the right column on either side (primary & secondary data).

Figure 45: Adding/Editing the Blending relationship Now that we have added the two data sources and defined the relationships between the two tables, we are all set for some "Data blending" action. From the orders table, let’s bring in the region field into the rows and now switch to the people table. You’ll notice a small link highlighted in red as highlighted in step 3a (figure 46). This link, when highlighted, ensures that the relationship is active and now when you bring in the person field, you will get the name of the sales representative assigned to each region. If you try to deactivate the relationship as highlighted in step 3b, you will notice that Tableau does not know how to bring in the values from the person field in the current view and instead shows an asterisk.

Figure 46: Data Blending ATTR ATTR indicates that there are multiple values while Tableau is expecting a single value. In the above example, we brought in the field people and it happened that there was exactly one sales representative per region and hence Tableau had no issue displaying it. On the other hand, if we had multiple sales representatives

assigned to each region, Tableau would display a "*" next to the region indicating that there are multiple values. One possible workaround to try to make the relationship 1:1 by adding more granular details to the people tab to ensure that the relationship between the two tabs is unique or even better try leveraging Relationships (section 5.6). Whenever you blend data in Tableau, you will be able to identify the primary and secondary data sources in your blend by the blue and red tick marks next to the data sources respectively. (highlighted in figure 47). Double clicking on the Region pill will show you that Tableau is applying ATTR by default behind the screens to ensure that one value is brought to the front. A subtle point to keep in mind on your blending extravaganza - by default, the unique values in your primary data source constitute the starting point which are then completed with relevant information from your secondary data sources. We have 13 unique regions and 13 unique sales representatives in our dataset. As a result, you could start with the orders data as the primary and people as the secondary or vice-versa and you’re sure to have 13 lines in either visualization. But let’s say you have just 6 sales representatives in the people tab, you’ll notice that the order of the primary and secondary data sources matter. I will let you convince yourself on why that happens.

Figure 47: Data blending

This process of dynamically defining relationships after the data has been imported is essentially called as "Data Blending". We could achieve the same result through Joins as well which is slightly upstream in the process of importing. As you import the data, you directly specify the relationship (Left, Right, Outer or Inner join) between the various data sources and then import the data into Tableau. For a more detailed explanation on Joins and how it is different compared to relationships, please refer to Tableau’s help pages here Do keep in mind that with blending, the relationships between the data sources are defined dynamically and as a result the data sources are not physically joined and materialized as a table. But on the other hand, with joins, the relationship is concretely manifested with the creation of a flat table (denormalized) containing the columns from the various data sources. Tableau recommends you to avoid the use of joins as much as possible and make use of either Relationships or Blending capabilities. Since version 2020.2, Tableau has introduced the concept of Data Modeling and Relationships which we will cover more in detail in section 5.6. In the Data Source tab, you first need to double click on the Orders brick or on the Open option before joining the secondary data source. Now when you drop in your secondary data source, tableau will show you a venn diagram which allows you to specify the join type as highlighted in figure 48. Essentially there are 4 types of joins: LEFT, RIGHT, INNER and OUTER. YouTube contains a plethora of informational videos for those who wish to dig deeper on the topic of joins.

Figure 48: Data joins

5.4.2 From MySQL

Figure 49: Custom SQL

So far, we worked with data imported from excel and csv. Let’s take a look at the process of importing data from a database such as MySQL. The process is exactly the same for all similar databases. As highlighted in Step 1 of figure 49, you will need to fill in the credentials of the database you’re trying to connect to. In this example, I have connected to a MySQL database running locally on my computer at the address 127.0.0.1 (localhost) and default port 3306. You will need to provide the name of the database along with the username and password to complete the connection. Once connected, Tableau will show you the various tables available in your database. You can either directly drag and drop the tables or use new custom SQL to pull-in more precisely the columns at the granularity that you require. Here’s a small SQL query to give you a flavor of how this works. SELECT a.market, b.new_market as revised_market, a.order_date, SUM(a.profit) FROM orders AS a LEFT JOIN market_groupings AS b ON a.market = b.market The query pulls in the market, order date and total profits from the table orders and merges it with another table called market_groupings to get the revised market groupings. The end result of this SQL query will be a table with 4 columns that you can use a data source in your visualizations. Whenever you combine two or more data sources, the level of granularity (cardinality) is extremely important. To illustrate this, let’s take a look at the typical budgeting exercise that business analysts and financial controllers undertake on a quarterly, monthly or even on a daily basis.

5.5 Importance of cardinality - A practical example Imagine the financial team has defined forecast targets for profits of the 7 different markets by month. Now when we want to compare the actual

performance of the profits in the orders table against the forecast targets, we can not do it directly as the level of granularity is not the same. The orders table is much more granular as it contains individual orders by market, country, product and order id while the targets are defined at the level of market and month. In order to consolidate the two tables, we need to ensure that they have the same level of granularity on either side. Let’s say the finance team also has the targets defined for the 7 markets by day, it is a little more straightforward as the dates are at the same level of granularity on either side. Let us start by defining the blending relationships for the orders table and the two targets (daily and monthly) tables respectively. Once you import the daily and monthly targets tab from the excel file, follow the steps as highlighted in section 5.4.1 to start defining the blending relationship between the orders and targets table. For the daily targets table, choose market and MDY(date) on either of the tables. (MDY stands for Month Day Year in Tableau). While on the other hand, since the targets are defined at a monthly level on the Monthly Targets data, make sure to choose market and MY (Month Year) on the date on either side as highlighted in figure 50.

Figure 50: Blending relationships at different levels of granularity

Let’s do a quick sanity check on the targets in the two data sources. We notice that the sum of target profits on either data source adds to 3.5M USD ($3 514 968). Let’s start building a comparison of targets against actual profit. We could take two approaches depending on the data source (primary) that we start with. In the first approach, let’s use the monthly profit target as the starting point (primary data source). Drag the market and target profits from the monthly profit target data source on to the viz pane. Bring in the actual profits from the orders table in the second step. You’ll notice the orders table gets a small red tick indicating that it’s a secondary data source. Let’s activate the blending relationship on market and order date and you’ll notice that the you get the same total target profits of 3.5M USD as expected.

Figure 51: Blending at market and order date Now for our second approach, let’s start off with the orders as the primary data source (blue tick) and complete it with the monthly targets data. This time, let’s activate the blending relationship on the market and the order month on the target data source. You’ll notice that surprisingly the total target profit’s don’t match the total of $3 514 968 but instead we get a total target profit of $ 3 513 837. We’re indeed missing 1 131 USD.

Figure 52: Blending at market level This head-scratcher is primarily related to the activation of blending relationships. Let me give you the solution before explaining the why and how behind the fix. Keep the market relationship activated while deactivating the order month to get back the true total target profits on either side miraculously as highlighted in figure 52. Let’s now try to explain this mysterious phenomenon. If you take a look at the target table, you will notice that it is exhaustive in terms of targets defined by market and by month. i.e, every single month starting from January 2011 to December 2014 has a target assigned by market. But on the contrary, we have not made a sale every single month for each of the markets. As a result the second approach (figure 51), brought in just the months on which a sale was made as the orders table was the primary data source. We blended this against the target profits at the regional and monthly level. Hence Tableau did not bring in the targets for those "market & month combinations" where no sale was made. By removing the blend on the month and keeping it on just the market, we tell tableau to bring in all sales at the level of the market and compare it against the total target profits irrespective of the month. This fixes the issue as we notice on figure 52. Digging a little deeper (figure 53), we notice that Canada did not generate any profits in January 2012 while it had a target of 1 131 USD which was the missing

amount. I will let you conduct the same operation on the daily targets table to identify markets and days on which no sale occurred.

Figure 53: The discrepancy comes from Canada not making a sale. Blends are equivalent to left joins The one key take away from this lengthy exercise on cardinality is to keep in mind that blends are equivalent to left joins. You start with the rows on your primary data source and based on how you activate the blending relationship, you might bring in or exclude rows from the secondary data source.

5.6 Data Modeling Now that you’ve got a firm grasp on data blending in Tableau and the nitty-gritty details of the order of joins, you might be wondering "Couldn’t this be easier in Tableau ?" and you definitely are not alone as Tableau seems to have taken notice. Since the version 2020.2, Tableau has introduced the concept of Data Modeling and relationships similar to what you might have

seen in competing tools such as Power BI. The Data Model enables you to define relationships between various data sources and Tableau does the heavy lifting behind the screens to pull in the right data at the right level of cardinality. Let’s walk through the previous example of target vs actuals comparisons that we achieved through blending but this time with the aid of Data Modeling and Relationships. Let’s go to the Data Source tab and drag in both orders and daily Profit target tables. As soon as you drag in the targets table (highlighted in 1), you’ll notice the orange line (called informally as the noodle) and the popup window which shows you the relationship (highlighted in 2). In our case, let’s select market and order date/order daily on either side respectively and close the relationship panel.

Figure 54: Data Modeling - Step 1 Noodle Now when we navigate back to the sheets, we notice that Tableau shows all the data sources for which we have designed a relationship. With blends, we had to pay attention where we started or otherwise our numbers did not exactly match our expectations.

Figure 55: Data Modeling - Step 2 With relationships, all roads lead to Rome and as you can notice the totals are the same in either case irrespective of your starting point. i.e, markets from the orders table or the daily target table. The astute observer among you might have noticed that Tableau has added an extra row on the second table where we start with the markets from the orders table. Tableau shows us that there are markets for which we have defined daily targets and that they have not made a corresponding sale. Logical vs Physical layer ? Keep in mind that whenever you see a noodle (link), you’re in the logical layer (the results are not manifested as flat tables) Whenever you see venn Diagrams, you’re in the physical layer where joins are made and flat denormalized tables are created. A handy recap of Relationships vs Joins vs Blends

Figure 56: Relationships vs Joins vs Blend

6.1 Show me or start from scratch ? Tableau unfortunately sometimes has the tendency to make the easiest of things unbelievably hard. Let’s say you build a simple bar chart to show the profits by market as highlighted in the figure 57 and would simply like to flip it to a table instead.

Figure 57: Show me faster Armed with what you know so far, you would get extra 5 points, if you try to switch the chart type from Bar to Text. But unfortunately, that would not get you the result you had in mind and is one of the most frequent points of frustration for beginners. Unfortunately, the only way to do this manually is to fastidiously rearrange the pills (market in rows and SUM(Profit) on the text card in the marks section. This is where the Show me pane can come in handy. In a click, you would be able to flip to a Table or even the infamous pie chart. Adding Measures to Tables Now let’s say you have managed to put together a simple table with your profits by market and you would like add the total sales as well as a column. You need to be careful to drag the sales pill on top of the existing profit column as highlighted in the right. If you drop it on the text shelf (as highlighted on the left), Tableau will create two rows which is not what we had in mind. You could still click on "Table" on the Show me ribbon to force tableau to create two columns.

Figure 58: Table tips

The Show me "Table" can only take you so far. Let’s get our hands dirty and build a table from scratch to understand the nitty-gritty details. Let’s answer a simple question, which is, how much profit and sales are we making by region and also break it down by the quantity of items we’re selling. A couple of subsidiary answers we could read off the table include "when we sell a lot of quantities, is there any correlation between sales and profits", "are there any outliers in terms of regions ? etc...". I will admit that we could build better visualizations to answer these questions but for pedagogical reasons, we will stick to a table.

Figure 59: Table from scratch We can effectively build this table in 4 steps as illustrated in figure 60.

Figure 60: Table from scratch in 4 steps Step 1: Lets drag the market and the quantity (groups) in the row and column shelf respectively. Step 2: Now we need to have profits & sales as two rows next to each market. Here comes the tricky part. We need to drag the Measure names next to the market. You will see the rows populated with No Measure Value. Step 3: Drag the Measure values onto the Text Shelf and this will create as many rows as measures that you have. Step 4: Tableau handily adds the Measure names in the filter pane which will let us filter out the unwanted measures and keep just the profit and sales that we require. If we need to add a new dynamic row which gives you the total profit by sales for each market and quantity group intersection, you can just double click on the Measure values pane and type in the formula SUM([Profit])/ SUM([Sales]) to add a new row.

6.2 Table totals

Once you’ve got yourself a pretty table, the next logical step would be to add Grand Totals along the rows or columns according to your necessity. You have the possibility to switch to the "Analytics" tab as highlighted below and as you drag the totals on to the viz pane, you can then choose the Total type. Or else, you could achieve the same result from the Analysis drop-down on the menu bar. The Totals options provides you more customization options such as displaying the column grand totals at the top or bottom of the table, adding sub-totals etc.

Figure 61: Table Totals

6.3 Table calculations Table calculations will help you narrow down on answers to advanced analytic questions. Say for example, you would like to visualize the evolution of the percentage split of sales across the 3 segments in all the markets over the last 4 years. It’s a lot easier to start off with a table to ensure that the numbers make sense before flipping to a pretty chart which might seem a little counter intuitive at first.

Figure 62: Table calculations lay the foundation Before we get to building this visualization, let’s start off nice and easy with a few simple warm up Tableau calculations. For the following exercises, let’s stick to Running Totals as the logic remains exactly the same across the various table calculation functions.

6.3.1 Table & Pane - Down & Across

Figure 63: Table Down Let’s say you’re doing a deep dive into the year of 2011 and want to see how the profits have been growing cumulatively over the months in the year, the illustration in figure 63 will help you answer the question. For the first 2 examples in this section 6.3.1 concerning a simple table, the Quick table calculations menu will get you through. Once you have added the Running totals in the quick table calculation, you can right click again and choose Edit calculation to see what Tableau is doing behind the scenes. As you can see (highlighted in step 2), Tableau is running a simple Table Down calculation. If instead of having the months along the rows, we had them along the columns, the Table calculations turn into Table Across instead of a Table Down as highlighted in figure 64.

Figure 64: Table Across The Table is the container or holder which contains the rows and columns of data. As you add more dimensions to a table, the table then gets further broken down into panes.

Figure 65: Pane Down & Pane Across When you add a table calculation, Tableau will by default apply a Table down or Across calculation through your entire table. Let’s say you would like to analyze the cumulative performance by month not just for one year but across the years. Let’s remove the filter and add the year dimension ahead of the month dimension. In our example here, your table gets broken down into 4 panes as there are 4 years of data. The Edit Table calculation will now reveal an extra option which is Pane down. The Running total resets itself at the end of each pane giving us the right cumulative total of each year. Switching the year and month from the row to the column shelf, makes it a Pane Across calculation as highlighted in figure 65. Where do calculations reset in Table Calculations ? When you click on Edit table calculations, Tableau will highlight in Yellow the scope of your Table or Pane Calculations. It will reset to zero at the end of the highlighted zone and restart the calculations in the new pane.

6.3.2 Down then Across & Across then Down Let’s kick it up a notch and add an extra dimension (segment) along the columns and focus on quarterly profits instead of monthly profits to simplify the table. Now that we have a pane across the rows (years) and segments along the columns, Tableau now presents a few extra options Down then Across and Across then Down for both Table and Pane.

Figure 66: Down & Across In the figure 66, we illustrate the Down then Across for both Table and Pane. As you can notice in the Table Down then Across option, Tableau goes down the entire list of rows with the running total for the first segment and then climbs back to the next segment to continue the running total. If you pay close attention to the Edit table calculation window under Specific Dimensions, you will notice that all three dimensions are selected. In The Pane Down then Across, as you can see the yellow zone stops at the end of year 2011 indicating that the running totals resets itself at the end of every year. Under the Specific Dimensions, you will notice that the year is unticked while the segment and quarter are both ticked. In Across then down, the order of operation is transposed and the running total goes horizontally to the end of the list before continuing with the running totals from the beginning of the subsequent rows.

6.3.3 Shortcut to reading Table calculations in English Across then Down and Down then Across can help you through small tables with a max of 3 dimensions in which you can clearly visualize the flow of the calculations but as you start adding more dimensions or need finer

granular control over the calculations, it’s better to master the Specific Dimensions section in the Edit table calculation window. At first glance, it’s not the most intuitive menu and most analysts don’t even know that you can reorder the rows in the Specific Dimensions list by holding down on the values.

Figure 67: Down & Across Instead of relying on Down and Across calculations, let’s switch to Specific Dimensions directly as highlighted in figure 67. There is a method to the madness in reading the Edit table calculation window while using the Specific Dimensions which will help you crush the Table calculations once and for all. Start with the unticked Dimensions as such For each segment, follow it up with the calculations that need to be done as in Calculate the running total and wrap it up by the ticked dimensions in the reverse order as in By quarter and year. (For each segment - Calculate the running total - By quarter and year) Full credit to this amazing formulation goes to Andy Kriebel and his amazing blog post that I stumbled upon here.

Figure 68: Reading Table calculations in english Let’s take a look at another example this time. Let’s say this time we want to compute the cumulative profits for each of the categories across the quarters. Using the same methodology, we leave the category unticked and rest ticked (years can be ticked or unticked and it doesn’t make a difference, do you see why ?). But now when the calculation flows to end the quarter for a segment and restarts at the consecutive segment for Q1, we need to reset the totals as it doesn’t make sense. The Restart using option conveniently allows us to specify that we need to reset the totals for each segment.

Figure 69: Reading Table calculations in English - example 2

6.3.4 Formulation of Table calculations In section 3.4 Converting "Business Questions" to the language of Data, we saw how to rephrase questions analytically. In case you forgot, let me jog your memory with this image

We start by listing the measure that we’re trying to aggregate and follow it up with our dimension to break it down and apply our filtering conditions. As you might recall, we can stack the questions on top of each other as they grow in complexity. Taking our example question which drove the visualization in figure 62,

"Evolution of the percentage split of sales across the 3 segments in all the markets over the last 4 years.", we will start slicing by the dimensions in the reverse order literally. So in our case, we could visualize our question as such

We total our sales by market and year to begin with. We then add the segment to the list and then switch it to percentage instead of absolute numbers. Also make sure to switch the calculations to Pane (Down) instead of the Table Down which is chosen by default. Once you have the table in place, it’s easy to flip it into an Area chart as shown at the beginning of this section using the "Show Me" wizard. Whenever you see "Percentage of","Running Totals", "Differences", "Moving Average", "Year on Year" etc, always think table calculations.

6.3.5 Comparisons - YoY, WoW, MoM Let’s put to use the amazing table calculations that we just covered. Year over year, Month over month and all variants of these comparisons lend themselves perfectly well to the use case of table calculations.

Figure 70: YoY calculations - pitfalls and workarounds As illustrated in Step 1, let’s start off with a simple use case by dropping in the year(order date) and now using table calculations, we can easily calculate the percentage difference between the years. For some diabolical reason, let’s say we decide to exclude 2012 from the analysis. As you can see in step 2, we have filtered out year 2012 but now Tableau unfortunately calculates the percentage difference of 2013 over the year 2011 instead of 2012 as it’s the value in the previous row. A quick fix for this would be to use the "Show missing values" if the dimension using for the percentage difference calculation is continuous (In our case, it’s year which is a continuous date variable). Let’s take it a step further and try to do Year over year comparisons by country as illustrated in figure 71. As you can see below, Armenia and Nepal unfortunately did not have any sale in 2012,2014 and 2011 respectively. Argentina and Australia managed to make a sale in all the 4 years. If we use the percentage difference from the table calculations, Tableau by default uses the Table down which is incorrect in this case as the rows don’t line up (4 rows needed for each of the years for each country). We can quickly fix this by switching it to Pane Down as shown in Step 2. Tableau smartly adds the missing years for each of the countries to correctly calculate the percentage difference. The same result can be achieved using Specific Dimensions (step

3) which we can read out using our mnemonic as such "For every country, calculate the percentage difference year over year".

Figure 71: YoY calculations - By pane & specific dimensions One of the most common business KPIs that analysts often calculate for their business review sessions is the infamous WoW (week over week comparisons). This mundane calculation could get extremely complicated given the week splits across the years. Let’s take the example of end of year 2014 containing the weeks 52 and 53. We want to complete the week over week performance only for those weeks containing the full 7 days otherwise we would not be comparing apples to apples. There are almost an infinite way of overcoming this. But let’s take a look at 2 possible approaches based on what we know so far (Table calculations and calculated fields).

Figure 72: WoW calculations - excluding incomplete weeks Let’s start by creating a calculated field to determine incomplete weeks. The formula MIN(DATEPART("weekday",[Order Date])) when dropped after the week of order date as shown in figure 72 gives you the starting weekday for each row (normally 1 as it corresponds to sunday - week start by default in Tableau which you can modify if needed). Knowing this, we can put together the formula to combine Min and Max to get the complete week and flag them appropriately. Week 53 in our example starts on a sunday but ends on a wednesday and hence flagged as an incomplete week. For the first approach, let’s use the LOOKUP formula which will bring the value from the preceding or successive rows based on the offset provided (LOOKUP(SUM([Sales]),-1) is the formula in our case here to lookup one row above). It’s then as simple as calculating the percentage difference between the sales of the value and the looked-up value and showing it only for complete weeks. In the second approach, we will use Table calculations which is a lot simpler. We just need to choose "Percentage Difference from" and "Table Down" to effectively calculate the week over week calculations. You can then choose to hide the incomplete weeks. (Please refer to the "Missing Week" Tab in the Tableau workbook corresponding to Chapter 6 for an example of this implementation)

6.4 Sorting Sometimes the proverbial sort yields unexpected results unless you have a good understanding of the underpinnings of Tableau. When you have just a dimension and a measure in your pane, sorting is straightforward with the two sort icons on the top ribbon or by right clicking on the dimension, you can specify the sort order. Let’s take a common example of nested dimensions as illustrated in figure 73. We have the profits broken down by market and further broken down by category within each market. By default, Tableau will display the markets in the Data source order i.e, the order in which they happened to show up your data. Let’s fix that by right clicking on market and selecting sort. We now have 5 possibilities in the Sort by dropdown. (i) Data source order: Sorts by the initial order in which your data happened to be present in your raw data (ii) Alphabetic No surprises here. (iii) Field: Sorts by the chosen measure. e.g., let’s say you have profits and sales. You could specify to sort by profits using the field option (iv) Manual: You can specify a custom sort order by manually rearranging the values. (v) Nested: When you have more than 1 dimension and you need to specify the sort order for the nested inner dimensions. (check section 6.4.1). For our first dimension of market, we can select the field and specify that we need to sort in descending order the sum of profits and Tableau obediently does so. Now you might notice that the categories within the Market need to be sorted and seem to be out of order which we will cover in the next section.

Figure 73: Sorting possibilities

6.4.1 Nested Sort Wanting to sort the categories now, you might try to repeat the steps from above and sort using the "Field" option. Unfortunately the results that you see will not be what you had in mind. The sort order will be wrong for markets like EU and Canada. What tableau is doing here is first calculating the total profits of the 3 categories and then determining the sort order for category irrespective of the other dimensions in the visualization (markets in our case). Technology, office supplies and furniture form the sort order for the category dimension using profits. Tableau then arranges the values of category within each market in the exact same order. In the EU and Canadian market, we will notice that office supplies generated more profits than technology breaking this sort order. We need to specify Tableau that the field category needs to be sorted within each market irrespective of the global category sort order. This is where the Nested option comes to the rescue. Tableau will now sort the values of category independently within each market getting us the result we had in mind.

Figure 74: Nested Sort

6.4.2 Rank Sort This chapter being about Tables and Table Calculations, I would be a remiss if I were not to show you another possibility with the Rank function in Tableau. Drag the Sum(Profits) on to the rows between market and category by setting it to Discrete instead of continuous. Now add a Table calculation as illustrated in figure 75 specifying the Rank function as the calculation type and Pane down (or Specific Dimensions = Category ticked).

Figure 75: Rank Sort

6.4.3 Sorting in Blended data When all your measures and dimensions come from the same data source, Tableau gracefully manages the sorting. But unfortunately when you have blended data from different data sources, the sort complexity grows beyond the grasp of Tableau. In these cases, you would either not see the sort option when you right click on the dimension or even worse you see it but infuriatingly you don’t see the measure that you want to sort by in the field name. For these corner cases,

Figure 76: Sorting in blended data Drop the profit measure on to the rows and set it to Discrete. You can then click on the sort icon to force the visualization to use this column as sort (Note: make sure to click first on the pill before clicking on the sort icon). You can then right click on the pill and untick "Show header" to hide this column from the visualization as it’s only purpose is to determine the sort order in our example here.

In this section, we will dive into the deep end of the pool to learn a few more aspects that will help you fine-tune your analysis and visualizations.

7.1 Dynamic Inputs - Parameters In all the analysis and visualizations we built so far, there were no interactive components i.e., you could not dynamically vary the input values and see the results. We will walk through two common use cases with the first one which will provide us the possibility of dynamically scaling up or down the measures and the second one which will allow us to dynamically vary the measures in the visualization. Let’s imagine you want to present the user with the actual profits by segment but at the same time provide him/her the possibility to dynamically scale up or down the profits. In order to achieve this, let’s start by right clicking anywhere on the left pane (Note: make sure to right click in the white space at the bottom of the list of dimensions and measures on the pane but not on top of any dimension or measure) and select Create Parameter. This will open up a popup window as in step 1 of figure 77. Let’s call it Scaling

Factor and let it take a value between 1 and 50 with a Data type of Integer and Allowable Values of Range. You can right click on the parameter that we just created and select Show Parameter. Moving the scaling factor is not going to have any effect on your visualizations as we need to wire it up first. In step 2, let’s create a calculated field by right clicking on top of profits. Set the formula equal to [Profit]*[Scaling Factor] and name the column as scaled profits. Now let’s add this measure onto our table and you will notice the profits getting scaled up or down as you move the scaling factor parameter.

Figure 77: Scaling Profits For our second use case, let’s increase the heat and try to make dynamic measures. In dashboards that you build, you might want to present the user with an option to select the KPIs that interest him/her. This allows you to create lighter dashboards without cramming in too much information and provide the flexibility to the end user.

Figure 78: Dynamic Measures Let’s make a table in which the user can dynamically select either profits or sales. In step 1 of figure 78, let’s make the Allowable Values a List and in the list of values make sure that 1 and 2 corresponds to sales and profit as illustrated. When you make the parameter visible now, you’ll see that it’s a drop-down with sales and profit and each of them internally correspond to a value of 1 and 2 as defined above. Now let’s wrap it up by creating a calculated field, in which we will specify that if the value of the parameter corresponds to 1, we need to pull in the sales otherwise we pull in the profits. IF { [Parameters].[Sales or Profit] = 1 THEN [Sales] ELSE [Profit] END So when the user selects profit in the parameter, it internally corresponds to a value of 2 and the formula will appropriately fetch the profit measure.

Now let’s drop it into the table and enjoy the goodness of the dynamic measures.

7.2 Top 10/20/50 filters

Figure 79: Top 5 filter The Pareto’s 80-20 law says that 80% of the consequences can be attributed to 20% of the causes. In such cases, we would like to create quick filters that would filter into the Top N products or items according to your criteria. Let’s start off as usual by simply creating a Top 5 filter before adding more layers of complexity on top. In figure 79, we have a list of products with their associated profits. Drag the product name on to the filter pane and in the dialog window, you’ll notice that the 4th tab aptly reads "Top". Select by field and choose profits and sum on the drop-downs along with the Top 5 criteria. This will filter down your list to the top 5 products in terms of profits. Now let’s say you want to filter down into the top 5 products by country that you filter. Drop in the country field into the filter and for demonstration purposes, let’s filter into Afghanistan. You’ll suddenly notice that the product names disappeared and it’s probably not a result you expected. But rest

assured, Tableau is doing the right thing and we will try to decipher the logic behind. In this case, we have two filters applying on the same dataset. The top 5 filter on the products and the country filter which is set to Afghanistan. Tableau has a specific order in which it likes to apply the filters which we will cover at the end of this section . All that you need to know at this point is that the Top 5 filter gets applied before the country filter. As a result, Tableau picks out the top 5 products and then tries to filter into Afghanistan and as it turns out none of the top 5 products ever made a sale in Afghanistan and hence the table disappears. The fix for this is to right click on the country filter and click on Add to context. This ensures that the data is first passed through the country filter at the top of the funnel and the top 5 filter gets successively applied giving us the expected result that we see in step 2 of figure 80. The filters that are added to context are highlighted in a brownish grey color.

Figure 80: Top 5 - Add to context As is the tradition, let’s increase the complexity by adding the dimension of segment to the analysis. We would like to get the top 5 products sold in each of the segments for the countries chosen. As you can see in step 1 of figure 81, the number of products still remain 5 and they are not listed for

each of the segments. The fix for this is a little circuitous. We need to add an index() formula to the rows and set the table calculations to increment for the products in each segment (Note: Which dimensions do we untick under the specific dimensions in table calculations. Segment ?). Now we can drop the index formula into the filter pane and select the numbers from 1-5 which will filter the top 5 products for each of the segments. The country remains on the context and hence we know that the data that is getting into the visualization has already been filtered.

Figure 81: Top 5 - By Segment

7.3 Dual Axis Let’s say you want to create a combo graph which includes a bar and a line chart, we could achieve it with the help of the dual axis graphs.

Figure 82: Dual Axis - Bar & Line Combo graphs As highlighted in step 1 of figure 82, right click on the second measure pill (Note: Doesn’t work on the first measure pill) and select the dual axis. In step 2, you will notice that there are 3 possible ways to change the visualization of each of the measures. If you choose all and set the visualization to Bar, it will set both the measures to Bar. Since we’re trying to set just the discounts on a line, let’s click on the SUM(Discount) ribbon and then set it to Line. We will instantly notice that, we offered a lot of discounts on the office Supplies category but unfortunately did not have as much as an effect on the profit numbers compared to the other categories. (Assuming that we have already established a causal link between discounts and profits). As you can notice, Tableau has created two axis one for each of the measures. You can force Tableau to use the same axis by right clicking on any one of the axis and selecting synchronize axis. In this case, Tableau will pick the largest axis and force it on both. In our example here, if we synchronize, the profit axis will be applied to both and as a result we will be barely able to see the discount line.

7.4 Shapes & Icons

At the end of the day, Tableau is a data visualization tool and we need to make our analysis as visual as possible to communicate our ideas faster. Tableau provides us the possibility of associating a palette of built-in icons to the data or even associating your own custom icons. One common use case for Business or Financial analysts is to indicate the trend with a colored icon. If the trend is positive, an upward pointing green arrow or a red downward facing arrow to indicate decline in numbers. In figure 83, we’re showing year on year percentage growth. If the percentage growth is less than 25%, we have used an orange flat arrow or if the percentage growth is more than 25%, a green upward pointing arrow. With this visualization, we’re able to see immediately that the technology sector is booming in 2014 compared to 2013. Let’s start off with figure 84 in which we build a simpler version by associating a set of icons to the 3 different categories.

Figure 83: Custom Shapes & Icons - 1 Start off as usual with category on the rows and select shape from the visualization drop-down. You’ll now see the shape square appear next to the tooltip pane. Clicking on the shape, opens up a window which allows you to associate any of the icons with each of the values in the category.

Figure 84: Custom Shapes & Icons In order to recreate 83, all you need to do is calculate the YoY calculations for the 3 categories. Associate these numeric YoY values to categorical values ("25%") in a calculated field and then drop this field into the shapes pane and associate the right icon with each of the values. As always, please refer to the Tableau companion workbook available in Github on this implementation (Shapes & Icons 1 Tab). Custom icons in Tableau Tableau by default is shipped with a palette of icons. You can make your own icons appear in this list by creating your own folder in your computer in the Documents > My Tableau repository > Shapes folder and adding the icons in there as highlighted below.

Figure 85: Custom Shapes Folder In case you don’t see the folder in your tableau shapes, make sure to click on "Reload Shapes". On a Mac, you’ll need to create it here: /Documents/My Tableau Repository/Shapes

7.5 Level of Detail (LOD) calculations Level of details is one of those concepts that takes a while to wrap your head around but once you do, it helps unlock a whole sleuth of advanced analytic capabilities. The good news is that there are just 3 types of LODs to master.i.e, Fixed, Include and Exclude. Think of LODs as mini tables of data within your data-source. This helps you to mash together data with varying levels of granularity. For most of your advanced data mashing needs, Fixed LODs will get the job done. The general syntax of LODs starts with the keyword which is then followed up by the dimensions that need to be considered. The syntax ends with the measure that needs to be aggregated and preceded by a colon. Here’s a schematic representation of the formula:

Figure 86: Custom Shapes Folder

7.5.1 Fixed LOD The syntax for Fixed LODs goes as such { FIXED DIMENSION 1,..N : AGG(MEASURE) } Example: { FIXED [Segment] : SUM([Sales]) } Syntax:

Let’s create a new calculated field and call it sales by segment Fixed LOD and use the formula {FIXED[Segment] :SUM([Sales] }. You can imagine the result of this this function being a mini table as shown below. So far in Tableau, when you write a calculated field, you get a single value output but now with LODs, it’s helpful to think of mini tables and you will see why in figure 87. Segment Consumer Corporate Home Office

Sales 6 507 949 3 824 698 2 309 502

Table 5: Fixed LOD: Result

Figure 87: Fixed LOD Illustration For our first illustration, drop the segment along the rows and the newly created calculated field onto the measure along with the regular sales. You will notice that the two measures are exactly the same and you might be asking yourself, why did I have to go through all those hoops if I get the same value. The real power comes when you want to squish the total sales of the

segments on to another dimension i.e, another level of granularity. In section 1 of figure 87, we drop in the categories and the year of order date in 2 separate analysis (1a and 1b respectively). You will notice that the total sales of the 3 segments gets repeated for each line of both category and year. Behind the screens, Tableau is checking if for each of the categories/years, if the segments are present and if they are, it totals the sum of the corresponding segments to give you a single figure of 12 642 502. (image on the right in section 1).You’ll notice that the same total value of sales for the segments gets distributed irrespective of category or year. In section 2, let’s drop the country and filter into Afghanistan, Albania, Armenia, Bahrain and Chad to clearly see this in action. For Afghanistan and Albania, no surprises, we get the same total of 12 642 502 as all three segments made a sale in those countries. In the case of Armenia for example, only the total sales of consumer gets pulled in (6 507 949) and in the case of Chad, just the Home office segment sale which is 2 309 855. Let’s take a look at the common use case of Fixed LODs in filtering where you would like the filter not necessarily to apply to certain dimensions. In figure 88, you see the percentage contribution of sales using both the Fixed Sales LOD formula and regular sales columns (of-course using table calculations!). No surprises, the percentage split is the same for the 2 columns when all the years are chosen as the filter. In section 2, let’s just filter in year 2013 and 2014, you’ll notice the right column using the regular sales is different compared to the first column using the Fixed LOD which remains unaltered and shows the same split as in section 1. The regular sales updates itself just to consider 2013 and 2014 and gives you the percentage split across these 2 years while the fixed LOD continues to give you the split across the 4 years irrespective of the filter applied. In section 3, we illustrate exactly the same with a filter on category.

Figure 88: Fixed LOD Illustration 1 Fixed LOD & contex filter interaction In the section about Top N filters (figure 80), we briefly took a look at context filters which filters the data upstream before it gets to the visualization pane. Let’s take another look at it from a different lens and see how they affect LODs. In, figure 89, you will notice that by adding the filter to the context, we can constrain the data that gets into the LODs and hence the Fixed LODs will have no effect and mimic the regular sales column in this example. We will take a deeper look at the hierarchy of operations in section 7.7.

Figure 89: Fixed LOD context filter interaction

7.5.2 Include LOD { Include DIMENSION 1,..N : AGG(MEASURE) } Example: { Include [Sub Category] : SUM([Sales]) } Syntax:

The Include LOD is going to add an extra level of detail that you specify in your formula in addition to the dimensions present in the visualization. To clearly illustrate this, let’s take a tiny dataset of 9 rows. Category Cat 1 Cat 1 Cat 1 Cat 2 Cat 2 Cat 2 Cat 3

Sub Category Sub Cat 1.1 Sub Cat 1.1 Sub Cat 1.2 Sub Cat 2.1 Sub Cat 2.2 Sub Cat 2.3 Sub Cat 3.1

Sales 10 10 20 10 20 30 10

Cat 3 Cat 3

Sub Cat 3.2 Sub Cat 3.2

20 20

Table 6: Include LOD Dataset When we drop in the category along the rows and calculate the average sales, Tableau simply calculates the average of each category group. Since our dataset includes 3 categories with 3 rows each, Tableau simply averages the values of the 3 rows for each of the categories. In our Include LOD formula, let’s write it as { Include Sub Category: SUM(Sales) }. Now let’s add this to our visualization and set the aggregation to average. Keep in mind that the visualization just contains category along the rows and that in our LOD we explicitly said to include sub category as well in the sum before averaging. In the case of category 1, the are 2 distinct sub categories (sub cat 1.1 (repeated twice) and sub cat 1.2 with values of 10 and 20 respectively). Since the LOD includes sub category now, Tableau first sums the sales for each of the sub categories and divides by the distinct values of sub categories. As a result, it’s 20 for sub cat 1.1 and 20 for sub cat 1.2 which when averaged gives 20 for the first line. In the case of sub cat 2, there are 3 distinct sub categories, hence the average is the sum of the 3 lines divided by 3 (20). In the last case of category 3, we again have 2 distinct values and hence the average of 10 and 30 gives us an average of 25. (Note: In the Include, try using an average instead of sum and try to follow through the same logic to see if you really understand the aggregations).

Figure 90: Include LOD

7.5.3 Exclude LOD Syntax: { Exclude DIMENSION 1,..N : AGG (MEASURE) } Example: { Exclude [Item] : AVG([Sales]) } Exclude does the exact opposite of Include (thankfully!) in Tableau. It explicitly ignores the dimensions specified in the formula even if they are present in the visualization. To see this in action, let’s make the following sales data set. Let’s imagine you opened up a blockbuster e-commerce website and managed to sell 3 orders. In this data set, for example, the person who made the first order bought herself a handbag, a shoe and a cap. Category Order 1 Order 1 Order 1 Order 2 Order 2

Sub Category Handbag Shoes Cap Handbag Cap

Sales 10 10 20 10 20

Order 2 Order 3 Order 3 Order 3

Cap Shoes Shoes Cap

30 10 20 20

Table 7: Exclude LOD Dataset Now let’s say we want to do some in-depth analysis to really understand the purchase patterns in this data set of 9 rows (Sarcastic wink!). We start off by dropping the order Ids and the items on to the rows. Now when we want to ask Tableau to compute the average, it will calculate it at the level of item as it’s more granular than the order Id. What we really want to understand is the average sales for each of the order and not at the item level. It’s time for the Exclude LOD to shine and as expected it ignores the Item for the calculation of the averages of each of the orders. It calculates the averages at the level of the order and repeats the same value for the items in the same order. This way we are able to calculate the Average Item value and Average Order Value in the same table by considering the right level of detail.

Figure 91: Include LOD

When to use LODs ? When you are trying to mash together varying levels of granularity on the same visualization, think LODs. Below are a couple of pointers to help you get the LOD sense tingling at the right moment. (i) When you don’t want filters to apply on your calculations, think FIXED LOD (ii)When you want to calculate averages of distinct groups which are a combination of values across two columns, think INCLUDE LOD (iii)When you want to show the details but want to calculate averages at a higher level of granularity, think EXCLUDE LOD

7.6 Reference Lines & Forecasts You hardly ever take a look at a single piece of visualization or analysis in isolation. You are either explicitly comparing it to some expected value or threshold or implicitly comparing with a mental image of an expectation that you have in mind. Tableau’s Reference lines come in handy when you want to add a layer of detail (expectation) to your visualization so that you can compare the numbers easily and how they stack against each other or against the reference.

7.6.1 Reference Lines using Parameters So far in all our analysis, on the left pane we remained on the Data Tab. Now let’s flip over to the Analytics tab and explore the essential ones for us. In the below illustration, we have the profits broken down by year and category. Imagine our total expectation of profit for any of the categories by year could vary anywhere between $50 000 & $200 000. and you would like to dynamically compare it against the target profits and highlight the bars in red if they are below and in green if above.

Figure 92: Reference line using Parameter We need to first create a parameter as highlighted in the right of the figure 92 which is a list of the various possible scenario that you would like to compare against and name it as Target profit parameter. (Note: if you need a brush up on parameters, please refer to section 7.1.) We then start by dragging the Reference line on to the visualization and tableau will present us a floating pane with 3 possible options on which we can drop the selection. Table: Set the reference line across the entire visualization Pane: Set the reference line across each pane - For every year in our example here. Cell: Across every individual value / cell. In our case, let’s drop it on the Table option which then opens up another floating window. Under the value drop-down, select the parameter (Target profit parameter) that we just created and close the window. Now as you slide through the values in the Target profit parameter, you will notice the line go up and down but the bars don’t yet change colors if they are above or below. In order to do that, let’s create a calculated field with the following formula

IF { FIXED YEAR([Order Date]),[Category]: SUM([Profit])} - [Target Profit Parameter] > 0 THEN "GREEN" ELSE "RED" END Armed with the knowledge of LODs, we can now understand that we’re calculating the total profits for each of the combination of year and category and comparing it against the dynamic target profit parameter. When it’s greater than the parameter, the color green is returned else red. The last step would be to drop this newly calculated field on the color pane and make the bars change color as you slide up or down the Target profits. (Note: Instead of LODs, you could also just use SUM(Profit) - [Target Profit Parameter] but can you see why we cannot just use simply Profit - [Target Profit Parameter] ? )

7.6.2 Reference Lines using secondary data

Figure 93: Reference line with secondary targets Now let’s say you have target profits defined by market and would like to compare the performance of each of the markets against the defined targets. In the Reference line window, Tableau will by default show you all the measures available in the current data source along with the parameters. So when you drop the reference line, you will not be able to choose the target profits in the value dropdown as it comes from a different data source. The trick here is to drop the target profits on the detail pane first and then drop the reference line from the analytics pane on to the visualization as highlighted in figure 93.

7.6.3 Forecast & Trend lines When you need to create quick extrapolations and see them either in your visualizations or tables, you could make use of the Forecast and Trend line option in the Analytics tab. Do keep in mind that the forecasting option in Tableau is pretty limited and it’s just to get directional indicators. The forecasting window provides you a couple of options such as exclusion of periods, seasonality detection etc.

Figure 94: Forecasts

7.7 Order of operations Tableau has an order in which the filters and calculations trickle down into the visualization. It’s crucial to understand this order of operations to make sure that the numbers that you’re seeing in your visualization are indeed correct. In the illustration below, you will notice that the flow of data is from the top to the bottom. The first filters to get applied are the Extract and Data source filters which we saw in the very beginning in section 3.2. Let’s focus primarily on the 4 highlighted boxes as those are the most useful ones starting with the context filters which are on top of the diagram. We saw in section 7.2, how we could make sure than the Top N filters work in conjunction with the context filters. Just to recap, context filters being upsteam compared to top N, ensure that when they are applied, the top N numbers are calculated on the filtered data set and not on the global dataset. In section 7.5.1, we covered Fixed LODs and we saw how they help us workaround the filters in the visualization by fixing the level of aggregation at a different level. In section 7.5.1, we also saw that context filters when applied in tandem with Fixed LODs get a higher priority and hence the Fixed LODs get the filtered data and might not work as you might expect. Now you see schematically why that is happening from the order of operations flowchart.

Figure 95: Order of Operations Keep in mind that the Include and Exclude LODs will be executed after the filters on the dimension have been applied. As a result, if you have filtered your dimension, then the Include and Exclude LODs will get just the filtered data. As you can see, Fixed LODs get a higher priority/preferential treatment compared to the other 2 LODs.

Figure 96: Table calcs run last and Table filters happen even later The table calculations are executed at the very end and hence they come further down the list after all the filtering has already happened. So keep in mind that Table calculations are essentially just manipulating the numbers that are already visible in your visualization. The Table calc filters is the last of the filters to get applied on the visualization. You can take advantage of this fact and selectively hide and show elements in your visualization without affecting the calculations. Think of Table calc filters as cosmetic filters such as hiding or showing rows and columns in Excel. They have no impact on the calculations. Figure 96 shows you how using Table calc filters, we have hidden the odd rows and as a result the grand totals on both the tables on the left and the right are exactly the same.

8.1 Less is more Over the last few chapters, you have gained a good mastery on the various analytics capacities of Tableau and how effectively you can slice and dice and visualize your data. Now comes the last part of the challenge, how to effectively communicate these analysis and results with your colleagues and the rest of the world. To help reassure you, this is the easy part of the challenge. Let’s see, how to effectively communicate your visualization in easily digestible dashboards. If you’re part of the new wave of "minimalists", then you will easily understand the concept of less is more. The objective of our Dashboards is to communicate the maximum amount of information with the least amount of content. Keeping this in mind, let’s get started on Dashboarding.

8.2 Dashboards: A view from 10000ft The last but one icon at the bottom pane lets you create a new dashboard (As you can see in figure 97). You are not alone, if you feel that the empty

dashboard and the endless possibilities that it provides, are more daunting than understanding the order of operations in Tableau. Block 1 lists all the various sheets and analysis that you have built so far. It’s just matter of tiling them up and applying a brush of makeup to make them look good. The block 2 provides the various placeholders in which you will fit in your content. Think of these objects as the framework or the scaffolding that is going to hold your dashboard together. Blocks 1 & 2 are present in the dashboard pane. If you flip onto the layout pane, you have the block 3 which gives you the dials to fine tune the size, color, margins and padding of the place-holders in block 2. With the help of just 3 blocks, we can start putting together a lot of decent dashboards.

Figure 97: Dashboard: Breakdown Before starting to flesh out our dashboard, it’s always super helpful to sketch out a wire frame of your dashboard and how it should eventually look like. Just the outline to visually grasp the various elements that we will be placing in the dashboard would suffice. In figure 98, we see that there is a master dashboard holder highlighted in salmon color(1).

Figure 98: Dashboard: Wire-frame 2 blocks in blue with the one on the left which will hold our logo (2) and the one on the right which will house the dashboard title (3). The green block (4) will hold our analysis and visualizations. Now that we know the outline of our dashboard, we can start sliding in the objects from the bottom to start building our place holders. Start off with a vertical block which will house the entire dashboard and then drag another vertical block but this time go toward the top of the dashboard while keeping the vertical block pressed down. As you hover over the top of the block, Tableau will visually indicate to you if you want to split the block in to two (one on the top and one on the bottom). Similarly now take a horizontal block and hover to the right extreme of the top block until Tableau indicates to you that it is going to split the top block into two but horizontally this time. You can then resize the blocks to make sure that they line up the way you want them to. If you notice closely, on the bottom of the dashboard in block 5, we have the "Tiled" option highlighted. That is essentially what we’re indeed trying to achieve. i.e, tiling the blocks to get the wire-frame of our dashboard. Instead, if you select the floating option before sliding in the objects, Tableau will allow you to position elements exactly where you want them to

be located. It’s convenient when you need to fine-tune the position of certain elements but becomes unmanageable if your entire dashboard is composed of floating objects. The recommendation is to definitely opt for the tiled objects over the floating objects but the choice is definitely yours and no one can deprive of you that.

Figure 99: Dashboard: Float certain elements In figure 99, on the right in the colored lines visualization, you will notice that the legend block occupies a lot of space and we would gain a lot more space by positioning it horizontally. You could do that in 2 ways. You could click down on the caret icon ( ) on the block and select floating. This will allow you to position the legend block exactly where you need it and stretch it out horizontally. Or even better, you could just hold down on the top of the block and move it around. Tableau as usual will allow you to snap it in place in any of the blocks either horizontally or vertically.

8.3 Fit & Layout Now that we have the scaffolding in place, we just need to how to fit the various objects in the placeholders. Take the example of the first block 1 in figure 100, by default when you drop in a Text object in the container, it will

not take up the entire container. You need to click on the caret symbol and explicitly specify to Distribute contents evenly. This will force the text element to occupy the entire space and align it centrally.

Figure 100: Dashboard: Fit elements In the case of the block 2, we see that the table occupies just the left part of the container. Clicking on the caret symbol and hovering over the Fit option, reveals that by default Tableau uses the Standard fit option. If we want the Table to fill the entire space, we could choose Entire View. In block 3, we have a rather long table and now if we force it to fit the entire view, you will notice that the table is illegible as all the rows get squished together. A better option would be to specify "Fit Width" and this will make a vertically scroll-able table.

8.4 Filters & Interactions The real power of dashboards are exposed when you allow your end users to filter in and out the data. In section 4.1.4, we saw how can conditionally filter data in our visualization. Now if we want to expose these filters in our dashboards as well, start by clicking on the caret and hovering on the filter option. Tableau lists the various dimensions and measures available in this

visualization. In our case, let’s expose the segment, quarter and year of order date as filters. Tableau will drop in the filters in your dashboard as you can see in block 2. Keep in mind that Tableau has a tendency to display the filters within containers that are close to the visualization. As a result, sometimes you might have to go on a treasure hunt finding your filters if they get squished in some small container. Even if they are randomly dropped, we know how to move these filters and align them properly using the vertical and horizontal blocks. A side note, clicking on the Go to sheet or the icon highlighted by the arrow 1 will take you directly to the corresponding sheet.

Figure 101: Dashboard: Adding filters Filtering in Dashboard If you need to expose certain dimensions as filters even if they are not present in your visualization, they will first need to be added to the sheet as a filter and only then will you see them on the dashboard when you hover over the filter option on the visualization.

8.4.1 Customizing filters

By default, Tableau will display dimension filters as multiple values (custom list) and they occupy a lot of space. We can rectify this by clicking on our favorite caret icon but this time directly on the filter. As you can see in the figure 102, block 1 provides us a variety of display options. The most appropriate in the case of dimensional filters is the multiple values (dropdown) option but the choice is yours. We can also choose to show only relevant values instead of all values as in block 2. When you have multiple filters, they come in handy. Let’s say you have filtered into the year 2011 and turns out you made a sale just in the consumer segment but not in the rest. The segment filter will display just the consumer if you had chosen "Show relevant values".

Figure 102: Dashboard: Customizing filters

8.4.2 Discrete vs continuous filters In section 3.1, we saw how Tableau distinguishes discrete and continuous values. As a quick reminder, Tableau uses blue to indicate discrete quantities and green to indicate continuous values. Let’s say you would like to let your users filter on the order date in two possible ways. i.e., By selecting an individual date or by selecting a range of dates with a start and an end date. In this case, we need to drop in order date twice in the filter as highlighted in

figure 103.

Figure 103: Dashboard: Discrete vs continuous filters Block 1 provides the option to keep the dates as continuous quantities and block 2 provides the possibility to treat the dates as a discrete quantity. When the dates are considered as a discrete quantity, they will show up as dropdown values instead of date ranges. Changing filter type from discrete to continuous If you dragged in your filter as a discrete quantity in your sheet, your dashboard will not allow you to display them as continuous quantities. Keep this in mind as you set up your filters as they will have an impact on how they can be displayed in your dashboard.

8.4.3 Filter domain By default, filters apply on just the worksheets that you have applied. In figure 105, we have assembled a dashboard with two visualizations. The one on the left (block 1) uses Data blending where we have combined the Orders

with the People dataset and on the right (block 2), we have a simple table with profits by Region. We click on Block 1 and make the People filter available in the dashboard. When you try to filter on specific persons, you will notice that block 1 gets filtered but the block 2 continues to show all regions irrespective of the filter applied.

Figure 104: Dashboard: Filter Domain In order to rectify this, we need to click on the filter and select "All using this Data Source". This will ensure that the filter gets propagated to the entire data source and as a result block 2 will get filtered as well when you apply the filters.

Figure 105: Dashboard: Filtering options It’s definitely much easier to manage all the filter settings on each of the sheets with the analysis and then just display them directly on the dashboard without tinkering too much on the options to avoid confusions as it can get easily get out of hand. Essentially to summarize, when you right click on a filter in your analysis and choose to Apply to worksheets, you have got 4 options 1. All using related Data Sources: If you’re using multiple data sources, then this option allows the effect of the filters to propagate to linked data sources. 2. All using this Data Sources: Restrictive and applies to just the data source concerned. 3. Selected Worksheets: You can decide to apply your filters on multiple sheets. This filter is highlighted by the superposed sheets of bar graph icon. It can easily slip out of control and lead to unexpected filtering results if you are not meticulous in keeping track of which filters are applied to which sheets if you decide to use this option. 4. Only this worksheet: The simplest filter with no associated icons and

its effects are visible just in the sheet. Sometimes instead of exposing every possible dimension as a filter, you might be able to provide a more interactive experience to your users by rendering the tables and visualizations themselves as a filter. In figure 106, we see how we can turn a sheet in a dashboard into a filter. By clicking on the "Use as filter" option, all the values in the dashboard get filtered when you click on any of the values in the table. You will be able to identify them on the all the sheets where they get applied in the filter section as "Action (filter name): filter value"

Figure 106: Dashboard: Interactive filtering In less than a hundred pages, I have tried to distill the essential elements of Tableau and provide you the right fishing techniques. As I mentioned at the very beginning of this book, the possibilities of combining the various functionalities in Tableau are endless but as long as you understand the base elements and their functional logic that we have covered so far, you can mix and match them confidently and draw data analytic conclusions with confidence. I wish you all the best and Godspeed on your quest to master Tableau !

Apart from a plethora of resources available on YouTube, here a couple of useful links: Free resources: Video Trainings. Tableau Desktop Starter kit. Guided e-Learning modules. Tableau Public. Paid resources: Accelerated Desktop. Desktop Fundamentals. Dashboarding Color Palettes: Palette Generator.

Color Palette - Themes. Chart selector. Dashboard Inspirations.