Joining Data in Tableau: A Road to Tableau Desktop Specialist Certification
Last Updated on April 4, 2022 by Editorial Team
Author(s): Daksh Trehan
Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.
Chapter 5: A comprehensive guide to Relationships, Joins, Unions & Blending inΒ Tableau
Welcome to the fifth chapter, In this piece, we are going to learn about joining data in Tableau and which one should you goΒ for.
If you want to navigate through other chapters, visit: Tableau: What it is? Why it is the best?; A road to Tableau Desktop Specialist Certification.
If you want to directly go on Tableau Desktop Specialist notes, access them here β https://dakshtrehan.notion.site/Tableau-Notes-c13fceda97b94bda940edbf6751cf30
Use the link to get access to free Tableau certification dumps (Valid till 13 AprΒ 2022):
When we connect our data, Tableau lets us join data in different ways, each way is unique for specific data types and use-cases.
Every time we add a data source to Tableau, we create a Data Model. As the name suggests, it specifies how Tableau should query different tables to combine the data. A Data Model can comprise a single table or multiple tables that use different combinations of Relationships, Joins,Β Union.
Table ofΒ Content
- Logical Layer & PhysicalΒ Layer
- Relationships
- Joins
- Unions
- Data Blending
- Which one toΒ choose?
- Sample Exam Questions from thisΒ Topic
Logical Layer & PhysicalΒ Layer
Tableau Data Model comprises of twoΒ layers:
Logical Layer β The default data canvas view in our Data Source Page is our Logical Layer. It enables to join the data using Relationships(or Noodles). While combining data from multiple tables, we donβt specify joins in Logical Layer but rather perform Relationship where Tableau automatically chooses adequate join type based on data and analysis. Tables formed in Logical Layers are known as LogicalΒ Tables.
In earlier Tableau versions, there was only a Physical Layer, but from v2020.2 onwards, Tableau introduced Relationships & Logical Layer. The tables in logical layers arenβt merged, instead, they remain distinct.
The logical Layer consists of merged Physical Tables and other Logical Tables. Logical Tables may contain single or multiple physical tables associated with unions orΒ joins.
Physical Layer β This can be regarded as a sub-logical layer, to access it, double click a logical table. It enables a combination of data using Joins/Union. Tables joined in Physical Layer are known as Physical Tables and are merged into a single flat table after transformations.
Relationships
Relationships are the just launched USP of Tableau. It makes data joining easy for all, and doesnβt specifically requires the knowledge of joins/union. These are denoted by a Noodle-like structure and are readily available on the Data CanvasΒ page.
The relationship can only be made in the Logical Layer. These are like a contract between two tables based on some matching field, following that contact Tableau automatically performs appropriate joins and combines theΒ data.
The matching field must have the same data type and it doesnβt matter if matching columns are from the same data source or different. Relationships donβt support Geographical data types as matchingΒ fields.
Relationships require at least 1 dataΒ source.
Relationships retain the original table structure.
Relationships canβt be defined on published dataΒ sources.
Relationships are computedΒ locally.
A Relationship canβt be edited after publishing a dataΒ source.
When we join multiple tables using Relationships, Tableau automatically detects the field with the same name and creates a relationship. If data sources have different field names, Tableau would ask the user to define matching columns. A relationship can only be made in a Logical Layer. These are like a contract between two tables based on some matching field, following that contact Tableau automatically performs appropriate joins and combines theΒ data.
The matching field must have the same data type and it doesnβt matter if matching columns are from the same data source or different. Relationships donβt support Geographical data types as matchingΒ fields.
Relationships require at least 1 dataΒ source.
Relationships retain the original table structure.
Relationships canβt be defined on published dataΒ sources.
Relationships are computedΒ locally.
A Relationship canβt be edited after publishing a dataΒ source.
When we join multiple tables using Relationships, Tableau automatically detects the field with the same name and creates a relationship. If data sources have different field names, Tableau would ask the user to define matchingΒ columns.
To improve performance, we can modify the cardinality and look into referential integrity. But, most of the time, it is better to leave it toΒ Tableau.
Joins
Joins are a traditional method to combine our data based on certain conditions. In Tableau, Joins can only be done in the physical layer. To create a join, double-click on a logicalΒ table.
Joins will always create a newΒ table.
A Join combines the data and then aggregates.
Joins areΒ Static.
It requires at least 1 dataΒ source.
It can be applied to multiple data connections.
Joining data with different aggregations or levels of detail can cause data duplication.
Tableau supports four types of joins: Inner, Outer, Left,Β Right.
For illustrations, letβs take two tables: Names andΒ Marks.
Inner Join β It represents only those values that are common in both tables. If a value isnβt entirely matched in either table, it will be dropped immediately.
In our examples, we just had marked for StudentID 1,3, 5 and therefore, we got Student Name for those IDs respectively.
Outer Join β It represents all the values from both the tables, if a value from either table doesnβt match, there will be a null value rightΒ there.
In our example, since we only had 3 matching rows in both tables, the marks for the other ones were assigned asΒ null.
Right Join β It represents all the values from the right table, if a value from left-right table doesnβt match, it will be assignedΒ null.
If we perform right join on our tables Name & Marks, the result would be somewhat like inner join(it is not necessary that every time result of inner join and right join will be theΒ same).
But, if we perform the right join between Marks & Name, the result would be totally different i.e. like OuterΒ Join.
Left Join β It represents all the values from the left table, if a value from right table doesnβt match, it will be assignedΒ null.
If we perform left join on our tables Name & Marks, the result would be somewhat like outer join(it is not necessary that every time result of outer join and left join will be theΒ same).
But, if we perform left join between Marks & Name, the result would be totally different i.e. like InnerΒ Join.
Cross-Database Joins
Cross-database joins requires setting up data sources from multiple connections
To create a Cross-Database join:
Step1: Click onΒ βAddβ.
Step2: Add another connection.
Step3: Once we go to the worksheet, we can find all the data in a single source. We can toggle betweenΒ them.
Union
Union is a way to append two data connections with the same number of columns and data types. This happens in the Physical layer, which can be accessed by double-clicking on a LogicalΒ Table.
Union will always create a newΒ table.
Data from different data sources canβt beΒ unioned.
In a union, Tableau adds reference fields to help identify the source of theΒ data.
A unioned table can be used to createΒ joins.
The union can be done in two ways: Manually, WildcardΒ Search
Manual Union
To create a manual union, enter the physical layer and click on βNewΒ Unionβ.
Wildcard Union
This method is automatically creates a union between tables. The wildcard character will help to find a specific pattern for the tableΒ name.
It will create a union with all files having βOrderβ in itsΒ name.
Matching Field Names or FieldΒ Ordering
If two fields have different names or no names, we can ask Tableau to perform a union on basis of the ordering of data inΒ tables.
Merging mismatched values
When values in two tables are not matched, they are replaced by null values in the union table. We can merge all the non-matching fields together, to get rid of nullΒ values.
Data Blending
Data Blending is another method to join data from multiple connections in Tableau. This is best for connecting data with different levels of details or in simple words, connecting data with different fileΒ formats.
Blending requires two connections namely primary and secondary. The first chosen connection will be deemed as primary and later would be secondary. The primary connection will be represented by blue color, while the secondary connection will be represented by orangeΒ color.
A blend aggregates and then combines.
Blends retain the original table structure.
Data Blend is a left-outer join between primary and secondary tables, there might exist some missing data from the secondary table.
Data blends are worksheet specific i.e. different for different worksheets.
There must be a common dimension between the data sources to create a dataΒ blend.
Data is never truly combined in Blend but each data source is queried independently and the results are aggregated and presented together.
The blended data canβt be published to Tableau Online or TableauΒ Server.
There are some data blending limitations around non-additive aggregates, such as COUNTD, MEDIAN, and RAWSQLAGG.
Which one toΒ choose?
Before answering this question, let's have a recap of all joiningΒ types:
Relationships & Joins requires a single data source whereas Blends requires at least two dataΒ sources.
Relationship, Join, Blend is the way to bind data on certain conditions but Union just appends the data. Now, the competition and confusion come down to three contenders: Relationship, Joins &Β Blend.
Summing up:
βUse RELATIONSHIPS if you can, JOIN if you must, BLEND if you absolutely must.β ~ DonabelΒ Santos
Sample Exam Questions from thisΒ Topic
Relationships can be joined in Geographical fields/Calculated fields?
- True
- False
Solution: False
____ and ____ retains original table structure.
- Joins
- Relationship
- Blends
- Union
Solution: Blend and Relationship
____ and ____ will always create a newΒ table.
- Join
- Union
- Relationship
- Blend
Solution: Join andΒ Union
Use of which joining method leads to the addition of a reference field?
- Join
- Union
- Blend
- Relationship
Solution: Union
All secondary data in the blend must be aggregated?
- True
- False
Solution: True
Use the link to get access to free Tableau certification dumps (Valid till 13 AprΒ 2022):
References:
[1] Tableau Help | TableauΒ Software
[2] PersonalΒ Notes
[3]Tableau Desktop Specialist Exam (New Patternβββ2021)βββApisero
[4] sqlbelle
Thanks forΒ Reading!
Feel free to give claps so I know how helpful this post was for you, and share it on your social networks, this would be very helpful forΒ me.
If you like this article and want to learn more about Machine Learning, Data Science, Python, BI. Please consider subscribing to my newsletter:
Find me on the Web: www.dakshtrehan.com
Connect with me at LinkedIn: www.linkedin.com/in/dakshtrehan
Read my Tech blogs: www.dakshtrehan.medium.com
Connect with me at Instagram: www.instagram.com/_daksh_trehan_
Want to learnΒ more?
How is YouTube using AI to recommend videos?
Detecting COVID-19 Using Deep Learning
The Inescapable AI Algorithm: TikTok
GPT-3 Explained to a 5-year old.
Tinder+AI: A perfect Matchmaking?
An insiderβs guide to Cartoonization using Machine Learning
How Google made βHum to Search?β
One-line Magical code to perform EDA!
Give me 5-minutes, Iβll give you a DeepFake!
Cheers
Joining Data in Tableau: A Road to Tableau Desktop Specialist Certification was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Join thousands of data leaders on the AI newsletter. Itβs free, we donβt spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI