LeetCode 175 - Combine Two Tables
Database Language: SQL Server
Difficulty: Easy
Problem Description
Input
Table: Person
| Column Name | Type |
| ----------- | ------- |
| personId | int |
| lastName | varchar |
| firstName | varchar |
`personId` is the primary key (column with unique values) for this table. This table contains information about the ID of some persons and their first and last names.
Table: Address
| Column Name | Type |
| ----------- | ------- |
| addressId | int |
| personId | int |
| city | varchar |
| state | varchar |
`addressId` is the primary key (column with unique values) for this table. Each row of this table contains information about the city and state of one person with ID = PersonId.
Requirement
Write a solution to report the first name, last name, city, and state of each person in the Person table. If the address of a personId is not present in the Address table, report null instead.
Return the result table in any order.
The result format is in the following example.
Examples
Example 1
Input
Person table:
| personId | lastName | firstName |
| -------- | -------- | --------- |
| 1 | Wang | Allen |
| 2 | Alice | Bob |
Address table:
| addressId | personId | city | state |
| --------- | -------- | ------------- | ---------- |
| 1 | 2 | New York City | New York |
| 2 | 3 | Leetcode | California |
Output
| firstName | lastName | city | state |
| --------- | -------- | ------------- | -------- |
| Allen | Wang | NULL | NULL |
| Bob | Alice | New York City | New York |
Explanation
There is no address in the address table for the personId = 1 so we return null in their city and state. addressId = 1 contains information about the address of personId = 2.
SQL Schema
CREATE TABLE [dbo].[Person] (personId int PRIMARY KEY, firstName varchar(255), lastName varchar(255));
CREATE TABLE [dbo].[Address] (addressId int PRIMARY KEY, personId int, city varchar(255), state varchar(255));
TRUNCATE TABLE [dbo].[Person];
INSERT INTO [dbo].[Person] (personId, lastName, firstName) VALUES ('1', 'Wang', 'Allen');
INSERT INTO [dbo].[Person] (personId, lastName, firstName) VALUES ('2', 'Alice', 'Bob');
TRUNCATE TABLE Address;
INSERT INTO [dbo].[Address] (addressId, personId, city, state) VALUES ('1', '2', 'New York City', 'New York');
INSERT INTO [dbo].[Address] (addressId, personId, city, state) VALUES ('2', '3', 'Leetcode', 'California');
Solution
Given that the requirement is to return all rows from the `Person` table even if there are no corresponding row in the `Address` table, the `LEFT OUTER JOIN` (or `LEFT JOIN` for short) will be used to join the `Person` table with the `Address` table with the `Person` table on the left side of the join clause. The 2 tables will be joined on the `personId` column and here's how the join clause will look like:
FROM Person LEFT OUTER JOIN Address
ON Person.personId = Address.personId
The requirement wants the person's `firstName` and `lastName` as well as the person's `city` and `state` if it exists in the `Address` table and return a NULL value to these 2 columns if they don't exist. Since we are using a `LEFT OUTER JOIN`, the `city` and `state` will automatically return NULL values if an address record does not exist for a person. Here's how the `SELECT` statement will look like:
SELECT Person.firstName, Person.lastName, Address.city, Address.state
It is a good practice to include the table name as part of the column name to avoid any ambiguity if you will be joining tables that has a column that exists in both tables. Putting these 2 together yields the following query:
# Final Query
SELECT Person.firstName, Person.lastName, Address.city, Address.state
FROM Person LEFT OUTER JOIN Address
ON Person.personId = Address.personId
The query plan generated by SQL Server for this query is as follows:
|--Nested Loops(Left Outer Join, WHERE:([leetcode].[dbo].[Person].[personId]=[leetcode].[dbo].[Address].[personId]))
|--Clustered Index Scan(OBJECT:([leetcode].[dbo].[Person].[PK_Person]))
|--Clustered Index Scan(OBJECT:([leetcode].[dbo].[Address].[PK_Address]))
And here's the fastest runtime for the query above:
Related Articles: