|
One of the common tasks in database management is the maintenance of lookup tables. Sample lookup tables
are Currency Codes , Country Codes, U.S. State Codes, Product Types and so on. Maintenance of these
look-up tables are usually done using stored procedures where the normal task is to add the new record if it does not
exist or update the existing record if it already exists based on the identified primary key of the table. This
involves 2 statements, issuing either an INSERT statement or an UPDATE statement depending on the existence of the record.
SQL Server 2008 introduces a new statement called MERGE statement which combines these functionalities into just a single
statement. The MERGE statement performs an INSERT, UPDATE or DELETE operation on a target table based on the results of
a join with source table.
|
To illustrate how the MERGE statement is used, let's look at the process of maintaining a table using a stored
procedure that performs either an INSERT statement if the record does not exist or an UPDATE statement if the record
already exists in the target table. Prior to SQL Server 2008, your stored procedure may look like the following:
CREATE TABLE [dbo].[Employee] (
[EmployeeNumber] VARCHAR(10),
[FirstName] VARCHAR(50),
[LastName] VARCHAR(50),
[Position] VARCHAR(50)
)
GO
CREATE PROCEDURE [dbo].[usp_ProcessEmployee]
@EmployeeNumber VARCHAR(10),
@FirstName VARCHAR(50),
@LastName VARCHAR(50),
@Position VARCHAR(50)
AS
IF NOT EXISTS (SELECT 'X' FROM [dbo].[Employee]
WHERE [EmployeeNumber] = @EmployeeNumber)
INSERT INTO [dbo].[Employee] ( [EmployeeNumber], [FirstName], [LastName], [Position] )
VALUES ( @EmployeeNumber, @FirstName, @LastName, @Position )
ELSE
UPDATE [dbo].[Employee]
SET [FirstName] = @FirstName,
[LastName] = @LastName,
[Position] = @Position
WHERE [EmployeeNumber] = @EmployeeNumber
GO
EXECUTE [dbo].[usp_ProcessEmployee] 'ABC123', 'John', 'Smith', 'Vice President'
SELECT * FROM [dbo].[Employee]
GO
EmployeeNumber FirstName LastName Position
--------------- ----------- ---------- ------------
ABC123 John Smith Vice President
The MERGE Statement Way
With the new MERGE statement, the stored procedure above will now look as follows:
CREATE PROCEDURE [dbo].[usp_MergeEmployee]
@EmployeeNumber VARCHAR(10),
@FirstName VARCHAR(50),
@LastName VARCHAR(50),
@Position VARCHAR(50)
AS
MERGE [dbo].[Employee] AS [Target]
USING (SELECT @EmployeeNumber, @FirstName, @LastName, @Position)
AS [Source] ( [EmployeeNumber], [FirstName], [LastName], [Position] )
ON [Target].[EmployeeNumber] = [Source].[EmployeeNumber]
WHEN MATCHED THEN
UPDATE SET [FirstName] = [Source][FirstName],
[LastName] = [Source].[LastName],
[Position] = [Source].[Position]
WHEN NOT MATCHED THEN
INSERT ( [EmployeeNumber], [FirstName], [LastName], [Position] )
VALUES ( [Source].[EmployeeNumber], [Source].[FirstName],
[Source].[LastName], [Source].[Position] );
GO
Here are the results when calling the stored procedure containing the MERGE statement:
DELETE FROM [dbo].[Employee]
EXECUTE [dbo].[usp_MergeEmployee] 'ABC123', 'John', 'Smith', 'Vice President'
SELECT * FROM [dbo].[Employee]
GO
EmployeeNumber FirstName LastName Position
--------------- ----------- ---------- ------------
ABC123 John Smith Vice President
EXECUTE [dbo].[usp_MergeEmployee] 'ABC123', 'John', 'Smith', 'President'
SELECT * FROM [dbo].[Employee]
GO
EmployeeNumber FirstName LastName Position
--------------- ----------- ---------- ------------
ABC123 John Smith President
Dissecting the MERGE Statement
Now let's look at the syntax of the MERGE statement. The basic syntax of the first part of the MERGE
statement is as follows:
MERGE INTO <target_table> AS <alias>
USING <table_source>
ON <merge_search_condition>
This syntax looks totally different from the SELECT, UPDATE, INSERT or DELETE statement that you may have been
accustomed to. The first line specifies the target table where the succeeding INSERT, UPDATE or DELETE statement will be
performed. The "AS <alias>" part of the syntax is optional. The INTO clause also is optional.
The source table to which to get the records to merge with the target table is specified after the USING clause. The source table
can be a table or view, a function that returns a rowset, a derived table, joined tables or even the OPENXML statement.
The join condition "<merge_search_condition>" that joins the target table with the source table is specified after the ON clause. The
syntax of the MERGE statement can be compared to a SELECT statement with an INNER JOIN clause where the <target_table> is inner joined with
the <table_source> using the <merge_search_condition> specified.
Extracting from the stored procedure above, the first part of the MERGE statement used is as follows:
MERGE [dbo].[Employee] AS [Target]
USING (SELECT @EmployeeNumber, @FirstName, @LastName, @Position)
AS [Source] ( [EmployeeNumber], [FirstName], [LastName], [Position] )
ON [Target].[EmployeeNumber] = [Source].[EmployeeNumber]
In this case, the [dbo].[Employee] is the target table while [Target] is the alias associated with this table. As for the source
table, instead of using a table or view, a derived table is used using the parameters passed to the stored procedure. The alias [Source]
was assigned to this derived table followed by the column alias list corresponding to each column specified in the SELECT statement. Make
sure that the number of columns specified in the SELECT statement matches the number of columns in the column alias list. If the number of
columns in the SELECT statement is more than the number of columns in the columns list, you will get the following error message:
Msg 8158, Level 16, State 1, Line 6
'Source' has more columns than were specified in the column list.
Similarly, if the number of columns in the SELECT statement is less than the number of columns in the columns list, you will get the
following error message:
Msg 8159, Level 16, State 1, Line 5
'Source' has fewer columns than were specified in the column list.
The second part of the MERGE statement tells what action will be performed for those records that match between the source table and the target
table. It also tells what action will be performed for those records that are in the source but are not in the target. In our scenario,
what we want to happen is that for those records that exist in both the target table and source table, we would want to update the data in the
target table with the values in the source table. On the other hand, for those records in the source table but are not in the target table, we
would want to insert these records to the target table. The basic syntax for the second part of the MERGE statement is as follows:
WHEN MATCHED THEN <do_either_update_or_delete>
WHEN NOT MATCHED <do_insert>
Extracting this part from the stored procedure above:
WHEN MATCHED THEN
UPDATE SET [FirstName] = [Source][FirstName],
[LastName] = [Source].[LastName],
[Position] = [Source].[Position]
WHEN NOT MATCHED THEN
INSERT ( [EmployeeNumber], [FirstName], [LastName], [Position] )
VALUES ( [Source].[EmployeeNumber], [Source].[FirstName],
[Source].[LastName], [Source].[Position] );
From this script, you can see that for those records that matched (WHEN MATCHED THEN), the [FirstName], [LastName] and [Position] columns of
the target table are updated with the values from the source table. The name of the target table is not specifed after the UPDATE statement
because only the target table can be updated by the UPDATE statement. On the other hand, for those records that are in the source table
but are not in the target table (WHEN NOT MATCHED THEN), an INSERT statement is executed inserting the records from the source table into the
target table for those that don't exist.
Using Row Constructors as Source
In the article Row Constructors (or Table-Valued Constructors) as
Derived Table, it discusses the use of row constructors as derived tables and it shows an example of how to use it with the MERGE
statement. The stored procedure above can be modified to use row constructors as the source for the MERGE statement. Here's
how the stored procedure will look like, with the row constructor highlighted in red.
CREATE PROCEDURE [dbo].[usp_MergeEmployee]
@EmployeeNumber VARCHAR(10),
@FirstName VARCHAR(50),
@LastName VARCHAR(50),
@Position VARCHAR(50)
AS
MERGE [dbo].[Employee] AS [Target]
USING (VALUES ( @EmployeeNumber, @FirstName, @LastName, @Position ))
AS [Source] ( [EmployeeNumber], [FirstName], [LastName], [Position] )
ON [Target].[EmployeeNumber] = [Source].[EmployeeNumber]
WHEN MATCHED THEN
UPDATE SET [FirstName] = [Source].[FirstName],
[LastName] = [Source].[LastName],
[Position] = [Source].[Position]
WHEN NOT MATCHED THEN
INSERT ( [EmployeeNumber], [FirstName], [LastName], [Position] )
VALUES ( [Source].[EmployeeNumber], [Source].[FirstName],
[Source].[LastName], [Source].[Position] );
GO
|