Tip of the Day : LeetCode 175 - Combine Two Tables
Determine Missing Identity Values

Determine Missing Identity Values

There will come a time wherein a table that contains an identity column will have some IDs deleted.  These deleted IDs will create gaps between the values of the identity column.  To create a sequential identity column with no gaps in between, newly created records of the table can make use of unused or deleted IDs.  But before it can make use of these IDs, you first have to identify these missing IDs.  This article will discuss 3 ways of identifying the missing identity values in a table containing an identity column.

To simulate the process of determining missing identity values in a table, let's assume we have the following table:

-- Step #1: Create Table and Populate with Values
CREATE TABLE #CarType (
    [ID] INT IDENTITY, 
    [Name] VARCHAR(20) )

INSERT INTO #CarType ( [Name] ) VALUES ( 'Bentley' )
INSERT INTO #CarType ( [Name] ) VALUES ( 'BMW' )
INSERT INTO #CarType ( [Name] ) VALUES ( 'Ferrari' )
INSERT INTO #CarType ( [Name] ) VALUES ( 'Lamborghini' )
INSERT INTO #CarType ( [Name] ) VALUES ( 'Hummer' )
INSERT INTO #CarType ( [Name] ) VALUES ( 'Jaguar' )
INSERT INTO #CarType ( [Name] ) VALUES ( 'Lexus' )
INSERT INTO #CarType ( [Name] ) VALUES ( 'Mercedes Benz' )
INSERT INTO #CarType ( [Name] ) VALUES ( 'Porsche' )
INSERT INTO #CarType ( [Name] ) VALUES ( 'Volvo' )

SELECT * FROM #CarType

The output of the SELECT statement will be as follows:

ID     Name
------ --------------
1      Bentley
2      BMW
3      Ferrari
4      Lamborghini
5      Hummer
6      Jaguar
7      Lexus
8      Mercedes Benz
9      Porsche
10     Volvo

Let's say certain records have been deleted from the table, as shown in the following script:

-- Step #2: Delete IDs
DELETE FROM #CarType WHERE [ID] IN (3, 4, 9)

SELECT * FROM #CarType

The table now has the following records:

ID     Name
------ --------------
1      Bentley
2      BMW
5      Hummer
6      Jaguar
7      Lexus
8      Mercedes Benz
10     Volvo

One way to identify the missing/deleted identity values is to loop through the table starting from 1 up to the maximum identity value as shown in the following script:

-- Step #3 (Option #1): Identify Missing IDENTITY Values
DECLARE @ID INT
DECLARE @MaxID INT
DECLARE @MissingCarTypeIDs TABLE ( [ID] INT )

SELECT @MaxID = [ID] FROM #CarType

SET @ID = 1
WHILE @ID <= @MaxID
BEGIN
    IF NOT EXISTS (SELECT 'X' FROM #CarType
                   WHERE [ID] = @ID)
        INSERT INTO @MissingCarTypeIDs ( [ID] )
        VALUES ( @ID )

    SET @ID = @ID + 1
END

SELECT * FROM @MissingCarTypeIDs

The missing identity values are temporarily stored in a table variable for later processing.  The output of the SELECT statement above is as follows, which are the deleted identity values:

ID
------
3
4
9

Another way of determining the missing identity values is the use of a temporary table that contains just one column which will hold all possible values of an identity column from a value of 1 to the maximum identity value of the table being searched.

-- Step #3 (Option #2): Identify Missing IDENTITY Values
DECLARE @IntegerTable TABLE ( [ID] INT )
DECLARE @ID INT
DECLARE @MaxID INT

SELECT @MaxID = [ID] FROM #CarType

SET @ID = 1
WHILE @ID <= @MaxID
BEGIN
    INSERT INTO @IntegerTable ( [ID] )
    VALUES ( @ID )

    SET @ID = @ID + 1
END

SELECT A.*
FROM @IntegerTable A LEFT OUTER JOIN #CarType B
  ON A.[ID] = B.[ID]
WHERE B.[ID] IS NULL

The first part of the script is the population of a table variable (@IntegerTable) that contains one column (ID), which will hold all possible values from 1 to the maximum identity value of the table (@MaxID).

The second part of the script is the joining of this table variable with the table being determined the deleted identity values.  Since the table variable contains all possible identity values, a LEFT OUTER JOIN is used and the deleted identity values are identifed by checking for NULL IDs from the table (WHERE B.[ID] IS NULL).

Instead of a table variable, a user-defined table-valued function can be created that will return a table containing the same data as the table variable used above.  A separate article discusses such function, Integer Range Table Function).  Using the same function discussed on the mentioned article, [dbo].[ufn_GenerateIntegers], the script can be simplified as follows:

-- Step #3 (Option #3): Identify Missing IDENTITY Values
DECLARE @MaxID INT

SELECT @MaxID = [ID] FROM #CarType

SELECT A.*
FROM [dbo].[ufn_GenerateIntegers] ( @MaxID ) A LEFT OUTER JOIN #CarType B
  ON A.[IntValue] = B.[ID]
WHERE B.[ID] IS NULL