Tip of the Day : SQL Server Database Design - Twitter Profile and Followers
Get Tree Node Level Function

Get Tree Node Level Function

Let's say you have a table named [dbo].[Hierarchy] with the following data:

ID          ParentID    Name
----------- ----------- ----------------------------
1           NULL        Asia
2           NULL        Africa
3           NULL        Antarctica
4           NULL        Australia
5           NULL        Europe
6           NULL        North America
7           NULL        South America
8           6           Canada
9           6           United States of America
10          9           Alaska
11          9           Alabama
12          9           Arkansas
13          9           Arizona
14          9           California
15          9           Colorado
16          9           Connecticut
17          9           District of Columbia
18          9           Delaware
19          9           Florida
20          9           Georgia
21          14          Los Angeles
22          19          Miami
23          20          Atlanta

For each row in the table, you want to be able to determine the level of the row within the tree.  The desired ouput is as follows:

ID          ParentID    Name                                          Level       
----------- ----------- --------------------------------------------- ----------- 
1           0           Asia                                          1
2           0           Africa                                        1
3           0           Antarctica                                    1
4           0           Australia                                     1
5           0           Europe                                        1
6           0           North America                                 1
7           0           South America                                 1
8           6           Canada                                        2
9           6           United States of America                      2
10          9           Alaska                                        3
11          9           Alabama                                       3
12          9           Arkansas                                      3
13          9           Arizona                                       3
14          9           California                                    3
15          9           Colorado                                      3
16          9           Connecticut                                   3
17          9           District of Columbia                          3
18          9           Delaware                                      3
19          9           Florida                                       3
20          9           Georgia                                       3
21          14          Los Angeles                                   4
22          19          Miami                                         4
23          20          Atlanta                                       4

One option that can be done to provide the [Level] of each row is to add a field in the table that will hold this value.  The following script will add the [Level] column to the table and then populate it with the level of each within the tree.

-- Step #1: Add [Level] Column to the Table
ALTER TABLE [dbo].[Hierarchy] ADD [Level] INT NULL
GO

-- Step #2: Set Level = 1 For Rows With No Parents
UPDATE [dbo].[Hierarchy]
SET [Level] = 1
WHERE [ParentID] IS NULL
GO

-- Step #3: Set the Level of all Remaining Rows
WHILE (SELECT COUNT(*) FROM [dbo].[Hierarchy] WHERE [Level] IS NULL) > 0
BEGIN
    UPDATE A
    SET [Level] = B.[Level] + 1
    FROM [dbo].[Hierarchy] A INNER JOIN [dbo].[Hierarchy] B
                                     ON A.[ParentID] = B.[ID]
    WHERE A.[Level] IS NULL AND
          B.[Level] IS NOT NULL
END
GO

The first step of the script adds the [Level] column to the table, which will hold the level of the row within the table/tree.  The next step is to assign level 1 to those rows with no parents (WHERE [ParentID] IS NULL).

The last step of the script updates the remaining rows with their corresponding level.  This is accomplished by looping through the table and checking if there are still rows that needs to be updated (WHILE (SELECT COUNT(*) FROM [dbo].[Hierarchy] WHERE [Level] IS NULL) > 0).  If there are still rows that need to be updated, it now updates all rows where the level has not been set yet (A.[Level] IS NULL).  Also, it will only update those rows whose parent already has a level (B.[Level] IS NOT NULL).  To get the rows level, it simply adds 1 to the parents level.

This script accomplishes the required task.  It was able to assign the level of each row within the tree structure.  What if a new level was inserted between other levels?  Let's say that Counties are now included in the [dbo].[Hierarchy] table.  The lowest level in the [dbo].[Hierarchy] table, which is the City, will now go down the hierarchy.  With the [Level] column as part of the [dbo].[Hierarchy] table, it is mandatory that this column be updated, both for the newly inserted level as well for all rows that were moved down the hierarchy, in this case the City.  For the script that is provided above to work, the [Level] column needs to be changed to NULL first because it populates the column one level at a time.  In a production environment, this can't be done easily.

Another solution to the problem is to determine the Level of the row on the fly.  The user-defined function below accomplishes this task.

CREATE FUNCTION [dbo].[ufn_GetTreeNodeLevel] ( @pCurrentNode    INT )
RETURNS INT
AS
BEGIN

    DECLARE @vParentID            INT

    IF @pCurrentNode = 0 OR @pCurrentNode IS NULL
        RETURN 0

    SELECT @vParentID = [ParentID]
    FROM [dbo].[Hierarchy]
    WHERE [ID] = @pCurrentNode

    RETURN [dbo].[ufn_GetTreeNodeLevel] ( @vParentID ) + 1

END
GO

The logic behind this user-defined function is in the following code:

RETURN [dbo].[ufn_GetTreeNodeLevel] ( @vParentID ) + 1

As can be seen, the function calls itself.  This type of function is a recursive function.  A recursive function is simply a function that calls itself.  In this user-defined function, to get the level of the current node, it simply adds 1 to the level of its parent.  The function stops calling itself when the input node (@pCurrentNode) is already 0 or null and simply return 0.

This user-defined function can only process until 31 levels within the tree.  This limitation is not a limitation of the function itself but a limitation of SQL Server because in SQL Server, the maximum number of nesting levels is 32.  Nesting occurs when one function/ stored procedure calls another function/stored procedure, which is the case with this recursive function.

To validate the output of this function, simply run the following SELECT statement:

SELECT *, [dbo].[ufn_GetTreeNodeLevel] ( [ID] ) AS [Level]
FROM [dbo].[Hierarchy]
GO
ID          ParentID    Name                                          Level   Level   
----------- ----------- --------------------------------------------- ------- --------
1           NULL        Asia                                          1       1
2           NULL        Africa                                        1       1
3           NULL        Antarctica                                    1       1
4           NULL        Australia                                     1       1
5           NULL        Europe                                        1       1
6           NULL        North America                                 1       1
7           NULL        South America                                 1       1
8           6           Canada                                        2       2
9           6           United States of America                      2       2
10          9           Alaska                                        3       3
11          9           Alabama                                       3       3
12          9           Arkansas                                      3       3
13          9           Arizona                                       3       3
14          9           California                                    3       3
15          9           Colorado                                      3       3
16          9           Connecticut                                   3       3
17          9           District of Columbia                          3       3
18          9           Delaware                                      3       3
19          9           Florida                                       3       3
20          9           Georgia                                       3       3
21          14          Los Angeles                                   4       4
22          19          Miami                                         4       4
23          20          Atlanta                                       4       4

We can now get rid of the [Level] column and use this function to derive the level of each row.  This addresses the issue of newly inserted rows as well as newly created level within the hierarchy.