Web developers wiki ASP.NET Sitecore Sharepoint Kentico by Evident Interactive

HierarchyID - Design considerations

Modified: 2008/09/22 15:17 by vanthoog - Categorized as: SQL Server
In a previous article we have presented a brief introduction to the data type HierarchyID in SQL Server 2008. In this follow-up article a number of pitfalls and design considerations are described.

Edit

Pitfalls

When working with the data type HierarchyID it is easy to run into problems and encounter various pitfalls. This is especially true when using HierarchyID as the data type of one or more columns in a table. We will now discuss a number of these problems and pitfalls.

The most important thing to remember when using HierarchyID is that the various methods of HierarchyID do not know anything about the context in which HierarchyID is used. Therefore it is up to the designer and programmer to take care of the context themselves.

When, for example, you are using GetReparentedValue to determine the new path of an item, the method GetReparentedValue will not check whether a record already exists with the path it has just recalculated. So, it is very important that after using GetReparentedValue to determine a new path, you or your code checks whether a record already exists with that path.

This also applies to breaking the hierarchy. For example, let’s have a look at the following hierarchy:

/
/1/
/1/1/
/1/1/1/
/2/

You might expect that an error is produced when you try to delete the record with path /1/1/, because this record has children. But this is not the case; you can just delete the record with path /1/1/ and the record with path /1/1/1/ will remain in the database as an orphan (i.e. a record without a parent). And in this same example, if you use GetReparentedValue to change the path of the item /1/1/ into /2/1/, the children of /1/1/, in this case the item /1/1/1/, will not be changed automatically as well. So again, it is up to you and your code to also change the path of all children of the item whose path you are changing.

When inserting or updating records it is possible to use hardcoded paths (for example by specifying ‘/1/2/3/’). Although hierarchyid::Parse() will translate your hardcoded path into a valid HierarchyID, it will not check whether the supplied path is valid in the context of your table. As a result using incorrect hardcoded paths may result in two records with the same path or orphaned children (i.e. children with missing parents). In general it is therefore advisable not to use hardcoded paths.

Some of these issues might be resolved by using constraints. The following two constraints will prevent many inconsistency problems:
- A unique constraint on the HierarchyID column.
- A foreign key constraint between a computed column which defines the parent of the HierarchyID column, and the HierarchyID column itself.

Although this is a secure approach to ensuring the integrity and consistency of your hierarchy, it will be bad for performance.

Here is an example of a table implementing this approach:

CREATE TABLE Company
(
   Node HIERARCHYID NOT NULL PRIMARY KEY,
   ParentNode AS Node.GetAncestor(1) PERSISTED REFERENCES Company(Node),
   Title VARCHAR(28) NULL
) ;

It is possible to use a column with data type HierarchyID as the primary key of a table. This is generally not a good idea, because changing the path of an item will be cumbersome and might turn into a nightmare. Take for example the following example:
Example: company with vacancies

Example: company with vacancies


You could implement this like this:

CREATE TABLE Company
(
   Node HIERARCHYID NOT NULL PRIMARY KEY,
   Level AS Node.GetLevel() ,
   Title VARCHAR(28) NULL
) ;
GO

CREATE TABLE Vacancy
(
   ID INT NOT NULL PRIMARY KEY,
   VacancyTitle VARCHAR(28) NULL,
   CompanyNode HIERARCHYID NOT NULL REFERENCES Company
) ;
GO

Now if you change the path of a company, you will not only have to change the path of all its children, but also have to update the column CompanyNode in the table Vacancy. A better approach would be to always use ID columns as primary key and in foreign keys referencing this table. So the following would be a better approach:

CREATE TABLE Company
(
   ID INT NOT NULL PRIMARY KEY,
   Node HIERARCHYID NOT NULL UNIQUE,
   Level AS Node.GetLevel() ,
   Title VARCHAR(28) NULL
) ;
GO

CREATE TABLE Vacancy
(
   ID INT NOT NULL PRIMARY KEY,
   VacancyTitle VARCHAR(28) NULL,
   CompanyID INT NOT NULL REFERENCES Company
) ;
GO

Edit

Design considerations

Here are some design considerations and additional tips:
- Do not use a column of type HierarchyID as the primary key of your table. Instead, use a ID-like column of type UniqueIdentifier (or Int) as the primary key.
- Put a unique constraint on all columns of type HierarchyID. It is also acceptable to put the unique constraint on the combination of a column of type HierarchyID and another column (e.g. an ID column referencing another table).
- Add a computed column defining the parent of the HierarchyID column and put a foreign key constraint between this computed column and the HierarchyID column.
- Within the same table do not use two columns of type HierarchyID. Although this is technically possible, it will add an extra level of complexity to your insert, update and delete queries.
- Consider adding a computed column (using GetLevel()) containing the level of the hierarchy.
- Consider whether to make the primary key (i.e. your ID column) or the unique constraint (i.e. your HierarchyID column) clustered.
- Consider using stored procedures for inserting, updating and deleting records and add code in your stored procedures for ensuring the integrity of your hierarchy.
- Always use HierarchyID::GetRoot() (i.e. ‘/’) as the root of your hierarchy.
- Do not use hardcoded paths.

 © Evident Interactive BV