When clearing database tables, one knows that certain tables must be deleted before others, but others can be deleted at any point. Hence we want to establish a partial order of the delete operations in our cleanup script. A
Topological Sort algorithm satisfies this requirement. As the name suggests it evaluates the interconnected nodes [in our case tables] to determine which come first.
|
So if we were to look at the database structure on the right, we would want the UserTeam table cleared before the Team and User tables. The Document table would also need to be cleared before we could clear out the user table. Hence any of the following clearance orders would satisfy our needs:
- UserTeam, Document, User, Team
- Document, UserTeam, Team, User
- Document, UserTeam, User , Team
- UserTeam, Team, Document, User
I think the list above makes the concept of a partial order pretty clearly since no one order is better than the other. Obviously as one introduces more dependencies so the order will change and the number of combinations grows significantly. The list however is actually the inverse of a topological sort since it starts of with the tables that depend on other tables and ends with tables that have no dependencies.
|
|
In order to sort the relationships correctly we will have to traverse down the relationship tree until we find a table that has no relationships and then add it to our sorted list. We then pick another table and do the same until there are no unprocessed tables. The flowchart below illustrates the logic that I will use to implement the sort.

Doing this in TSQL is actually quite easy and I'll highlight the code I used to implement the flowchart logic:
- Compile a list of tables
CREATE TABLE #Tables (
TableName varchar(100)
)
INSERT INTO #Tables (
TableName
)
SELECT
Table_Name
FROM
INFORMATION_SCHEMA.TABLES
WHERE
TABLE_TYPE = 'BASE TABLE'
- Get a list of foreign key relationships
CREATE TABLE #Dependencies (
PKTable varchar(100),
FKRefTable varchar(100)
)
INSERT INTO #Dependencies (
PKTable,
FKRefTable
)
SELECT
PKTABLE.TABLE_NAME as PKTable,
FKTABLE.TABLE_NAME as FKRefTable
FROM
INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS REFCON
INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS FKTABLE ON
FKTABLE.CONSTRAINT_NAME = REFCON.CONSTRAINT_NAME
INNER JOIN INFORMATION_SCHEMA.TABLE_CONSTRAINTS PKTABLE ON
PKTABLE.CONSTRAINT_NAME = REFCON.UNIQUE_CONSTRAINT_NAME
ORDER BY
FKRefTable
- While there are unprocessed tables, traverse the relationship tree of each table until I reach a table (or node) that has no dependencies and add it to the sorted list
DECLARE @CurrentTable varchar(100)
DECLARE @DependantTable varchar(100)
DECLARE @PreviousRelatedTable varchar(100)
SET @PreviousRelatedTable = ''
SET @CurrentTable = (SELECT TOP 1 TableName FROM #Tables)
WHILE @CurrentTable != ''
BEGIN
SET @DependantTable = ''
SELECT TOP 1
@DependantTable = PKTable
FROM
#Dependencies
WHERE
FKRefTable = @CurrentTable AND
PKTable != @CurrentTable
/* ----------------------------------------------------
If this table does not have any dependencies on other tables
then add it to the sorted tables table and then remove it from
the tables list.
----------------------------------------------------- */
IF @DependantTable = ''
BEGIN
INSERT INTO #SortedTables (
TableName
) VALUES (
@CurrentTable
)
--Remove all references to the removed table to allow
--object higher up in the hierarchy to be added later
DELETE FROM #Dependencies
WHERE
PKTable = @CurrentTable
--Remove the current table from the tables list
DELETE FROM #Tables
WHERE
TableName = @CurrentTable
--Move on to the next table
SET @CurrentTable = ''
SET @CurrentTable = (SELECT TOP 1 TableName FROM #Tables)
END
/* --------------------------------------------------------
If a dependant table was found then move on to that
table for the next loop
-------------------------------------------------------- */
ELSE
BEGIN
SET @PreviousRelatedTable = @CurrentTable
SET @CurrentTable = @DependantTable
END
END
The end result in this case is the following script:
DELETE FROM [UserTeam]
DELETE FROM [Team]
DELETE FROM [Document]
DELETE FROM [User]
The complete script is available here
Clear DB - Topological Sort Example.zip.
Obviously the same approach without the list reversal can be used to determine the order in which to script out tables and data.