Background
Recently, I have been organizing my blog data and needed to perform an operation to merge article categories. On my blog, articles and categories have a many-to-many relationship. That is, one article can belong to multiple categories, and one category can contain multiple articles. This is a very typical many-to-many relationship, and I use a many-to-many table with a composite primary key to associate this data.
What I need to do is move articles from the "DotNetBeginner" category to the "CSharpAndDotNet" category. However, since some of the articles originally in the "DotNetBeginner" category also belong to "CSharpAndDotNet", directly updating the association table would result in duplicate composite keys, which would cause issues.
To put it more intuitively, write an SQL statement to query the data from the original category (DotNetBeginner) and the target category (CSharpAndDotNet):
DECLARE @SourceCatId AS UNIQUEIDENTIFIER,
@TargetCatId AS UNIQUEIDENTIFIER
SET @SourceCatId = '678A4011-40E0-4F75-BC23-1FFC25B72D4A'
SET @TargetCatId = 'D58043FF-B3CB-43DA-9067-522D76D21BE3'
SELECT p.Title,
c.Name
FROM PostCategory AS pc
INNER JOIN Post AS p
ON p.Id = pc.PostId
INNER JOIN Category AS c
ON c.Id = pc.CategoryId
WHERE pc.CategoryId IN (@SourceCatId, @TargetCatId)
The result is shown in the following figure:
The article "C#字符和ASCII码互转" will encounter issues if directly updated to the CSharpAndDotNet category because there is already an existing record in the PostCategory table. However, an article like "上海轨道交通……" can be directly updated without any issues.
Solution
The approach to solve this problem is divided into two steps:
- Delete records that will break the composite primary key, such as those in "C#字符和ASCII码互转".
- Update the related table, changing the ID of the old category to the new category.
First, we need to identify which records meet the criteria for deletion. Find the IDs of these articles and use a GROUP BY HAVING
clause to query them.
SELECT pc.PostId
FROM PostCategory AS pc
WHERE pc.CategoryId IN (@SourceCatId, @TargetCatId)
GROUP BY
pc.PostId
HAVING COUNT(*) >= 2
The results are 10 items:
These 10 articles simultaneously belong to the DotNetBeginner, CSharpAndDotNet, and other categories.
To display intuitively and facilitate subsequent deletion operations, define a table variable called temp, and then verify which articles they are.
DECLARE @Temp TABLE (PostId UNIQUEIDENTIFIER)
INSERT INTO @Temp
(
PostId
)(
SELECT pc.PostId
FROM PostCategory AS pc
WHERE pc.CategoryId IN (@SourceCatId, @TargetCatId)
GROUP BY
pc.PostId
HAVING COUNT(*) >= 2
)
------------------------------------------------------------------------------
SELECT --pc.PostId
p.Title,
--pc.CategoryId,
c.Name
FROM PostCategory AS pc
INNER JOIN Post AS p
ON p.Id = pc.PostId
INNER JOIN Category AS c
ON c.Id = pc.CategoryId
WHERE pc.PostId IN (SELECT t.PostId
FROM @Temp t)
The result was found to be correct:
Then, you can delete all records where the PostId
in the @Temp
table matches from the PostCategory
association table, and the CategoryId
corresponds to DotNetBeginner. After that, use an update statement to complete the merging of article categories.
-- Step 1. Delete records that will fuck up the primary key
DELETE
FROM PostCategory
WHERE CategoryId = @SourceCatId
AND PostId IN (SELECT t.PostId
FROM @Temp t)
-- Step 2. Update old key to new key
UPDATE PostCategory
SET CategoryId = @TargetCatId
WHERE CategoryId = @SourceCatId
Finally, let's verify that the data has been successfully merged.
Comments