July 2009
Mon Tue Wed Thu Fri Sat Sun
« Jun   Aug »
 12345
6789101112
13141516171819
20212223242526
2728293031  

Month July 2009

Counting Children with CTEs

Have you ever wanted to get a running total of all of the descendants of each tree node? This sort of thing is useful, especially if you don’t want to pull back an entire object graph just to compute the count of a child collection.

IF OBJECT_ID('tempdb..#categories') IS NOT NULL
  DROP TABLE #categories;

IF OBJECT_ID('tempdb..#product_categories') IS NOT NULL
  DROP TABLE #product_categories;

CREATE TABLE #categories (CategoryId INT, ParentCategoryId INT);
CREATE TABLE #product_categories (CategoryId INT, ProductId INT);

INSERT INTO #categories VALUES (1, NULL);
INSERT INTO #categories VALUES (2, 1);
INSERT INTO #categories VALUES (3, 1);
INSERT INTO #categories VALUES (4, 2);
INSERT INTO #categories VALUES (5, 3);
INSERT INTO #categories VALUES (6, 2);

INSERT INTO #product_categories VALUES (1, 1);
INSERT INTO #product_categories VALUES (2, 15);
INSERT INTO #product_categories VALUES (2, 18);
INSERT INTO #product_categories VALUES (3, 12);
INSERT INTO #product_categories VALUES (6, 34);
INSERT INTO #product_categories VALUES (4, 35);
INSERT INTO #product_categories VALUES (5, 99);
INSERT INTO #product_categories VALUES (3, 43);
INSERT INTO #product_categories VALUES (6, 54);
INSERT INTO #product_categories VALUES (3, 92);
INSERT INTO #product_categories VALUES (2, 77);
INSERT INTO #product_categories VALUES (5, 62);
INSERT INTO #product_categories VALUES (4, 42);
INSERT INTO #product_categories VALUES (1, 11);

;
WITH  cte(CategoryId, ParentCategoryId, c)
        AS (SELECT  c1.CategoryId,
                    c1.ParentCategoryId,
                    (SELECT COUNT(*)
                     FROM   (SELECT DISTINCT
                                    ProductId
                             FROM   #product_categories AS pc
                             WHERE  pc.CategoryId = c1.CategoryId
                            ) AS t1
                    ) AS c
            FROM    #categories AS c1
            UNION ALL
            SELECT  c2.CategoryId,
                    c2.ParentCategoryID,
                    d.c
            FROM    #categories c2
                    INNER JOIN cte d ON c2.CategoryId = d.ParentCategoryId
           )
  SELECT  cte.CategoryId,
          cte.ParentCategoryId,
          SUM(c) AS ProductCount
  FROM    cte
  GROUP BY cte.CategoryId,
          cte.ParentCategoryId;

So, how does mess work? Carefully.

The innermost select, t1, builds a virtual table that is nothing more than the product count under a single Category. There is a DISTINCT on here just to make sure that there are no duplicates from the #product_categories table. Theoretically this should be a group by, but this was a super fast hotfix that needed to go into production ASAP. Mistakes happen.

Moving outwards we encounter a SELECT COUNT on t1. This is where we get the actual count of products under the current category. Pretty standard.

What’s interesting, to me at least, is that in the recursive portion of the CTE, we’re working our way UP the hierarchy, from the bottom. This makes sure that when we do a sum of c, grouped by category and parent category, we’ll have the proper count in our CTE results.

What’s the advantage of this approach? You don’t have nasty deeply nested sub-selects. You can use this to populate a ChildCount column in your table as a result of a trigger. This is also, usually, fast enough to run live or else to use as a materialized view.

Happy Contribupendence Day

Last year, Jeff Blankeburg came up with the idea of Contribupendence Day. To save you from visiting Jeff’s site and reading another blog post, Contribupendence day is a day when we acknowledge the people who help out our community.

Update: It’s completely remiss of me not to tag the person who tagged me. Carey Payette singled me out this year. I can’t say enough great things about her community involvement and the time she puts in to running the Central Ohio dot Net Developers Group.

Mladen Prajdić – Mladen is slowly becoming better known in the SQL Server community, in large part because of SSMS Tools Pack. What many people don’t realize is that Mladen is a phenomenal SQL developer in his own right and is more than happy to share that knowledge via twitter and email. He even braved the time zone difference (he lives in Slovenia) and presented for the PASS Application Development virtual chapter. He is easily able to explain complex concepts and has always been more than happy to share his knowledge with others.

Thomas LaRock – Tom helps out PASS as a member of the Board of Directors. He helps out the community as an active blogger and twitterer. Outside of blogging and tweeting, Tom is always a phenomenal voice of reason and is able to see both sides of a situation – both as a developer/DBA and from the perspective of business users.

Michelle Ufford – Michelle is a developer DBA with GoDaddy.com and an active member of the SQL Server community. In addition to being a blogger, she is heavily involved in the I380 Corridor PASS chapter, organizing the East Iowa SQL Saturday, and maintaining her index maintenance scripts at SQL Server Pedia. Michelle is active on twitter as sqlfool and is always willing to help out with SQL questions. Michelle has also been working heavily to start up the Performance virtual chapter for PASS.

Brent Ozar – Brent’s involvement in the SQL Server community never ceases to amaze. He blogs prolifically on his personal website, is the Editor-in-Chief of SQL Server Pedia, is building up the Virtualization virtual chapter of PASS, records podcasts, presents for user groups both in person and remotely, and is very active on twitter as BrentO. Brent is incredibly knowledgable about SQL Server, SAN configuration, T-SQL, and server consolidation and virtualization. He has always taken the time to help me understand SQL Server concepts and has guided my knoweldge as I’ve been learning more about performance tuning SQL Server and the underlying OS and disk structure.

All of these people happily contribute their knowledge and free time to make the SQL Server community a better place.

Clearing out old backup history

Let’s say that you want to clear out your old backup history. Why would you want to do that? Let’s also assume that your maintenance plans (yes, maintenance plans) have not been cleaning up after themselves.

If this has happened over a small period of time, you can just do this:

EXEC sp_delete_backuphistory DATEADD(m, -1, GETDATE());

But what if you have a lot of backup history? Trying to delete two years of backup history will slow things down incredibly on your server and take a very very long time. To the point where your boss will come over and frantically say ‘Hey, why is the server refusing connections?’

Instead you’ll want to use something like this:

DECLARE @oldest_date AS DATETIME;

IF OBJECT_ID('tempdb..#dates') IS NOT NULL
  DROP TABLE #dates;

SELECT DISTINCT DATEADD(d, 0, DATEDIFF(d, 0, backup_start_date)) AS d
INTO #dates
FROM dbo.backupset

SELECT @oldest_date = DATEADD(d, 2, MIN(d))
FROM #dates
WHERE d < DATEADD(mm, -1, GETDATE())

EXEC sp_delete_backuphistory @oldest_date;

Why isn’t this in a loop? Because I have a sinus infection and I didn’t feel like being that clever while I’m working from home. That’s why. You probably should be clever and put this in a loop. And put a nice little wait at the end of it that waits for a minute or two before repeating the whole thing. Me? I just sit here and zone out for a few minutes. The point is, this works.

If you’ll excuse me, I’m going to get back to watching day time TV and pressing F5 every 3 minutes.

Columbus Give Camp is Just Two Weeks Away

The Columbus Give Camp is only two weeks away but it’s still not too late for both charities and volunteers to sign up.

GiveCamp is a weekend-long event where software developers, designers, and database administrators donate their time to create custom software for non-profit organizations. This custom software could be a new website for the nonprofit organization, a small data-collection application to keep track of members, or a application for the Red Cross that automatically emails a blood donor three months after they’ve donated blood to remind them that they are now eligible to donate again. The only limitation is that the project should be scoped to be able to be completed in a weekend.

During GiveCamp, developers are welcome to go home in the evenings or camp out all weekend long. There are usually food and drink provided at the event. There are sometimes even game systems set up for when you and your need a little break! Overall, it’s a great opportunity for people to work together, developing new friendships, and doing something important for their community.

At GiveCamp, there is an expectation of “What Happens at GiveCamp, Stays at GiveCamp”. Therefore, all source code must be turned over to the charities at the end of the weekend (developers cannot ask for payment) and the charities are responsible for maintaining the code moving forward (charities cannot expect the developers to maintain the codebase).

The deadline is July 8th, so get your charity proposals and volunteer submissions in!

This site is protected with Urban Giraffe's plugin 'HTML Purified' and Edward Z. Yang's Powered by HTML Purifier. 401 items have been purified.