Tag sqljackass

When Less Reading is a Good Thing

Less reading is a good thing when you’re reading from a physical disk.

During some recent performance tuning, I stumbled across a little gem of a stored procedure that was producing between 4,000 and 11,000 physical reads every time it was called. The name? GetStateList.

It returns a list of state abbreviations. I’m not proud to say that I wrote it, but I have to come clean: I wrote it.

So, how did I fix this travesty of my youthful ignorance?

Well, it’s just a list of states, right? And it’s just all of the states from a couple of different tables. So, the easiest thing to do was to create an indexed, or materialized, view. Basically an indexed view is a view with a clustered index. Since a clustered index defines the physical order of data on the disk, the view is persisted to disk. When the underlying data changes, the data that is persisted to disk will change. Plus, you can supply other indexes on the view, just like it’s a table, which can make your queries run even faster.

Imagine, if you will, that you have a Location table full of data, as opposed to the empty one that we’re about to create.

CREATE TABLE Locations (
  LocationID INT NOT NULL IDENTITY(1, 1) PRIMARY KEY,
  ClientID INT NOT NULL,
  Name VARCHAR(34) NOT NULL,
  State VARCHAR(2) NOT NULL
);

So, we have a lot of locations. Each location is belongs to a client. Each location has a state. I want a list of allowed states for each client.

SELECT l.ClientID,
       l.[State]
FROM Locations AS l
GROUP BY l.ClientId, l.[State]

Well, that’s great, but that’s also the query that results in WAY too many reads. So, how do we go about fixing this situation?

The indexed view that I mentioned earlier, that’s how!

CREATE VIEW dbo.LocationStates
WITH SCHEMABINDING -- you need this, can't very well persist changes if the table changes
AS
SELECT l.ClientID,
       l.[State],
       COUNT_BIG(*) AS the_count
FROM dbo.Locations AS l
GROUP BY l.ClientId, l.[State]
GO

CREATE UNIQUE CLUSTERED INDEX CIX_Locations_ClientID_State ON LocationStates (ClientID, State);
GO

What’s up with that COUNT_BIG(*) in there? Well, any time you use an aggregate function, or aggregation, in an indexed view, you also need to make use of COUNT_BIG(*) (see the MSDN Link above for more info).

Once I created the indexed view, I changed my queries to use the indexed view:

SELECT State
   FROM dbo.LocationStates AS x WITH(NOEXPAND)
  WHERE ClientId = @client_id;

The WITH(NOEXPAND) hint forces SQL Server to use the indexed view. As I understand it, this is not necessary on Enterprise Edition. Since I’m on Standard Edition I cannot verify this, but I can verify that SQL Server would not use the indexed view until I supplied the hint.

What was the outcome of all of this? Well, instead of several thousand reads, this ended up taking 2 reads. Considering that this stored procedure gets called thousands and thousands of times a day, that’s a HUGE savings.

The downside to using an indexed view is that if the underlying data changes, the persisted view data has to be updated, so it’s best to avoid using indexed views on queries that change constantly.

Building a new SQL 2008 box

I have a lucky, and rare, chance coming my way in about a week: I will be given the opportunity to design and install the production SQL servers at work.

Here’s the fun part – I’ve never done this before, at least not on a production machine.

What best practices, secret tricks, or hare-brained schemes do you, my fellow DBAs, follow when you put together a new SQL Server installation?

This could be anything from drive configuration, something about SANs, trace flags, whatever. You name it, I’ll evaluate it and confer with other smartypants types and once the install is done, I’ll let you know how everything goes.

Oh, the hardware is going to be a dual CPU quad Xeon machine from Dell with 16GB of RAM. And we do have a SAN of some nature, but that’s a mystery box.

Edit: So, what will these machines actually do? They will serve reports off of an OLTP schema that is updated every 15 minutes through replication from our master database server. There will be two instances on each machine – one that is live and one that just sits there, doing nothing. They mirror each other, in addition to the geographic fail over that somebody else is configuring. We do some heavy batch processing in the early morning hours, but during the day the machine is largely doing reads and whatever writes that the replication brings about.

The database that holds the majority of the reporting data is about 45GB and there is approximately 25GB of metadata and summarized reports (the batch processing, naturally) in a second database.

Rated G for Nerdy Content

My fellow DBAs and SQL Server Developers, have you ever worried about being associated with other, perverted, developers who might use profanity or nudity in their presentations? Are you concerned that people might not know what kind of content to expect in your presentations?

Worry no more!

Even C++ developers can understand this!

Even C++ developers can understand this!

Now you can declare to the world that your presentation is free of filth and only contains worthwhile knowledge and information. No more will people confuse you with some kind of filth monger. No more will people cringe in fear when you mention your up time. Rejoice my fellow developers!

This site is protected with Urban Giraffe's plugin 'HTML Purified' and Edward Z. Yang's Powered by HTML Purifier. 401 items have been purified.