When Less Reading is a Good Thing

Less reading is a good thing when you’re reading from a physical disk.

During some recent performance tuning, I stumbled across a little gem of a stored procedure that was producing between 4,000 and 11,000 physical reads every time it was called. The name? GetStateList.

It returns a list of state abbreviations. I’m not proud to say that I wrote it, but I have to come clean: I wrote it.

So, how did I fix this travesty of my youthful ignorance?

Well, it’s just a list of states, right? And it’s just all of the states from a couple of different tables. So, the easiest thing to do was to create an indexed, or materialized, view. Basically an indexed view is a view with a clustered index. Since a clustered index defines the physical order of data on the disk, the view is persisted to disk. When the underlying data changes, the data that is persisted to disk will change. Plus, you can supply other indexes on the view, just like it’s a table, which can make your queries run even faster.

Imagine, if you will, that you have a Location table full of data, as opposed to the empty one that we’re about to create.

CREATE TABLE Locations (

So, we have a lot of locations. Each location is belongs to a client. Each location has a state. I want a list of allowed states for each client.

SELECT l.ClientID,
FROM Locations AS l
GROUP BY l.ClientId, l.[State]

Well, that’s great, but that’s also the query that results in WAY too many reads. So, how do we go about fixing this situation?

The indexed view that I mentioned earlier, that’s how!

CREATE VIEW dbo.LocationStates
WITH SCHEMABINDING -- you need this, can't very well persist changes if the table changes
SELECT l.ClientID,
       COUNT_BIG(*) AS the_count
FROM dbo.Locations AS l
GROUP BY l.ClientId, l.[State]

CREATE UNIQUE CLUSTERED INDEX CIX_Locations_ClientID_State ON LocationStates (ClientID, State);

What’s up with that COUNT_BIG(*) in there? Well, any time you use an aggregate function, or aggregation, in an indexed view, you also need to make use of COUNT_BIG(*) (see the MSDN Link above for more info).

Once I created the indexed view, I changed my queries to use the indexed view:

   FROM dbo.LocationStates AS x WITH(NOEXPAND)
  WHERE ClientId = @client_id;

The WITH(NOEXPAND) hint forces SQL Server to use the indexed view. As I understand it, this is not necessary on Enterprise Edition. Since I’m on Standard Edition I cannot verify this, but I can verify that SQL Server would not use the indexed view until I supplied the hint.

What was the outcome of all of this? Well, instead of several thousand reads, this ended up taking 2 reads. Considering that this stored procedure gets called thousands and thousands of times a day, that’s a HUGE savings.

The downside to using an indexed view is that if the underlying data changes, the persisted view data has to be updated, so it’s best to avoid using indexed views on queries that change constantly.


7 Comments so far. Comments are closed.
  1. Aaron Bertrand,

    As an FYI, I have observed at least one case where I had better performance when using NOEXPAND on Enterprise Edition. I’ll try to dig up the details but IIRC the index on the view was not being chosen.

  2. Hey, you forgot to mention the gazillion SET options that have to be supplied to create the view :-)
    Also you need dbo.Locations in the view definition or you can’t create it

    just curious, when you run this

    SELECT l.ClientID,
    FROM Locations AS l
    GROUP BY l.ClientId, l.[STATE]

    |–Clustered Index Scan(OBJECT:([tempdb].[dbo].[LocationStates].[CIX_Locations_ClientID_State]))

    it uses the view index for me

    also running this without a hint

    FROM dbo.LocationStates AS x
    WHERE ClientId = 1;

    gets me this plan

    |–Stream Aggregate(GROUP BY:([LocationStates].[STATE]))
    |–Clustered Index Seek(OBJECT:([tempdb].[dbo].[LocationStates].[CIX_Locations_ClientID_State]),
    SEEK:([LocationStates].[ClientID]=Convert([@1])) ORDERED FORWARD)

  3. Denis, you have a great point! My view, technically, is bunk without the schema. I’ll change that.

    When I ran this on my production instance, I saw the opposite of what you saw and ended up with my code never using the indexed view unless I specified WITH(NOEXPAND). I suspect, though, this may be due to the prevalence of that execution plan in the plan cache.

  4. Andy Irving,

    Just to confirm what you said, from http://technet.microsoft.com/en-us/library/ms181151.aspx
    In SQL Server Enterprise Edition, the query optimizer automatically considers the indexed view. To use an indexed view in all other editions, the NOEXPAND table hint must be used.

    Dennis, i think what you’re seeing is THAT EE will act like it does with filtered indexes in 2008, and consider the indexes for use in the plans (which only happens if you don’t specify noexpand).

  5. Regarding this statement…

    “Considering that this stored procedure gets called thousands and thousands of times a day, that’s a HUGE savings.”

    States don’t change very frequently. Any thought to caching the state list in the application layer and save those 2 reads per 1k+ calls? :)

    • I keep making that argument, but nobody listens to me :)

      Also, it is entirely possible, given the way this application is built, for a specific user to have access to 50 states and have that change to 45 states pretty quickly due to changes in security restrictions and also the constant ETL process from the mainframe.

      The other issue is that we’d need to cache the state list for each user, because of the security in the app, so it might create some problems with memory use on the web server.


One Trackback Trackbacks are closed.

This site is protected with Urban Giraffe's plugin 'HTML Purified' and Edward Z. Yang's Powered by HTML Purifier. 531 items have been purified.