Less reading is a good thing when you’re reading from a physical disk.
During some recent performance tuning, I stumbled across a little gem of a stored procedure that was producing between 4,000 and 11,000 physical reads every time it was called. The name? GetStateList.
It returns a list of state abbreviations. I’m not proud to say that I wrote it, but I have to come clean: I wrote it.
So, how did I fix this travesty of my youthful ignorance?
Well, it’s just a list of states, right? And it’s just all of the states from a couple of different tables. So, the easiest thing to do was to create an indexed, or materialized, view. Basically an indexed view is a view with a clustered index. Since a clustered index defines the physical order of data on the disk, the view is persisted to disk. When the underlying data changes, the data that is persisted to disk will change. Plus, you can supply other indexes on the view, just like it’s a table, which can make your queries run even faster.
Imagine, if you will, that you have a Location table full of data, as opposed to the empty one that we’re about to create.
CREATE TABLE Locations ( LocationID INT NOT NULL IDENTITY(1, 1) PRIMARY KEY, ClientID INT NOT NULL, Name VARCHAR(34) NOT NULL, State VARCHAR(2) NOT NULL );
So, we have a lot of locations. Each location is belongs to a client. Each location has a state. I want a list of allowed states for each client.
SELECT l.ClientID, l.[State] FROM Locations AS l GROUP BY l.ClientId, l.[State]
Well, that’s great, but that’s also the query that results in WAY too many reads. So, how do we go about fixing this situation?
The indexed view that I mentioned earlier, that’s how!
CREATE VIEW dbo.LocationStates WITH SCHEMABINDING -- you need this, can't very well persist changes if the table changes AS SELECT l.ClientID, l.[State], COUNT_BIG(*) AS the_count FROM dbo.Locations AS l GROUP BY l.ClientId, l.[State] GO CREATE UNIQUE CLUSTERED INDEX CIX_Locations_ClientID_State ON LocationStates (ClientID, State); GO
What’s up with that COUNT_BIG(*) in there? Well, any time you use an aggregate function, or aggregation, in an indexed view, you also need to make use of COUNT_BIG(*) (see the MSDN Link above for more info).
Once I created the indexed view, I changed my queries to use the indexed view:
SELECT State FROM dbo.LocationStates AS x WITH(NOEXPAND) WHERE ClientId = @client_id;
The WITH(NOEXPAND) hint forces SQL Server to use the indexed view. As I understand it, this is not necessary on Enterprise Edition. Since I’m on Standard Edition I cannot verify this, but I can verify that SQL Server would not use the indexed view until I supplied the hint.
What was the outcome of all of this? Well, instead of several thousand reads, this ended up taking 2 reads. Considering that this stored procedure gets called thousands and thousands of times a day, that’s a HUGE savings.
The downside to using an indexed view is that if the underlying data changes, the persisted view data has to be updated, so it’s best to avoid using indexed views on queries that change constantly.