The people who follow me on twitter have probably noticed that when I’m not tweeting absolute nonsense, I’m tweeting about performance tuning SQL. What most people don’t know is that I don’t have sysadmin access on our servers, nor do I have any of the other permissions that would let me collect trace information or access the dynamic management views. I’ll be honest, at first I thought this would be an insurmountable task – how could I possibly tune queries without access to the tools that I have typically used to tune queries? Eventually, I realized that I always had the knowledge that the tools provided, so I got to work putting this knowledge to work for me.
What the hell? Queries aren’t fruit salad! You’re absolutely right, but it’s still possible to do a bit of genetic programming when working on a query. Here’s how this process works for me:
Building the core query is where genetic programming comes into play. As I write a query, I’ll start with a core table and slowly add in additional tables and predicates one at a time until I’ve built my query. The whole time I’m doing this, I keep telling SSMS to return the actual execution plan. Normally, if everything runs quickly, I give the execution plan a quick once over with each step to look for glaring trouble spots that need attention before I move on. As I move through the query, I don’t add any aggregation, I just keep adding tables, predicates, and output columns. This lets see where I need to group, verify the data I’m getting, and actually see the raw data before I start aggregating values. This last part is extremely helpful when you’re dealing with CTEs and aggregates.
How does this help me tune the query? Whenever there is a slow down because of a bad join or missing index it’s immediately visible and I can take action. Remember how I said I always return the actual execution plan? There’s a reason for that. I can open up the execution plan and quickly determine what is causing the slow down. If a query goes from taking about a second to 30 seconds, I know that I’ve either done something incredibly wrong or else I’ve missed something in a join or where clause.
Whenever I start writing a query, I start with the table and conditions that will filter down results the fastest. This comes from knowing how SQL Server builds query results – I suggest taking a look at Itzik Ben-Gan’s fantastic book Inside Microsoft SQL Server 2005: T-SQL Querying for an introduction to this topic. Basically, I find the conditions that will be the most restrictive and build my query out from there. I work with a range of data that covers 7 years and the detail tables contain 40,000,000+ rows. This isn’t a lot in the grand scheme of things, but when seconds matter, it’s important to have the smallest possible number of records to join on before you get to the monster table.
When all else fails, or when the execution plan is just plain bad, I look at the indexes being used and I compare them to the possible indexes on the table. There are covering indexes in place for most common queries on any single table. When it makes sense, I force the engine to use specific indexes through judicious application of index hints. What I don’t do is feed my queries into the Database Tuning Advisor and apply the indexes it suggests. I do use the DTA tools to look for indexes that I might have missed, but I create indexes myself and occasionally, carefully, combine its suggestions with my own to create better covering indexes.
When working almost exclusively with dynamic SQL, it’s important to provide index hints since the number of possible execution plans is staggering and SQL Server will not be able to hold them all in memory. It’s better (and cheaper) to front load the brainpower during query development than to find out that you have poorly performing queries and have to rewrite them on the fly while customers are calling up and complaining about poor performance.
Parameter sniffing is when SQL Server caches the values of variables passed into the query for later re-use. The problem is that these might be infrequently used values or the values might always be different. When you have multiple clients accessing the database and @ClientId is a parameter for every query, you don’t want that value to be cached. Instead, I immediately copy stored procedure parameters to a second set of internal parameters. Unfortunately, I lose the advantage of caching if there are common queries that will always use the same or similar parameters, so I’m careful about how I do this. When it makes sense, I don’t copy parameter values.
One of the easiest places to lose performance is by writing poorly performing where clauses. Essentially, make sure the columns in your where conditions are always using an indexed column and that you are using the columns in a way that will make use of the index – WHERE DATEPART(yyyy, my_col) > 2006 isn’t going to cut it. Denis Gobo has a great write up on this topic over atLess Than Dot.
There are a lot of great tools out there, both within SQL Server and from third party vendors, that make this process easier. It’s also important to realize that these are skills that we all have and can put to use before resorting to tools to get the last 10 or 20% of performance tuning.
Here are two related articles about security within SQL Server.