June 2010
Mon Tue Wed Thu Fri Sat Sun
« May   Jul »
 123456
78910111213
14151617181920
21222324252627
282930  

Month June 2010

SQL Saturday 42 Musings

SQL Saturday 42 has been and gone. I don’t have a ton to say, but I wanted to try to post my thoughts on the event before they were completely gone from my mind.

The people who put this event on – Dave Schutz, Stuart Johnson, Marc Kuyper, Jim Stoltz, and John Jakubowski – deserve a big round of applause. They put together a very strong event and it seemed like they were incredibly well prepared.

Despite Brent Ozar suffering from incredible jet lag, and another speaker not showing up, things went off well. Allen White and I were able to cover Brent’s time slots, but not his presentations. This led to the funny moment of an attendee walking into the room and saying “You’re not Brent Ozar.” I replied, “No, I am not, but I’m talking about SQL Server Internals, it’ll be fun.” They just said, “Oh,” and walked away. So, apparently, I’m nowhere near as engaging as Brent talking about his stupid and dangerous T-SQL tricks. Clearly they don’t know that I’m just stupid and dangerous.

Outside of that, things went very well. My Dynamic SQL talk went over well, although I think I need to re-work it. I felt like I was flipping back and forth between demos and code a bit too much. It gave the entire experience a jerky feel. I suspect I’m overly critical of the presentation since I’ve given it a few times, but it’s always good to improve.

My internals presentation wasn’t well attended – this is probably due to me not being Brent Ozar. However, that didn’t stop us from having fun talking about SQL Server Internals. If anything, having a room of 6 people made the topic more fun to talk about. We were able to digress onto different topics, delve a little bit deeper than normal, and have a lot of fun watching me not trip over an extension cord.

Last but not least was my Indexes and Other Free Performance Boosts. I had a full room on this – I suspect it had something to do with the word “Free” in the presentation title. This is a really fun presentation because it’s a whirlwind tour of indexes, keys, statistics, and how they work together to coax SQL Server into giving us data faster. Once again, I think I need to smooth up the code samples, but on the whole things went really well.

I don’t have any of the evaluations, but I hope the SQL Saturday team is able to get those out to the speakers soon. From the ones I glanced at, I did a passable job. I’m looking forward to the next event in Columbus.

Free Training – SQL Saturday 42

Good news! I’m speaking at SQL Saturday 42 this Saturday. Got nothing to do? Head on down to Goodwill Columbus at 1331 Edgehill Rd, Columbus, OH. Got something to do? Cancel it.

I’m excited about the presentations I’m giving – I haven’t given the indexing presentation in a long time and it should be a lot of fun. And the Dynamic SQL presentation is one of the first in the day. It’s a nice easy way (I think) to get your day started.

A Dynamic World Demands Dynamic SQL

Dynamic SQL is a misunderstood and much maligned part of a DBA’s tool kit – it can be used to solve difficult business problems, respond to diverse data needs, and alleviate performance problems. Many DBAs reject dynamic SQL outright as a potential source of SQL injections, being poorly performing, or just for being a hacky solution in general. Not so! Jeremiah Peschka has been making extensive use of dynamic SQL throughout his career to solve a variety of problems. He’ll set about dispelling these misconceptions and demonstrate how dynamic SQL can become a part of every DBA’s tool kit.

Indexes And Other Free Performance Boosts

The database is often viewed as a major performance bottleneck. There are a number of quick, easy, painless techniques that can increase the performance of an application not just by a small amount, but by orders of magnitude. These techniques includes simple indexing techniques, T-SQL techniques, and general database application design patterns that give great gains in performance. In this session, you will learn how to look at a database to identify these problem areas and how to resolve common issues that you will encounter.

What else?

Let’s say you’re interested in something else. What should you go see? Well, Michael Swart (blog | twitter) put together a nice little blog post about How I plan to spend my weekend.

If you like business intelligence, I suggest you hit up Dave Rodabaugh’s presentations. I cannot speak highly enough of Dave’s work. Not only is he one of the brightest BI people I know, he’s also been a teacher, friend, and mentor to me for a long time.

There will be some kind of dinner/drinks/whatever going on afterwards at Barley’s Smokehouse (map). I plan on being there for a little bit. Even if you can’t make it to the event, head on over there around 6:00PM. I’ll be there.

Shrink, Damn’d Log! Shrink, I Say!

Ever have a database log file grow to epic proportions on the weekend? I have. Ever forget to set up job monitoring on that server? I have. Ever get so pissed off that you decided to write a job to automatically shrink the log files whenever they grow? I have.

T-SQL Tuesday – “Did he just say that?” edition

As in, I didn’t participate in the most recent T-SQL Tuesday about my favorite feature in SQL 2008 R2.

Want to know my favorite 2008R2 features? PostgreSQL 9.0 and MongoDB.

PostgreSQL and MongoDB are rapidly advancing features that solve my daily problems. I’m an OLTP guy. Honestly, I don’t care about the latest reporting ding dongs and doo dads. I already know PowerShell. I manage 6 production servers and we’re unlikely to grow, so these MDM and Utility Control Points don’t make me giddy with excitement.

I solve problems using SQL.

You know what makes me happy? Support for window functions or better yet, improved support for window functions. What about exclusion constraints? Or column level triggers so I don’t have to use branching logic in my triggers? Yes, I use triggers. Or any other of these features.

What about MongoDB? I’ve just started playing with it, but it solves a lot of the problems I face at work. Not just in the day job, but in side projects as well. I’ve bitched about O/R-Ms before, but one of the biggest problems that users of O/R-Ms (developers) face is that the ideal way to model data for object-oriented programming bears no resemblance to the ideal way to store relational data. A recent article about scaling Farmville hints at this – the developers of Farmville managed scale by storing everything in a key-value store (memcached) before persisting to a relational data store later. Digg does something similar with Cassandra. It’s not like these guys are idiots, the blog posts from Digg show that they know their stuff.

MongoDB lets me solve these problems. I can control the frequency of syncs to disk (just as I can in PostgreSQL) to improve raw write performance to memory. I only have to worry about storing data the way my application expects to see the data – arrays and hashes – without worrying about building many-to-many join tables.

What about DMVs and data integrity and a write-ahead log and indexes? MongoDB has instrumentation and indexes. Yeah, you sacrifice some durability but many applications don’t need that. Hell, Amazon has even designed their systems to account for the potential of failure.

When I start my next application, I’m going to look long and hard at the platform I’m building on. There’s a good chance it’s not going to be a relational database and a really good chance it’s not going to be using SQL Server. It’s not because I have a problem with SQL Server or even RDBMSes, but because there are other choices that give me the flexibility I need to solve the problems I’m facing.

This is nothing against SQL Server 2008 R2, it’s a great step forward in the direction that SQL Server seems to be going. Sometimes I wonder if SQL Server and I are on the same road.

This Is How We Handle Problems

I had a production issue tonight. Still am, actually. I’ve admitted to it and here’s the email I’m sending to my management.

At 9:00PM I took a backup of database_a and database_b prior to running the database migration scripts. Once the backups were finished, I began the migration process at approximately 9:20.

I stopped the migration process at 10:15 after multiple failures and restarts. There are too many unknown cross-dependencies to go on with the roll forward. At this time I called SUPPORT PERSON and explained the situation. I also called MANAGEMENT and left a voice mail. I then began the process of restoring the production databases on SERVER_A.

No changes were made to SERVER_FIGHTING_MONGOOSE or SERVER_C.

Once I had restored database_b and database_a on SERVER_A, I began seeing multiple failures from replication and the rest of SQL Server indicating severe problems with the physical disk structure. I immediately stopped all replication involving database_a and database_b on SERVER_A and I have begun a physical drive integrity check using SQL Server’s built-in integrity check tool: DBCC CHECKDB. The CHECKDB for database_b finished at 11:15 with a clean bill of health. database_a is still running as of 11:31PM.

Once the CHECKDB process for database_a is complete, I will begin re-initialize the subscription for the database_a database on SERVER_A. Following the successful completion of the database_a re-initialization, I will begin the process of re-initializing the subscription to database_b.

If you have any questions, feel free to contact me at 867-5309.

See what I did there?

  • I stated how we got into this mess – I dropped a running chainsaw into the SAN.
  • I outlined my decision making process and took ownership of rolling back our production migration.
  • I described the situation after the migration was rolled back and provided an assessment based on what I had observed.
  • I outlined a course of action to mitigate our problems and restore our production database to an operational state as soon as possible.

Am I proud? Not really. I like it when things work. Am I tired and cranky? Yes.

Will I get this fixed before I go to bed? Hell yeah.

Is this something that I, in a sick way, live for? Only because it reminds me to keep studying and to stay on my toes.

How I Analyzed the PASS 2010 Summit Submissions

Want to know how I analyzed the Summit session data? I exported the data from the abstract selection system into a tab delimited file. Since I use a Mac at home, I used the tools available to me: PostgreSQL.

I loaded the data using the PostgreSQL COPY command to bulk load the report output and then did some data mojo using PostgreSQL. Most of it was exactly the same as it would be for SQL Server with a few differences. Here’s the code that I used:

SELECT  COUNT(*) AS total_sessions
FROM    abstracts;
GO
-- Yup, I used the GO statement separator. This is a byproduct
-- of the client that I use, not because of PostgreSQL.

SELECT  COUNT(DISTINCT first_name || ' ' || last_name) AS name_count
FROM    abstracts ;
GO

SELECT  job_title,
        COUNT(job_title) AS jt_count
FROM    ( SELECT  job_title,
                  first_name,
                  last_name
          FROM    abstracts
          GROUP BY job_title,
                  first_name,
                  last_name
) AS x
GROUP BY job_title
ORDER BY jt_count DESC;
GO

WITH c AS (
  SELECT  job_title,
          first_name,
          last_name
  FROM    abstracts
  GROUP BY job_title,
          first_name,
          last_name
)
SELECT  SUM(CASE WHEN LOWER(job_title) LIKE '%engineer%' THEN 1 ELSE 0 END) AS engineer_count,
        SUM(CASE WHEN LOWER(job_title) LIKE '%manager%' THEN 1 ELSE 0 END) AS manager_count,
        SUM(CASE WHEN LOWER(job_title) LIKE '%developer%' THEN 1 ELSE 0 END) AS developer_count,
        SUM(CASE WHEN LOWER(job_title) LIKE '%consultant%' THEN 1 ELSE 0 END) AS consultant_count,
        SUM(CASE WHEN LOWER(job_title) LIKE '%business intelligence%' THEN 1
                 WHEN LOWER(job_title) LIKE '%b.i.%' THEN 1
                 WHEN LOWER(job_title) LIKE '%BI%' THEN 1
                 ELSE 0 END) AS bi_count,
        SUM(CASE WHEN LOWER(job_title) LIKE '%architect%' THEN 1 ELSE 0 END) AS architect_count
FROM    c;
GO

SELECT  COUNT(DISTINCT company) AS company_count
FROM    abstracts;
GO

SELECT  DISTINCT
        category,
        track,
        COUNT(track) OVER (PARTITION BY category) AS the_count_by_category,
        COUNT(track) OVER (PARTITION BY category, track) AS the_count_by_track
FROM    abstracts
ORDER BY category, track;
GO

SELECT  AVG(author_count) AS author_average
FROM    ( SELECT  DISTINCT
                  first_name,
                  last_name,
                  COUNT(session_title) OVER (PARTITION BY first_name, last_name) AS author_count
          FROM   abstracts
        ) AS x;
GO

-- You might be weirded out by this string_to_array and array_upper.
-- PostgreSQL has an array data type. We're using an array of strings
-- to get an accurate word count
SELECT  AVG(CHAR_LENGTH(abstract)) AS avg_char_count,
        STDDEV(CHAR_LENGTH(abstract)) AS char_count_stddev,
        AVG(array_upper(string_to_array(abstract, ' '), 1)) AS avg_word_count,
        STDDEV(array_upper(string_to_array(abstract, ' '), 1)) AS word_count_stddev
FROM    abstracts;
GO

-- Hey, look at that, we can use a single TRIM function to do all work
SELECT  SUM(CASE WHEN TRIM(BOTH FROM presented_before) = 'YES' THEN 1 ELSE 0 END) AS presented_before,
        SUM(CASE WHEN TRIM(BOTH FROM presented_before) = 'NO' THEN 1 ELSE 0 END) AS never_presented
FROM    abstracts;
GO

SELECT  level,
        COUNT(*)
FROM    abstracts
GROUP BY level
ORDER BY level;
GO

Presenting for the PASS Professional Development Virtual Chapter

Wednesday, June 16th, I will be presenting for the PASS Professional Development Virtual Chapter.

The Live Meeting starts at 1PM EST and will last for around an hour. You can click on this glorious link to attend.

Title: Taking Control of Your Career

Abstract: Raises, promotions, and job offers don’t happen by accident; you need a plan. Through careful planning you can create and reach impressive goals. But what’s the point of reaching your goals if nobody notices? “If you build it, they will come” doesn’t apply when you’re building your career.

In this session I will show you:

  • How to set achievable goals
  • The importance of planning your career
  • Methods for recording and communicating your accomplishments

What the heck are you people talking about?

The call for speakers is closed and the selection committees are about to begin reviewing the abstracts. Before they get started, here are a few little pieces of trivia about the sessions.

Keep in mind that I have done minimal cleansing on this data so it’s going to be a bit messy and unreliable for anything apart from entertainment purposes.

Raw Numbers

There were 574 sessions submitted this year by 209 speakers working for a total of 163 separate companies.

Out of a total of 127 job titles, the most common one was “Consultant”. Or, it would be until you take into account the different variations of DBA, Database Administrator, etc. Taking that into account, 20 of you are some kind of DBA.

If you want to get a bit fuzzier about the numbers:

  • 17 of you are an engineer and are probably measuring something with calipers right now
  • 12 of you identify as a manager
  • 12 of you think you’re a developer
  • 27 of you claim to be a consultant
  • 11 of you work in B.I.
  • 32 of you claim to be architects. You draw squares and arrows to represent work.

All of you work for a total of 163 different companies. Unless some of you can’t spell your company’s name right. Then you may work for fewer, I’m not sure. The data, as I said, is dirty.

The Submissions

The average length of a session abstract was 617.7 characters or 99.19 words with a standard deviation of 243 characters and 39 words. Out of all the presentations, 187 have never been presented before. They may very well be 100% fresh in November when you arrive.

Here’s the distribution between the different levels:

  • 100 Level (Novice) – 109 sessions
  • 200 Level (Intermediate) – 239 sessions
  • 300 Level (Advanced) – 178 sessions
  • 400 Level (Expert) – 44 sessions
  • 500 Level (Advanced Expert) – 2 sessions

Pre/Post Conference

42 total pre/post conference sessions were submitted. That’s a lot of competition for a few slots.

  • 14 were B.I. (9 for architecture and 5 for reporting/delivery)
  • 13 relate to DBA work
  • 12 relate to application development
  • 3 relate to professional development

The only conclusion I can draw is that very few of us are prepared to talk about our careers for a single work day. Are you playing Legend of Zelda at work all day? What’s up with that, guys? People making a living talking about this kind of stuff.

Spotlight Sessions

There were 58 spotlight sessions submitted and I’m afraid to say that your day jobs faired even worse in this section.

  • 20 were DBA related topics
  • 15 were development topics (this better not be all IronPerl … I mean PowerShell)
  • 13 were somehow related to B.I. architectural whatnot
  • 5 were related to B.I. reporting and delivery. Apparently these guys are too busy making money to talk at the PASS Summit for 8 hours.
  • 5 of you thought that you could talk about your job for 90 minutes. Good for you. That’s long enough to get to the good parts in Krull or watch any number of amazing kung fu movies.

The distribution is starting to skew a bit here. If we combined the BI tracks it would look a lot more like the pre/post conference situation.

Community Sessions

Here is where you guys clearly shine. You submitted a whopping 472 sessions. That’s an average of 2.25 per person. While not bad, you could have submitted 4. I’m just saying that maybe next year you should pick it up a little bit.

  • 137 application development topics. We’re number 1, we’re number 1
  • 135 DBA topics put this in at a close second.
  • 113 BI architecture topics
  • 48 Professional development (you can talk for 75 minutes about your job, but not 90?)
  • 39 B.I. reporting topics. I think these people really must have jobs to do, because that’s not a lot.

The Missing Speakers

Allen Kinsel (twitter | blog) and I have been fielding a draft email back and forth this afternoon. Rather than bore you with the backstory, here’s the body of the email:

After we heard from a few Microsoft employees who work outside of the normal speaking groups, we decided to do something about it. We’re asking you, and anyone else you know who falls into this category, to submit your sessions during the community call for speakers. This is our way of making sure that you don’t slip through the gap. Please pass this on to any of your colleagues who may find themselves in the same situation.

One last thing: The Call for Speakers ends Saturday, so you will have to hurry.

http://sqlpass.eventpoint.com/cft/

Why would we do this? Microsoft have their own call for speakers, right? They do, but it’s a Microsoft internal process – we can only make suggestions about the final speaker selections.

Here’s the deal – there are some speakers out there that we feel are an important part of the community, speakers who are outside of the normal Microsoft speaker track. They aren’t on the product development team, they aren’t architects, they aren’t general managers at Microsoft. These are the DBAs who keep Microsoft’s databases running, CSS/PSS, and other DBAs out there in the trenches. Most importantly, they’re also members of our community.

You probably know some of them – they’ve spoken at the PASS Summit, at SQL Saturdays, and at user groups. They’re highly rated speakers and they’re valuable members of the community.

If you know someone who falls into this group of people, pass this message along. We are working to make sure that this year’s PASS Summit is the best one that we’ve had and we know these people will make it even better.

Thank you.

Push Ups and String Concatenation

SQL sucks at string manipulation. It’s not just SQL Server, it’s the SQL language in general. I’ve mentioned this before on Stack Overflow in answer to the question “What are five things you hate about your favorite language?” It’s almost easier to return the raw rows to application code than it is to work with them inside of a database. There’s one place where SQL Server has made it easier to work with string data – string building.

This site is protected with Urban Giraffe's plugin 'HTML Purified' and Edward Z. Yang's Powered by HTML Purifier. 401 items have been purified.