Nov 21
Live Blogging PASS Keynote (#sqlpass)
Liveblogging the PASS Summit 2008 final day keynote. Refresh this page for news. Or, better yet, visit Brent Ozar’s coverage for additional info.
10:04 Parallel optimization is hard. Very hard. There’s a lot going on when the data is distributed across multiple nodes. Gray Systems Lab is working with DATAllegro to solve these problems. There are a great number of challenges that are up ahead. Big things are coming (har har har).
10:02 Partition skew is a concern when fragments/nodes don’t end up containing the same number of rows. How does this get solved? You can use range partitioning or you can change the hash function you’re using to partition the table.
9:57 Table repartitioning makes it possible to shuffle rows around so that all of Bob’s order rows are on the same server as Bob’s customer data rows. Joins can happen locally once you do this, even though you have a giant lump of data spread across a huge number of nodes.
9:50 This is very interesting stuff and I would encourage anyone interested to get a hold of the video of the keynote. I’m trying to keep up with all of this and blogging is getting in the way. Blogging will resume when the subject changes.
9:35 He’s now explaining how this would all work in a real system, not just in theory. The magic is that the software makes this all transparent outside of the database. There are no indexes, sadly, but queries take less time because they are distributed. Brent Ozar has a good overview of what’s going on from an engine perspective. Check it out.
9:30 Hash partitioning explained now. This is great… he’s explaining how it works and what’s wrong with it.
9:23 Horizontal partitioning is up now. This is some really really cool stuff… Round Robin partitioning is up now, also very cool. The problem is that you can’t tell where a row lives.
He’s showing all of this with animated slides. There’s very little to try to comprehend - he’s just showing it.
9:20 There’s a picture of a cluster of VAX machines. Oh, VAX.
9:16 Shared Memory (everything is shared in one machine) doesn’t scale up very well, the hardware doesn’t scale up very well up at all.
Share Disk is where nodes of commodity hardware uses local storage. There’s still limited scaleability here, too.
Shared Nothing is where you have commodity hardware with dedicated disk and memory. Everything is connected via commodity hardware. This can scale as long as you have money to buy commodity hardware. This is how the big boys do it.
9:15 Apparently, eBay has two 2 petabyte systems and one 5 petabyte systems. That’s a lot of data! He’s describing how the basic forms of scaling work.
9:10 The reason we need to know about this is because this is the theory behind the new DATAllegro products that are coming out next year. The point of doing linear speed up and linear scale up is to add hardware resources incrementally (10% more data? 10% more resources)
9:06 David Dewitt, a technical fellow with Microsoft and Ph.D. holder, is coming on stage now for the last keynote. He gets Alice Cooper as welcome music. He’s going to be talking about parallel DBs for scaleability.
9:00 Patrick is continuing to show different hardware that could be used and why you’d want to use it to meet your needs. This is a review of a white paper that’s available through Dell. Basically, add more servers to the query layer to meet load and distribute the data from the processing layer. Once you get more load in the processing layer… add more servers.
8:55 The first speaker is Patrick Otriz - Solutions Architect with Dell. What Dell doesn’t do is application development - Patrick’s job is to drive consistency around what Dell does - meet Service Level Agreements and establish business continuity plans. He’s describing the full stack and the problems that people will be facing at the hardware level.
8:48 SQL Heroes Contest winners are going to be announced. This was to create a project on codeplex using SQL Server. There has been an effort to get community sample applications that run along side the Microsoft sample databases - the SQL Heroes Contest. 60% of the submissions were from outside of the United States. Didn’t have time to type all of them out before the list was off the screen, hopefully the list will be published somewhere. (thanks to Adam Machanic, they’re Extended Events Manager, SSISUnit, CDC Helper, and QPee tools by Jason Massie!)
8:44 Bill Graziano came out riding on a tricycle. Early bird discount is $995 if you register before December 31st, act now! The summit will be in Seattle, Nov 3-6 in 2009.
PASS is looking for content either through videocasts (PASSTips) showing off new features or through technical articles.
Three new board members have been elected:
- Douglas McDowell
- Lynda Rab
- Andy Warren

November 21st, 2008 at 2:49 pm
[...] Jeremiah Peschka [...]