Tag SQL Server

Ten Reasons PostgreSQL is Better Than SQL Server

Why would anyone want to use PostgreSQL instead of SQL Server? There are a lot of factors to consider when choosing how to store your data. Sometimes we need to look deeper than the standard choice and consider something new. If you’re starting a brand new project, where should you store your data? Here are ten reasons why you might want to consider PostgreSQL over SQL Server.

Releases Every Year

Let’s face it, waiting three to five years for new functionality to roll out in any product is painful. I don’t want to constantly be learning new functionality, but on the flip side I don’t want to be writing hack solutions to critical business problems because I know something is coming down the pipe, but I can’t wait for a few more years before I implement it myself. Rapid release cycles guarantee that the PostgreSQL development team is able to quickly ship the features that users need and make frequent improvements.

Starting with version 9.0, the PostgreSQL release cycle has switched to a yearly cycle. Before that, PostgreSQL released whenever the features were done. Looking at the major releases on Wikipedia, it’s obvious that major releases still rolled out about once every 18 months. An 18 month release cycle isn’t bad for any software product, much less a mission critical one like a database.

True Serialization

Snapshot isolation guarantees that all reads in a transaction see a consistent snapshot of data. In addition, a transaction should only commit if the ways that it changes data don’t conflict with other changes made since the snapshot was taken. Unfortunately, snapshots allow anomalies to exist. It’s possible to create a situation where two valid transactions occur that leave the database in an inconsistent state – the database doesn’t pass its own rules for data integrity.

Serializable snapshot isolation was added to PostgreSQL in version 9.1. SSI emulates strict serial execution – transactions behave as if they are executing one after another. If there is a conflict, or even a potential conflict, the database engine throws an error back to the caller (who is left to figure out the appropriate next step).

Serializable snapshot isolation sounds painful. The kicker is that it makes it possible for databases to behave in ways that work to guarantee an even stronger level of consistency. Applications can be developed to assume that data modification will fail and subsequently retry failed transactions. The true benefit is that well written software can avoid data inconsistencies and maintain the illusion that all is operating as it should be.

Sane Defaults, Ridiculous Tuning

Okay, to be fair PostgreSQL ships with some ridiculously conservative shared memory settings. Most other PostgreSQL settings are conservative, but general enough for most generic workloads. Many people deploying PostgreSQL will not have to make many changes to PostgreSQL (probably just increasing shared_buffers to 25% of total RAM to start).

Once a PostgreSQL installation is up and running, there are a number of settings that can be changed. The best part, though, is that most of these settings can be changed at the server, database, user, or even individual query level. It’s very common to have mixed workload servers – most activity on the server is basic CRUD, but a small percentage of activity are reports that need to be aggressively tuned. Instead of moving the individual reports out to running on separate space (either separate servers, databases, or even in separate resource pools in the same database), we can simply tune a few queries to use the appropriate parameters including the memory to allocate for sorting and joins.

Unlogged Tables

Are you sick of trying to get minimally logged bulk inserts to work? Me too. Instead of trying various mechanisms to minimally log some tables, PostgreSQL give us option of creating an unlogged table – simply add the UNLOGGED directive to a create table statement and everything is ready to go.

Unlogged tables bypass the write ahead log; they aren’t crash safe, but they’re stupid fast. Data in an unlogged table will be truncated after the server crashes or there is an unclean shutdown, otherwise it’ll still be there. They’re also excluded from replication to a standby server. This makes unlogged tables ideal for ETL or other data manipulation processes that can easily be repeated using source data.

KNN for Geospatial… and More

Yeah, I hear ya, SQL Server will have this soon, but PostgreSQL already has it. If K Nearest Neighbor searches are critical for your business, you’ve already gone through some pain trying to get this working in your RDBMS. Or you’ve given up and implemented the solution elsewhere. I can’t blame you for that – geospatial querying is nice, but not having KNN features is a killer.

PostgreSQL’s KNN querying works on specific types of indexes (there are a lot of index types in PostgreSQL). Not only can you use KNN querying to find the 5 nearest Dairy Queens, but you can also use a KNN search on other data types. It’s completely possible to perform a KNN search and find the 10 phrases that are closest to “ice cream”.

KNN search capability makes PostgreSQL a serious contender for anyone looking at implementing geospatial querying. The additional flexibility puts PostgreSQL in a leadership position for many other kinds of search driven applications.

Transaction-Controlled Synchronous Replication

One of the easiest ways to keep another copy of your database is to use some kind of database replication. SQL Server DBAs will largely be used to transactional replication – a dedicated agent reads the SQL Server log, collects outstanding commands, and then ships them over to the subscriber where they are applied.

PostgreSQL’s built-in replication is closer to SQL Server’s mirroring than SQL Server’s replication (PostgreSQL’s replication has a readable standby). Log activity is hardened on the primary and then streamed to the secondary. This can either happen synchronously or asynchronously. Up until PostgreSQL 9.1, replication was an all or nothing affair – every transaction was either synchronous or asynchronous. Developers can set a specific transaction by setting the synchronous_replication configuration value for that single transaction. This is important because it makes it possible to write copious amounts of data to logging tables for debugging purposes but not have performance be impacted by synchronously committing writes to the log tables.

Any time we have more choice in how we develop our applications, I’m happy.

Writeable CTEs

CTEs are great for reads, but if I need to do something more complex with them, there are other issues involved. An example is going to make this much easier. Let’s say I want to delete stale data, but I want to store it in an archive table. To do this with SQL Server, the easiest route (from a development standpoint) is going to be to elevate my isolation level to at least snapshot, if not serializable, and use isolation levels to guarantee that no data will be changed. I could also load the PK value of the comments to be deleted into a temp table and reference that multiple times.

Both methods work, but both methods have problems. The first requires that the code be run in a specific isolation level. This relies on specific settings to be in place that may not be available. The code could also be copied out of the procedure and run in SSMS, leading to potential anomalies where a few rows are deleted but not archived. That’s no big deal for spam comments, but it could be critical in other situations. The second method isn’t necessarily bad, there’s nothing wrong with it, but it involves extra code noise. That temporary table isn’t necessary to solve our problem and is a byproduct of dealing with different isolation levels.

PostgreSQL has a different way to solve this problem: writeable CTEs. The CTE is constructed the same way it would be constructed in T-SQL. The difference is that when we’re using PostgreSQL, the data can be modified inside the CTE. The output is then used just like like the output of any other CTE:

CREATE TABLE old_text_data (text_data text); 

WITH deleted_comments AS (
  DELETE FROM comments
  WHERE comment_text LIKE '%spam%'
  RETURNING comment_id, email_address, created_at, comment_text
)
INSERT INTO spam_comments
SELECT *
FROM deleted_comments

This can be combined with default values, triggers, or any other data modification to build very rich ETL chains. Under the covers it may be doing the same things that we’ve outlined from SQL Server, but the conciseness is beneficial.

Extensions

Ever want to add some functionality to SQL Server? What about keep up to date on that functionality? This can be a huge problem for DBAs. It’s very easy to skip a server when you roll out new administrative scripts across your production environment. Furthermore, how do you even know which version you have installed?

The PostgreSQL Extension Network is a centralized repository for extra functionality. It’s a trusted source for open source PostgreSQL libraries – no sneaky binaries are allowed. Plus, everything in PGXN is versioned. When updating PGXN provided functionality, the extension takes care of the update path for you – it knows how to make sure it’s up to date.

There are extensions for things ranging from K-means clustering, Oracle compatibility functions, to remote queries to Amazon S3.

Pushing this functionality out into extensions makes it easy for developers and DBAs to build custom packages that look and act like core functionality of PostgreSQL without trying to get the package through the PostgreSQL release process. These packages can then be developed independently, advance at their own rate, and provide complex functionality that may not fit within the release plan of the PostgreSQL core team. In short, there’s a healthy ecosystem of software being built around PostgreSQL.

Rich Temporal Data Types

One of my favorite features of PostgreSQL is the rich support for temporal data types. Sure, SQL Server 2008 finally brought some sophistication to SQL Server’s support for temporal data, but it’s still a pretty barren landscape. Strong support for temporal data is critical in many industries and, unfortunately, there’s a lot of work that goes on in SQL Server to work around the limitations of SQL Server’s support for temporal data.

PostgreSQL brings intelligent handling of time zones. In addition to supporting the ISO 8601 standard (1999-01-08 04:05:06 -8:00), PostgreSQL supports identifying the time zone by an abbreviation (PST) or by specifying a location identifier (America/Tijuana). Abbreviations are treated like a fixed offset from UTC, location identifiers change with daylight savings rules.

On top of time zone flexibility, PostgreSQL has an interval data type. The interval data type is capable of storing an interval of up to 178,000,000 years with precision out to 14 digits. Intervals can measure time at a number of precisions from as broad as a year to as narrow as the microsecond.

Exclusion Constraints

Have you ever tried to write any kind of scheduling functionality using SQL Server? If you have, you’ll know that when you have business requirements like “two people cannot occupy the same conference room at the same time”, you’ll know that this difficult to enforce with code and usually requires additional trips to the database. There are many ways to implement this purely through application level code and none of them lead to happy users or developers.

PostgreSQL 9.0 introduced exclusion constraints for columns. In short, we define a table and then add an additional constraint that includes a number of checks where at least one of the checks is false. Exclusion constraints are supported under the hood by indexes, so these operations are as quick as our disks and the index that we’ve designed. It’s possible to use exclusion constraints in conjunction with temporal or geospatial data and make sure that different people aren’t reserving the same room at the same time or that plots of land don’t overlap.

There was a presentation at the 2010 PGCon that going into the details of exclusion constraints. While there is no video, the slides are available and they contain enough examples and explanations to get you started.

Bonus Feature – Cost

It’s free. All the features are always there. There are no editions of PostgreSQL – the features always exist in the database. Commercial support is available from a number of companies, some of them even provide additional closed source features, but the core PostgreSQL database is always available, always free, and always contains the same features.

Getting Started

Want to get started with PostgreSQL? Head on over to the download page and pull down a copy for your platform of choice. If you want more details, the documentation is thorough and well written, or you can check out the tutorials in the wiki.

My Code Isn’t Fat, It’s Just Robust

I’ve been working on implementing some infrastructure code for a client. We’re building robust partition swapping to make it easy to load data without disrupting user queries. We’re doing everything eles the right way, but partition swapping makes it really easy to correct a bad load of a past data.

The upside is that this code is really easy to write. There are enough examples, samples, and previous samples out there that a lot of the basics can be easily implemented. Even the complex parts of implementing the partition swapping are fairly trivial. The trick is making the code robust enough to handle almost any failure scenario.

Table partitioning is good to use in different ETL scenarios, but we never want it to fail. If it does fail, we want to make sure that we’re in a recoverable state. Likewise, this code needs to be automated and recover from any potential failures.

It turns out that the actual functionality is just a few lines of code. The robust error handling, logging, and recovery code is about 30 times longer than the functionality. It can be difficult to go through the code and update all of the error handling and logic in response to minor changes to business requirements, but the end product is a stable piece of functionality.

PASS 2011 Session Abstracts

PASS 2011 Session Abstracts

Every November, a bunch of database geeks gather for the Professional Association for SQL Server’s (PASS) international Summit. This year it’s going to be held October 11-24 in Seattle, Washington. I didn’t submit last year since I was involved with the abstract selection process. This year I’m not involved, so I decided to submit a few abstracts.

Rewrite Your T-SQL for Great Good!

Refactoring SQL is not like refactoring application code. This talk will cover proven SQL refactoring techniques that will help you identify where performance gains can be made, apply quick fixes, improve readability, and help you quickly locate places to make sweeping performance improvements. Jeremiah Peschka has years of hands on experience tuning SQL applications for performance, throughput, and concurrency.

Why I submitted this session: I submitted this session because it’s a fun session to give, it crosses boundaries between DBA and developer, and I’ve given it a few times before.

The Database is Dead, Long Live the Database

If relational databases are so great, why are people talking about NoSQL? Shouldn’t we explore other ways to store and manipulate data? We’ll look at four scenarios – caching, session state, flexible data models, and batch processing – and discuss how traditional databases perform in each situation and what other options exist on the market. At the end of this session, attendees will have a better understanding of how different workloads perform in RDBMSes, best practices, and alternative storage solutions to make your life easier.

Why I submitted this session: I wrote this session when I was asked to speak at Stir Trek: Thor Edition. Writing it has been a lot of fun and has started the process of crystallizing a lot of the ideas in my head around data storage. This talk focuses on a few areas where relational databases don’t do a good job and proposes solutions to pick up the slack.

Rules, Rules, and Rules

Computers are governed by the rules of physics: electrons, drive heads, and disk platters can only move so fast. Database systems are built according to those rules: memory is faster than disk which is faster than the network. Database schemas and queries are built within the rules of database systems. You will hit the limitations of these rules. If you know what the rules are and why they are in place, you’ll know when it’s time to break them… and how to succeed.

Why I submitted this session: This is also a session I’ve given before. Andy Leonard asked me to speak at the inaugural SQLPeople event about my passion. One of my passions is learning about computer science and how it can be applied to databases in a practical way. (There’s a lot of purely theoretical information that only matters when you’re implementing an RDBMS.) This session is an extended version of the talk I gave at SQLPeople. I’m incredibly excited about it and I’ll be bummed if it doesn’t get accepted.

The Other Side of the Fence: Lessons Learned from Leaving Home

Traveling the world changes your outlook on things, home just doesn’t look quite the same once you’ve traveled. The same can be said for SQL Server; working with databases like PostgreSQL, Cassandra, and Hadoop forced Jeremiah Peschka to re-learn concepts that he took as a given. Learn from his experiences about the importance of understanding isolation levels, data storage and retention, querying patterns, and even database functionality in this talk drawn from his experiences as a DBA, consultant, and developer.

Why I submitted this session: There’s a theme going on here – I’ve learned a lot about database and application design and how it’s sometimes necessary to move outside of my comfort zone to build an effective system. This is a 3.5 hour session that will cover a lot of features in SQL Server. I learned a lot working with other databases, and I hope that this information helps some other people.

In the Event That Everything Should Go Terribly Right

Astute readers and internet stalkers will have noticed that I left my job at Quest Software back in March. I wasn’t unhappy, I just had the opportunity to take my show on the road and go solo. I’ve had the idea of being my own boss in the back of my head for along time. Suddenly I was confronted with a situation where a former pipe dream was all too real. I talked it over with a few friends and made the plunge.

Right around the same time, I started talking to Brent about his plans. This turned into talking to Brent and Tim about their plans. Then we looped Kendra in. It turns out that we all have similar goals and dreams. It only made sense to join forces and fight crime together! After evaluating the insurance costs of fighting crime we decided to become consultants. And thus Brent Ozar PLF was born.

I’ve never been more excited to work with a group of people. Brent, Tim, and Kendra have always gotten along. I’ve never felt more supported and challenged by a group of people. My business partners are three friends who have always encouraged me to excel. Whether it’s been learning about SQL Server, Ruby, or non-relational databases, these three have been there supporting me every step of the way, even when we’ve disagreed.

I could make a list of all of the other reasons that I’m looking forward to building this business, but it all boils down to the way we interact. Brent, Tim, and Kendra challenge me to be better at everything I do. Whether it’s my SQL Server skills, writing, or presenting, they’re always there helping me get better. I couldn’t ask for a better core group of friends to join me on this new endeavor.

What does the future hold? In terms of business, I’m excited to be building a business with Brent, Tim, and Kendra. Our interests are similar enough that we complement each other but they’re diverse enough that I know we’re going to educate and challenge each other.

You can learn more about our services at http://brentozar.com. If you want to get in touch, you can do that too.

I’m Presenting at SQL Saturday 67

No, this isn’t a re-run! I’ll be presenting about Refacatoring SQL at SQL Saturday 67 in Chicago this coming Saturday.

I’m really excited about this opportunity. I had a blast presenting in Chicago last year and I’m looking forward to doing it again this year. There’s a great line up of speakers. If you’re in the Chicago area and want to get your learn on, I suggest you swing on by the DeVry Addison campus and check it out.

Here’s the title and abstract:

Refactoring SQL

Refactoring SQL is not like refactoring application code. This talk will demonstrate proven SQL refactoring techniques that will help you identify where performance gains can be made, apply quick fixes, improve readability, and help you quickly locate places to make sweeping performance improvements. Jeremiah Peschka has years of hands on experience tuning SQL applications for performance, throughput, and concurrency.

Database Restores – Where’s my Transaction Log Backup?

Developers! DBAs! Has this ever happened to you?

Surprise! It's a database migration error!

You’re chugging along on a Friday night getting ready for your weekend deployment. Your 2 liter of Shasta is ice cold, you have your all Rush mix tape, and you’re wearing tube socks with khakis. Things are looking up. You open up your deployment script. You’re confident because you’ve tested it in the QA environment and everything worked. You press F5 and lean back in your chair, confident that the script is going to fly through all of the changes. Suddenly, there’s an error and you’re choking in surprise on Shasta.

In an ideal world, you could pull out your trusty log backups and do a point in time restore, right? What if you’ve never taken a transaction log backup? What if you only have full database backups? Can you still recover from this situation? The answer, thankfully, is yes.

Let’s break something!

USE ftgu;
GO

-- at midnight, we took our initial back up
BACKUP DATABASE ftgu TO DISK = 'C:\ftgu-1.bak'
GO

-- customer data from the business is inserted
-- more customer data is inserted

-- some kind of migration goes here

-- insert a bad value
INSERT INTO Bins (Shelf, Bin)
VALUES ('B', 9)
GO

SELECT GETDATE();

SELECT * FROM Bins WHERE Shelf = 'B' ORDER BY BinID DESC;
GO

-- wait for a bit
WAITFOR DELAY '00:01:00';
GO

-- do something dumb
DELETE p
FROM Products p
JOIN Bins b ON p.BinID = b.BinID
WHERE b.Shelf = 'B';

DELETE FROM Bins WHERE Shelf = 'B'
GO

SELECT GETDATE();
GO

We have a starting backup, no t-log backups, and we’ve gone and deleted some important data from the production database. How do we get it back? If we restored the database from our first backup we might lose a lot of data. Who knows when the last database backup was taken? Oh, midnight. So, in this case, we’d lose a day of data. Well, bugger. In a panic, we save the state of our broken database.

-- ack!
BACKUP DATABASE ftgu TO DISK = 'C:\ftgu-2.bak';

And then we realize that we also need our transaction log:

-- ah crap, I need to back up my log to get point in time recovery!
BACKUP LOG ftgu
TO DISK = 'C:\ftgu-log-1.trn';
GO

Here’s the kicker – the transaction log has never been backed up. (In my experience, this is all too common.) This database has been running for a week or a year or three years without any kind of transaction log backups. We’re screwed right? I mean, wouldn’t we have to apply all of the transactions from the log to the very first full backup we have? No.

Let’s get started and restore our last good backup. We always have our backup with missing data, just in case we need it for some reason.

-- switch to master (need to make sure nobody else is using that database)
USE master;
GO

-- restore the last full backup with known good data
-- make sure to specify NORECOVERY so we can
-- apply our transaction log backup
RESTORE DATABASE ftgu
FROM DISK = 'C:\ftgu-1.bak'
WITH REPLACE, NORECOVERY;
GO

SQL Server is cunning and records the log sequence number (LSN) from the last full backup (technically it’s the start and end LSN from the last full backup). If we have a log backup that encompasses the relevant LSNs, we’re good to go. Since our transaction logs were never backed up before today, we’re safe.

We’re going to use something called

-- restore the log backup until right before we started
-- this is called "point in time recovery"
RESTORE LOG ftgu
FROM DISK = 'C:\ftgu-log-1.trn'
WITH STOPAT = '2011-02-13 10:03:55.653';

Even though we never took a transaction log backup before today, we’re able to take a backup and recover from what initially seemed like a bad situation.

SQL Saturday 60 Resources

SQL Saturday 60 was a week ago and I completely failed to post resources from the presentation in a timely manner.

The SQL Server Internals resources have been available for a while: http://facility9.com/resources/sql-server-internals… You just had to know to look for them.

The Modeling Muddy Data talk is available on GitHub: https://github.com/peschkaj/Muddy-Data. This presentation is released under a Creative Commons Attribution-ShareAlike license which means that we can all make things better by collaborating on the presentation materials. I’ll slowly be adding more information to the write up of the talk that is in the README.

Data Durability

A friend of mine half-jokingly says that the only reason to put data into a database is to get it back out again. In order to get data out, we need to ensure some kind of durability.

Relational databases offer single server durability through write-ahead logging and checkpoint mechanisms. These are tried and true methods of writing data to a replay log on disk as well as caching writes in memory. Whenever a checkpoint occurs, dirty data is flushed to disk. The benefit of a write ahead log is that we can always recover from a crash (so long as we have the log files, of course).

How does single server durability work with non-relational databases? Most of them don’t have write-ahead logging.

MongoDB currently has limited single server durability. While some people consider this a weakness, it has some strengths – writes complete very quickly since there is no write-ahead log that needs to immediately sync to disk. MongoDB also has the ability to create replica sets for increased durability. There is one obvious upside to replica sets – the data is in multiple places. Another advantage of replica sets is that it’s possible to use getLastError({w:...}) to request acknowledgement from multiple replica servers before a write is reported as complete to a client. Just keep in mind that getLastError is not used by default – application code will have to call the method to force the sync.

Setting a w-value for writes is something that was mentioned in Getting Faster Writes with Riak. Although, in that article we were decreasing durability to increase write performance. In Amazon Dynamo inspired systems writes are not considered complete until multiple clients have responded. The advantage is that durable replication is enforced at the database and clients have to elect to use less security for the data. Refer to the Cassandra documentation on Writes and Consistency or the Riak Replication documentation for more information on how Dynamo inspired replication works. Datastores using HDFS for storage can take advantage of HDFS’s built-in data replication.

Even HBase, a column-oriented database, uses HDFS to handle data replication. The trick is that rows may be chopped up based on columns and split into regions. Those regions are then distributed around the cluster on what are called region servers. HBase is designed for real-time read/write random-access. If we’re trying to get real-time reads and writes, we can’t expect HBase to immediately sync files to disk – there’s a commit log (RDBMS people will know this as a write-ahead log). Essentially, when a write comes in from a client, the write is first written to the commit log (which is stored using HDFS), then it’s written in memory and when the in-memory structure fills up, that structure is flushed to the filesystem. Here’s something cunning: since the commit log is being written to HDFS, it’s available in multiple places in the cluster at the same time. If one of the region servers goes down it’s easy enough to recover from – that region server’s commit log is split apart and distributed to other region servers which then take up the load of the failed region server.

There are plenty of HBase details that have been grossly oversimplified or blatantly ignored here for the sake of brevity. Additional details can be found in HBase Architecture 101 – Storage as well as this Advanced HBase presentation. As HBase is inspired by Google’s big table, additional information can be found in Chang et al. Bigtable: A distributed storage system for structured data and The Google File System.

Interestingly enough, there is a proposed feature for PostgreSQL 9.1 to add synchronous replication to PostgreSQL. Current replication in PostgreSQL is more like asynchronous database mirroring in SQL Server, or the default replica set write scenario with MongoDB. Synchronous replication makes it possible to ensure that data is being written to every node in the RDBMS cluster. Robert Haas discusses some of the pros and cons of replication in PostgreSQL in his post What Kind of Replication Do You Need?.

Microsoft’s Azure environment also has redundancy built in. Much like Hadoop, the redundancy and durability is baked into Azure at the filesystem. Building the redundancy at such a low level makes it easy for every component of the Azure environment to use it to achieve higher availability and durability. The Windows Azure Storage team have put together an excellent overview. Needless to say, Microsoft have implemented a very robust storage architecture for the Azure platform – binary data is split into chunks and spread across multiple servers. Each of those chunks is replicated so that there are three copies of the data at any given time. Future features will allow for data to be seamlessly geographically replicated.

Even SQL Azure, Microsoft’s cloud based relational database, takes advantage of this replication. In SQL Azure when a row is written in the database, the write occurs on three servers simultaneously. Clients don’t even see an operation as having committed until the filesystem has responded from all three locations. Automatic replication is designed into the framework. This prevents the loss of a single server, rack, or rack container from taking down a large number of customers. And, just like in other distributed systems, when a single node goes down, the load and data are moved to other nodes. For a local database, this kind of durability is typically only obtained using a combination of SAN technology, database replication, and database mirroring.

There is a lot of solid technology backing the Azure platform, but I suspect that part of Microsoft’s ultimate goal is to hide the complexity of configuring data durability from the user. It’s foreseeable that future upgrades will make it possible to dial up or down durability for storage.

While relational databases are finding more ways to spread load out and retain consistency, there are changes in store for MongoDB to improve single server durability. MongoDB has been highly criticized for its lack of single server durability. Until recently, the default response has been that you should take frequent backups and write to multiple replicas. This is still a good idea, but it’s promising to see that the MongoDB development team are addressing single server durability concerns.

Why is single server durability important for any database? Aside from guaranteeing that data is correct in the instance of a crash, it also makes it easier to increase adoption of a database at the department level. A durable single database server makes it easy to build an application on your desktop, deploy it to the server under your desk, and move it into the corporate data center as the application gains importance.

Logging and replication are critical technologies for databases. They guarantee data is durable and available. There are also just as many options as there are databases on the market. It’s important to understand the requirements of your application before choosing mechanisms to ensure durability and consistency across multiple servers.

References

SQL Server Internals – Live at the TriPASS Live Meeting

Join me on Tuesday, January 11 at 12:00PM Eastern and take a break from your work day to learn about SQL Server Internals. There’s some information on the event page.

Add it your calendar!

SQL Server Internals

Want to know what makes SQL Server tick? Ever wonder what SQL Server is doing when you run a query? Ever wonder which parts of SQL Server are responsible for specific functionality? Want to know what a HOBT is? I can’t promise answers to every question, but I can set you on the path to knowledge about the inner workings of SQL Server.

Twelve Days of SQL – Day (2 – 1)

The Story So Far

Brent Ozar (blog | twitter) asked me to pick a favorite blog post for the year. Since I couldn’t pick anything I wrote (yes, I love myself that much), I had to pick one from the community. Since just about everyone in Brent’s crazy list of crazy blogs about SQL, I had to pick someone from the SQL Server community.

My Favorite Blog Post This Year

Earlier this year, Mladen Prajdić posted SQL Server – Undelete a Table and Restore a Single Table from Backup. I love this post for a couple of reasons. First, it’s completely crazy. Mladen had a strange idea and then ran with it. Rather than accept conventional thinking that it isn’t possible to restore a single table from a backup, Mladen opened up SSMS and started prodding at the inside of SQL Server. The second reason I love this post is because the explanation is clear and the code well documented. Mladen ran the code by me before he published the post. Normally, I can’t read other people’s code without a tremendous amount of time and energy. His code was clear enough to stand on its own.

Coming Up Next

Grant Fritchey (blog | twitter) is up next. Grant has been a huge inspiration to me – he’s humble, intelligent, and genuinely interested in helping out. When he’s not terrifying developers at a large insurance company, he is a scout leader, father, geek, and kilt connoisseur. Grant’s post drops tomorrow – December 10 – so be on the look out.

This site is protected with Urban Giraffe's plugin 'HTML Purified' and Edward Z. Yang's Powered by HTML Purifier. 401 items have been purified.