A client recently asked me for help with their SQL Server environment. It seems that replication was running slowly and was getting further and further behind – replication had been turned off during heavy data modification and was turned on after several days.
Protip: This is why it’s important to have a full checklist for everything that you do on a server.
Check Everyone’s Health When you have a complicated system you want to take a look at everything, not just the symptoms of the problem.
While perusing twitter, I saw that Google has open sourced Sawzall, one of their internal tools for data processing. WTF does this mean?
Sawzall, WTF? [caption id="" align=“alignright” width=“250”] For Data Analytics or automotive modification, you will find no finer tool.[/caption] Apart from a tool that I once used to cut the muffler off of my car (true story), what is Sawzall? Sawzall is a procedural language for analyzing excessively large data sets.
They aren’t. 99% of what you do could be replicated by a fairly stupid shell script. When I started as a DBA, I didn’t have practical experience as a DBA. I had Books Online and google. What’s necessary as a DBA has nothing to do with your knowledge of T-SQL or SQL Server’s internal fiddly bits. That’s icing on the cake. The skills necessary to become a DBA are things that we learn over time.
You know that you should be testing your code. You even know that you should be testing your SQL. But why? We need to make sure that changes to our code are safe, prevent regressions, and that we catch edge cases. But are you testing your code for performance? Changes to code can make your code faster or slower, depending on indexing as well as user defined functions and built-in functions.
People have chimed in and talked about the Foursquare outage. The nice part about these discussions is that they’re focusing on the technical problems with the current set up and Foursquare. They’re picking it apart and looking at what is right, what went wrong, and what needs to be done differently in MongoDB to prevent problems like this in the future. Let’s play a “what if” game. What if Foursquare wasn’t using MongoDB?
Last week, Amazon announced that we could all get some free AWS if we signed up for a new account. Just what do you get for signing up? Take a look.
What you get with the AWS Free Usage Bundle
The First Catch First off, this is only free for the first 12 months. After that you’re going to have to pay as you go. In a way, this is like Microsoft’s BizSpark, but in the clouds.
Free Amazon Web Services for new customers. Amazon are giving away 12 months (from the date of your sign up) of a bunch of Amazon services. If you want to try to start up a business, now is the time to try! IronRuby is alive! There was some conjecture about the life of IronRuby after Microsoft cut the team. One of the original developers is picking up where he left off.
I should have written this right when I got back from Hadoop World, instead of a week or so later, but things don’t always happen the way you plan. Before I left to go to Hadoop World (and points in between), I put up a blog post asking forquestions about Hadoop. You guys responded with some good questions and I think I owe you answers.
What Is Hadoop? Hadoop isn’t a simple database; it’s a bunch of different technologies built on top of the Hadoop common utilities, MapReduce, and HDFS (Hadoop Distributed File System).
Cassandra: RandomPartitioner vs OrderPreservingPartitioner Data order is important in relational databases and it’s something that you need to be aware of with a non-relational database, too. Improperly ordered data can put a huge load on a few nodes in a cluster. This article goes over the trade-offs in Cassandra of using a random data order vs key ordered data. liblfds Want to write your own NoSQL database in C? These (free) libraries should make it pretty easy to do.
Sounds like I’m bragging, right? There is a free book involved. Or maybe you don’t care because it’s all NoSQLs and stuff and you’re a SQL Server DBA. And that’s where we differ. When I first heard about NoSQL databases, I had the same reaction that a lot of people are having right now: disbelief and mockery. I remember making fun of MySQL when I first ran into it. It was such an odd database: it didn’t have foreign keys, joins didn’t work, it sometimes ate all of your data, and writes put locks on tables.