About Jeremiah

I'm a senior database administrator with Cass Information Systems, a SQL Server MVP, director-at-large with PASS, and I also help out with my local chapter and the Application Development virtual chapter.

You can learn more about me or contact me directly.

Retrieve the top X random rows from a query

Let’s suppose for a minute that you want to retrieve the top x number of rows from any given query. Keep in mind that x is a value that your users can supply.

Sure, you could do string concatenation and EXEC sp_executesql ‘SELECT TOP ‘ + @count + ‘ x FROM y;’. But, then again, a vicious hacker could get all cute and decide to change your form variable to ‘; DECLARE @bad_var AS NVARCHAR(MAX); SET @bad_var = ””; SELECT @bad_var = @bad_var”DROP TABLE ” + name + ”;” FROM sys.tables; SET @bad_var = @bad_var + ”–”; EXEC sp_executesql @bad_var;’ Or something along those lines but probably infinitely more cunning.

Well, you could also accomplish this a different way using a subquery:

DECLARE @count AS int;
SELECT @count = 2;

SELECT ContactId,
       FirstName,
       LastName,
       EmployeeId,
       AddressId,
       AddressLine1,
       AddressLine2,
       City,
       PostalCode
FROM (
  SELECT ROW_NUMBER() OVER (ORDER BY NEWID()) AS rownum,
         c.ContactId,
         c.FirstName,
         c.LastName,
         e.EmployeeId,
         a.AddressId,
         a.AddressLine1,
         a.AddressLine2,
         a.City,
         a.PostalCode
    FROM HumanResources.Employee AS e
          INNER JOIN Person.Contact AS c
            ON e.ContactId = c.ContactId
          INNER JOIN HumanResources.EmployeeAddress AS ea
            ON e.EmployeeId = ea.EmployeeId
          INNER JOIN Person.Address AS a
            ON ea.AddressId = a.AddressId
) AS x
WHERE rownum <= @count;

It’s not the prettiest thing on earth, but it works. Thanks to Rick Kierner over at rickdoes.net for asking me to help out with a similar query late last week.

4 comments to Retrieve the top X random rows from a query

  • you can also use SET ROWCOUNT n, still works in 2008 but will be deprecated in the next version I believe, make sure that it is set to 0 after the query

    declare @id int
    select @id = 5

    set rowcount @id

    select * from sysobjects
    order by newid()

    set rowcount 0

  • I just found this:
    DECLARE @Count int
    SELECT @Count = 5

    SELECT TOP (@Count),
    c.ContactId,
    c.FirstName,
    c.LastName,
    e.EmployeeId,
    a.AddressId,
    a.AddressLine1,
    a.AddressLine2,
    a.City,
    a.PostalCode,
    at.Name AS AddressType
    FROM HumanResources.Employee AS e
    INNER JOIN Person.Contact AS c
    ON e.ContactId = c.ContactId
    INNER JOIN HumanResources.EmployeeAddreess AS ea
    ON e.EmployeeId = ea.EmployeeId
    INNER JOIN Person.Address AS a
    ON ea.AddressId = a.AddressId
    INNER JOIN Person.AddressType AS at
    ON a.AddressTypeId = at.AddressTypeId

    Is this better/worse/same?

  • It actually turns out that the execution plans are almost exactly the same (in this case). Using TOP (@Count) is slightly more efficient within this scenario in AdventureWorks (the plan cost for my query is 0.412827 and for TOP it is 0.412769). The difference is made up by some additional operators and a filter that I’m using to replace the functionality of TOP via my use of ROW_NUMBER and the WHERE clause.

    Which brings up the point that we’re always supposed to be using the parens with TOP and I just found that out this morning while reading Books Online.

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>