Filtering data is one of the bread-and-butter features of SQL. Filtering puts the Q in SQL. Because of that, filtering is a very large topic. We are just going to cover a smattering of items to help you start writing your own queries.
WHERE Do We Start?
Did you notice the SQL pun there? WHERE is the keyword you'll come to know and love. Some SQL gurus might also be HAVING affairs with other keywords but we'll try not to focus on their peculiar proclivities. Once you start using SQL regularly, WHERE will feel common and comfortable, but at first it may run counter to the way you've grown up using the word.
Normally WHERE is used for asking for the location of an item (Where's the Beef?) or entertainment activity (Where's the Party?). In SQL, however, WHERE is used for asking for the contents of an item. So, if we have a table of patient records, you can think of each record as being patient. When we query, we're asking for all patients that have a pulse of 85 or all patients that have a last name of Asimov or have last visited our office in 2012. These are each columns in the table and, if the table has been mapped out well, you can usually think of the columns as attributes of the record.
Let's take the examples I just gave and create queries for them:
SELECT FirstName, LastName, LastVisit FROM Patients WHERE Pulse = 85
SELECT FirstName, LastName, Pulse, LastVisit FROM Patients
WHERE LastName = 'Asimov' OR (LastVisit >= '1/1/2012' AND LastVisit < '1/1/2013')
If you've been following along, you might be thinking I threw a lot of stuff in those statement that you hadn't seen before and you'd be correct. The mathematically inclined among you will probably be able to puzzle these statements out, but let's break them down anyway...
SELECT FirstName, LastName, LastVisit FROM Patients
There's nothing new here. We're selecting the FirstName, LastName, and LastVisit columns from the Patients table. Note that I could have also included the Pulse column, but since were looking for pulses of only 85, there's not really any need to do so and it saves us from having to pass that value down the wire for each and every record that gets returned.
Finally, since we haven't used the TOP keyword, we're selecting all records that meet our criteria - which is specified in the WHERE clause.
WHERE Pulse = 85
Our criteria then, is all records that have a Pulse that is equal to 85
SELECT FirstName, LastName, Pulse, LastVisit FROM Patients
Here we're grabbing the FirstName, LastName, Pulse, and LastVisit columns from the Patients table. In the previous example, I chose not to include the Pulse column since pulse would always be 85. However, in this example where we know the LastName will always be "Asimov" I chose to include it anyway. Why? It's up to you. The same rule-of-thumb applies here, but I usually end up joining the first and last name when displaying names and having the last name readily available reduces the amount of work I have to do. It's up to you and your application as to which method is more appropriate.
Finally, as in the first example, we are not using a TOP clause so all records that meet the WHERE clause criteria will be returned.
WHERE LastName = 'Asimov' OR (LastVisit >= '1/1/2012' AND LastVisit < '1/1/2013')
This clause is only mildly more complex that the first, even though it may look more intimidating. Just as in example #1, we're looking for records that have a LastName equal to "Asimov". Because Asimov is text (geeks call it a string) we need to tell the database that. We do that by surrounding it with single quotes (not double quotes).
Next we encounter a pair of parentheses with some more comparisons inside. The parentheses may look uber-geeky but are really just a way to group things together logically. In the first part we're looking for records who have a LastVisit date that is greater than or equal to Jan. 1st, 2012. "AND", in part two, we're looking for records that also have a LastVisit data that is less than (i.e earlier than) Jan. 1st 2013. Take them together and we're looking for records that have a LastVisit date that falls sometime in the year 2012. When you use multiple ANDs and ORs and other operators together, SQL follows and order of precedence to determine how to evaluate your statement, by using the parentheses, you can ensure that SQL knows exactly how you want things evaluated. It also makes it easier for a human to read and understand later.
Remember we surrounded Asimov with single quotes to denote it was a string (text). We've also surrounded our dates with single quotes. Why? Because the SQL spec says so, that's why :) I've never gone so far as to research why the language designers did this, but I suspect the reason is that you tend to use characters like "/" and "-" in date values. These are operators in SQL that are used for dividing and subtracting numbers. So, if you wrote 1/1/2013 without the quotes, SQL would try to interpret that as a mathematical statement and try to divide 1 by 1 and then by 2,013.
There are many other operators you can use in your WHERE clause. You may use NOT to look for something that is NOT a certain value. We could modify our example to select patients whose last name is Asimov or who haven't been seen in 2012 like so:
WHERE LastName = 'Asimov' OR NOT (LastVisit >= '1/1/2012' AND LastVisit < '1/1/2013')
That's about all we can cover in this short beginner's introduction. If you have trouble falling asleep, you can always read the SQL language reference for MS SQL Server (the language is called T-SQL) here.
Articles in this this SQL for Beginners Series: