Microsoft Access is arguably the most powerful tool in the entire Microsoft Office suite, yet it mystifies (and sometimes scares) Office power users. With a steeper learning curve than Word or Excel, how is anyone supposed to wrap their head around the use of this tool? This week, Bruce Epper will look at some of the issues spurred by this question from one of our readers.
A Reader asks:
I’m having trouble writing a query in Microsoft Access.
I’ve got a database with two product tables containing a common column with a numeric product code and an associated product name.
I want to find out which products from Table A can be found in Table B. I want to add a column named Results which contains the product name from Table A if it exists, and the product name from Table B when it doesn’t exist in Table A.
Do you have any advice?
Microsoft Access is a Database Management System (DBMS) designed for use on both Windows and Mac machines. It utilizes Microsoft’s Jet database engine for data processing and storage. It also provides a graphical interface for users which nearly eliminates the need to understand Structured Query Language (SQL).
SQL is the command language used to add, delete, update, and return information stored in the database as well as modify core database components such as adding, deleting, or modifying tables or indices.
If you do not already have some familiarity with Access or another RDBMS, I would suggest you start with these resources before proceeding:
- So What is a Database? where Ryan Dube uses Excel to show the basics of relational databases.
- A Quick Guide To Get Started With Microsoft Access 2007 which is a high-level overview of Access and the components that comprise an Access database.
- A Quick Tutorial To Tables in Microsoft Access 2007 takes a look at creating your first database and tables to store your structured data.
- A Quick Tutorial On Queries In Microsoft Access 2007 looks at the means to return specific portions of the data stored in the database tables.
Having a basic understanding of the concepts provided in these articles will make the following a bit easier to digest.
Database Relations and Normalization
Imagine you are running a company selling 50 different types of widgets all over the world. You have a client base of 1,250 and in an average month sell 10,000 widgets to these clients. You are currently using a single spreadsheet to track all of these sales – effectively a single database table. And every year adds thousands of rows to your spreadsheet.
The above images are part of the order tracking spreadsheet you are using. Now say both of these clients buy widgets from you several times a year so you have far more rows for both of them.
If Joan Smith marries Ted Baines and takes his surname, every single row that contains her name now needs to be changed. The problem is compounded if you happen to have two different clients with the name ‘Joan Smith’. It has just become much harder to keep your sales data consistent due to a fairly common event.
By using a database and normalizing the data, we can separate out items into multiple tables such as inventory, clients, and orders.
Just looking at the client portion of our example, we would remove the columns for Client Name and Client Address and put them into a new table. In the image above, I have also broken things out better for more granular access to the data. The new table also contains a column for a Primary Key (ClientID) – a number that will be used to access each row in this table.
In the original table where we removed this data, we would add a column for a Foreign Key (ClientID) which is what links to the proper row containing the information for this particular client.
Now, when Joan Smith changes her name to Joan Baines, the change only needs to be made once in the Client table. Every other reference from joined tables will pull the proper client name and a report that is looking at what Joan has purchased for the last 5 years will get all of the orders under both her maiden and married names without having to change how the report is generated.
As an added benefit, this also reduces the overall amount of storage consumed.
SQL defines five different types of joins: INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER, and CROSS. The OUTER keyword is optional in the SQL statement.
Microsoft Access allows the use of INNER (default) , LEFT OUTER, RIGHT OUTER, and CROSS. FULL OUTER is not supported as such, but by using LEFT OUTER, UNION ALL, and RIGHT OUTER, it can be faked at the cost of more CPU cycles and I/O operations.
The output of a CROSS join contains every row of the left table paired with every row of the right table. The only time I have ever seen a CROSS join used is during load testing of database servers.
Let’s take a look at how the basic joins work, then we will modify them to suit our needs.
Let’s start by creating two tables, ProdA and ProdB, with the following design properties.
The AutoNumber is an automatically incrementing long integer assigned to entries as they are added to the table. The Text option was not modified, so it will accept a text string up to 255 characters long.
Now, populate them with some data.
To show the differences in how the 3 join types work, I have deleted entries 1, 5, and 8 from ProdA.
Next, create a new query by going to Create > Query Design. Select both tables from the Show Table dialog and click Add, then Close.
Click on ProductID in table ProdA, drag it to ProductID in table ProdB and release the mouse button to create the relationship between the tables.
Right-click on the line between the tables representing the relationship between the items and select Join Properties.
By default, join type 1 (INNER) is selected. Option 2 is a LEFT OUTER join and 3 is a RIGHT OUTER join.
We will look at the INNER join first, so click OK to dismiss the dialog.
In the query designer, select the fields we want to see from the drop-down lists.
When we run the query (the red exclamation point in the ribbon), it will show the ProductName field from both tables with the value from table ProdA in the first column and ProdB in the second.
Notice the results only show values where ProductID is equal in both tables. Even though there is an entry for ProductID = 1 in table ProdB, it does not show up in the results since ProductID = 1 does not exist in table ProdA. The same applies to ProductID = 11. It exists in table ProdA but not in table ProdB.
By using the View button on the ribbon and switching to SQL View, you can see the SQL query generated by the designer used to get these results.
SELECT ProdA.ProductName, ProdB.ProductName FROM ProdA INNER JOIN ProdB ON ProdA.ProductID = ProdB.ProductID;
Going back to Design View, change the join type to 2 (LEFT OUTER). Run the query to see the results.
As you can see, every entry in table ProdA is represented in the results while only the ones in ProdB that have a matching ProductID entry in table ProdB show up in the results.
The blank space in the ProdB.ProductName column is a special value (NULL) since there is not a matching value in table ProdB. This will prove important later.
SELECT ProdA.ProductName, ProdB.ProductName FROM ProdA LEFT JOIN ProdB ON ProdA.ProductID = ProdB.ProductID;
Try the same thing with the third type of join (RIGHT OUTER).
The results show everything from table ProdB while it is showing blank (known as NULL) values where the ProdA table does not have a matching value. So far, this brings us closest to the results desired in our reader’s question.
SELECT ProdA.ProductName, ProdB.ProductName FROM ProdA RIGHT JOIN ProdB ON ProdA.ProductID = ProdB.ProductID;
Using Functions in a Query
The results of a function may also be returned as part of a query. We want a new column named ‘Results’ to appear in our result set. Its value will be the content of the ProductName column of table ProdA if ProdA has a value (it is not NULL), otherwise it should be taken from table ProdB.
The Immediate IF (IIF) function can be used to generate this result. The function takes three parameters. The first is a condition that must evaluate to a True or False value. The second parameter is the value to be returned if the condition is True, and the third parameter is the value to be returned if the condition is False.
The full function construct for our situation looks like this:
IIF(ProdA.ProductID Is Null, ProdB.ProductName,ProdA.ProductName)
Notice that the condition parameter does not check for equality. A Null value in a database does not have a value that can be compared to any other value, including another Null. In other words, Null does not equal Null. Ever. To get past this, we instead check the value using the ‘Is’ keyword.
We could have also used ‘Is Not Null’ and changed the order of the True and False parameters to get the same result.
When putting this into the Query Designer, you must type the entire function into the Field: entry. To get it to create the column ‘Results’, you need to use an alias. To do this, preface the function with ‘Results:’ as seen in the following screenshot.
The equivalent SQL code to do this would be:
SELECT ProdA.ProductName, ProdB.ProductName, IIF(ProdA.ProductID Is Null,ProdB.ProductName,ProdA.ProductName) AS Results FROM ProdA RIGHT JOIN ProdB ON ProdA.ProductID = ProdB.ProductID;
Now, when we run this query, it will produce these results.
Here we see for each entry where table ProdA has a value, that value is reflected in the Results column. If there isn’t an entry in the ProdA table, the entry from ProdB appears in Results which is exactly what our reader asked.
For more resources for learning Microsoft Access, check out Joel Lee’s How to Learn Microsoft Access: 5 Free Online Resources.