Excel Data Queries
Use MS Query to Treat Excel as a Relational Data Source
Excel ranges can work like relational tables. You can join them by common fields. Query them with SQL. And use queries in PivotTables. Here's how.
By guest author: Marty Ryerson
I work for a manufacturing company with plants in five states. When I started working there I used Crystal Reports to answer people’s questions about their data.
But the Crystal reports took a long time to create. And about 95% of the time, after I had written a Crystal report, the
person I had done it for would ask to see it in Excel.
With more work, I figured out a way to export the reports from Crystal to Excel without spending a lot of time. But it still took a lot of time to create or modify new reports.
Several years ago, I discovered Microsoft Query, which is included
with Excel. Although the program is less advanced than other Office
programs, it obviously could give Excel users significant
power in working with external data. But unfortunately, I could find
very little documentation about the tool.
Finally, I learned that Timothy Zapawa had written about MS Query
extensively in his 2005 book,
Excel Advanced Report Development. With his information at hand, I finally have been able
to use MS Query on the job.
MS Query gives Excel users the ability to access 3rd party
databases, text files, and Excel workbooks as relational data sources.
With text files, you can place them all in one folder to form a database.
With Excel, you define several named ranges in a single workbook, and
then use the ranges as database tables.
MS Query doesn't give you many of the built-in features of a “real” database
program, such as Microsoft Query Analyzer or TOAD from Quest.
But you certainly can join two or more tables by their common fields. You can use SQL
queries to access these tables.
And you can send the SQL query results to worksheets or access them with
If you know SQL, you can slice and dice all you
want. If you don’t know SQL, it's a lot easier to learn than VBA.
In this article, I'll show you how to define three ranges in an Excel
workbook as relational tables, and then display queries against these
tables in a worksheet. I'll also explain how to access the tables using
If you want to follow along with this description, you can
download the sample MS Query workbook from this link.
The Sample Database
The mini-database that I created has three ranges organized as tables.
Each of the tables has more rows than are pictured below. I've included
to illustrate the data and how it's organized. The tables
1. A customer table named CUST:
2. An order table named ORD:
3. A sales rep table named SREP:
Notice that like standard relational tables, these have certain fields in
Also notice how the second table is formatted. These formats are
among the few that MS Query will recognize. To assign these formats,
choose columns B and C and then assign the first format listed in
Format, Cells, Number, Date. Then choose column F and assign the first
format listed in Format, Cells, Number, Currency.
When using Excel as the source of data, it's important that each of the tables be a named range,
because when MS Query uses workbooks as a data source it will recognize only named ranges as tables. I usually place the
ranges on separate sheets, but that isn't necessary.
set up these tables, save and close the workbook. The workbook must be closed when it is accessed by MS Query.
Create a Connection
Open another workbook where you will create your Excel report. Choose Data, Import External Data, New Database Query,
which launches the Choose Data Source dialog box.
(If MS Query isn't installed, a message will appear asking if you want to install it.
To do so, place your installation disk in the appropriate drive and follow the
The first time you access a database, including a workbook database,
you'll need to create a new Data Source. To do so, select the <New Data Source> line, and then click OK.
In the first edit box of the Create New Data Source dialog, give your data source a
name that will remind you what it is connected to. This is the name you will select
from a list when you create new queries later.
The item asks you to select the driver type. Because Excel
is the source of data for this exercise, select the Excel driver shown
from the drop-down list.
Choose the Connect button and select the version of Excel you're working with.
Notice that even if you use Excel 2003, the most-current version of
Excel listed is Excel 97-2000.
Choose the Select Workbook button, launching the Select Workbook
Use this dialog to navigate to the workbook that will serve as your
data source. Here, OEDATA.xls contains my Order Entry Data. Select the
workbook from the list.
Choose OK to accept your Database Name selection. In the ODBC
Microsoft Excel Setup dialog, choose OK to return to the Create New Data
Source dialog. This dialog now shows the path to your Excel workbook
that acts as your database.
Choose OK to return to the Choose Data Source dialog. Note that the
Data Source you just created is already selected in the list.
Make sure the check box at the bottom of the dialog, "Use the Query
Wizard to create/edit queries," is NOT checked.
Query Wizard can help if you are doing very simple queries, but I want
to show you more powerful features of the program. You can experiment with
the Query Wizard later, if you like.
Now that you've defined an Excel workbook as a relational database,
you can use it in queries.
Create a Query
The Choose Data Source dialog now includes the data source
(MSQuery--Excel) that we've defined for the OEDATA.xls workbook. Choose
OK to use this data source.
data source will appear each time you access the Choose Data Source
After you choose OK, Excel displays both the full-screen Microsoft
Query application window and the Add Tables dialog. You will use these
tools to specify what data you want returned, either by pointing and clicking, or by pasting an SQL statement into the SQL window. For this example,
we'll use the point and click method.
In the Add Tables dialog, double-click on each of the tables you want to add. Notice that all of the named ranges appear here.
For this example, let’s add all of the tables. To do so, select each
table in turn, and then choose Add or double-click. Doing so displays them in MS Query,
as shown here. After you've added each table, close the Add Tables
The grey pane near the top of this figure is called the Tables pane. The white area at the bottom is called the Data pane. When you execute the query, the data will be returned to a grid in the Data pane.
middle pane is called the Criteria pane. It isn't visible by default. To see the Criteria pane,
choose View, Criteria. You also can choose the Show/Hide Criteria button,
shown here, to
toggle whether this pane is visible.
Now, let’s join the tables shown in the Tables pane.
The matching field in CUST and ORD is CustNum. Click on CustNum in CUST, and drag
it to CustNum in ORD. When you drop, a line will appear, joining the two tables.
The matching field in ORD and SREP is SalID. In a similar fashion,
connect the SalID
field between ORD and SREP.
Now, let's use these tables to create a query.
Suppose we're interested only in sales in West Virginia. In that case, we would restrict the returned data set to just the records where the ST
(state code) field in the CUST table is equal to WV.
We set up
this filter by dragging the ST field from the CUST table to the top-left
cell of the Criteria pane, and then by expressing the filter we want to use. You tell MS Query what value you want this field to be equal to by typing the value in the second line of the criteria pane.
In this case, we type WV. (MS Query adds
single quotes around WV when you move off the cell.)
On the other hand, if we wanted to show sales everywhere except West Virginia, we could
enter the expression, <> WV in this cell. This would return
the state code does not have the value WV.
Please note that these criteria are not case sensitive when
you query Excel files, but they might be case sensitive when you query
other data sources. For example, queries against an Oracle or SQL Server
database may be case sensitive, depending on how your database is set
Next, we need to tell MS Query which columns we would like to see in
our Excel report. For this exercise, let's choose to see the customer number, the customer’s last name, the type of customer (cash or
credit), the amount of the order, the delivery date, and the name of the sales rep. To do this, double-click on the fields in the tables
shown in the following figure, and they'll appear as headings in the data grid.
After you've added all the fields you want, click on the Query Now button,
shown here. The data will be returned in the data grid, as shown in top
few rows of this figure.
Note that the data grid isn't limited to 65,536 rows. If you suspect
the dataset you've returned is larger, you can check this by clicking on
Record” button at the bottom of your window; it's the right-most button
shown here. Here, for example, the query produced 140 records.
Now would be a good time to save your query.
will allow you or another person to use the same query later in a new
workbook, with additional data, or both. To save the query, choose
File Save As in the Microsoft Query window and then name your query
anything you want. In the File Save As dialog you'll see two file
formats, dqy and qry. If dqy is specified as the default, use that
format. The qry file format was used in earlier versions of the tool.
At this point, you may be curious to know what the SQL statement you just generated looks like.
When you click on SQL toolbar button shown
can see the SQL statement in the SQL window. If you know SQL, you can edit the statement to add features that are not supported by the generator, but are supported by the ODBC driver you're using.
Export the Data to Excel
If the data in the grid is what you want to export to Excel, click on the
Return Data button,
You'll be returned to Excel, and the Import Data window will let you decide where you want to put the data. For this example, I'll accept the defaults, and put the data in the existing worksheet in Column A, Row 1, by clicking the OK button.
You now can apply any formats, formulas, and so on, that you wish.
Because this is just an introduction, I will leave it at that. You can
do a lot more.
- You can refresh this query by clicking a button, in case the data in the original tables has changed.
- You can add formulas and have them automatically “copy down” each time you refresh the query.
- You can add parameters, and have them refer to a cell in the worksheet, so that you can see different subsets of the data.
You can generate similar queries on text files and databases.
new data source you need to create a “Data Source
Name”. Once you've done so, you can use the data source repeatedly
to create any number of queries against that database. You can save the queries and use them in
a new workbook.
Return the Data to a Pivot Table
Let me show you one last trick, one that lets you analyze data when the data set you want to look at is too big to fit on an Excel spreadsheet.
From the Data menu, select Pivot Table and Pivot Chart. Select
External Data Source when the Wizard comes up.
In the "Step 2 of 3" dialog above, choose Get Data.
Choose your data source and proceed as before,
create an entirely new query. When you return,
you'll have a PivotTable with all the data in the pivot cache, but not on a spreadsheet. Even if the data would not fit on a spreadsheet, this will allow you to create all the pivot reports you need.
The thing I find most appealing about this approach is that it is relatively easy to learn if you have some good documentation. I have been able to reduce my workload and stress significantly by teaching the people who want relatively simple, one-time reports, or those who want to see the data in numerous different configurations, how to use these tools.
If you think this could be of value to you, I highly recommend Mr. Zapawa’s
book, “Excel Advanced Report Development” available now from Wiley