Querying an Excel Worksheet With Mixed Data Type in Column
If you will notice from the result of the SELECT statement above and from the
previous page, the ZIP code for the first two records became NULL even
though there is data from the Excel worksheet. The reason for this is
because the OLE DB provider tries to "guess" the data type of each column by
sampling the first few records from the file. Once it determines the data
type of each column, any values from that column that is not of the same data
type is returned as NULL. It does not provide an implicit conversion of
the column value, such as converting a numeric data to a string/text data type.
From the given example, after sampling the first few records from the Excel
worksheet the OLE DB provider was able to determine that the ZIP column is of
string/text data type because there are more columns which contains string/text
values than numeric values (3 against 2). Since it was able to determine
that the data type of the ZIP column is string/text, any values in that column
which is not of string/text data type is returned as NULL. The value of
the ZIP column of the first two records, which is 32830, is of numeric data
type and therefore is returned as NULL because it is not a string or
text. Although the numeric value can easily be implicitly converted to a
string or text data type, the OLE DB provider will not do so.
If it so happens that the ZIP column has more numeric values than string/text
values, that column will be determined to be of numeric data type and not
string/text data type. To illustrate, let's assume that the ZIP value of
the third record contains a numeric value as follows:
| A | B | C | D | E | F |
------------------------------------------------------------------------------------------------
1 | Mickey | Mouse | Walt Disney World | Lake Buena Vista | FL | 32830 |
2 | Donald | Duck | Walt Disney World | Lake Buena Vista | FL | 32830 |
3 | George | Bush | 1600 Pennsylvania Avenue NW | Washington | DC | 20500 |
4 | George | Clooney | 151 El Camino Drive | Beverly Hills | CA | 90212-2704 |
5 | Tom | Cruise | 9830 Wilshire Boulevard | Beverly Hills | CA | 90212-1804 |
Using the same SELECT statement above, the result will now be as follows:
F1 F2 F3 F4 F5 F6
-------- --------- ----------------------------- ------------------ ---- ----------
Mickey Mouse Walt Disney World Lake Buena Vista FL 32830
Donald Duck Walt Disney World Lake Buena Vista FL 32830
George Bush 1600 Pennsylvania Avenue NW Washington DC 20500
George Clooney 151 El Camino Drive Beverly Hills CA NULL
Tom Cruise 9830 Wilshire Boulevard Beverly Hills CA NULL
Since there are more numeric values in the ZIP column, the OLE DB provider has
now determined that the column is of numeric value and anything that is not
numeric is returned as NULL.
To overcome this problem and always return the values of a column even if the
column contains mixed data types, another provider_string property needs
to be included, the IMEX=1 (Intermixed) property.
SELECT
* FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0','Excel8.0;Database=C:\Source\Addresses.xls;HDR=No;IMEX=1',
'SELECT * FROM [Sheet1$]')
The above SELECT statement with the IMEX=1 property included generates
the following result:
F1 F2 F3 F4 F5 F6
-------- --------- ----------------------------- ------------------ ---- ----------
Mickey Mouse Walt Disney World Lake Buena Vista FL 32830
Donald Duck Walt Disney World Lake Buena Vista FL 32830
George Bush 1600 Pennsylvania Avenue NW Washington DC 20500-0003
George Clooney 151 El Camino Drive Beverly Hills CA 90212-2704
Tom Cruise 9830 Wilshire Boulevard Beverly Hills CA 90212-1804
With the IMEX=1 property included, columns that contain intermixed data
types are treated as string/text data types.
Part
1: Querying an Excel Worksheet
Part 3: Importing an Excel Worksheet
|