Home | Articles | Functions | Tips & Tricks | SQL Server 2005 | SQL Server 2008 | SQL Server 2012 | SQL Server 2014 | SQL Server 2016 | FAQ | Practice Test |    
Tip of the Day : How to Get a List of Databases Within a SQL Server Instance
Home > Tips and Tricks > Query and Import an Excel File
Part 2: Querying an Excel Worksheet Without Header and With Mixed Column Data Type

Part 2: Querying an Excel Worksheet Without Header and With Mixed Column Data Type

Querying an Excel Worksheet Without Header

To query/read an Excel worksheet that does not contain any column header record, the HDR (HeaDeR) property will be included in the provider_string.  From the previous example the provider_string only contains the Excel version and the Excel file name.  Since the HDR property was not specified, the OLE DB provider assumes that the first record in the file contains the column header.

To specify to the OLE DB provider that the Excel worksheet does not contain any column header, simply add "HDR=No" in the provider_string as shown here:

SELECT * 
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
                'Excel 8.0;Database=C:\Source\Addresses.xls;HDR=No',
                'SELECT * FROM [Sheet1$]')

Using the same Excel worksheet example on the previous page and removing the column header row:

  |     A     |     B    |               C             |         D        |   E   |     F      |
------------------------------------------------------------------------------------------------
1 | Mickey    | Mouse    | Walt Disney World           | Lake Buena Vista | FL    | 32830      |
2 | Donald    | Duck     | Walt Disney World           | Lake Buena Vista | FL    | 32830      |
3 | George    | Bush     | 1600 Pennsylvania Avenue NW | Washington       | DC    | 20500-0003 |
4 | George    | Clooney  | 151 El Camino Drive         | Beverly Hills    | CA    | 90212-2704 |
5 | Tom       | Cruise   | 9830 Wilshire Boulevard     | Beverly Hills    | CA    | 90212-1804 |

The result of the SELECT statement shown above will be as follows:

F1       F2        F3                            F4                 F5   F6        
-------- --------- ----------------------------- ------------------ ---- ---------- 
Mickey   Mouse     Walt Disney World             Lake Buena Vista   FL   NULL
Donald   Duck      Walt Disney World             Lake Buena Vista   FL   NULL
George   Bush      1600 Pennsylvania Avenue NW   Washington         DC   20500-0003
George   Clooney   151 El Camino Drive           Beverly Hills      CA   90212-2704
Tom      Cruise    9830 Wilshire Boulevard       Beverly Hills      CA   90212-1804

Since the Excel worksheet did not have a column header, the OLE DB provider assigned the column name to each column.  The column names convention is from F1 to Fn.

Querying an Excel Worksheet With Mixed Data Type in Column

If you will notice from the result of the SELECT statement above and from the previous page, the ZIP code for the first two records became NULL even though there is data from the Excel worksheet.  The reason for this is because the OLE DB provider tries to "guess" the data type of each column by sampling the first few records from the file.  Once it determines the data type of each column, any values from that column that is not of the same data type is returned as NULL.  It does not provide an implicit conversion of the column value, such as converting a numeric data to a string/text data type.

From the given example, after sampling the first few records from the Excel worksheet the OLE DB provider was able to determine that the ZIP column is of string/text data type because there are more columns which contains string/text values than numeric values (3 against 2).  Since it was able to determine that the data type of the ZIP column is string/text, any values in that column which is not of string/text data type is returned as NULL.  The value of the ZIP column of the first two records, which is 32830, is of numeric data type and therefore is returned as NULL because it is not a string or text.  Although the numeric value can easily be implicitly converted to a string or text data type, the OLE DB provider will not do so.

If it so happens that the ZIP column has more numeric values than string/text values, that column will be determined to be of numeric data type and not string/text data type.  To illustrate, let's assume that the ZIP value of the third record contains a numeric value as follows:

  |     A     |     B    |               C             |         D        |   E   |     F      |
------------------------------------------------------------------------------------------------
1 | Mickey    | Mouse    | Walt Disney World           | Lake Buena Vista | FL    | 32830      |
2 | Donald    | Duck     | Walt Disney World           | Lake Buena Vista | FL    | 32830      |
3 | George    | Bush     | 1600 Pennsylvania Avenue NW | Washington       | DC    | 20500      |
4 | George    | Clooney  | 151 El Camino Drive         | Beverly Hills    | CA    | 90212-2704 |
5 | Tom       | Cruise   | 9830 Wilshire Boulevard     | Beverly Hills    | CA    | 90212-1804 |

Using the same SELECT statement above, the result will now be as follows:

F1       F2        F3                            F4                 F5   F6        
-------- --------- ----------------------------- ------------------ ---- ---------- 
Mickey   Mouse     Walt Disney World             Lake Buena Vista   FL   32830
Donald   Duck      Walt Disney World             Lake Buena Vista   FL   32830
George   Bush      1600 Pennsylvania Avenue NW   Washington         DC   20500
George   Clooney   151 El Camino Drive           Beverly Hills      CA   NULL
Tom      Cruise    9830 Wilshire Boulevard       Beverly Hills      CA   NULL

Since there are more numeric values in the ZIP column, the OLE DB provider has now determined that the column is of numeric value and anything that is not numeric is returned as NULL.

To overcome this problem and always return the values of a column even if the column contains mixed data types, another provider_string property needs to be included, the IMEX=1 (Intermixed) property.

 SELECT
* FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0','Excel8.0;Database=C:\Source\Addresses.xls;HDR=No;IMEX=1',
                'SELECT * FROM [Sheet1$]')

The above SELECT statement with the IMEX=1 property included generates the following result:

F1       F2        F3                            F4                 F5   F6        
-------- --------- ----------------------------- ------------------ ---- ---------- 
Mickey   Mouse     Walt Disney World             Lake Buena Vista   FL   32830
Donald   Duck      Walt Disney World             Lake Buena Vista   FL   32830
George   Bush      1600 Pennsylvania Avenue NW   Washington         DC   20500-0003
George   Clooney   151 El Camino Drive           Beverly Hills      CA   90212-2704
Tom      Cruise    9830 Wilshire Boulevard       Beverly Hills      CA   90212-1804

With the IMEX=1 property included, columns that contain intermixed data types are treated as string/text data types.

Part 1: Querying an Excel Worksheet
Part 3: Importing an Excel Worksheet

Related Topics: