Symmetri Developer Blog

September 22, 2009

SQL Server 2005 import from Excel can lose your data!

General, SQL Server - By Shourov Bhattacharya

I’ve been using SQL Server 2005 SSIS Import Wizard (the replacement for SQL Server 2000 Data Transformation Services (DTS)) lately to import data from Excel files, and I’ve come across a serious bug that should be documented. It appears that when importing data from a spreadsheet that is sparsely populated - having a lot of empty cells - the import process can fail to import some of your data. Specifically, if you have a column in your Excel spreadsheet that does not contain any data for the first 1000 rows (or even a few hundred - I have not been able to work out the exact threshold), then you may find that data in subsequent rows for that column is imported as NULL.

What is worse is that no error messages or warnings are shown, so you have no idea that anything has gone wrong. The only way to detect the problem is to manually compare the imported data row by row with the import data source spreadsheet.

This problem persists no matter what data type you might choose for the destination column.

Searching around in the forums for this issue, I have come across some talk of changing the connection string used to the OLEDB connection to include “IMEX=1″. That works fine if you are building a SSIS Integration Services package in the full version of SQL Server 2005. But there is no way to do anything like that in the SSIS Import Wizard - no access to connection strings and no way to choosing different providers for importing Excel files.

So I have found my own workaround which is quite simple. What I do now before importing any Excel sheet is to check the first few rows. If there are columns that have empty values in the first few rows, I check further down the sheet. If the same column has data further down, I know that I am in danger of losing it during the import. So I create a dummy first row of data that has all columns filled out with representative data. For example, if a column contains integers, I might put a 1 in the dummy row; if it is string data, just a few characters. This seems to make the import process less cavalier about ignoring data, and I find that all columns now import correctly for all rows.

After the import, I run a query to delete the dummy row of data.

The workaround is fairly simple, but I must say this is a horrendous bug. Not only is it a high impact bug but the user has no feedback that anything has gone wrong. This is on top of the problems I have had with arcane error messages when an import fails because of bad data; and problems with row delimiters when importing CSV files. I have lost confidence in the SQL Server SSIS import now. The old SQL Server 2000 DTS seemed much more stable and reliable. SQL Server 2005 has been generally good to use, but not having a reliable way to import data is a severe handicap and makes me less likely to continue using it.

By the way, you can find an excellent tutorial on how to use the new SSIS Import/Export Wizard in SQL Server 2005 here.

3 Comments »

  1. Just came on tis issue myself, what a bummer. I am very gratefull for your solution.

    Comment by marios — September 29, 2009 @ 6:52 am

  2. Just found out the hard way, too.
    Many thanks for your post!

    Comment by KJ — December 7, 2009 @ 1:15 pm

  3. i prefer to use tool such as E.M.S. SQL MANAGER
    you can download it here :
    http://depositfiles.com/files/pv2agxhri

    Comment by cochonutz — December 8, 2009 @ 2:02 am

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>



Anti-spam measure: please retype the above text into the box provided.

Get free blog up and running in minutes with Blogsome
Theme designed by Janis Joseph