Anytime I hear the words “SSIS” and “Excel” in the same sentence, my first instinct is to run… run far. I won’t stop running until I am out of breath. After I collapse from exhaustion, I will keep crawling away as long as my arms and legs will take me. Then I will shut off my phone, lock myself in the nearest closet, and hold my knees to my chest in the fetal position, rocking back and forth while saying to myself, “it will never hurt me again”.
Ok, maybe it isn’t that bad. This also isn’t to say I don’t know what I am doing when it comes to that unholy marriage. But it can feel like having a star role in any of the Saw movies; it will probably hurt before it gets better…you get the idea.
Thankfully, there are tons of people who have gone through this same heartache who have documented their trials and tribulations. One such angel is Robert Sheldon in his article “Importing Excel Data into SQL Server Via SSIS: Questions You Were Too Shy to Ask”. He has created a pretty nice list of some of the most common “gotchas” the average developer will come across while rolling this rock uphill.
Now, after hearing all of this initial negativity, a question one might ask is, “If importing Excel Data with SSIS is such a hassle, why do it?” There are really two answers to this question.
First, a lot of organizations have a lot of good, usable, “BI-worthy” data locked up in one or a thousand Excel workbooks. Excel has turned into the de facto repository of data for many organizations for several reasons, even when that same organization already has an enterprise level database like SQL Server. It’s simple, easy to use, and for those folks who aren’t tech savvy, can just be a quick and dirty way to get things done in a pinch. Unfortunately, left unchecked, this results in what we often call “cockroach databases.” Quite honestly, it’s very prevalent.
The other answer is, if you already have SQL Server Standard Edition or better (and if you’re reading this you probably do), you already have SSIS available to you. You don’t have to purchase any additional third-party ETL tools or services. It also already integrates with a number of sources and destinations, well beyond anything Excel flavored. Just like any other software, SSIS has its pros and cons. Unfortunately, the cons can be shrouded in obscure error messages and metadata mishaps. This is where Robert Sheldon’s article alleviates some of that kind of headache.
So what can make it so difficult to use Excel as a source? Well, for example, no matter how many times you reiterate that under no circumstances should the column headers change names and columns should never be added without telling you first, the inevitable happens. Sheldon uses actual scenarios and gives easy to follow visuals on what screens you need to look at/change to tackle when this happens. A couple of the most useful tips are changing the data type that gets returned from the Excel sheet(s) and how that gets determined by SSIS, out of sync issues with the source document, and some nice Conditional Split scenarios for NULL values.
Hopefully, Mr. Sheldon’s article helped ease any frustration you may have felt while developing an ETL process using Excel as a data source. I, for one, appreciated the effort he put into it and the simple fact that he made it available. Drop a comment below and let me know if this was helpful for you!