Friday, March 23, 2012
Importing multiple Excel Files
Is there a method with Data Transformation Services where I can have it work off of all the files in a given directory. I can set up DTS to work off of specific Excel files with no problem, but what I would like to do is set up a DTS so it could pull from each of the 800+ files.
Is this possible, or do I need to look at a solution outside of SQL to consolidate the Excel files first?
The Excel file would have columns similar to the following: store_id, zip_code, sales, transactions.There are lots of different ways to handle this kind of problem.
By far my largest concern with 800+ Excel files is GIGO (Garbage In, Garbage Out). Even one sheet of undetected junque data, and you can't get a correct answer! Before I can even suggest a technical solution, I need to understand what the risks of bad data are for you, and how much manpower you are willing to commit to either detecting or correcting the bad data... That answer will strongly influence what I suggest for a solution.
-PatP|||In terms of the quality of the data, it comes directly from my client, so as long as what I have stored in our database matches what they gave me, I am in good shape. I run some additional reports from our application to spot anomalies in the data quality.
In terms of confidence in the file structure integrity: the client isn't very sophisticated (i.e. they cannot generate one file with all the stores in it), but their result file is very simple and routine. Last month was the first time I had gone through this process with them, and all 843 files were clean.
In terms of manpower... I'm a one man technical staff for a company with 50--70 employees, and several clients, and the database management / design is only one part of my responsibilities. Considering I would be doing this particular task once a month, 30 min - 45 min of effort would hopefully be the max I would have to spend.
Having done manipulation of customer / client files for several years now, I know to expect the unexpected; and GIGO is always one of my biggest concerns. Having no support staff to help, limits the effort I can make, but I need to know that what I received is what I loaded. If something cannot be loaded, identifying that would be great.
I don't mind loading the data into a staging area... and then doing some post-load validations in SQL before moving the data to the intended tables in the database. From my perspective, the key is getting it out of the Excel format, which is terribly unfriendly for me to validate against, into a format / structure that I can.
Does that help guide you?|||I am not sure about DTS, because scripting is my preference, so there's gotta be someone that would come up with a DTS solution (I've seen a couple of samples on the net that demonstrate ennumeration of files in a directory from a DTS/SSIS package).
But in scripting world you can have a batch file that would dump a list of files into a file queue table, and then extract a script-like output into a TSQL script that you can run in the next step with osql/sqlcmd.
When I had to deal with Excel files in the past, I ended up building linked servers on the fly for each file in order to be able to handle everything in TSQL (plenty of references on the net as to how to create a linked server to an excel spreadsheet).|||You can do this in DTS by defining a filename variable and using VB to cycle through the files. So what you do depends upon whether your comfort zone is in VB or in scripting.
Monday, March 12, 2012
importing data to new database
hi,
i have 2 databases (DT1 and DT2), both contains one table customers with almost identical structure, the only difference is the firstname and lastname in DT1 are char(30) and varchar(50) in DT2. i've created both tables already. i tried using SQL 2005 managment studio to do a 'import data' and it worked. all the information in DT1 were copied to DT2.
now i have to repeat the process for many databases and it would be too tedious to do it one by one. my question is, is there a script i can use to do it so that i could make a small application to automate the process?
thanks in advance.
bob
hi bob,
you can save the script generated by the wizard as an ssis or dts package
you can automate from ssis or dts
regards,
|||I addition to Joyea and the use of SSIS, which is very extensible you can also use the threepart notation to access the data from another database and write a stored procedure for this:
Select * From Database.Schema.Objectname
HTH, Jens Suessmeyer.
http://www.sqlserver2005.de
thanks for your suggestions.
Bobby