I have a few remote SQL servers that I need to pull a large amount of data from regularly (say 5,000,000 rows per server). This data also needs to be formatted and FTPed to another server. The dial-up portion is fine, I can connect and communicate with the server, but sometimes the connection is slow, maybe only 19Kbps.
Once connected I tried to use System.Data.SqlClient.SqlBulkCopy, but it just throws a timeout error. I've set the timeout on the bulk copy to 7200 seconds, and the timeout on the connection strings to each database at 900 seconds, but it still times out at around 30 seconds.
I tried using a DataReader and running SqlCommand.ExecuteNonQuery() to insert each row, which works OK, but it's slow and sometimes the connection is lost.
I also tried setting up a DTS package on the remote servers, scheduling them to dump the data I need to text, and then tried downloading the files. But, they can be a few hundred MB (possibly even GB on some servers) and the dial-up connection is usually dropped at some point.
-
How many miles away is it? Can you sneakernet a DVD?
scottm : The servers are spread out all over the country.ConcernedOfTunbridgeWells : Not as silly as might be thought. The sneakernet could be a courier firm.scottm : The facilities are also secure, and not just anyone will be allowed access. (even if we authorize it) -
Compress the data using the ZIP format, it's built into .NET
http://www.windowsdevcenter.com/pub/a/windows/2006/09/12/using-data-compression-in-net-20.html
If the data is still too big, you can try using an external library/executable to RAR or 7Zip the data as that will be half the size of the ZIP'd copy.
-
If it is an option, zip it up, ftp and do the bulk insert on your side.
scottm : I'm retrieving the data from the remote systems, so this is not an option. If I could pull it here, there'd be no problemOtávio Décio : I see. But zipping the data seems to be required in your case. -
Some thoughts on the topic:
Build a routine at source to extract the data and chop it into several smaller files. FTP the files individually and have some check mechansim to ensure they have all arrived.
If you can identify changed or new data at source, put some sort of changed data capture function at source and only ship the deltas (apologies if you're already doing that).
Compress the chunked files to reduce the data size to ship.
-
bcp.exe supports an "out" parameter that allows you to dump the output of an SQL statement to a flat text file.
- On the source side: Dump, zip and split the file into smaller portions, upload
- On the destination side: download, decompress and rejoin the files once you have them all, and bcp in.
It's not an elegant solution. If you want to do it programatically, you'll be making process calls. It does, however, get the job done.
The question you might want to ask, however, is how cost effective is it to create a solution around the limitations of not having a broadband connection versus getting it (if possible). I expect that regardless of the solution you decide on, you'll have ongoing maintenance problems with using dial-up.
scottm : I've been having "ongoing maintenance problems" with dial-up for years. But, it's not my call, nor is it always an option.Michael Meadows : Understood. The bcp option has been a solid solution for me. The only thing is that you'll need to be able to set the source server up to run command line jobs in order to create the out file. -
You'll probably want to use a combination approach.
BCP OUT the table in native format.
Compress the file with a tool that will make a series of files.
ftp the individual files. You'll be able to continue the transmission.
I've found that 7Zip (free opensource) has the best compression and will make a series of numbered files of a size that you specify. I believe it can be run from the command line.
HTH Andy
-
What I ended up doing was creating a small app in C (a few of these are WINNT and that was the easiest way, it also allows other to retrieve the data manually if necessary without the ability to alter the source) that takes a few arguments to build the query I need. It then runs the query and dumps the results in the required CSV format. Then it calls 7zip with the highest compression level to compact the data as small as possible (this reduces a 500MB file to about 20MB).
Because I have to bring the data back to me before I can FTP it to the necessary place, and the remote servers don't have any internet access, I'm still just copying the file to a windows share, then decompressing it locally and FTP the uncompressed data (as requested) to it's destination.
This may not be the best approach, but it's working. Thanks
0 comments:
Post a Comment