In DTS 2000 I had a situation where I had a text file as input source and text file as output source. On migrating the package to 2005 it puts a wrapper around it which executes it as a 2000 package, the rest of the tasks are neatly converted to 2005 style tasks. I presume this to mean that this will not be supported through to the next version, and there is no direct equivalent in 2005.
My question is how do I import a non-flat file source which has different numbers of columns per line. I did ,somehow, manage to do this with 2000 but cannot seem to get anywhere with 2005.
The flat file source seems to be expecting a common number of columns and just can't seem to cope with no column delimiters on some lines. If anybody knows different I would be glad to hear about it.
Raw data is not helpful to me as only works with a specific raw type (apparently)
Went onto Bulk Insert Task but got this message
[Bulk Insert Task] Error: An error occurred with the following error message: "Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.Bulk load: An unexpected end of file was encountered in the data file.".
Have already browsed with this on web but only find comments about changing timeout setting.
Can find timeout settings in DataFlow source and DataFlow destinations but not in Bulk Insert Task.
As you can see this is a long and protracted question.
If the answer is simple I apologise if not blame Microsoft. Other than that have found SSIS has some nice improvements, apart from the odd vague error message I keep coming across.
Set up your flat file source to read in one big long line. Then inside your data flow, use a derived column to substring() out your different columns. OR you can use a conditional split to move your different record layouts to their own flat files, to be read inside another data flow.Search around this forum for conversations and examples on this topic.|||
I am certainly no expert, but there have been a few discussions on getting around the flat file import with variable columns. After a very brief search, I would suggest that you see the following thread:
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1134754&SiteID=1
As well as:
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1546075&SiteID=1
As to your other question, I am not certain of the bulk insert task error that you are getting, but I am sure someone else will be along to answer it for you shortly.
:-)
(Of course there is a reply in the time it takes me to type this up....)
|||Have been looking into this answer.
Seem to have hit a row limit. Imports the first 158383 lines out of 665525 and then stops. Have had a brief browse to see if there is a number of rows limit but other than someone mentioning a million lines in a flat file have found no other information.
Can anybody confirm this number or can this problem be remedied.
|||Well, I know that the limit, if there is one, is not as low as 158383 as I have retrieved over 700k lines from a flat file. (I do not believe that there is an upper bound on the number of rows a flat file source can handle)
Have you turned on logging to see if there is a problem with the import? Or, have you set up an event handler or anything of that nature? I would have to think that there is something wrong with the data...
|||With a bit of playing around I found out that if I set the second column to text only 158383 get copied into the table whereas with varchar(255) 665517 get copied.
Looking into the scripting solution but so far have only got 15 columns of blanks being written. Tried putting in breakpoints but must be missing something really basic as not breaking and when going back into script breakpoint vanished. I am using this event
PublicOverridesSub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
as set up by SSIS I presume this is set up to work with each row received from the data source. Am I wrong? Am I missing some settings?
|||Obviously replied to wrong one earlier for previous post see up a couple of messages. Have partially discovered what is wrong. Am correctly applying data to right output columns. However inital column data is wrong. I seem to trying to split the Row.Column1.ToString which returns the type blob? A bit puzzled here. Having great difficulty locating where the column data actually is so I can split it.|||
Madhattuer wrote:
Obviously replied to wrong one earlier for previous post see up a couple of messages. Have partially discovered what is wrong. Am correctly applying data to right output columns. However inital column data is wrong. I seem to trying to split the Row.Column1.ToString which returns the type blob? A bit puzzled here. Having great difficulty locating where the column data actually is so I can split it.
How are you trying to do this?
# Delimiter character is ";" in this example; change to what you need:
Dim myStringSplit as String() = Split(Row.Column1.ToString, ';')|||If it is a blob(and it would be if you set the type to DT_TEXT or if the data is binary), you have to read the data using the GetBlobData method, and then convert it to a string, before applying the split function.|||
jwelch wrote:
If it is a blob(and it would be if you set the type to DT_TEXT or if the data is binary), you have to read the data using the GetBlobData method, and then convert it to a string, before applying the split function.
Ah yes... And in converting that blob data to a string, won't the developer have to split that blob stream up into 4,000 byte chunks? (That is, assuming the input string length is more than 4,000 characters...)|||
Yes I would split using the method you described earlier.
Yes correctly identified that the input column to the scripting was of type TEXT.
I did notice the GetBlobData and noticed that it returned in byte form. I might have another brief glance at this. Hopefully 255 characters should be enough. On one of the files the largest column on one of the rows has a maximum of 60 but the other columns shouldn't really get as big as this.
Thanks for all the input.
|||DT_STR will give you 4,000 characters to work with. As long as each row length is less than 4,000 characters, that's what I'd use. One caveat is that if you have embedded carriage returns/line feeds/NULLs, you'll have to read in the row as DT_TEXT (binary data).|||Whoops showing my ignorance, mind obviously still set on old 255 limits.
Ah 4000 hence your comments on the [TEXT] earler assuming length 4000 plus.
Will be using crlf as end of line so I don't thinks any of those get passed through. There are no NULLS. Thanks this has saved me a lot of time and effort.
没有评论:
发表评论