显示标签为“source”的博文。显示所有博文
显示标签为“source”的博文。显示所有博文

2012年2月13日星期一

Batch insert 10000 rows at a time and commit

I want to Insert 10000 rows at a time and commit in sql server , Is
there a way to do it if the source tables have no id fields ?
What would be the most efficient method?

Thanks

AjayHi

Bulk Insert/BCP have batch size parameters which will do this, although the
expect imput from data files. If the source is a table then you could set
rowcount and loop (if you can differentiate the records (there may be
another natural key) or possibly use a cursor to break the sizes down. There
are issues with rowcount and inserting from remote tables (see BOL (Topic:
ROWCOUNT) for more information)

John

"Ajay Garg" <ajayz90@.hotmail.com> wrote in message
news:d9477327.0411200529.4cfc0386@.posting.google.c om...
>I want to Insert 10000 rows at a time and commit in sql server , Is
> there a way to do it if the source tables have no id fields ?
> What would be the most efficient method?
> Thanks
>
> Ajay

2012年2月11日星期六

Basic Text File Data Flow Source

In DTS 2000 I had a situation where I had a text file as input source and text file as output source. On migrating the package to 2005 it puts a wrapper around it which executes it as a 2000 package, the rest of the tasks are neatly converted to 2005 style tasks. I presume this to mean that this will not be supported through to the next version, and there is no direct equivalent in 2005.

My question is how do I import a non-flat file source which has different numbers of columns per line. I did ,somehow, manage to do this with 2000 but cannot seem to get anywhere with 2005.

The flat file source seems to be expecting a common number of columns and just can't seem to cope with no column delimiters on some lines. If anybody knows different I would be glad to hear about it.

Raw data is not helpful to me as only works with a specific raw type (apparently)

Went onto Bulk Insert Task but got this message

[Bulk Insert Task] Error: An error occurred with the following error message: "Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.Bulk load: An unexpected end of file was encountered in the data file.".

Have already browsed with this on web but only find comments about changing timeout setting.

Can find timeout settings in DataFlow source and DataFlow destinations but not in Bulk Insert Task.

As you can see this is a long and protracted question.

If the answer is simple I apologise if not blame Microsoft. Other than that have found SSIS has some nice improvements, apart from the odd vague error message I keep coming across.

Set up your flat file source to read in one big long line. Then inside your data flow, use a derived column to substring() out your different columns. OR you can use a conditional split to move your different record layouts to their own flat files, to be read inside another data flow.

Search around this forum for conversations and examples on this topic.|||

I am certainly no expert, but there have been a few discussions on getting around the flat file import with variable columns. After a very brief search, I would suggest that you see the following thread:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1134754&SiteID=1

As well as:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1546075&SiteID=1

As to your other question, I am not certain of the bulk insert task error that you are getting, but I am sure someone else will be along to answer it for you shortly.

:-)

(Of course there is a reply in the time it takes me to type this up....)

|||

Have been looking into this answer.

Seem to have hit a row limit. Imports the first 158383 lines out of 665525 and then stops. Have had a brief browse to see if there is a number of rows limit but other than someone mentioning a million lines in a flat file have found no other information.

Can anybody confirm this number or can this problem be remedied.

|||

Well, I know that the limit, if there is one, is not as low as 158383 as I have retrieved over 700k lines from a flat file. (I do not believe that there is an upper bound on the number of rows a flat file source can handle)

Have you turned on logging to see if there is a problem with the import? Or, have you set up an event handler or anything of that nature? I would have to think that there is something wrong with the data...

|||

With a bit of playing around I found out that if I set the second column to text only 158383 get copied into the table whereas with varchar(255) 665517 get copied.

Looking into the scripting solution but so far have only got 15 columns of blanks being written. Tried putting in breakpoints but must be missing something really basic as not breaking and when going back into script breakpoint vanished. I am using this event

PublicOverridesSub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

as set up by SSIS I presume this is set up to work with each row received from the data source. Am I wrong? Am I missing some settings?

|||Obviously replied to wrong one earlier for previous post see up a couple of messages. Have partially discovered what is wrong. Am correctly applying data to right output columns. However inital column data is wrong. I seem to trying to split the Row.Column1.ToString which returns the type blob? A bit puzzled here. Having great difficulty locating where the column data actually is so I can split it.|||

Madhattuer wrote:

Obviously replied to wrong one earlier for previous post see up a couple of messages. Have partially discovered what is wrong. Am correctly applying data to right output columns. However inital column data is wrong. I seem to trying to split the Row.Column1.ToString which returns the type blob? A bit puzzled here. Having great difficulty locating where the column data actually is so I can split it.

How are you trying to do this?

# Delimiter character is ";" in this example; change to what you need:
Dim myStringSplit as String() = Split(Row.Column1.ToString, ';')|||If it is a blob(and it would be if you set the type to DT_TEXT or if the data is binary), you have to read the data using the GetBlobData method, and then convert it to a string, before applying the split function.|||

jwelch wrote:

If it is a blob(and it would be if you set the type to DT_TEXT or if the data is binary), you have to read the data using the GetBlobData method, and then convert it to a string, before applying the split function.

Ah yes... And in converting that blob data to a string, won't the developer have to split that blob stream up into 4,000 byte chunks? (That is, assuming the input string length is more than 4,000 characters...)|||

Yes I would split using the method you described earlier.

Yes correctly identified that the input column to the scripting was of type TEXT.

I did notice the GetBlobData and noticed that it returned in byte form. I might have another brief glance at this. Hopefully 255 characters should be enough. On one of the files the largest column on one of the rows has a maximum of 60 but the other columns shouldn't really get as big as this.

Thanks for all the input.

|||DT_STR will give you 4,000 characters to work with. As long as each row length is less than 4,000 characters, that's what I'd use. One caveat is that if you have embedded carriage returns/line feeds/NULLs, you'll have to read in the row as DT_TEXT (binary data).|||

Whoops showing my ignorance, mind obviously still set on old 255 limits.

Ah 4000 hence your comments on the [TEXT] earler assuming length 4000 plus.

Will be using crlf as end of line so I don't thinks any of those get passed through. There are no NULLS. Thanks this has saved me a lot of time and effort.

Basic Table update/append question

I'm new to SQL server. I want to add or append a unique set of rows to a destination table from a source table, they are essentially the same table by definition. The source table is updated every hour via DTS, all rows deleted and new set added. Both tables have the same primary key. Approximately 40 unique rows are created each hour and I would think the best approach would be to append the new rows to the destination table. I think an Append query will run into a primary key conflict.

In Access, I did this within VB by checking the max value of the primary key and then running the append for any values greater than that.

In SQL, I'm not sure if this should be done as a stored procedure or if there is an easier approach altogether.In SQL Server (or in Access...) you are better off using a LEFT OUTER JOIN against your destination table, and filtering where the destination table key is null:

select [YourFields]...
from SourceTable
left outer join DestinationTable on SourceTable.PKey = DestinationTable.PKey
where Destination.PKey is null

You could also use NOT EXISTS, but I prefer this method.|||Thanks, that worked really well!

Basic SSIS Question

Lets say I want to create several flat files - one file for each row returned by a sql query. Source data resides in SQL.

In short, my problem is a "reverse" of the many examples out there where data originates from multiple flat files into a SQL database. I want to go from SQL to multiple flat files.

I suppose this means I need a dynamic flat file connection string ... but I'm really stuck. Please help.

Thanks in advance.

Do you want to create many files at the same time? Then create a Flat File Connection Manager for each file.

If you only want to create one file at a time, but use a different file name so as not to overwrite the existing file, like a daily sales file or something like that? The look at the expressions property on the Flat File Connection Manager. You can use a variable to alter the connection string property.

Try Google: "Using Property Expressions in Packages"

|||

Like TGnat says, you'll need to use a Flat File Destination Adapter in concert with a Flat File Connection Manager. There's loads of information out there if you google it!

-Jamie

|||

Thanks, I'll check it out.

John

2012年2月9日星期四

Basic Reporting Services / Analysis Services report - flattened rowset problem

I am just getting started with Reporting Services 2000 and I need some help PLEASE! My source is MSAS 2000 cube. This whole flattened rowset is confusing me. Insight will be HUGELY appreciated.

I have a MSAS hierarchy that looks like this:
WORLD
-North America
USA
Canada
Mexico
-Europe
UK
Germany
etc...

I use the following MDX to get my dataset
Select [Amount] on Columns, [World].AllMembers on ROWS From Cube

I get the following Reporting Services dataset with THREE columns:
(nothing) 5000
North America 3000
North America USA 1000
North America Canada 1000
North America Mexico 1000
Europe 2000
Europe UK 1000
Europe Germany 1000

What I want is a report that looks like this -- (Two columns)
DESIRED REPORT
Location Amount
+ North America 3000
+ Europe 2000
World 5000

When the user Clicks the + icon , The report should expand. Like below:
DESIRED REPORT AFTER DRILLDOWN
Location Amount
USA 1000
Canada 1000
Mexico 1000
+ North America 3000
+ Europe 2000
World 5000

This seems like it should be the most basic report possible for a cube with a hierarchy. However, it seems that no matter what I try, I get variations of the following:
Three columns:
Continent Country Amount
+ blank here!
+ North America (nothing!!!)
+ Europe (nothing!!!)

When a + icon is clicked I get (still no AMOUNTS!!!):
Continent Country Amount
+ blank here!
- North America (nothing!!!)
+USA (nothing!!!)
+Canada (nothing!!!)
+Mexico (nothing!!!)
+ Europe (nothing!!!)

and another clicks on USA for example yields:
Continent Country Amount
+ blank here!
- North America (nothing!!!)
-USA (nothing!!!)
1000
+Canada (nothing!!!)
+Mexico (nothing!!!)
+ Europe (nothing!!!)

The desired report is a simple report. It seems like a good candidate for a wizard. However, using the wizard, I get results such as the one above. The example above uses a wizard stepped format with subtotals and drilldown enabled.

Moving away from the wizard... If I build the report without the wizard some report rows have no titles and/or no amounts and/or amounts are repeated multiple times down the rows. The only thing I can think of is that I need to IIF statements in every row to hide these rows which seems inefficient. I think I must be doing something wrong.

Do I need to change my MDX in some way to eliminate rows with Blanks in the Continent and Country columns? Or is there something simple I can do in the report definition to make it behave differentlty?

Can someone get me started please?

This MSDN paper discusses how to work with flattened rowsets in Reporting Services, and includes some sample drilldown reports that should point you in the right direction. Basically, you can set filtering in your groups to select the right rows for a given group, or have the MDX query only return data from the lowest level:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/olapasandrs.asp?frame=true

>>

Integrating Analysis Services with Reporting Services

Sean Boon
Microsoft Corporation

June 2004

Applies to:
Microsoft SQL Server 2000

Summary: Create a compelling solution for your customer that defines and manages great-looking Analysis Services reports, and quickly answers analytical questions to improve traditional reporting scenarios. (33 printed pages)

...

Building "Drill-Down" Reports against OLAP Data

Up until this point, all of the data contained within the example reports has belonged to only one level from each dimension. For example, if you ask the question, "What are unit sales by promotion media type?" even though the user might be given the option to select any available time period, the resulting dataset will only contain data from a single level of each dimension. The included level of the time dimension might change each time the report is executed, but for each report execution only one level of the time dimension will be referenced. What happens when you need to add interactivity to the report by allowing the users to "drill-down" on members in the table?

...

>>

|||

I am having the same problem. When I run my MDX query in designer I do not get the "All" member for any of my dimensions. Displaying that member is a requirement for all of my reports.

I did review the above mentioned white paper as well as download the sample reports, however none of those reports uses the "ALL" member.

Any suggections?

Here is my simple code:

Thanks in advance !!

SELECT { [Measures].[Responses], [Measures].[Responders], [Measures].[Trans Amount] } ON COLUMNS,
{ { { [Providers].Members} * { [Product].Members } * { [Channel].Members } } }
Dimension Properties Member_Unique_Name ON ROWS
FROM [Programmatic_Balance_Build]

|||

The problem with returning an "All" member in a flattened rowset is discussed elsewhere in the above paper:

>>

Another behavior of the flattening algorithm is that the "[(ALL)]" level is not included in the dataset. This has a couple of implications. The first is that if you want to include data from the "All" member of a dimension, you'll need to create a calculated member to represent this member. This can be accomplished in a couple of different ways. The first method would be to create a calculated member on the Measures dimension and for the definition of the calculated member refer to the current member's name or unique name. There are several examples of this method represented in later sections of this whitepaper.

Note The "All" level of a dimension is not included in the field set that is returned to Reporting Services.

The second implication of the "All" level not being represented in the dataset is that calculated members, usually defined without a parent member, will need to change so that they do have a parent member. This only applies in cases where the calculated member does not belong to the Measures dimension. In many cases, when calculated members are defined on a non-Measures dimension, the parent member property is left blank. This can be changed in the calculated member dialog box as shown below

>>

|||

Thank you Deepak -

This whitepaper opened up a lot of concepts for RS and AS. It was just what I needed. Thank you for pointing me in this direction.

I still don't like some of the integration issues between RS and AS . But I now understand the behaviors and work with them. And most importantly, still deliver the reports.

Basic Reporting Services / Analysis Server report - flattened rowset problem

I am just getting started with Reporting Services 2000 and I need some
help PLEASE! My source is MSAS 2000 cube. This whole flattened rowset
is confusing me. Insight will be HUGELY appreciated.
I have a MSAS hierarchy that looks like this:
WORLD
-North America
--USA
--Canada
--Mexico
-Europe
--UK
--Germany
--etc...
I use the following MDX to get my dataset
Select [Amount] on Columns, [World].AllMembers on ROWS From Cube
I get the following Reporting Services dataset with THREE columns:
(nothing) 5000
North America 3000
North America USA 1000
North America Canada 1000
North America Mexico 1000
Europe 2000
Europe UK 1000
Europe Germany 1000
What I want is a report that looks like this -- (Two columns)
DESIRED REPORT
Location Amount
+ North America 3000
+ Europe 2000
World 5000
When the user Clicks the + icon , The report should expand. Like
below:
DESIRED REPORT AFTER DRILLDOWN
Location Amount
USA 1000
Canada 1000
Mexico 1000
+ North America 3000
+ Europe 2000
World 5000
This seems like it should be the most basic report possible for a cube
with a hierarchy. However, it seems that no matter what I try, I get
variations of the following:
Three columns:
Continent Country Amount
+ blank here!
+ North America (nothing!!!)
+ Europe (nothing!!!)
When a + icon is clicked I get (still no AMOUNTS!!!):
Continent Country Amount
+ blank here!
- North America (nothing!!!)
+USA (nothing!!!)
+Canada (nothing!!!)
+Mexico (nothing!!!)
+ Europe (nothing!!!)
and another clicks on USA for example yields:
Continent Country Amount
+ blank here!
- North America (nothing!!!)
-USA (nothing!!!)
1000
+Canada (nothing!!!)
+Mexico (nothing!!!)
+ Europe (nothing!!!)
The desired report is a simple report. It seems like a good candidate
for a wizard. However, using the wizard, I get results such as the one
above. The example above uses a wizard stepped format with subtotals
and drilldown enabled.
Moving away from the wizard... If I build the report without the wizard
some report rows have no titles and/or no amounts and/or amounts are
repeated multiple times down the rows. The only thing I can think of
is that I need to IIF statements in every row to hide these rows which
seems inefficient. I think I must be doing something wrong.
Do I need to change my MDX in some way to eliminate rows with Blanks in
the Continent and Country columns? Or is there something simple I can
do in the report definition to make it behave differentlty?
Can someone get me started please?Joel,
The reason you're getting the blank row is because of the 'total' rows
in your dataset (ie North America <blank> 3000). The alternative would
be to remove this total row (via modifying your MDX) and letting the
report total for the continent by doing a SUM on the Amount col.
Check the visibility property of the Amount textbox to make sure it is
toggled by the [+] Continent textbox.
Also, to hide duplicates, there is a property on a Textbox which will
hide repeated values.
Hopefully that will get you started.
Andy Potter|||Thank you for your response -
For your first suggestion, do you mean to create a row in the report
with formula USA+Canada+Mexica and a second row with formula
UK+Germany? This isn't really my hierarchy, in reality, the dimension
has 7 levels and hundreds of thousands of members so manual totals are
not an option.
Or do you mean to suppress the rows with a blank via my MDX? If so,
any ideas on how to formulate this?
I tried the toggle for the amount text box and it has no effect. the
Hide Repeated on the Amount text box has no effect either. I tried
hiding repeats for amount conditioned on Continent and Country.|||Thank you for your response -
For your first suggestion, do you mean to create a row in the report
with formula USA+Canada+Mexica and a second row with formula
UK+Germany? This isn't really my hierarchy, in reality, the dimension
has 7 levels and hundreds of thousands of members so manual totals are
not an option.
Or do you mean to suppress the rows with a blank via my MDX? If so,
any ideas on how to formulate this?
I tried the toggle for the amount text box and it has no effect. the
Hide Repeated on the Amount text box has no effect either. I tried
hiding repeats for amount conditioned on Continent and Country.|||Thank you for your response -
For your first suggestion, do you mean to create a row in the report
with formula USA+Canada+Mexica and a second row with formula
UK+Germany? This isn't really my hierarchy, in reality, the dimension
has 7 levels and hundreds of thousands of members so manual totals are
not an option.
Or do you mean to suppress the rows with a blank via my MDX? If so,
any ideas on how to formulate this?
I tried the toggle for the amount text box and it has no effect. the
Hide Repeated on the Amount text box has no effect either. I tried
hiding repeats for amount conditioned on Continent and Country.