2012年2月9日星期四

Basic Questions about SQL Server IO Optimization

Hi,
I am a programmer and my understanding of low-level database I/O
issues is limited. I've just taken a job with a company that is
currently using classic asp with inline sql. I come from an
environment with stored procs and I'm used to dealing in full records
and recordsets. We've been discussing database IO and my colleagues
here feel that it's necessary with classic asp to only pull in fields
you need, even if you're working with one record in a table. This,
they feel, will will cut down on IO time and bandwidth.
In my reading, I've gotten the impression that the days of worrying
about how many fields you bring in from one record are in the past
since I/O involves paging (meaning that more than a couple of fields
are brought in no matter what) and disks are so fast it doesn't
matter.
To support their arguments they've brought up the issue of bringing
in ntext fields which can slow down an I/0 operation. In this case I
agree with them but there methodology goes against the whole idea of
standard I/O concepts (e.g. creating stored procs to read and write to
at least an entire record).
I'm not certain of my take on things. I hope that someone might give
me some guidance and perhaps some articles to read on this subject.
What we have here is a lot of disparate in-line sql which I feel is
redudant and not easily modified since it's all over the place (i.e.
the addtion of one critical field in a table would require changing
hundreds of isolated inline sql strings).
Any advice would be appreciated.
Thanks,
Fig"fig000" <neilnewton001@.yahoo.com> wrote in message
news:1176731759.224287.121120@.y5g2000hsa.googlegroups.com...
> Hi,
> I am a programmer and my understanding of low-level database I/O
> issues is limited. I've just taken a job with a company that is
> currently using classic asp with inline sql. I come from an
> environment with stored procs and I'm used to dealing in full records
> and recordsets. We've been discussing database IO and my colleagues
> here feel that it's necessary with classic asp to only pull in fields
> you need, even if you're working with one record in a table. This,
> they feel, will will cut down on IO time and bandwidth.
Generally they're probably correct. If you're suggesting a select * is "ok"
your own example below seems to agree with the problems that might cause.
Also, consider a case where their select statement only brings back 2 fields
from a row, but those 2 fields are in a covering index. In that case, the
IO can be VERY effecient.
If they were bring back additional rows not in the covered index and not
using them, the IO usage can go way up.
Overall though I'd agree, getting rid of inline stuff and replacing with
stored procedures would probably be even better. :-)
> In my reading, I've gotten the impression that the days of worrying
> about how many fields you bring in from one record are in the past
> since I/O involves paging (meaning that more than a couple of fields
> are brought in no matter what) and disks are so fast it doesn't
> matter.
> To support their arguments they've brought up the issue of bringing
> in ntext fields which can slow down an I/0 operation. In this case I
> agree with them but there methodology goes against the whole idea of
> standard I/O concepts (e.g. creating stored procs to read and write to
> at least an entire record).
> I'm not certain of my take on things. I hope that someone might give
> me some guidance and perhaps some articles to read on this subject.
> What we have here is a lot of disparate in-line sql which I feel is
> redudant and not easily modified since it's all over the place (i.e.
> the addtion of one critical field in a table would require changing
> hundreds of isolated inline sql strings).
> Any advice would be appreciated.
> Thanks,
> Fig
>
--
Greg Moore
SQL Server DBA Consulting Remote and Onsite available!
Email: sql (at) greenms.com http://www.greenms.com/sqlserver.html|||Greg,
Thanks for answering. I can see that there is no simple answer to
this question. What I wonder then, is how the whole current trend got
started: creating a stored procedure for each of the four crud
operations for each table in a database. This is sort of underscored
by the development of all the wizards (datadapter in 2003 and
tableadapter in 2005). Are these simply bad practices which simplify
programming?
I guess I can see these issues if you are reading a large number of
records (which would make the use of the ado.net recordsets dangerious
if not used carefully). But for one record (or a few) wouldn't the
difference between reading two fields or 50 be minimal in terms of
database I/O? If I'm right, wouldn't the use of a standardized "select
*" type of functionality be more convenient from a coding standpoint
and cause little trouble in terms of database I/O?
I guess my final question would be: how could possibly put hundreds
of disparate queries selecting various sets of fields from a table or
tables, into stored procs? Wouldn't the complexity of that be an
argument for standardizing the SQL for a table into four simple crud
queries that could be called from any part of the system. Wouldn't
that also make modifying the queries (say for adding a new crictical
field) much easier.
I admit I don't know much about database internals so please forgive
my possible ignorance.
Thanks again for answering this sort of off the wall request. Are
there any articles or books that might clear this up for me?
Fig
On Apr 16, 10:58 am, "Greg D. Moore \(Strider\)"
<mooregr_deletet...@.greenms.com> wrote:
> "fig000" <neilnewton...@.yahoo.com> wrote in message
> news:1176731759.224287.121120@.y5g2000hsa.googlegroups.com...
> > Hi,
> > I am a programmer and my understanding of low-level database I/O
> > issues is limited. I've just taken a job with a company that is
> > currently using classic asp with inline sql. I come from an
> > environment with stored procs and I'm used to dealing in full records
> > and recordsets. We've been discussing database IO and my colleagues
> > here feel that it's necessary with classic asp to only pull in fields
> > you need, even if you're working with one record in a table. This,
> > they feel, will will cut down on IO time and bandwidth.
> Generally they're probably correct. If you're suggesting a select * is "ok"
> your own example below seems to agree with the problems that might cause.
> Also, consider a case where their select statement only brings back 2 fields
> from a row, but those 2 fields are in a covering index. In that case, the
> IO can be VERY effecient.
> If they were bring back additional rows not in the covered index and not
> using them, the IO usage can go way up.
> Overall though I'd agree, getting rid of inline stuff and replacing with
> stored procedures would probably be even better. :-)
>
>
> > In my reading, I've gotten the impression that the days of worrying
> > about how many fields you bring in from one record are in the past
> > since I/O involves paging (meaning that more than a couple of fields
> > are brought in no matter what) and disks are so fast it doesn't
> > matter.
> > To support their arguments they've brought up the issue of bringing
> > in ntext fields which can slow down an I/0 operation. In this case I
> > agree with them but there methodology goes against the whole idea of
> > standard I/O concepts (e.g. creating stored procs to read and write to
> > at least an entire record).
> > I'm not certain of my take on things. I hope that someone might give
> > me some guidance and perhaps some articles to read on this subject.
> > What we have here is a lot of disparate in-line sql which I feel is
> > redudant and not easily modified since it's all over the place (i.e.
> > the addtion of one critical field in a table would require changing
> > hundreds of isolated inline sql strings).
> > Any advice would be appreciated.
> > Thanks,
> > Fig
> --
> Greg Moore
> SQL Server DBA Consulting Remote and Onsite available!
> Email: sql (at) greenms.com http://www.greenms.com/sqlserver.html|||Fig,
SQL Server uses a cost based optimizer that is getting smarter with each
release. The more to-the-point your query gets, the more options the
optimizer has to achieve this as fast and efficient as possible. If you
don't need all columns of a row, then don't ask for it.
There are many examples where the optimizer can speed things up if you
ask less, so the ones below are just some examples.
- If there is a covering index (an index that holds all the columns
needed for your query) that it can be used instead of the base table. A
covering index is typically much smaller than the table, so on average
less I/O is required to retrieve the same number of rows
- If a row's size expands to over 4096 bytes, for example because you
just update a varchar column with a bigger string, then the data of this
column will be moved to a separate page. If you do not use this column
in your select query, this extra page does not have to be retrieved
- If you have text, ntext or image column in your table. Their values
are typically (but not always) stored in seperate pages
- If there is an indexed view defined for the table, it might be used
instead of the actual table (provided you are using Enterprise Edition)
One last thing. You are right, that if the storage engine has to
retrieve a page of table data, then (apart from the cases above) most or
all row data is retrieved. However, if the query plan has multiple
operators, all columns that are selected have to be kept in memory and
(possible) processed. This might involve I/O in tempdb. Also, the data
has to be piped to your application. Over a WAN that might cause
(serious) delays.
HTH,
Gert-Jan
fig000 wrote:
> Hi,
> I am a programmer and my understanding of low-level database I/O
> issues is limited. I've just taken a job with a company that is
> currently using classic asp with inline sql. I come from an
> environment with stored procs and I'm used to dealing in full records
> and recordsets. We've been discussing database IO and my colleagues
> here feel that it's necessary with classic asp to only pull in fields
> you need, even if you're working with one record in a table. This,
> they feel, will will cut down on IO time and bandwidth.
> In my reading, I've gotten the impression that the days of worrying
> about how many fields you bring in from one record are in the past
> since I/O involves paging (meaning that more than a couple of fields
> are brought in no matter what) and disks are so fast it doesn't
> matter.
> To support their arguments they've brought up the issue of bringing
> in ntext fields which can slow down an I/0 operation. In this case I
> agree with them but there methodology goes against the whole idea of
> standard I/O concepts (e.g. creating stored procs to read and write to
> at least an entire record).
> I'm not certain of my take on things. I hope that someone might give
> me some guidance and perhaps some articles to read on this subject.
> What we have here is a lot of disparate in-line sql which I feel is
> redudant and not easily modified since it's all over the place (i.e.
> the addtion of one critical field in a table would require changing
> hundreds of isolated inline sql strings).
> Any advice would be appreciated.
> Thanks,
> Fig|||Best practice for me is usually what you said, and what they said.
Return only what you need, and use stored procedures.
On Apr 16, 3:51 pm, "fig000" <neilnewton...@.yahoo.com> wrote:
> Greg,
> Thanks for answering. I can see that there is no simple answer to
> this question. What I wonder then, is how the whole current trend got
> started: creating a stored procedure for each of the four crud
> operations for each table in a database. This is sort of underscored
> by the development of all the wizards (datadapter in 2003 and
> tableadapter in 2005). Are these simply bad practices which simplify
> programming?
> I guess I can see these issues if you are reading a large number of
> records (which would make the use of the ado.net recordsets dangerious
> if not used carefully). But for one record (or a few) wouldn't the
> difference between reading two fields or 50 be minimal in terms of
> database I/O? If I'm right, wouldn't the use of a standardized "select
> *" type of functionality be more convenient from a coding standpoint
> and cause little trouble in terms of database I/O?
> I guess my final question would be: how could possibly put hundreds
> of disparate queries selecting various sets of fields from a table or
> tables, into stored procs? Wouldn't the complexity of that be an
> argument for standardizing the SQL for a table into four simple crud
> queries that could be called from any part of the system. Wouldn't
> that also make modifying the queries (say for adding a new crictical
> field) much easier.
> I admit I don't know much about database internals so please forgive
> my possible ignorance.
> Thanks again for answering this sort of off the wall request. Are
> there any articles or books that might clear this up for me?
> Fig
> On Apr 16, 10:58 am, "Greg D. Moore \(Strider\)"
>
> <mooregr_deletet...@.greenms.com> wrote:
> > "fig000" <neilnewton...@.yahoo.com> wrote in message
> >news:1176731759.224287.121120@.y5g2000hsa.googlegroups.com...
> > > Hi,
> > > I am a programmer and my understanding of low-level database I/O
> > > issues is limited. I've just taken a job with a company that is
> > > currently using classic asp with inline sql. I come from an
> > > environment with stored procs and I'm used to dealing in full records
> > > and recordsets. We've been discussing database IO and my colleagues
> > > here feel that it's necessary with classic asp to only pull in fields
> > > you need, even if you're working with one record in a table. This,
> > > they feel, will will cut down on IO time and bandwidth.
> > Generally they're probably correct. If you're suggesting a select * is "ok"
> > your own example below seems to agree with the problems that might cause.
> > Also, consider a case where their select statement only brings back 2 fields
> > from a row, but those 2 fields are in a covering index. In that case, the
> > IO can be VERY effecient.
> > If they were bring back additional rows not in the covered index and not
> > using them, the IO usage can go way up.
> > Overall though I'd agree, getting rid of inline stuff and replacing with
> > stored procedures would probably be even better. :-)
> > > In my reading, I've gotten the impression that the days of worrying
> > > about how many fields you bring in from one record are in the past
> > > since I/O involves paging (meaning that more than a couple of fields
> > > are brought in no matter what) and disks are so fast it doesn't
> > > matter.
> > > To support their arguments they've brought up the issue of bringing
> > > in ntext fields which can slow down an I/0 operation. In this case I
> > > agree with them but there methodology goes against the whole idea of
> > > standard I/O concepts (e.g. creating stored procs to read and write to
> > > at least an entire record).
> > > I'm not certain of my take on things. I hope that someone might give
> > > me some guidance and perhaps some articles to read on this subject.
> > > What we have here is a lot of disparate in-line sql which I feel is
> > > redudant and not easily modified since it's all over the place (i.e.
> > > the addtion of one critical field in a table would require changing
> > > hundreds of isolated inline sql strings).
> > > Any advice would be appreciated.
> > > Thanks,
> > > Fig
> > --
> > Greg Moore
> > SQL Server DBA Consulting Remote and Onsite available!
> > Email: sql (at) greenms.com http://www.greenms.com/sqlserver.html- Hide quoted text -
> - Show quoted text -|||"fig000" <neilnewton001@.yahoo.com> wrote in message
news:1176753103.971027.241230@.y80g2000hsf.googlegroups.com...
> Greg,
> Thanks for answering. I can see that there is no simple answer to
> this question. What I wonder then, is how the whole current trend got
> started: creating a stored procedure for each of the four crud
> operations for each table in a database. This is sort of underscored
> by the development of all the wizards (datadapter in 2003 and
> tableadapter in 2005). Are these simply bad practices which simplify
> programming?
Not really sure what you're asking here.
> I guess I can see these issues if you are reading a large number of
> records (which would make the use of the ado.net recordsets dangerious
> if not used carefully). But for one record (or a few) wouldn't the
> difference between reading two fields or 50 be minimal in terms of
> database I/O?
Not necessarily. Let's say you want to read one record.
In that case, (I'm going to assume basic, row, no large blobs), you need to
read one 8K page.
Fair enough. Even if you only need two fields (say a total of 100 bytes)
you're reading an 8K page.
However, consider the situation where each call only returns 1 row, but you
call that code 100 times.
Worse case scenario, you need to read 100 8K pages. (This assumes a select
* from...)
Best case with say a covering index and returning 100 bytes is reading in 2
8K pages.
That's 50 times more effecient.
Now, in the worst case, if the pages are scattered all over the disk, you
might need a physical read for each one. That's 100 physical reads.
In the best case, you need two physical reads and in some cases just ONE
physical read (since the engine will generally read multiple pages in a row
in one pass). Combine that with lack of arm movement and this suddenly
becomes far more effecient.
> If I'm right, wouldn't the use of a standardized "select
> *" type of functionality be more convenient from a coding standpoint
> and cause little trouble in terms of database I/O?
Well if you want a worse case of 100x slower I/O sure that's a little
trouble. :-)
(and note we're assuming you don't have any large blobs returned in a select
* that are thrown away.)
And actually a select * could be far worse than the 100 reads above in some
cases, depending on choices the optimizer will make.
> I guess my final question would be: how could possibly put hundreds
> of disparate queries selecting various sets of fields from a table or
> tables, into stored procs? Wouldn't the complexity of that be an
> argument for standardizing the SQL for a table into four simple crud
> queries that could be called from any part of the system. Wouldn't
> that also make modifying the queries (say for adding a new crictical
> field) much easier.
Well from the above, sounds more like the whole application might need to be
reviewed.
You can get other benefits from stored procs also, query/plan re-use,
protect against SQL injection attacks.
> I admit I don't know much about database internals so please forgive
> my possible ignorance.
> Thanks again for answering this sort of off the wall request. Are
> there any articles or books that might clear this up for me?
Yeah.. Inside SQL Server either for 2000 or 2005 depending on what you're
using.
> Fig
> On Apr 16, 10:58 am, "Greg D. Moore \(Strider\)"
> <mooregr_deletet...@.greenms.com> wrote:
>> "fig000" <neilnewton...@.yahoo.com> wrote in message
>> news:1176731759.224287.121120@.y5g2000hsa.googlegroups.com...
>> > Hi,
>> > I am a programmer and my understanding of low-level database I/O
>> > issues is limited. I've just taken a job with a company that is
>> > currently using classic asp with inline sql. I come from an
>> > environment with stored procs and I'm used to dealing in full records
>> > and recordsets. We've been discussing database IO and my colleagues
>> > here feel that it's necessary with classic asp to only pull in fields
>> > you need, even if you're working with one record in a table. This,
>> > they feel, will will cut down on IO time and bandwidth.
>> Generally they're probably correct. If you're suggesting a select * is
>> "ok"
>> your own example below seems to agree with the problems that might cause.
>> Also, consider a case where their select statement only brings back 2
>> fields
>> from a row, but those 2 fields are in a covering index. In that case,
>> the
>> IO can be VERY effecient.
>> If they were bring back additional rows not in the covered index and not
>> using them, the IO usage can go way up.
>> Overall though I'd agree, getting rid of inline stuff and replacing with
>> stored procedures would probably be even better. :-)
>>
>>
>> > In my reading, I've gotten the impression that the days of worrying
>> > about how many fields you bring in from one record are in the past
>> > since I/O involves paging (meaning that more than a couple of fields
>> > are brought in no matter what) and disks are so fast it doesn't
>> > matter.
>> > To support their arguments they've brought up the issue of bringing
>> > in ntext fields which can slow down an I/0 operation. In this case I
>> > agree with them but there methodology goes against the whole idea of
>> > standard I/O concepts (e.g. creating stored procs to read and write to
>> > at least an entire record).
>> > I'm not certain of my take on things. I hope that someone might give
>> > me some guidance and perhaps some articles to read on this subject.
>> > What we have here is a lot of disparate in-line sql which I feel is
>> > redudant and not easily modified since it's all over the place (i.e.
>> > the addtion of one critical field in a table would require changing
>> > hundreds of isolated inline sql strings).
>> > Any advice would be appreciated.
>> > Thanks,
>> > Fig
>> --
>> Greg Moore
>> SQL Server DBA Consulting Remote and Onsite available!
>> Email: sql (at) greenms.com
>> http://www.greenms.com/sqlserver.html
>
--
Greg Moore
SQL Server DBA Consulting Remote and Onsite available!
Email: sql (at) greenms.com http://www.greenms.com/sqlserver.html

没有评论:

发表评论