Sunday, March 25, 2012

DataSizer for 2000

Hi,
I'm trying to estimate the amount of space that will be required for several
import tables.
I've run the DataSizer tool and it calculates the following for a table with
3 INT fields and one REAL field (prec - 24).
Heap Table Data Sizer
Rows in table 137,242,307
Data Row Fixed Len Col Size 16
Number of Columns in data row 4
Number of Variable Length Columns in data row 0
Max Size of Variable Length Data in data row 0
Index Key Fixed Len Col Size 0
Number of Columns in Index Key 0
Number of Variable Length Columns in index key 0
Max Size of Variable Length Data in index key 0
Page size (not configurable) 8192
Page Hdr Size (not configurable) 96
Data Row Size Calculation
Data Row Header 4
Null Bitmap 3
Variable Length Columns Total Size 0
Minimum Row Length 16
Total Row Size 23
Data rows per page 324
Data pages 423588
Table Size 3,470,032,896
This example happens to be an existing table (heap) that reports the
following information via sp_spaceused
rows = 137242307
space used = 3569632 KB
There is about a 100MB difference between the two values returned (table
size est and table size actual). Is the DataSizer tool accurate for SQL
2000? Am I performing the calculation incorrectly?
Thanks
Jerry
Sp_spaceused might use stale information (see the @.updateusage parameter to sp_spaceused and also
DBCC UPDATEUSAGE). Also, you might have less than 100% full pages (see DBCC SHOWCONTIG).
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Jerry Spivey" <jspivey@.vestas-awt.com> wrote in message
news:%23yVzzudoFHA.1444@.TK2MSFTNGP10.phx.gbl...
> Hi,
> I'm trying to estimate the amount of space that will be required for several import tables.
> I've run the DataSizer tool and it calculates the following for a table with 3 INT fields and one
> REAL field (prec - 24).
> Heap Table Data Sizer
> Rows in table 137,242,307
> Data Row Fixed Len Col Size 16
> Number of Columns in data row 4
> Number of Variable Length Columns in data row 0
> Max Size of Variable Length Data in data row 0
> Index Key Fixed Len Col Size 0
> Number of Columns in Index Key 0
> Number of Variable Length Columns in index key 0
> Max Size of Variable Length Data in index key 0
> Page size (not configurable) 8192
> Page Hdr Size (not configurable) 96
> Data Row Size Calculation
> Data Row Header 4
> Null Bitmap 3
> Variable Length Columns Total Size 0
> Minimum Row Length 16
> Total Row Size 23
> Data rows per page 324
> Data pages 423588
> Table Size 3,470,032,896
>
> This example happens to be an existing table (heap) that reports the following information via
> sp_spaceused
>
> rows = 137242307
> space used = 3569632 KB
> There is about a 100MB difference between the two values returned (table size est and table size
> actual). Is the DataSizer tool accurate for SQL 2000? Am I performing the calculation
> incorrectly?
> Thanks
> Jerry
>
|||Hey Tibor.
I did use the updateusage param of sp_spaceused.
Yea I thought it might be due to fragmentation so I created a clustered
index on the table then dropped the clustered index to recompact the heap
pages. The 137 million rows now consume 3,399,200 KB (apx 70 MB less than
the estimated and about 170 MB less than before with the fragmentation).
Interesting ;-)
Thanks again for the reply.
Jerry
"Tibor Karaszi" <tibor_please.no.email_karaszi@.hotmail.nomail.com> wrote in
message news:eTrz2aeoFHA.3448@.TK2MSFTNGP12.phx.gbl...
> Sp_spaceused might use stale information (see the @.updateusage parameter
> to sp_spaceused and also DBCC UPDATEUSAGE). Also, you might have less than
> 100% full pages (see DBCC SHOWCONTIG).
> --
> Tibor Karaszi, SQL Server MVP
> http://www.karaszi.com/sqlserver/default.asp
> http://www.solidqualitylearning.com/
> Blog: http://solidqualitylearning.com/blogs/tibor/
>
> "Jerry Spivey" <jspivey@.vestas-awt.com> wrote in message
> news:%23yVzzudoFHA.1444@.TK2MSFTNGP10.phx.gbl...
>
|||Hi Jerry,
If I take the numbers from your calculation (3,470,032,896 bytes) and compare to the numbers from
sp_spaceused (I assume) (3,399,200 KB) and convert properly to MB etc, I get:
SELECT KB, KB/1024 AS MB, KB/(1024*1024) AS GB
FROM
(
SELECT 3470032896/1024 AS KB
UNION
SELECT 3399200 AS KB
) AS i
KB MB GB
-- -- --
3388704.000000 3309.28125000000 3.23171997070312500
3399200.000000 3319.53125000000 3.24172973632812500
I.e. a difference of 10 MB, which for a 3.2GB table is pretty close, IMO :-).
Btw, I always select from sysindexes (after updateusage) when I want to get these values. I never
remember whether sp_spaceused uses reserved, dpages or the used column. To be honest, I don't
remember the difference between dpages and used by heart either, so I always look up sysindexes in
BOL.
I should also say that I didn't actually look at your formula/calculation. I was initially focused
on the stale information in sysindexes.
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
Blog: http://solidqualitylearning.com/blogs/tibor/
"Jerry Spivey" <jspivey@.vestas-awt.com> wrote in message
news:%23CmI1geoFHA.2472@.TK2MSFTNGP15.phx.gbl...
> Hey Tibor.
> I did use the updateusage param of sp_spaceused.
> Yea I thought it might be due to fragmentation so I created a clustered index on the table then
> dropped the clustered index to recompact the heap pages. The 137 million rows now consume
> 3,399,200 KB (apx 70 MB less than the estimated and about 170 MB less than before with the
> fragmentation).
> Interesting ;-)
> Thanks again for the reply.
> Jerry
> "Tibor Karaszi" <tibor_please.no.email_karaszi@.hotmail.nomail.com> wrote in message
> news:eTrz2aeoFHA.3448@.TK2MSFTNGP12.phx.gbl...
>