In my prior post, I covered some of the basics of defragmenting your indexes – an essential for database performance maintenance. But for some, the question may more likely be: how do I even get to indexes? I have seen so many situations where indexes are an afterthought, and lucky if they even get applied – how this came to be, I can go on a soap box rant on how so many developers now a says consider the database passé….. but I'll stop here, blood pressure….
Moving forward -- how would someone either new to the concept, or for those to whom indexing is old hat, look at applying an indexing strategy to an existing database? There are many ways, but I'll cover one handy method here that makes use of SQL Server 2005's dynamic management views as a possible starting point. In particular, these three views, will ease our entry into a database that is completely foreign to you(you got dropped into the project or inherited the maintenance):
- sys.dm_db_missing_index_group_stats: contains performance metrics on a missing grouping of indexes
- sys.dm_db_missing_index_groups: links group stats to specific index details held in index_details
- sys.dm_db_missing_index_details: holds detailed information about the missing indexes, such as table and columns that can make up the missing index
The query below returns the detailed information for a specific database – if you do not filter, then you will get results for all databases on the server. The group_stats view contains many metrics(see references below), all of which are useful for various indexing strategies. Since I like to stick with examples that impact end user performance, I have found it most useful to rank in a descending order by the metric avg_user_impact in conjunction with user_scans; I have chosen to highlight this metric as important for the following reasons:
- It is a user based metric, and we are wanting to improve performance for users of the database
- We are looking for the most used missing index, or in other words, the highest number of queries into the system where an index is deemed missing
This by no means is a rule of thumb, but more food for thought – again, it all is situation dependent, but for anyone wondering where to get started and can't see the forest for the trees, this is one place to pick up and go forward:
select database_id, object_id, statement, equality_columns, inequality_columns, mid.included_columns
, unique_compiles, user_seeks, user_scans, last_user_seek, last_user_scan, avg_total_user_cost
, avg_user_impact, system_seeks, system_scans, last_system_seek, last_system_scan, avg_total_system_cost, avg_system_impact
from sys.dm_db_missing_index_group_stats migs
inner join sys.dm_db_missing_index_groups mig on migs.group_handle = mig.index_group_handle
inner join sys.dm_db_missing_index_details mid on mig.index_handle = mid.index_handle
where database_id = (select database_id from sys.databases where name = 'MyDatabaseToTune')
order by avg_user_impact desc
It is important to note that all of these statistics are cleared once the SQL Server service is restarted, so for those of you like me that run development instances on your laptop – you will need to archive this data off to a permanent table in order to look at historical data. From the data returned here, you can take a look at building new indexes against your most used tables – there is a lot to be said on database tuning and performance metrics – too much to be covered in a blog, so be sure to take the bits here in context and realise that everything in indexing and performance is part of a larger picture that shifts depending upon your project.
To save time if you choose to directly implement SQL Server's recommendations, you can easily take the results from the query above, or modify the query above, to build strings that are your DDL statements for creating indexes – if you have any questions, please feel free to post. Thanks!
References:
Group Stats table: http://msdn2.microsoft.com/en-us/library/ms345421.aspx
Index Details: http://msdn2.microsoft.com/en-us/library/ms345434.aspx
Tags: indexing,
performance,
storage subsystem
Categories: SQL Server
Whew – it's been a while since I last posted (not to mention my postings this year have been a little sparse), but I am now settled back in Austin, Texas after having spent the last two years down in San Antonio. My wife and I have been busy selling the house down there, getting everyhting packed, working my client projects and transitioning to the Austin branch of my company – but, its all finally over, so I have no more excuses for delayed postings J.
I've been doing a lot of interviewing lately for some open positions at Catapult, and one of my favorite subject areas to ask any SQL/BI candidate we bring is on indexes and index strategy. So many like to leave this area of the database stack for the DBA, however, I will always be of the mind that the developer of the solution has a responsibility to know the basics of SQL Server architecture if they plan on using it in their solution and thus plan for performance. I'll probably post a lot more about SQL performance since I have been doing a lot recently with various clients – no particluar theme or organization to it, but just post what I have been working with at the time.
One of the interesting items I like to point out to both developers and DBAs is about index fragmentation. Since a clustered index stores data in the physical sort order, a table that has a lot of CRUD operations against it will most likely have a lot of page splits as SQL inserts and removes rows from that table. Even more so with non-clustered indexes as the B-tree in a non-clustered index points to the index page and not a data page.
The script below illustrates a simple method you can use to defragment your indexes. While you can use the dbcc command to defrag your index, it is better to use the new SQL 2005 command alter index. Depending on your SQL 2005 version, you have various options you can set when using this command verus using the old dbcc commands, the most important of which is to set online rebuilds for your more fragmented indexes.
So how do you know if you need to defrag an index? Pretty simple, if your index is more than 10% fragmented, particularly for large tables, then you need to do something about it. How fragmented depends on what your next step is; the general rule of thumb is over 10 but <= 30%, you will use the alter index reorganize. Anything over 30%, it is best to use alter index rebuild. For complete detail on the commands, see here: http://msdn2.microsoft.com/en-us/library/ms188388.aspx
Again, the script below is more for illustrative purposes and logs to a basic table so I can see how long certain rebuilds take – do not use this directly for production purposes, but more as a template for understanding one piece of maintaining your database.
declare
@dbid int, @TableName varchar(100),
@IndexName varchar(200), @Metric varchar(50),
@MaintenanceOperation varchar(50), @StartTime datetime,
@EndTime datetime, @StrSQL nvarchar(4000),
@ChildObjectType varchar(50), @FragPercent float
set @dbid =(select database_id fromsys.databaseswherename='MyDatabaseToMaintain')
set @MaintenanceOperation ='Index Defrag'
ifexists(selectobject_idfrom tempdb.sys.objectswherenamelike'#IndexTemp%')
begin
droptable #IndexTemp
end
select DatabaseId = @dbid, TableName = s.name +'.['+ o.name +']', ChildObjectName = i.name
, ChildObjectType ='Index', avg_fragmentation_in_percent, IsDefragged =0
into #IndexTemp
from sys.dm_db_index_physical_stats(@dbid,null,null,null,null) ip
innerjoin sys.objects o
on ip.object_id= o.object_id
innerjoin sys.schemas s
on o.schema_id= s.schema_id
innerjoin sys.indexes i
on i.object_id= ip.object_id
and i.index_id = ip.index_id
where avg_fragmentation_in_percent > 10
and o.is_ms_shipped = 0
and i.name isnotnull
select*from #IndexTemp
whileexists(select 9 from #IndexTemp where IsDefragged = 0)
begin
selecttop 1 @TableName = TableName, @IndexName = ChildObjectName
, @ChildObjectType = ChildObjectType, @FragPercent = avg_fragmentation_in_percent
from #IndexTemp
where IsDefragged = 0
set @StartTime =getdate()
if(@FragPercent <= 30)
begin
set @StrSQL ='alter index '+ @IndexName +' on '+ @TableName +' REORGANIZE '
print @StrSQL
execsp_executesql @StrSQL
end
if(@FragPercent > 30)
begin
set @StrSQL ='alter index '+ @IndexName +' on '+ @TableName +' REBUILD with(SORT_IN_TEMPDB = ON, STATISTICS_NORECOMPUTE = OFF)'
print @StrSQL
execsp_executesql @StrSQL
end
set @EndTime =getdate()
insertinto Maintenance.Tables
(TableName, ChildObjectName, ChildObjectType, StartTime, EndTime, MaintenanceOperation)
values
(@TableName,@IndexName,@ChildObjectType,@StartTime,@EndTime,@MaintenanceOperation)
update #IndexTemp
set IsDefragged = 1
where TableName = @TableName
and ChildObjectName = @IndexName
end
References:
sys.dm_db_index_physical_stats: http://msdn2.microsoft.com/en-us/library/ms188917.aspx
dbcc indexdefrag: http://msdn2.microsoft.com/en-us/library/aa258286(SQL.80).aspx
Clustered Index: http://msdn2.microsoft.com/en-us/library/ms177443.aspx
Non-Clustered Index: http://msdn2.microsoft.com/en-us/library/ms177484.aspx
Feel free to post or e-mail with any questions, thanks!
Tags: indexing,
performance
Categories: SQL Server