I’m not sure how many of you know this, and it’s not something I’ve read much in the literature available from Microsoft on SharePoint scalability and sizing.
So I thought I would share my experiance, that is you can have real issues with large Scale SharePoint 2010/2013 platforms if you haven’t sufficiently sized your tempDB, and the tlog that goes with it.
I know microsoft recommend setting all DB’s to AutoGrow, however none of us want to over allocate the amount of storage required for our SharePoint databases, even with AutoGrow you can have the same problem once the LUN’s holding your TempDB + TempDB log fill up.
It’s generally best practice with SharePoint on a medium to large scale to ensure your TempDB & Tlog run on there own lun’s, with reasonable I/O (FC SAN) or even DAS SSD’s etc.., this prevent’s I/O contention between the content databases, and SharePoint stored procedures that manipulate data via temp tables.
I don’t know if you’ve ever looked into the stored procedures that SharePoint uses to retrieve data or carry out operations within timer jobs or for search, or just general DB read operations, but many of these procedures make extensive use of temporary tables, and cursors to work with data.
Thus once your environment starts to get larger (millions+ items) you can start to run into significant demands in terms of TempDB Log & TempDB space, including Random I/O requirements to complete maintaince and crawl update operations, this is in addition to maintaining normal SharePoint operation.
Even standard SPQuery’s onto the content DB’s require TempDB space, if your TempDB or TempDB log fill up, or the LUN’s they are on become full, then until you resolve the issue many parts of SharePoint just stop working.
One particular problem I have encountered on SharePoint 2010 can occur during a full crawl, a table in the CrawlStoreDB (dbo.MSSCrawlUrlChanges) appears to be tracking items crawled for reporting puroposes and stores a TimeStatme (DateTime) and Status (int) with associated item DocID (int), this table has two indexes one non-clustered on DocID (int) the other clustered on TimeStamp (datetime).
Now in a system with many millions of items this table can get rather large when you run a full crawl, I have seen it get as big as millions of rows & 10’s of Gbytes of space, the particular stored procedure that updates this table proc_MSS_CrawlReportPreprocessChanges attempts to load all records less than a particular time into a cursor (which means tempDB action), then it processes this cursor within a transaction, and inserts results as appropriate into a further temp table (more tempDB action).
Finally it clears down the dbo.MSSCrawlUrlChanges and inserts all the changes from the temp table back into the CrawlStoreDB table dbo.MSSCrawlUrlChanges.
This is all carried out under a single transaction so will hold space in TempDB log files for the duration of it’s execution.
These operations of temporary table plus cursor usage can have a real impact on the amount of Space you need in both tempDB and it’s associated transaction log.
This problem can be exacerbated, if you also run dbcc check’s, and index rebuilds as maintaince jobs on your SQL servers that happen to coinside with these TempDB application operations, as SQL DBCC + Index Rebuilds also make significantly demands of tempDB.
I’ve seen failures of these operations on systems with 100-150GB TempDB, with 20GB+ Tlogs.
I’m trying to work out some metric’s that cover sizing of TempDB & Tlog Size vs SharePoint content size and item numbers to provide a bit of guidance, I’ve only found this blog article so far that suggests tempDB should be 25% of your largest content DB, but it doesn’t cover the relationship between TempDB size, content DB size, and Log file size for temp DB, which is crucial if SharePoint is to stay online, I’ll update this article once I have something more specific.
Thanks for reading.