SharePoint 2010 – SQL Timeouts

Hi,

If you notice you have many “SQL Timeouts”, in the event logs of your SharePoint 2010 WFE servers, and you are running the SharePoint databases on a mirrored set of SQL servers you may be encountering this problem.

There is a .Net framework article / kb from Microsoft that says this is a known problem with .Net & Mirroring see KB 2605597 which is a hotfix to resolve the problem.

However there is also a further article KB 2600211 that has a section in it that says the KB above is included in the .Net 4.0.3 Update, see note from the web page for the .Net 4.0.3 update.

Don’t let this fool you into thinking that this update contains all of the changes that are in the original KB 2605597, it does not, the original KB has updated DLL’s for .Net 4.0 and earlier versions of .Net back to v2.0, if you apply the .Net 4.0.3 update you only get the .Net System.Data.dll update for v4.0 or later.

AS SharePoint 2010 is built on framework 3.5.1 you need to make sure you install the original KB 2605597 as this contains a new version of the System.Data.dll for framework 2 which is used by .Net v3.5.1 applications.

Hope this saves you some time in installing the rollup KB and not being able to work out why it hasn’t fixed the problem.

A further update to this 22nd Feb 2013, even after applying all the above fixes we still saw occasional SQL timeouts between the SharePoint WFE and the SQL servers, further diagnostic’s from using debugdiag, pointed the problem squarely at the the system.data.dll (Part of .Net Framework 2.0) it suggested we were still suffering from the same problem with the SSPI causing timeouts during the initial connection attempt from WFE to SQL.

Given this Microsoft recommended we deploy a further hotfix to .Net 2.0, KB 2784148 this was released in DEC 2012, and contains amongst other things a further update of the system.data.dll to v2.0.50727.7012.

It appears this latest version of the system.data.dll still hasn’t resolved our timeout issues, next steps time for some BID tracing, to see if that shed’s any more light.

An update on this saga (now August 2013), even though we have been through another two set’s of hotfix’s for our timeout issues, they still haven’t gone away completely. In conjunction with Microsoft they have now identified a SQL server scheduling problem that appears to be affecting SQL 2008 R2 when deployed on HP DLX80 Generation 6 or newer servers, this causes the SQL ring buffer to drop a connection sporadically when running on specific hardware, the infra I have seen this issue on is HP DL380 / DL 580 6th Generation or newer.

I’ll post another update on this subject once Microsoft have worked out how we get around this problem, but if you have SQL timeouts with SharePoint and are using the later generations of HP Servers you could be experiancing the same issue.

Update Microsoft have recently issued a new KB that is suppose to solve (or greatly reduce this problem), seems the underlying problem is with the operating system rather than SQL, so if you are running Windows 2008 R2 SP1 or Windows 7 SP1 this could be affecting you.

Please see the article here which gives instructions on how to apply the hotfix.

Content Database Creation Gremlins

Hi,

Recently we had some problems creating new content databases and adding them to SharePoint 2010 SP1.

From looking in the ULS logs I found the following.

10/22/2012 04:55:14.83 PowerShell.exe (0x477C) 0x0FB0 SharePoint Foundation Database 5586 Critical Unknown SQL Exception 2812 occurred. Additional error information from SQL Server is included below. Could not find stored procedure 'dbo.proc_GetDatabaseInformation'. aef39614-22b1-4cb2-9f8f-bfc624b9e7ba
10/22/2012 04:55:19.00 PowerShell.exe (0x477C) 0x0FB0 SharePoint Foundation Database 5586 Critical Unknown SQL Exception 208 occurred. Additional error information from SQL Server is included below. Invalid object name 'Groups'. aef39614-22b1-4cb2-9f8f-bfc624b9e7ba
10/22/2012 04:55:19.22 PowerShell.exe (0x477C) 0x0FB0 SharePoint Foundation PowerShell 6tf2 High Invalid object name 'Groups'. aef39614-22b1-4cb2-9f8f-bfc624b9e7ba
10/22/2012 04:55:19.23 PowerShell.exe (0x477C) 0x0FB0 SharePoint Foundation PowerShell 91ux High Error Category: InvalidData Target

After examinig our content database we were trying to Mount to sharepoint I noticed something strange, some of the schema objects in the database, specifically the ones mentioned above have been prefixed with the user account we used to run the power shell command, rather than “dbo” hence the errors during from SharePoint when attempting to mount the DB.

We traced this problem down to the fact that for some reason our DBA team had created the new content database with the default schema set to our admin user account rather than “dbo”.

Seem’s there must be a minor bug in the scripts that the Mount command invokes when building the new DB schema, most tables / proc’s views etc.. are created correctly regardless of default schema setting of the containing database, it appears as if a couple of the scripts that provision the schema into a new content database rely on the default schema value for the database and using it as the prefix in the create commands, thus if the default schema isn’t set to “dbo” you get the problem above.

One to look for if you run into this with your environments.

Thanks.

SharePoint 2010 SP1 Dec 11 CU – Issues with Office 2003

I know Microsoft don’t offer support for Office 2003 anymore (unless you have an extended support agreement), but as we all know there are still lot’s of places out there especially in larger Enterprises using Office 2003 on Windows XP.

Microsoft quote office 2003 as supported by SharePoint 2010, and the document here http://go.microsoft.com/?linkid=9690494 suggests if offers a “Good” level of interopertion with SharePoint 2010.

However here’s a tip for you, based on an issue we encountered during a SharePoint 2010 upgrade from SP1 to DEC 11 CU, I would suggest if you have Office 2003 clients don’t upgrade your SharePoint 2010 instances past SP1 if you have document libraries with multiple content types associated (later versions of Office 2007 / 2010 work fine).

You’ll get this error message ‘this.frm.ctNameToId’ is not an object, when you attempt to save a document for the first time into a library.

this.frm.ctNameToId is null or not an object

It appears as part of the SP2010 Dec 11 CU, some changes were made to the Javascript file BFORM.js & BFORM.debug.js in the TEMPATE\LAYOUTS\1033 folder of the 14 hive.

These changes cause the web page rendered in the “Save As” dialog of Office 2003 which allows you pick a content type for the file when it’s saved for the first time, to throw a JavaScript error, which then makes it impossible to complete the Save operation.

I have compared old and new versions of the function where the error is thrown “function ChoiceFValidate()” and can see an extra few lines of code have been added compared to the earlier version, see below

SP1 Dec 2011 CU (BFORM.js)
SP1 Dec 2011 CU

SP1 (BFORM.js)
SP1

Why this would cause Office 2003 to break is not clear at present it maybe due to the form configuration we have on our content types on our document libraries??

Either way we see the issue on versions of SharePoint 2010 SP1 DEC 2011 CU, and the file change date time for the BFORM.js file is 16/11/2011.

So if you have Office 2003 & Document Libraries with more than one content type, stick on SP2010 SP1 until you can get your users on a later version of office.  Might take this one up with M/Soft to see if they have a suggestion, might also investigate if a later CU from 2012 sorts the issue, hope you don’t have the same problem we did.

Solution to Unlocking Locked Files in SharePoint 2010

Recently we came across a problem with our SharePoint 2010 (SP1) system (that we’ve been upgrading from SharePoint 2007), where we use a custom work-flow process to clean up a set of files once the work-flow has been completed.

To achieve this we use a timer job that is kicked off periodically which removes the files associated with any newly completed work-flows.

The issue we had is that even though the users are completing the work flow, which means they have finished with the files we have no way of being sure they have closed all the files they were working on as part of the workflow, files maybe left open in office or in some situations office may have crashed! In both these circumstances files are left with short term locks as though they are being editied and can’t be deleted, yes you can wait for the default short term lock timeout (in the crash scenario) however this won’t help you if the user still has the file open, as Office will continue to renew its short term lock.

Our solution to this problem when we were on SharePoint 2007 was to make an RPC call via HTTP to the author.dll, this enabled our workflow clean-up timer job to mimic the same RPC over HTTP call Office makes when you close a document this releases the lock on any file (it’s amazing what you can find out with fiddler).  This worked fine for us on SP2007, however on SP2010, although it didn’t error if failed to unlock the files, so the delete process in the timer job couldn’t tidy up the documents.

We spent some time digging around inside the SharePoint 2010 assemblies (reflector to the rescue), looking into the content databases etc.., trying to spot any differences in internal implementation between SP2007 & SP2010, that might give us a clue as to why our RPC call over HTTP was no longer working.

In conjunction with a bit of SQL tracing we found the stored  procedure proc_UncheckoutDocument within SharePoint’s content database that unlocks files when the RPC method on SP2010 is invoked, this procedure contains logic that checks to make sure the SharePoint user id passed to the proc matches the user who has the document checked out, and in our case this isn’t the case as the SharePoint\System user is attempting to delete the documents (our timer jobs run under this context), this procedure has a forceunlock parameter, that if set changes an internal logic path in the procedure to allow the unlock to happen regardless , interestingly the HTTP RPC call to the author.dll also had this parameter.  So we thought no problem, lets just use the force unlock, however it appears that a small bug exists as when we had SQL tracing on we could never get the call via RPC to pass the forceunlock parameter down the the stored proc in the DB, it was always being passed as 0, even with the force unlock parameter set to 1 in the RPC call.

Our next port of call led us to a new API on the SPFile object for SharePoint 2010, ReleaseLock this appeared to offer what we wanted but again we ran into the same problem as the RPC method, our timer process wasn’t running under the user context so it failed to unlock the file, just to test things we tried calling the function manually while running a test component under the user context who had a file checked out or short term locked and it does indeed release the lock.

We next thought we need to impersonate the user who has the file open, trouble is this isn’t possible as we don’t hold the user credentials of all our users to create the runtime identity object from within our timer service.

After a little bit of headscratching, how we were going to solve this problem?

Our first clue was hmm hang on these user ID’s are SharePoint user ID’s from the userinfo table, not Windows identities, is there anyway we could create an SPSite context inside our timer job that uses this user ID rather than that of SharePoint\System?

Thankfully the SPSite object has a constructor that can take a SPUserToken object (I assume for this exact purpose) hmm what if we set this to the user who has the file checked out, we can do that by first getting the checked out SPUser from the SPFile.LockedByUser property, then use this to read the UserToken value we can do all this without needing to know the users credentials, then use this to create another SPSite with the correct user context, but will it work…

After a few minutes of development you can create a piece of prototype code, we do need to have two SPSite objects one from the Timer Service context SPWebApplication which will be associated with the SharePoint\System account to obtain the UserToken of the locked file, then another SPSite object created using the UserToken so we can unlock the file.

So if your stuggling to unlock files in a document library on SharePoint 2010 that others have left checked out, and you need to do it under the context of an account other than the user that has the file locked out, for example a timer job, give this approach a go it worked for us.

I take no responsibility for irate users, if you try this method to forcibly unlock a file while somebody really is still working on it, as they will likely lose the last set of changes!

You may find other methods mentioned on the internet about ways to do this by directly manipulating the tables in the SharePoint content database, such as this I would not recommend that you follow this approach, as interacting with and updating the databases directly in SharePoint may put you in breach of the EULA you signed up to when you installed SharePoint, and is definitly unsupported by Microsoft see here, so if you do it on a procduction system and break it your on your own, be warned !!

Import-SPEnterpriseSearchTopology / Export-SPEnterpriseSearchTopology

Recently I was working on a way to totally automate our SharePoint application build out, including farm provisioning one area where this is tricky is around deploying SSA’s (Search Service Applications).

We already had a nice PowerShell based farm build and code relase process to allow us to easily build and configure our farms from scratch and deploy our SharePoint application onto multiple environment configurations from dev (single server) through to production (multi-server farms).

However one area that was tricky and hard to automate was the configuration of the SSA, setup of crawl / query and index partitions etc.. as this is very different on small scale dev farms compared to large scale production farms.

We could have built a totally custom set of PowerShell scripts to do the job for us however with a bit of Simple token substitution and the use of these built in sharepoint commands you can deploy any SSA setup you like, using the XML file that is output when you call the Export-SPEnterpriseSearchTopology as your starting template.

Specific’s to follow later…