Our TFS died over the weekend.
SQL Server 2005 standard - wouldn't respond to anything. Could have been a hotfix we applied but we haven't been able to find anything on the net regarding known issues, I suspect its just some stupid combination of errors but we can't find out what. The eventlog doesn't have any real info other than a single error type. Trying to connect locally, even after reinstall of sql, (and remove hotfixes and then reinstall sql, and all other sorts of things) would always give the same error, "shared memory provider - no process at the other end of the pipe". All the info out there says to change the client protocol settings. We spent over 2 hours on that alone, trying and retrying every combination. Everything was working fine on Friday.
Really frustrating. Our project has lost all of Monday. Its Tue morning now and we have the task this morning of rebuilding the TFS from scratch and hoping it all comes up okay. That means its probably more likely to be 1.5 days lost.
If you are thinking about using TFS in a high demand environment (one where you cannot afford to lose a couple of days) then it might be worth investing in some load balancing and clustering. Or at least some kind of standby hardware that you can call upon at short notice.
I am reading through the TFS move types at the moment and also reviewing the installation guide.
By lunch time I hope to have all the team able to check things in again (and our client be able to access the portal!).
PS: the updates installed before the SQL problem were 912475, 913446 and 911927.
The latest - we have reformatted the server, reinstalled RC from scratch. Now going through the process of "moving" the existing data back onto the tfs server.
0 comments:
Post a Comment