diyAudio

diyAudio (http://www.diyaudio.com/forums/)
-   Forum Problems (http://www.diyaudio.com/forums/forum-problems/)
-   -   Unexpected outage (http://www.diyaudio.com/forums/forum-problems/236713-unexpected-outage.html)

Jason 28th May 2013 09:58 AM

Unexpected outage
 
About 4 hours ago, while I was sleeping, our regular web server shut down / was replaced by the generic apache web server. I'm currently trying to work out what happened.

Gyuri 28th May 2013 10:03 AM

sucks

Jason 28th May 2013 10:05 AM

Ok looks like "while you sleep" caretakers of our managed server detected a problem, logged in 4 hours ago, and then restarted the wrong apache.

I'm very sorry for the inconvenience.

I'll certainly be making sure that doesn't happen again.

zeonrider 28th May 2013 10:25 AM

Quote:

Originally Posted by Jason (Post 3506700)
About 4 hours ago, while I was sleeping, our regular web server shut down / was replaced by the generic apache web server. I'm currently trying to work out what happened.

Gremlins ;)

Regards zeoN_Rider

Jason 28th May 2013 03:33 PM

The hosting company's monitoring service detected that port 80 was not responding on the server. I'm currently trying to find out if that was the case or a false alarm. Things appeared to be going swimmingly up until a point and then everything dropped to zero. It could well have been a teething problem in the new setup. I'm going to be investigating that and monitoring things closely as time goes on.

When that happened, one of their techs logged in to save the day, but didn't follow their standard operating procedures which included instructions on how to properly restart the web server (this was documented 2 years ago when we last had an issue). A sequence of other non SOP events unfolded from there, resulting in the site being down for 4 hours. The host has taken full responsibility for their failure to following SOP and promises this won't happen again in the future. I'll be taking proactive steps to "future proof" things so even if SOP is not followed in the future, it will be hard to bork it up again.

Sorry for the downtime.

Other than this very consequential "glitch", the server is coasting along at record low load averages and serving up pages faster than ever. Hopefully it stays that way, I'll be baby sitting things in the meantime...

Zen Mod 28th May 2013 04:17 PM

tnx Jason

we are grateful for your good work

Jason 28th May 2013 06:31 PM

Case closed.

Various things have been implemented to make this (more) idiot proof next time round. Here's hoping!

zany 29th May 2013 12:12 AM

It's a real problem when 'idiots' are managing server farms... :-(

Sprags 29th May 2013 12:28 AM

I think if the computers used tried and true 75 year old tube technology there probably wouldn\'t have been an issue...but noooo...those computer geeks think transistors are more reliable.

Johno 29th May 2013 02:15 AM

I live on a farm and we use electric fencing to keep the cattle on the correct side. About 10kV but very low joules.
Works really well, one individual tests the envelope and gets a zap and the rest of the herd learns from the squeal and respects all fencing for the next few weeks.

I have thought about introducing similar "learning" tools at work but the health and safety people tell me there are a number of issues and will not let me. Wonder if the issues apply to contractors too?


All times are GMT. The time now is 01:27 PM.


vBulletin Optimisation provided by vB Optimise (Pro) - vBulletin Mods & Addons Copyright © 2014 DragonByte Technologies Ltd.
Copyright 1999-2014 diyAudio


Content Relevant URLs by vBSEO 3.3.2