Downtime today

Status
Not open for further replies.
Founder
Joined 2000
Paid Member
My sincere apologies for the 9 hours of downtime today. When I "improved" the backups a few months ago I switched the write location to the main partition instead of the backup partition and today during the "PST/EST dawn backups" the main drive filled up which linux really doesn't like happening, and it killed the webserver. I was just going to sleep at that time and didn't get notification until I woke up, hence the 8 hour delay in my response.

For the record, our server is monitored and managed by our host, and they did detect the problem immediately, logged in and attempted to resolve it, but weren't familiar enough with the configuration to be able to resolve the problem. We're working with them to ensure that in the future they have enough information on our (rather hardy/secure/locked down) configuration to be able to fix this kind of problem without additional assistance from myself in the future.

As this is the kind of model we're shooting for now (if stuff breaks it gets fixed immediately by the server management company), I guess this was a good real world test, and in this case it failed and again my sincere apologies for the interruption to your service. We'll do what it takes to ensure this can't happen again in the future.

Jason
 
Last edited:
Thanks Jason for letting us know what happened which is more than what other services would do!

Ironicaly I was just on and then went off to do a virus scan and then came right back and then nothing ,so ,I thought that my scanner had messed up something as it has happend to me before.

But by using one of my backup computers I figured that it must have been something with your server,Shooo, WoW I'm glad that you are back online !🙂

jer
 
I was a bit worried :scratch: as to what happened.
I am illiterate on such stuff (what you wrote sound Greek 😀 to me ), but I don’t think it is a bad idea to deliberately program for some regular downtimes in the future, to give us a chance to do some diyaudio real work off-line. 🙄

Jason, I am glad that You are back and catering for this wonderful site. 🙂
I hope You are doing well.

Regards
George

PS. Against the “common wisdom” here, a few hours sleep each day is a normal activity. 😱
 
+1 on geraldfryjr's comment. Good to be informed but its ok to have a glitch once in a while. Unlike audio and electronics, I have some experience with being a Unix admin (only over 12 years). Let me know if I can help with anything. Getting the operations team on board will require some documentation. Maybe they have a wiki site for that? That's what I used for the last three jobs to document proceedures and changes. If for nothing else than to remind me of "how it was done." 😉
 
The down time proved to me that I'm waaaaaaay to addicted to diyAudio!!! Man - I wuz Jones'n for my diyAudio!!!! Maybe I should get a life......🙄

Thanks Jason for getting things put right and letting us know what is going on.😀
 
Thanks ptempel for the offer.

Our host now has all the information and our special requirements in case of an emergency, backups are now back on their own partition (rolleyes), I'm getting texts and phone calls if disk space ever gets unplentiful, or the site becomes unresponsive, and our senior admins on here have a direct line and verbal passwords to our host in case of extended drama. It's been a long time since we last had any downtime, hopefully it will be another year or two until the next problem 🙂
 
Hi Jason,

My sincere apologies for the 9 hours of downtime today.

Don't worry about it. It's not like DIYA is "business critical".

It's just a "downtime" thing for those who post here.

Occasional breaks are probably a good thing.

You should probably schedule 24 Hour outages every two weeks or so... 😉

Appreciate the hard work to keep everything running and getting things back up to scratch when Murphy strikes...

Ciao T

PS, I remember once nearly 2 Weeks outage on a business critical Mini system, it was the Payroll... 🙁
 
You should probably schedule 24 Hour outages every two weeks or so... 😉
😱😱😱😱😱😱 Nooooooooooooooo!!! :faint:

PS, I remember once nearly 2 Weeks outage on a business critical Mini system, it was the Payroll... 🙁

I used to work for Digital Equipment in QC - we regarded those "little" bugs as customer based R&D. Annnnnnd of course MS is known for regarding bugs as "added features" 🙄
 
Status
Not open for further replies.