Canada Was Offline Today

daqvin_carter · 2022-07-10 12:14 am

I really wish my workplace would have a dropdown in the ticketing system for root cause - "Bubba Backhoe"

Bigun · 2022-07-10 1:27 am

I went to work in the office before I realized that the network was down, we couldn’t run our normal operations in production at first and all our business cell phones were cut off too. But I had work to do and international video meetings to join in, so I came home to use my Bell internet, VPN in through our US network and all was good. Of course I thought it was a cyber attack on Rgers 😀

invaderzim · 2022-07-10 1:57 am

tomchr said:
No service for a day still means 99.7 % uptime. Not that long ago we'd dream about those kinds of uptimes. That fact that 99.7 % is so bad that it flattens society for a day should give you pause.

Tom

We went from using it for entertainment to transmitting occasional business information to depending on it for day to day business functioning. So what was once an annoyance is now crippling.
With so much on 'the cloud' now, a lot of businesses don't have local servers to run their software so without the internet they are dead in the water. Our work software just stops without internet, I couldn't tell you what you need, the price or where in the warehouse it is even if you knew what you wanted. Anyone remember shipping UPS packages with a handwritten ledger?
There was some movie or book where the plot was to knock out the internet and at first it seemed silly but after thinking about how many things wouldn't happen without the internet it was scary.

wiseoldtech · 2022-07-10 3:31 am

The whole globe has been "pushed" into dependency through electronic means.
Think about it.
How much independency do you really have left?
Freedom, taken for granted.

tomchr · 2022-07-10 6:00 am

Yep. Internet is like a public utility. A power outage is only fun for the first few hours – especially if it's winter and a bit nippy out.

Tom

kodabmx · 2022-07-10 9:13 am

billshurv said:
Depends where you want the redundancy at telco level** that can be an expensive pain*. By what little has leaked out there was a BGP snafu which took out the whole rogers IP range. As everything is IP now basically everything in their network stopped being able to talk. Cloudflare went out a couple of weeks ago (and took this website off the air for a while) with a BGP issue. BIG difference is they fixed it quickly. And its the ability to quickly resolve that sorts the men from the boys.

*I'm sure everyone on here who's worked in or around datacentres has had to deal with $$$ redundant links where it turns out both pairs of fibre go out the building in the same conduit so one badly placed back hoe knocks both out at once

** disclaimer I do currently work with stuff that IS redundant partially at cellular level so can handle a full radio network failure. It sometimes works but still ends up backhauling out of the primary telco so would not survive this sort of outage.

Apparently, Interac has learned something from this: They should have wrote "Strengthen our non-existant network redundancy"... Also "adding a supplier" sounds a lot like Bell to me as there aren't many other providers large enough.

https://twitter.com/x/status/1545770735448330240

fubar3 · 2022-07-10 10:08 pm

Re: Reasons for Rogers outage:

"If anything can fail it will, and at the worst possible time" -- Peter Principle

Tsujigiri
"To test a new sword on a unsuspecting passer-by" This was an actual practice in medieval Japan. Now it can be used for black humour when a company releases a new product or revision (to get it out the door quickly) after shallow testing.

So Rogers. how are those blades working now?

daqvin_carter · 2022-07-10 10:15 pm

We once bought fiber links to two sites 50 or so miles away from each other that were supposed to go to different locations for redundancy. Come to find out they wound up in the same manhole on the east side of a large city once we had an outage for both.

One of the bigger failures with BGP configuration problems is you may not be able to reach the big router in a datacenter once you pushed a bad config. Having serial links and dial up to log into that switch is sort of critical. Often not done because costly. Sort of the "Work: part of networking.

billshurv · 2022-07-10 10:49 pm

tomchr said:
Yep. Internet is like a public utility. A power outage is only fun for the first few hours – especially if it's winter and a bit nippy out.

Tom

Haven't some states in the mob underneath you passed/tried to pass legislation making internet legally a utility? Sure I read something somewhere a few years back?

daqvin_carter said:
Having serial links and dial up to log into that switch is sort of critical. Often not done because costly.

I find modern IP based ILO very scary as a concept, but figured I was a dinosaur still finding comfort in a 9600 serial connection into the back for when all else fails 😀

daqvin_carter · 2022-07-10 11:14 pm

I have used alot of IP based ILO to restore down servers. I have reset Dell servers that wanted you to hit F2 for some stupid reason on boot. Even driven 25 miles to the datacenter because some device was stuck on boot for no good reason and missing ILO.
You go and disable the bios setting "stop of keyboard error when booting on all of the servers. Find it reset when you update firmware.

I have configured dial up switch and outer access devices. Can be a lifesaver when you are miles away.

chrisb · 2022-07-10 11:58 pm

fubar3 said:
Re: Reasons for Rogers outage:

"If anything can fail it will, and at the worst possible time" -- Peter Principle

isn’t that Murphy’s Law?
Peter Principle is something else: the tendency in most organizational hierarchies, such as that of a corporation, is for every employee to rise in the hierarchy through promotion until they reach a level of respective incompetence.

fubar3 · 2022-07-11 12:06 am

Yes .. Murphy predicted I would eventually make a mistake .. after rising to my level of incompetence. 🙂

tomchr · 2022-07-11 2:34 am

billshurv said:
Haven't some states in the mob underneath you passed/tried to pass legislation making internet legally a utility? Sure I read something somewhere a few years back?

There seems to be periodic chatter about that but I'm not sure how far it's gotten, if anywhere.

Tom

billshurv · 2022-07-25 11:46 pm

Looks like Rogers has fessed up to the problem. From https://www.theregister.com/2022/07/25/canadian_isp_rogers_outage/

The configuration change deleted a routing filter and allowed for all possible routes to the Internet to pass through the routers. As a result, the routers immediately began propagating abnormally high volumes of routes throughout the core network. Certain network routing equipment became flooded, exceeded their capacity levels and were then unable to route traffic, causing the common core network to stop processing traffic. As a result, the Rogers network lost connectivity to the Internet for all incoming and outgoing traffic for both the wireless and wireline networks for our consumer and business customers.

Yeah that'll do it. At least they have had the decency to sack the CTO (probably with a diamond encrusted golden parachute though).

Edit: I actually know the outgoing CTO from a previous gig. But there he wasn't CTO but appeared to be having an affair with the CTO.

Mikett · 2022-07-26 12:28 am

It's not just Rogers to blame.

Why would captains of industry not wake up to the vulnerability of THEIR decisions before this. Should they not have thought of that and had redundant suppliers themselves even BEFORE the outage happened. Seems like a lot more CTOs need to be sacked as well not just Rogers.
Think about this, if you are running a business that depends on connections. Would you rely solely on one supplier?

For example, I need the internet everyday. So when I change my ISP I at least allow a sizeable overlap of two subscriptions to make sure everything is OK before I end the subscription of one. If I absolutely needed the internet each day, then I would pay for two subscriptions. An extra $100 a month would be nothing if business for thousands of moola went down the drain for a day because I saved $100 a month.

Then there is the vulnerability of connected systems IF subsets cannot operate without full connections. What were they thinking?

I did not get affected because three months ago I went away from Rogers after being with them for over 6 years running. They treated a loyal customer like me as garbage. AFTER I left, the calls to come back came. I had given them ample opportunities to retain me as a customer. Thank goodness their imbecilic marketing policies made me wake up.

fubar3 · 2022-07-26 1:06 am

billshurv said:
Looks like Rogers has fessed up to the problem. From https://www.theregister.com/2022/07/25/canadian_isp_rogers_outage/

Yeah that'll do it. At least they have had the decency to sack the CTO (probably with a diamond encrusted golden parachute though).

Edit: I actually know the outgoing CTO from a previous gig. But there he wasn't CTO but appeared to be having an affair with the CTO.

The Rogers fess-up has too much blah blah for the general public. Short story: There team did not have the discipline to carefully plan and check the changes made to critical system components.

Mikett · 2022-07-26 1:15 am

fubar3 said:
The Rogers fess-up has too much blah blah for the general public. Short story: There team did not have the discipline to carefully plan and check the changes made to critical system components.

Hey even twitter needed an edit button for a reason. Apple will introduce the ability to undo message errors. Accuracy and planning and perfect execution no longer matters. We even want self driving cars and collision avoidance systems. See a pattern?

billshurv · 2022-07-26 8:43 am

fubar3 said:
The Rogers fess-up has too much blah blah for the general public. Short story: There team did not have the discipline to carefully plan and check the changes made to critical system components.

And not enough for anyone even vaguely involved with this sort of stuff. ISPs go down regularly. Things are too complex now to fully test. The measure of a good company is how quickly they can react and fix. For example Cloudflare went down in June. North American continent was asleep but I lost my morning binge of DIYaudio. They fixed it in 90 minutes (and admitted that was too slow and need to work on co-ordinating rollbacks). In the Cloudflare case the upgrades worked in their older smaller type A datacentres and borked their newer, larger Type B ones.

Planning for recovery from these sort of outages is hard and timeconsuming which is why it gets skipped in organisations that, for example have little competition or a C suite whose bonuses are based on getting rid of pesky overheads like expensive techies with 30 years of experience 😀

mbrennwa · 2022-07-26 2:37 pm

PRR said:
Rogers was a good company, back when they made vacuum tubes.

Yup! My favourite 6SN7 are Rogers 😀

dangus · 2022-08-02 1:43 am

I was reminded of this scene in the UK TV series "The IT Crowd", when the IT guys have convinced their manager that a small box with a blinking light is actually the heart of the internet.

Search

Amplifiers

Source & Line

Loudspeakers

Design & Build

General Interest

Live Sound

Member Areas

Site

Featured Vendors

Members Market

Vendors Market

Vendors

Search

Canada Was Offline Today

daqvin_carter

Bigun

invaderzim

wiseoldtech

tomchr

kodabmx

fubar3

daqvin_carter

billshurv

daqvin_carter

chrisb

fubar3

tomchr

billshurv

Mikett

fubar3

Mikett

billshurv

mbrennwa

dangus

Canada Was Offline Today

Member

Member

Member

Account Closed

Neurochrome.com

Member

Member

Member

Member

Member

Member

Member

Neurochrome.com

Member

Member

Member

Member

Member

Member

Member