Series Info...Trials, Triumphs & Trivialities #188:
by Shannon Appelcline
2006年04月20日
Tuesday morning at approximately 10:15am the T1 line leading to Skotos went deathly silent. What followed was a two-day nightmare made worse by the bureaucracy and lack of communication that today practically defines the United State痴 telecommunication industry. I致e had a tough week as a result, and so I知 not going to write too much about this issue, but I would like to hit the main points.
The main point is one that I致e made before: running a game company involves a lot more than just writing games. Having to manage a multiday downtime--to constantly make hard decisions that might ultimately decide the long-term fate of your company--shouldn稚 be the norm. We致e had telecommunication problems in the past, but this is the first one that痴 been this grave, and which has potentially put the company itself at risk.
However, you池e going to constantly have to deal with something outside the norm, and if you池e a designer who just wants to make games, you池e going to have to find someone to help you out with all these other issues.
My second point is: redundancy, redundancy, redundancy. At one point we had two T1s and one DSL line running into Skotos. We had to give that up in the face of economic necessity, but an extended downtime was exactly the situation that such a redundant setup was intended to avoid.
Now it happens that this time around the problem was that our local phone company had an entire 1200-pair cable burn out. In the past all of out networks might or might not have been on that one cable, depending on how far away from our location it was, and how many cables they池e running in the area. For the phone lines to the same location, two went out and one stayed up, so it痴 clearly hit or miss. However if we壇 been able to maintain our earlier redundancy, we壇 at least have had some chance of staying up.
Though we could no longer maintain cable redudancy, we had recently implemented a different type of redundancy for this sort of issue. We created a news page, with an RSS feed at http://status.skotos.net. Rather than being at the same location as the rest of our Skotos machines, it痴 instead at the hosting facility where we keep RPGnet. As a result whenever we have a network issue, we can pop over to status and update things there, and people soon see it on their feeds and aggregators. During this particular crisis, our top engineer also installed a simple chat room on status, and it was a great place for our players to chat with each other and remain members of the community, even though the community largely isn稚 available.
That痴 all I got this week: multitasking and redundancy.
And downtimes suck.
[ < #187: Social Software & Gaming: Dunbar's Number | #189: Art Imitates Life, in All Truthiness > ]