A difficult weekend for Second Life


The Grid Is Down While We Bang on Things – used during downtimes many years ago! 

On Monday 11th January April Linden who is a member of the Second Life operations team posted a blog post named Why Things Were Less Than Optimal This Past Weekend in Second Life. Due to a “series of independent failures happen that produced the rough waters Residents experienced inworld”.

A master node of one of the central databases crashed on Saturday and this was one of the most used databases in Second Life. The failure caused disruption for a lot of Second Life residents during the weekend. By Sunday evening the operations team managed to re-stabilize the grid back to normal again.

Here is what happened….

On Saturday 9th January

Shortly after midnight Pacific time on January 9th (Saturday) we had the master node of one of the central databases crash. The central database that happened to go down was one the most used databases in Second Life. Without it Residents are unable to log in, or do, well, a lot of important things.

This sort of failure is something my team is good at handling, but it takes time for us to promote a replica up the chain to ultimately become the new master node. While we’re doing this we block logins and close other inworld services to help take the pressure off the newly promoted master node when it starts taking queries. (We reopen the grid slowly, turning on services one at a time, as the database is able to handle it.) The promotion process took about an hour and a half, and the grid returned to normal by 1:30am.

After this promotion took place the grid was stable the rest of the day on Saturday, and that evening.

On Sunday 10th January 

That brings us to Sunday morning.

Around 8:00am Pacific on January 10th (Sunday), one of our providers start experiencing issues, which resulted in very poor performance in loading assets inworld. I very quickly got on the phone with them as they tracked down the source of the issue. With my team and the remote team working together we were able to spot the problem, and get it resolved by early afternoon. All of our metrics looked good, and I and my colleagues were able to rez assets inworld just fine. It was at this point that we posted the first “All Clear” on the blog, because it appeared that things were back to normal.

It didn’t take us long to realize that things were about to get interesting again, however.

Shortly after we declared all clear, Residents rushed to return to the grid. (Sunday afternoon is a very busy time inworld, even under normal circumstances!) The rush of Residents returning to Second Life (a lot of whom now had empty caches that needed to be re-filled) at a time when our concurrency is the highest put many other subsystems under several times their normal load.

Rezzing assets was now fine, but we had other issues to figure out. It took us a few more hours after the first all clear for us to be able to stabilize our other services. As some folks noticed, the system that was under the highest load was the one that does what we call “baking” – it’s what makes the texture you see on your avatar – thus we had a large number of Residents that either appeared gray, or as clouds. (It was still trying to get caught up from the asset loading outage earlier!) By Sunday evening we were able to re-stabilize the grid, and Second Life returned to normal for real.

It’s really interesting to hear April’s perspective on what went on and April mentions at the end of the blog post “My team takes the stability of the grid extremely seriously, and no one dislikes downtime more than us”.

One of the things I like about my job is that Second Life is a totally unique and fun environment! (The infrastructure of a virtual world is amazing to me!) This is both good and bad. It’s good because we’re often challenged to come up with a solution to a problem that’s new and unique, but the flip side of this is that sometimes things can break in unexpected ways because we’re doing things that no one else does.

I’m really sorry for how rough things were inworld this weekend. My team takes the stability of the grid extremely seriously, and no one dislikes downtime more than us. Either one of these failures happening independently is bad enough, but having them occur in a series like that is fairly miserable.

See you inworld (after I get some sleep!),

April Linden

I remember the weekly downtimes Second Life had many years ago which lasted for hours and the old message “the grid is down while we bang on things”. Since then the grid stability has improved but things can still go wrong at any time and take everyone by surprise even after 13 years of Second Life being online.

Thanks to April Linden for explaining what happened during the weekend and apologising for the situation.

14 thoughts on “A difficult weekend for Second Life

  1. Nice to see an explanation about a issue that made to many events being canceled or postponed. But on these times, many fear that this is a way to say, move on cause SL is going down.
    And they act accordingly, moving to open sim or just stop spending real money in Sl.
    LL needs urgently to ensure that this was not the case and promote a effective measure to avoid the flow of fearing users that are not waiting nor willingly move to Sansar.

    Liked by 1 person

    1. I agree!! I’m one of those pretty much stopped spending money in SL. I don’t like what I’ve seen about Sansar and unless there are some significant changes I doubt I’ll even try it when the open beta is available. The trouble is I’ve tried open sim and in my opinion it is not good substitute for SL. So where do that leave us?

      Liked by 2 people

      1. Thanks Willow.

        I wonder if Sansar will be good enough to use on a regular basis ? and what impact it will have on Second Life when made public ?. I’ve been logging into OpenSim grids in recent years and it seems so empty with no one around. However the total amount of OpenSim regions and active users are on the increase!

        OpenSim has improved over the years and the hypergrid it gaining a lot of interest. .

        Like

    2. Thanks zzpearlbottom. Yeah it was good to see an explanation about weekend issues. It seems the lab have improved communications once again and letting everyone know about things faster.

      The lab chat @2 will be held later this month and we will hear more about Sansar. Should be interesting. It will be interesting to see the impact Sansar will have on Second Life, we will have to wait and see I guess.

      Like

  2. People often complain about SL in terms of how hard they work for us, or how well the customer service team handles our needs. I’m Maggie Larimore in SL and I handle a small area of the mainland along with other landowning members of the Chilbo Community, and each of us manage tenants and land and try to keep things good for our neighbors. I can’t say how grateful I am, or how satisfied I have been, with the help I’ve received from the Lindens when I’ve needed it. It’s lovely to see April’s description of the rush to keep everything going smoothly. I really appreciate the LL folks and what they do, and am hoping to maintain my SL stuff while dabbling in Sansar in the future.

    Liked by 2 people

    1. Thanks nancyzingrone / Maggie Larimore.

      it was good to see an explanation about weekend issues. It seems the lab have improved communications once again and letting everyone know about things faster. I’ve been to the Chilbo Community and it’s really nice.

      I think the lab operations team and customer service is getting better. It’s great to see Lindens helping out when needed inworld when things don’t go work.

      I think many will be giving Sansar a go at least. I’m wondering what impact Sansar will have on Second Life but it still early at the moment to tell.

      Like

  3. I truly prefer to be in SL for long, there is so manay memories, so much history i doubt any other virtual world can reproduce or come close to.
    I just wish that LL would assure us better about Sl not going anywhere, even after Sansar.
    That is i think, many do wish to ear as well.

    Liked by 1 person

    1. The good news is that Second Life will not be closing down anytime soon and Second Life will continue for many years to come. I agree, there have been so many happy memories and so much history made since Second Life during the past 13 years.

      Liked by 2 people

      1. I understand that is what is even said by such people as Ebbe Altberg (LL’s CEO). But he also said Sansar will cannibalize SL. That is what is SAID but from observing LL for a long time that may or may not happen in the long run.

        Personally I think that a reduction in land tier and an ACROSS THE BOARD reduction in land set up cost would go a long way to inspire confidence in SL users about the longevity of SL.

        Like

    2. Like you I prefer to be in SL. Not just because of history and memories. I have plenty of those since I started my second decade in SL this past summer. But I look at SL progress like the US stock market the progress trend long term is a positive up. Sure it has its ups and downs but if you take a long term look the progress is upwards
      .
      In the past I haven’t been worried about the long term survival of SL until now because of Sansar. From the little I’ve been able to find out I don’t like it. Put the emphasis on how little information is really available. I’m really looking forward to the upcoming Lab Chat. Hopefully we’ll receive some definite assurance that SL will continue. Right now Sansar looks like an SL killer.

      Liked by 1 person

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.