Jump to content

Issues This Week


Trent
 Share

Recommended Posts

Now that we're back up and should be be stable, I'll talk about the issues that happened this week.

 

Web Proxy

The issues started last weekend when we started seeing the web panel have issues loading. We saw the web proxy lagging and worked to address that, but it was unclear what the underlying problem was. In the middle of the week, we were able to make a fix that helped the web proxy perform normally, but that caused the issue to go one step deeper into the Minehut system.

 

APIs

After the web proxy stopped being the point of pain, the issue presented itself in the APIs. Both the network and web APIs are critical to Minehut running and handle internal communication between Minehut services. We saw these services timing out, which meant that the web panel and game servers were having issues, as well as not being able to login to Minehut and join the lobby. We spent 2 days debugging bottle necks here and found a few pain points, which we patched yesterday.

 

Redis

Once the API issues were cleared up, we got to the root of the problem. Redis is a cache that stores a lot of data, and is where our APIs retrieve data when responding with server status, MOTD, and how many servers are online. We found 2 issues with this that combined were catastrophic:

  1. Redis is a cache and isn't meant to be persistent data, but none of the data in redis was ever clearing.
  2. An endpoint that hits redis would pull every object in redis and iterate over it, which is not how a key value store is supposed to work.

rediscount2.PNG

Graph of objects in redis. This should be a flat line with old objects expiring as new ones are being created.

Redis Fix

The above means that there were millions of objects in redis that weren't being cleared, and were being calculated every time an endpoint was called. This isn't something new, and has been building up for awhile. This morning we saw it hit the tipping point of multiple services that rely on redis to all fail. It took us a little bit to track down the issue, and then we attempted to start redis on a more powerful box and flush it without doing a full network restart. That helped resolve the problem, but left us in a broken state as our cache doesn't rebuild itself safely in that case. It took a couple of restarts to get the new redis configured correctly and in a healthy state, and now we're finally there.

 

Other Things

  • Fixed ram not allocating correctly when upgrading player slots.

 

It was a rough week, but we should be out of it for a bit. The issues that we did today will just delay the problem again for awhile, which is why we're working on rebuilding pieces of Minehut to allow us to continue to scale to 4000+ players.

Link to comment
Share on other sites

Honestly I call bs. We have never had such a severe problem like this other than the many times minehut has been hacked. What makes it happen now, and it seems that you guys didn't even address the issue or make it clear that our server data wasn't about to be wiped again.

Link to comment
Share on other sites

15 minutes ago, Colorlot said:

Honestly I call bs. We have never had such a severe problem like this other than the many times minehut has been hacked. What makes it happen now, and it seems that you guys didn't even address the issue or make it clear that our server data wasn't about to be wiped again.

'It took a couple of restarts to get the new redis configured correctly and in a healthy state, and now we're finally there.'

Is that not addressing the issue?? And what does this have to do with server data wiping? The cache is not for persistent data.

Edited by ecstacide
Link to comment
Share on other sites

55 minutes ago, ecstacide said:

'It took a couple of restarts to get the new redis configured correctly and in a healthy state, and now we're finally there.'

Is that not addressing the issue?? And what does this have to do with server data wiping? The cache is not for persistent data.

Some people were worried btw, and also... MH didn't drop that while it was going on.

Link to comment
Share on other sites

I have been messing around with the minehut API and I noticed that every now and then the $.getJSON() function in jquery would either fail or time out, its nice to know that you guys are on top of this and working to keep minehut up and stable, along with some of its other services and aspects. 😀

Link to comment
Share on other sites

14 hours ago, Colorlot said:

Honestly I call bs. We have never had such a severe problem like this other than the many times minehut has been hacked. What makes it happen now, and it seems that you guys didn't even address the issue or make it clear that our server data wasn't about to be wiped again.

I'm pretty sure that it was clearly stated in the discord server that nothing would be wiped.  Correct me if I'm wrong tho.

Link to comment
Share on other sites

I'm glad this went into a success. Good job. 

- Cover

Been playing Minehut since late summer of 2015 and still continuing. 

Servers I have created in the past:

SkyblockSMP (2015-2016)

MagicalSkyblock (2016-2017)

Mythblock (2019 - ) 

 

SkyblockSMP and MagicalSkyblock was one of my successful servers. I'm not an active server creator but I do create servers whenever it's the right time for me to create one. 


Link to comment
Share on other sites

Hello, I've been using MineHut for a few weeks now but there's a problem that I have encountered and currently having issues with, the problem is that I'm being sent to the incorrect server, instead of the server I'm logging onto it sends me to my first server that I made for no real purpose at all, if i try to shut it down and open it up, still broken, restarting, didn't work, I've already lost a server to this error, the cause of the error is the refresh button on the browser.

(sorry if this doesn't make any sense but I'm pretty sure other MH users can relate and explain this error a lot better than me)

Link to comment
Share on other sites

now everything is slow... Forums 5m to load. Website not loading properly/Not Loading

Fix the website.

Fix the server.

Lag is better than not able to do anything. I started my server through the CP then i exited out went back and wasn't able to open it back up/properly.

Link to comment
Share on other sites

2 hours ago, wm9 said:

now everything is slow... Forums 5m to load. Website not loading properly/Not Loading

Fix the website.

Fix the server.

Lag is better than not able to do anything. I started my server through the CP then i exited out went back and wasn't able to open it back up/properly.

We had an issue for about 10m where the website was lagging. We resolved it though.

 

2 hours ago, wm9 said:

now one of my servers isn't on the CP...

 

 

https://gyazo.com/74e707113128d2f4087007e1145aa23c

Please fix like FR. I have been playing since early 2017 like come on I remember Tiers and if u didn't have a tier u only had 2 plugin slots. Bring back the old days. Not the plugin slots though.

Can you please create a support ticket at https://minehut.com/support

Link to comment
Share on other sites

When is the next Good Greif?

 

like.gif.653e1b24da93a6ca10eaf5b04e7f0242.gif

Likes are appreciated ❤️

 

Username: SuperOrca

Discord: Link to Profile

Rank: [VIP]

Joined Minehut: August 10, 2017

Joined Forums: June 10, 2019

 

Experienced in Python, Java (mainly spigot), Javascript (node.js, basics of react.js), and Web Development. DM on discord me if you want a custom discord bot for your server (i'm bored).

Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
 Share



×
×
  • Create New...