Server Lockup Randomly?!?!?

For any problems with Dawn of Light website or game server, please direct questions and problems here.

Moderator: Support Team

Server Lockup Randomly?!?!?

Postby Crazys » Sat Oct 18, 2014 3:53 am

This has happened a few times to us now... Server Just randomly blows up and spams this and never recovers until rebooted.
Looks like changes where made to "DOL.GS.Util.GetThreadStack(Thread thread) in c:\DOLSharp\DOLSharp\trunk\GameServer\gameutils\Util.cs:line 412"
in SVN 3358

It seems like its trying to restore the already restored thread???

I just reverted this section to 3357 on my server to see if it happens again... Any other tips let me know. I haven't found any other errors to point at other issues...
Code: Select all
/// <summary> /// Gets the stacktrace of a thread /// </summary> /// <remarks> /// The use of the deprecated Suspend and Resume methods is necessary to get the StackTrace. /// Suspend/Resume are not being used for thread synchronization (very bad). /// It may be possible to get the StackTrace some other way, but this works for now /// So, the related warning is disabled /// --- This can cause a lot of trouble for Mono Users. /// </remarks> /// <param name="thread">Thread</param> /// <returns>The thread's stacktrace</returns> public static StackTrace GetThreadStack(Thread thread) { #pragma warning disable 0618 try { thread.Suspend(); } catch(Exception e) { return new StackTrace(e); } finally { thread.Resume(); } StackTrace trace; try { trace = new StackTrace(thread, true); } catch(Exception e) { trace = new StackTrace(e); } finally { thread.Resume(); } #pragma warning restore 0618 return trace; }

Code: Select all
22:24:07,279 - [40] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Found Frozen Region Timer ----- Name: RegionTime1 - Current Time: 7303483 22:24:17,281 - [40] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Failed to stop the TimeManager: RegionTime1 22:24:22,747 - [18] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Found Frozen Region Timer ----- Name: RegionTime1 - Current Time: 7303483 22:24:33,225 - [18] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Failed to stop the TimeManager: RegionTime1 22:24:36,526 - [10] - ERROR - DOL.GS.GameEvents.StatPrint - stats Log callback System.Threading.ThreadStateException: Thread is not user-suspended; it cannot be resumed. at System.Threading.Thread.ResumeInternal() at System.Threading.Thread.Resume() at DOL.GS.Util.GetThreadStack(Thread thread) in c:\DOLSharp\DOLSharp\trunk\GameServer\gameutils\Util.cs:line 412 at DOL.GS.GameTimer.TimeManager.GetStacktrace() in c:\DOLSharp\DOLSharp\trunk\GameServer\gameutils\GameTimer.cs:line 528 at DOL.GS.GameEvents.StatPrint.PrintStats(Object state) in c:\DOLSharp\DOLSharp\trunk\GameServer\gameutils\StatPrint.cs:line 204 22:24:38,181 - [40] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Found Frozen Region Timer ----- Name: RegionTime1 - Current Time: 7303483 22:24:47,117 - [40] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Failed to stop the TimeManager: RegionTime1 22:24:51,990 - [18] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Found Frozen Region Timer ----- Name: RegionTime1 - Current Time: 7303483 22:25:02,257 - [18] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Failed to stop the TimeManager: RegionTime1 22:25:05,844 - [10] - ERROR - DOL.GS.GameEvents.StatPrint - stats Log callback System.Threading.ThreadStateException: Thread is not user-suspended; it cannot be resumed. at System.Threading.Thread.ResumeInternal() at System.Threading.Thread.Resume() at DOL.GS.Util.GetThreadStack(Thread thread) in c:\DOLSharp\DOLSharp\trunk\GameServer\gameutils\Util.cs:line 412 at DOL.GS.GameTimer.TimeManager.GetStacktrace() in c:\DOLSharp\DOLSharp\trunk\GameServer\gameutils\GameTimer.cs:line 528 at DOL.GS.GameEvents.StatPrint.PrintStats(Object state) in c:\DOLSharp\DOLSharp\trunk\GameServer\gameutils\StatPrint.cs:line 204 22:25:07,645 - [40] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Found Frozen Region Timer ----- Name: RegionTime1 - Current Time: 7303483 22:25:17,376 - [40] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Failed to stop the TimeManager: RegionTime1 22:25:22,362 - [40] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Found Frozen Region Timer ----- Name: RegionTime1 - Current Time: 7303483 22:25:32,362 - [40] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Failed to stop the TimeManager: RegionTime1 22:25:35,704 - [40] - ERROR - DOL.GS.GameEvents.StatPrint - stats Log callback
Crazys
Contributor
 
Posts: 346
Joined: Tue Nov 07, 2006 10:18 pm

Re: Server Lockup Randomly?!?!?

Postby Leodagan » Sat Oct 18, 2014 6:38 am

We were seeing this error around your mass Mob death test...

Is this happening with other scenarii ?

I think I made a "pseudo" patch for this, (by pseudo I mean that I can't test it without reproducing...)

I'll investigate some more !

Edit : If you have any error log that tracks a locked GameTimer previous to this error I'll want to see this :)
User avatar
Leodagan
Developer
 
Posts: 1350
Joined: Tue May 01, 2012 9:30 am
Website: https://daoc.freyad.net
Location: Lyon

Re: Server Lockup Randomly?!?!?

Postby Crazys » Sat Oct 18, 2014 12:40 pm

We were seeing this error around your mass Mob death test...

Is this happening with other scenarii ?

I think I made a "pseudo" patch for this, (by pseudo I mean that I can't test it without reproducing...)

I'll investigate some more !

Edit : If you have any error log that tracks a locked GameTimer previous to this error I'll want to see this :)
As I said The Zone locks with mob death where easier to follow. It would build up and kill it you could see errors going by.
This is just playing playing playing dead. And no one seems to be doing anything out of the ordinary. Just all the sudden blows up. I've had it happen with 4 people online and we had it happen weeks ago with 40... I thought back then this was also part of the mob death but I now realize it isn't.

And like I said there isn't anything else in the logs at all. The only other things prior to it are Server Stat Popups with amount of players in / out bla bla bla... and when you kill it to restart you window crash.

So far no issues since last night with this.. But like I said its been SOOO random... Seen it with in a minute of server reboot.. Seen the server up for 4 days with nothing...
Crazys
Contributor
 
Posts: 346
Joined: Tue Nov 07, 2006 10:18 pm

Re: Server Lockup Randomly?!?!?

Postby Crazys » Sat Oct 18, 2014 1:24 pm

1st Error after change
Unlike before where it just continuously spammed attempting to restore the region and it wasn't user suspended errors it just dropped the thread. Which gave very odd results in game as expected till reboot.
09:03:53,940 - [24] - INFO - DOL.GS.GameEvents.StatPrint - -stats- Mem=287MB Clients=22 Down=1kb/s (11MB) Up=6kb/s (197MB) In=25pck/s (323K) Out=168pck/s (6034K) Pool=1022/1023(2) IOCP=1000/1000(2) GH/OH=33/412 RegionTime1=135t/s (1226) CPU=15.1% DOL=4.4% pg/s=0.0 dsk/s=0.0
09:04:08,612 - [24] - ERROR - DOL.GS.GameEvents.RegionTimersResynch - ----- Found Frozen Region Timer -----
Name: RegionTime1 - Current Time: 30990416
09:04:19,278 - [24] - ERROR - DOL.GS.GameTimer+TimeManager - failed to stop the time thread "RegionTime1" in 10 seconds (thread state=Background, WaitSleepJoin); thread stacktrace:

09:04:19,279 - [24] - ERROR - DOL.GS.GameTimer+TimeManager - at System.Threading.Thread.SleepInternal in line:0 col:0
at System.Threading.Thread.Sleep in line:0 col:0
at DOL.GS.GameTimer+TimeManager.TimeThread in c:\DOLSharp\DOLSharp\trunk\GameServer\gameutils\GameTimer.cs line:1018 col:8
at System.Threading.ThreadHelper.ThreadStart_Context in line:0 col:0
at System.Threading.ExecutionContext.RunInternal in line:0 col:0
at System.Threading.ExecutionContext.Run in line:0 col:0
at System.Threading.ExecutionContext.Run in line:0 col:0
at System.Threading.ThreadHelper.ThreadStart in line:0 col:0

09:04:19,282 - [24] - ERROR - DOL.GS.GameTimer+TimeManager - aborting the thread.

09:04:19,285 - [RegionTime1] - WARN - DOL.GS.GameTimer+TimeManager - Time manager thread "RegionTime1" was aborted
System.Threading.ThreadAbortException: Thread was being aborted.
at System.Threading.Thread.SleepInternal(Int32 millisecondsTimeout)
at System.Threading.Thread.Sleep(Int32 millisecondsTimeout)
at DOL.GS.GameTimer.TimeManager.TimeThread() in c:\DOLSharp\DOLSharp\trunk\GameServer\gameutils\GameTimer.cs:line 1018
09:04:19,297 - [RegionTime1] - INFO - DOL.GS.GameTimer+TimeManager - started timer thread RegionTime1 (ID:34)
09:04:24,685 - [24] - INFO - DOL.GS.GameEvents.StatPrint - -stats- Mem=288MB Clients=22 Down=0kb/s (11MB) Up=1kb/s (197MB) In=21pck/s (324K) Out=39pck/s (6036K) Pool=1022/1023(2) IOCP=1000/1000(2) GH/OH=33/402 RegionTime1=12t/s (1143) CPU=16.2% DOL=2.1% pg/s=0.0 dsk/s=0.3
Crazys
Contributor
 
Posts: 346
Joined: Tue Nov 07, 2006 10:18 pm

Re: Server Lockup Randomly?!?!?

Postby Leodagan » Mon Oct 20, 2014 6:28 am

I don't have any workaround for now...

I'm trying to get use to profiler and other debug tool to pin-point this kind of troubles...

I made some special command for admin that allows to trigger some expected bug by running specific benchs, I'm trying to build some "death spam" bench for later experiment :)
User avatar
Leodagan
Developer
 
Posts: 1350
Joined: Tue May 01, 2012 9:30 am
Website: https://daoc.freyad.net
Location: Lyon

Re: Server Lockup Randomly?!?!?

Postby elcotek » Mon Oct 20, 2014 4:27 pm

on my server was the same problem, i have then changed to the lastest mysql version and have the inventorys/itemuniqe converted from innodb to MyISAM.

this have solved the problem on my Server..
Brotherland Final RvR/PvE/ToA http://brotherland.phpbb8.de/
User avatar
elcotek
Server Representative
 
Posts: 177
Joined: Mon May 12, 2008 9:28 pm
Website: http://brotherland-2.de
Location: Germany

Re: Server Lockup Randomly?!?!?

Postby Leodagan » Sat Oct 25, 2014 4:12 pm

I provided some patch to make the Region Timer Resync more useful :)

It shouldn't fail to restart a region anymore !!

This is committed in SVN Revision 3370.
User avatar
Leodagan
Developer
 
Posts: 1350
Joined: Tue May 01, 2012 9:30 am
Website: https://daoc.freyad.net
Location: Lyon


Return to “%s” Support

Who is online

Users browsing this forum: No registered users and 1 guest