Sibolo

Dedicated server performance

101 posts in this topic

Hi all, I'm Sibolo from the BDR and we are having lots of trouble running our DS for our ATC Campaign.

ATC is a multiclan PvP campaign with 80 players played once a week.

During tests we have done, before the start of the campaign, our old server was showing great lag problems with very low server FPS and all strange of behaviours in game (scripts failing and so on) and it was really unplayable.

We tried different missions (both light non scripted missions and the well known Valhalla) just to understand if it had to do with the mission we were using and the server kept this lagging behaviour, so our attention was focused on the server performance. We discovered to have a lot of lost packets on our server connection, and the hosting company told us that they had recently setup a filter on UDP traffic to avoid hacker attacks.

We tried playing on another clan's server and found out that although the FPS were quite low, nobody had lags or any other problem.

So we decided to get a new server on this clan's hosting company and we bought a more costly and powerful option to try and be sure we would get good performance.

This is the new server specs:

Intel® Xeon® E3-1245 Quadcore

incl. Hyper-Threading Technology

RAM16 GB DDR3 RAM ECCHard disks2 x 3 TB SATA 6 Gb/s HDD

7200 rpm (Software-RAID 1)

Enterprise classNIC1 GBit OnBoard

connected at 100 MBitBackup Space100 GBTrafficUnlimited*

Once our server was up we started testing again to find out we had the same lag problems. I investigated on the server and found out that it was only using one of the 8 cores (I mean really using, some threads were on the other cores but they were not working intensely) and infact, although we had very low server FPS around 1-3, the maximum CPU usage was only 15%.

So reading on some very interesting threads on this forum I found out that the server should have autodetected the number of cores, but that sometimes this fails so I put manually the Exthreads=7 and cpucount=4.

This showed a very different behaviour, now the server was actually using up to three cores and the overall cpu usage went up to around 30%.

So reading further I understood that since Hyperthraeding splits cpu's in two, in this case it would have been better to switch it off so that those 3 cores could have more power to them.

So now HT is off and the server can use during games up to 75% of the cpu power.

On our game this means that the FPS has greatly improved, infact during the same mission with almost the same number of players we have higher FPS which actually stays in the 30-40 zone....but suddenly it goes down to around zero, than after a second it comes back up to high values. This gives smooth game for about 30 secs and than a big lag to everybody: unplayable.

Further investigating showed that maybe those lags can be due to sudden big I/O operations so I thought of a disk bottleneck and started looking at the possibility of using a ramdisk and tried the Fancycache program, but we are still having these problems.

During the tests we have tried fiddling around with the server performance settings and we used the default values, the same values as our friend clan (which runs smooth but with low FPS) and also values from the Kelly's Heroes site who have a nice guide on these settings.

So now we are really lost, we are still paying both servers and any of your help would be greatly appreciated.

Share this post


Link to post
Share on other sites

a) post your server.cfg and basic.cfg please

b) your high player count tests have been with 1.59 or also with 1.60

c) did you try to optimize your mission?

Share this post


Link to post
Share on other sites

Thank you for your reply:

a)

Server.cfg

HostName="ATC SERVER - {BDR} ITA Clan";

Password="xxx";

PasswordAdmin="xxxx";

reportingIP="arma2pc.master.gamespy.com"; // This is the default setting. If you change this, your server might not turn up in the public list. Leave empty for private servers

logFile="serveratc_console.log"; // Tells arma-server where the logfile should go and what it should be called

timeStampFormat=full;

motd[] = {

"", "", "",

"Benvenuto nel server BDR",

"Welcome to BDR server, welcome to Arma Tactical Combat",

" ",

"Regole/Rules:",

"Linguaggio offensivo non tollerato /Offensive language is not tolerated",

"Teamkilling non tollerato/ Teamkilling is not tolerated",

" ",

"Buon Divertimento!! / Have fun!!",

"Raggiungici sul sito : www.bdrclan.com",

"Join us on :www.bdrclan.com, www.arma-tactical-combat.com",

"",

};

motdInterval = 5;

// JOINING RULES

checkfiles[]={

};

maxPlayers=100; // Maximum amount of players. Civilians and watchers, beholder, bystanders and so on also count as player.

Kickduplicate=1; // Each ArmA version has its onw ID. If kickduplicate is set to 1, a player will be kicked when he joins a server where another player with the same ID is playing.

verifySignatures=1; // Verifies the players files by checking them with the .bisign signatures. Works properly from 1.08 on

equalModRequired=0; // If set to 1, player has to use exactly the same -mod= startup parameter as the server.

// VOTING

voteMissionPlayers=1; // Tells the server how many people must connect so that it displays the mission selection screen.

voteThreshold=0.33; // 33% or more players need to vote for something, for example an admin or a new map, to become effective

// INGAME SETTINGS

disableVoN=1; // If set to 1, Voice over Net will not be available

vonCodecQuality=0; // Quality from 1 to 10

persistent=1; // If 1, missions still run on even after the last player disconnected.

// SCRIPTING ISSUES

onUserConnected=""; // self-explaining

onUserDisconnected="";

doubleIdDetected="";

regularCheck="";

//onDifferentData="";

// some ArmA specific stuff - signature verification

onUnsignedData = "kick (_this select 0)"; // unsigned data detected

onHackedData = "kick (_this select 0)"; // tampering of the signature detected

onDifferentData=""; // data with a valid signature, but different version than the one present on server detected

BattlEye = 0; //Server to use BattlEye system

Basic.cfg:

We have tried different configs including default values without solving our problems.

The config that I retain the best choice is the one I copied off Kelly's Heroes site but I am not sure it is:

/*

Example ArmA2 configuration file

by [KH]Jman, 8th September 2011. http://www.kellys-heroes.eu

These example numbers are for a 2.5 - 3Ghz Quad Core Xeon on a 100mBit connection.

*/

/*

Bandwidth the server is guaranteed to have (in bps).

This value helps server to estimate bandwidth available.

Increasing it to too optimistic values can increase lag and CPU load

as too many messages will be sent but discarded. Default: 131072

*/

MinBandwidth=10000000;

/*

Bandwidth the server is guaranteed to never have.

This value helps the server to estimate bandwidth available.

*/

MaxBandwidth=2147483647;

/*

Maximum number of messages that can be sent in one simulation cycle.

Increasing this value can decrease lag on high upload bandwidth servers. Default: 128

*/

MaxMsgSend = 1024;

/*

Maximum size of guaranteed packet in bytes (without headers).

Small messages are packed to larger frames.

Guaranteed messages are used for non-repetitive events like shooting. Default: 512

*/

MaxSizeGuaranteed = 1024;

/*

Maximum size of non-guaranteed packet in bytes (without headers).

Non-guaranteed messages are used for repetitive updates like soldier or vehicle position.

Increasing this value may improve bandwidth requirement, but it may increase lag. Default: 256

*/

MaxSizeNonguaranteed = 64;

/*

Minimal error to send updates across network.

Using a smaller value can make units observed by binoculars or sniper rifle to move smoother.

Default: 0.01

*/

MinErrorToSend = 0.0040000002;

/*

Users with custom faces or custom sounds larger than this size are kicked when trying to connect.

Use this wisely as it can be the cause of alot of Join in Progress lag.

1600000 = 160k

*/

MaxCustomFileSize=0;

// EOF

b)

All our tests have been done on the 1.59 stable release. We now have moved to 1.60 stable only a few tests were conducted without much improvement but further testing is needed (not easy to keep 80 people up for testing...).

c)

Surely I've done what I could to optimize the mission scripts.

These were the first beleived cause of the problem, but then we tried different missions which had already been played in the past and had the same behaviour. Also tried some light missions, and the well known DAO Valhalla missions, still having problems. So this made us think it's mainly (but not only) a server issue.

Share this post


Link to post
Share on other sites

This is just to "clean up"

Server.cfg

motd[] = {

"", "", "",

"Benvenuto nel server BDR",

..........

"", <- not the last ","

};

It's in bits pr second. An of course it's a guesstimate

Basic.cfg

MinBandwidth=41943040; (40 mbps)

MaxBandwidth=104857600; (100 mbps)

and the ones to play with is these three I think

MaxMsgSend = 1024;

MaxSizeGuaranteed = 1024;

MaxSizeNonguaranteed = 64;

And why do you don't use default in v1.60 instead of:MinErrorToSend = 0.0040000002;

But the server probably won't use more than 4MB RAM.

You could set up the server with RAMDISK and put the most used addons in that RAMDISK.

Also you could use "mklink" to make links to the addons in RAMDISK.

After that you probably have to consider the mission.

Share this post


Link to post
Share on other sites

Basic.cfg

MinBandwidth=41943040; (40 mbps)

MaxBandwidth=104857600; (100 mbps)

40Mbps is 40000000, 100Mbps is 100000000, (b) bits.

Since he states server is connected at 100Mbps, no reason not to use it all. There still is overhead, for 100mbps, should be around 3~8Mbps

MinBandwidth=92000000;

MaxBandwidth=100000000;

and the ones to play with is these three I think

MaxMsgSend = 1024;

MaxSizeGuaranteed = 1024;

MaxSizeNonguaranteed = 64;

MaxMsgSend is how many messages, can be sent in a simulation cycle. this should allow you tweak more of the bandwidth. A message cannot be larger than 1,500Bytes (B), the only time you will see a full 1,500Byte message is in LAN. So after some conservative calculations, assume message will not be larger than 1,450~1,472; we will go with 1,450Bytes. 8,316 1,450Byte messages can be sent over 92Mbps.

MaxMsgSend=8192;

Theres no real reason to not use the default MaxSizeGuaranteed, or MaxSizeNonguaranteed. Since they are less than a full message size, the message is sent once its full at 1,450Bytes.

But the server probably won't use more than 4MB RAM.

Server will never use a mere 4MB of ram, it idles using around 100MB, and gets as large as 1,200MB

...Syn...

Share this post


Link to post
Share on other sites

The server.cfg and basic.cfg look OK.

As ...Syn... points out you could tweak the MaxXXX values. The MinErrorToSend is pretty important too.

Or use the defaults by not setting the values.

However the problem is to test with 80 impatient people. :)

My advice would be to try to get 2-3 servers ready to test for one night.

This way you may have a higher chance for success and you can see if one server is the problem.

Also prepare a second mission you know that works as backup.

To let people play this one instead if your tests fail - to keep them happy playing something at least.

One more idea: Run the mission on your server alone for a few minutes and then use #missions.

After that upload the mpStatistics.log here.

You said the server had low fps with the 80 player, didn't you? Anyway use #monitor 1 to get the info while testing.

Share this post


Link to post
Share on other sites

You should use:

verifySignatures = 2;

And also, this value:

regularCheck="";

should be:

regularCheck="{}";

From the Wiki:

regularCheck is also known to cause sporadic (10 mins - 2 hrs) disconnects, terminating the client with "You were kicked off the game." on the client side and "Player Test disconnected." in the console log. To turn this function off, write regularCheck="{}";. But beware, this will also make the server more prone to cheating (even though most cheats are averted when connecting)

Probably doesn't help with the slowdowns, but just stuff I noticed. :)

Share this post


Link to post
Share on other sites

@...Syn...

So he will guarantied get 100mbit network?

I have never seen a 100mbit get 100mbit all the time.

I know my guesstimate for 40mbit is low, but then again, is just for A2 to calculate whatever it calculates.

I think your values are far off, but you may be more correct than me. Since I only have 12-18 people and only 26mbit network.

And of course 100mbit ain't 100000000 bits, since its 1k ain't 1000, but 1024.

I see I have written 4MB, but I mean 4GB of memory. But you guys got that one.

Then he has 12 GB RAM left to play with.

And the explanation for MaxMsgSend was good. But I doubt you're numbers will hold on "da intanet". But if he get's those values you say, it would be awesome.

Share this post


Link to post
Share on other sites

Wasn't the regularCheck issues supposed to be resolved/improved with v1.60?

Is this is not the case?

Share this post


Link to post
Share on other sites
Wasn't the regularCheck issues supposed to be resolved/improved with v1.60?

Is this is not the case?

I am wondering this myself.

Share this post


Link to post
Share on other sites
Wasn't the regularCheck issues supposed to be resolved/improved with v1.60?

Is this is not the case?

It was supposed to be, but in my testing with the current 1.60 Linux DS, things have only improved slightly. It still randomly kicks legitimate players and causes a significant amount of desync.

I haven't created a CIT ticket because I don't have any actual information on it - no logs or anything that could indicate the source of the problem.

Share this post


Link to post
Share on other sites

Do you have -netlog running (does that work in linux machines even)?

Share this post


Link to post
Share on other sites

I'm also having a performance problem with my server running RP missions..

I start a mission and everything seems fine...checking with monitor and server fps stays at 48 steady all the time, the it suddenly goes to 0-5fps like "freezing" and dsyncing players, and it comes back, sometimes really quick, sometimes not (2-15sec).

Cpu load stays under 50% all the time, then it drops to zero during that "freeze" period, and comes back again..I've checked rpt files and it all seems fine.

Has anyone seen this?

72003044.jpg

It's annoying and happens every couple minutes..

Edited by ArmaholicBR

Share this post


Link to post
Share on other sites
Do you have -netlog running (does that work in linux machines even)?

It works on the Linux DS.

I'm also having a performance problem with my server running RP missions..

Use the Linux DS if you're running ZL.

Share this post


Link to post
Share on other sites

I've read all your posts paying a lot of attention to all the matters.

One thing that really interested me, following your suggestion about the regularCheck command is this:

From the BIS WIKI:

If you do not include the regularCheck option or set regularCheck=""; it will be activated. This means the server checks files from time to time by hashing them and comparing the hash to the hash values of the clients. Since newer server versions this has lead to some lag spikes on certain systems, because the whole file is hashed in one burst. (The heavy I/O operation essentially blocking the whole server application for 1-5 secs, depending on the file size)

regularCheck is also known to cause sporadic (10 mins - 2 hrs) disconnects, terminating the client with "You were kicked off the game." on the client side and "Player Test disconnected." in the console log. To turn this function off, write regularCheck="{}";. But beware, this will also make the server more prone to cheating (even though most cheats are averted when connecting)

The description of the effects of the regularCheck on "some" systems is amazingly similar to what I've come up with in fact my own description of the problem in the first post was:

"we have higher FPS which actually stays in the 30-40 zone....but suddenly it goes down to around zero, than after a second it comes back up to high values. This gives smooth game for about 30 secs and than a big lag to everybody: unplayable.

Further investigating showed that maybe those lags can be due to sudden big I/O operations so I thought of a disk bottleneck and started looking at the possibility of using a ramdisk and tried the Fancycache program, but we are still having these problems."

How can I disable regularCheck?

Is it the regularCheck="{}"; that Das Attorney suggested?

Thank you.

Share this post


Link to post
Share on other sites

Use the Linux DS if you're running ZL.

No way to fix the "freezing" under Win2003? I can't format the server :(

I have an aplicaion running there that needs windows..

Share this post


Link to post
Share on other sites

Server.cfg

persistent=1; // If 1, missions still run on even after the last player disconnected.

This option impacts high performances.

Share this post


Link to post
Share on other sites
No way to fix the "freezing" under Win2003? I can't format the server :(

I wasn't able to fix it. The only thing I can suggest is to try changing -cpucount and -exthreads, but that may not work.

Share this post


Link to post
Share on other sites
Server.cfg

persistent=1; // If 1, missions still run on even after the last player disconnected.

This option impacts high performances.

Yes, persistent is turned on. The problem with performance could happen, but i think it isnt whats causing it.. As you can see, cpu doesnt even reach 50% load. Even with noone playing, i can see that the freeze still occurs, coming back to Life in a few seconds..

Share this post


Link to post
Share on other sites

I get this problem also (server)... after awhile it looks like the game migrates to one cpu... I have had cpu 1 go to max after a while and then re-run the same mission I have had cpu 4 go to max... very strange and I am interested in a tweak to get around this...

Share this post


Link to post
Share on other sites

definitely as pointed out, MinErrorToSend default value was changed to 0.001 since 1.60 and new setting was introduced: MinErrorToSendNear

read updated

http://community.bistudio.com/wiki/basic.cfg

more details explained by SUMA: http://forums.bistudio.com/showpost.php?p=2029545&postcount=5

additionally take in mind new packet sizes were introduced (so make sure Your MaxSize* values aren't higher than 1300 but that's not case of this server in OP)

---------- Post added at 10:56 ---------- Previous post was at 10:50 ----------

I've read all your posts paying a lot of attention to all the matters.

One thing that really interested me, following your suggestion about the regularCheck command is this:

From the BIS WIKI:

If you do not include the regularCheck option or set regularCheck=""; it will be activated. This means the server checks files from time to time by hashing them and comparing the hash to the hash values of the clients. Since newer server versions this has lead to some lag spikes on certain systems, because the whole file is hashed in one burst. (The heavy I/O operation essentially blocking the whole server application for 1-5 secs, depending on the file size)

regularCheck is also known to cause sporadic (10 mins - 2 hrs) disconnects, terminating the client with "You were kicked off the game." on the client side and "Player Test disconnected." in the console log. To turn this function off, write regularCheck="{}";. But beware, this will also make the server more prone to cheating (even though most cheats are averted when connecting)

The description of the effects of the regularCheck on "some" systems is amazingly similar to what I've come up with in fact my own description of the problem in the first post was:

"we have higher FPS which actually stays in the 30-40 zone....but suddenly it goes down to around zero, than after a second it comes back up to high values. This gives smooth game for about 30 secs and than a big lag to everybody: unplayable.

Further investigating showed that maybe those lags can be due to sudden big I/O operations so I thought of a disk bottleneck and started looking at the possibility of using a ramdisk and tried the Fancycache program, but we are still having these problems."

How can I disable regularCheck?

Is it the regularCheck="{}"; that Das Attorney suggested?

Thank you.

obsolete info,

unless someone proves me it's still happening on actual latest betas....

also do not disable the regularcheck as you technically sack all the security with that decision

i see where you get this http://community.bistudio.com/wiki/server.cfg#Comments

the page was re-created in June 2011 that means the comment is even older (1.59 or 1.58 RC times)

thus the info is completely flawed for actual release 1.60 and 1.61 betas - i cleaned up the BIKI page and replaced it with bit more valuable info

Edited by Dwarden

Share this post


Link to post
Share on other sites

Thank you Dwarden for clearing this out, although I was hoping it could be a solution.

I have disabled this checks and I must say that I haven't had those lag spikes anymore. On the most serious test we made the spikes were gone, but for some strange reason we had other problems like scripts not initialized, and about 20 people suddenly moved to open sea after a few minutes of play. So the test was aborted after about 10 minutes from start. When we moved to the other server, we managed to play for about 1 hour and then we had an enourmous freeze for about 3 minutes. When the server recovered, a lot of peolpe were kicked out by BE, and could not even see the server in the list to rejoin. The remaining people could instead continue playing.

Another strange thing that is happening (since 1.60) is at mission start, we have a syncronization process in which the clients wait for public variables to be initialized by the server. This used to go almost unseen previously, but now it always takes about 1 minute, (in fact the first time we believed the server had frozen or crashed), but then it recovers and seems to work fine.

All these problems are really making it hard to continue and keep up our ATC campaign, so I ask you all to continue suggesting solutions to these problems.

Thank you.

Share this post


Link to post
Share on other sites

I'm a little confused on the math regarding MaxMsgSend. According to the Biki, the value determines the maximum number of messages that can be sent "per simulation cycle". By my understanding, a server can perform up to 50 simulation cycles per second, meaning any bandwidth calculations would have to be multiplied by 50.

With that in mind, setting the "MaxSizeXXX" values to 1300 (bytes per message) and MaxMsgSend to 1024, we would be sending 1300x1024=1331200 bytes per message. Per second would be multiplied x50, so 66560000 bytes, or ~63.48MB. Per second... :confused:

Isn't that a little too much for a 100MBit connection, which should have a maximum throughput of 12.5MB/sec? Looking at this post, it seems to me that either my understanding of the values is wrong, or VisceralSyn is missing something.

Share this post


Link to post
Share on other sites

If the server 'Frame per second' is synonymous with "per simulation cycle". The only time the server will ever be at 50FPS is when its idle, or you are running a mission with 0 scripts in it. You'll see the server sends very little data when its not churning away on scripts, easy to test in a 100Mbps LAN.

That said, MaxMsgSend would be used for setting the maximum amount of messages, which should allow you to control bandwidth. The MaxSizeGuaranteed and MaxSizeNonguaranteed, when changed from default should net you some very large numbers. While its very hard to confirm, with google-searching, I vaguely came up with using smaller messages is supposed to make gaming more efficient, while large message sizes enhance file transferring. This comes into the territory of Gigabit Ethernets' jumbo packets.

OK: my math, MaxMsgSend, is what should the maximum amount of messages you want sent. Remembering that data is measured in Bytes, and data transfer measurements in bits.

MaxMsgSend=1024

MaxSizeGuaranteed=1300 ( 1300 is the max size in bytes, in bits thats 10,400bits )

1024*10,400bits = 10,649,600bits ( 10.6Mbps )

If you want to assume that the server is sending that up to 50 times a second, at the most. Then yes you should come up with an insane number like 532,480,000bits.

In 100Mbps LAN I use MaxMsgSend=8192; and MaxSizeGuaranteed=484; and MaxSizeNonguaranteed=242;

484byte = 3,872bits

3,872 bits * 8192 = 1,874,048bits

and If you multiply that times 50, the maxbandwidth used should be 93,702,400bits. Which, should come up to about 93.7Mbps. Or in case you want to use bytes/s= 11.71MB/s.

So yes, if you don't do your math right, you get real funny numbers.

Also I use this tool, for testing different network settings.

Course, I might also be missing something, too!!!

Edited by VisceralSyn
typos and other grammatical errors, as usual...

Share this post


Link to post
Share on other sites
1024*10,400bits = 10,649,600bits ( 10.6Mbps )

If you want to assume that the server is sending that up to 50 times a second, at the most. Then yes you should come up with an insane number like 532,480,000bits.

I don't want to assume anything, I'm simply trying to figure out how to optimize the server bandwidth settings. And since the Biki describes MaxMsgSend and messages per cycle, not per second, I believe this to be correct. Of course the server never really maintains that framerate in large missions, so we could use something more realistic for our calculations. Let's say 30. Still doesn't help much.

So yes, if you don't do your math right, you get real funny numbers.

Which part of my math is wrong? Between this comment, the "insane number" comment, and your attitude as a whole, you're coming off a little condescending for my taste. Which is funny, because:

484byte = 3,872bits

3,872 bits * 8192 = 1,874,048bits

and If you multiply that times 50, the maxbandwidth used should be 93,702,400bits. Which, should come up to about 93.7Mbps. Or in case you want to use bytes/s= 11.71MB/s.

Emphasis mine. That highlighted part didn't even look plausible at first glance, so I did the math myself. 3,872 bits * 8192 is 31,719,424 bits. How the hell did you come up with a number over ten times lower? :confused:

Anyway, if you multiply that times 50, you get around 1.5Gbps. Even if you assume the server fps will never go over 30, it'll still need a gigabit connection to send that number of messages. Unless, of course, my math went insanely wrong somewhere. :rolleyes:

Also I use this tool, for testing different network settings.

Thanks, I'll check that out.

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now