Page 1 of 2

December\January Development Objectives

PostPosted: December 4th, 2015, 12:03 pm
by Obi-Two
Greeting Galaxy and happy holidays.

October and November's where quiet on the development front, stabilising the network layer and fixing a few bugs have uncovered more problems than anticipated. so we continue into the new year stabilising the code and updating some of the procedures.

I have been given admin to the official github and pulled HTX's back ports and bug fixes over this will build on windows platforms, I have also add Lei2's additions to enable it to be build on Linux.

In an ideal world I'd like nothing more than to have the project multi platform sadly at present we are unable to fulfil this, further developement will continute on the unofficial_hope git this sadly will be Linux only

December and January objectives:
I have opened up forum topics relating to the mmo-server project. if you have any questions or wish to help please start there and we will help you if we can, I'd also like to start looking at updating the dependencies, taking a closer look at what boost does for us and any alternatives that might be more useful to us (though its early days yet).

Lei2 wrote: I have thread safety on the plan, but goal keeps to be cpu burn in reduction and TC stability,

May the force be with you! Enjoy your Christmas, Hanukkah, Winter Solstice, Kwanzaa and new years,

Re: December\January Development Objectives

PostPosted: December 6th, 2015, 9:19 am
by lei2
Greetings to all from my side too!

Good to hear the git will live on and is getting a central place to be. Might
be the mmoserver trunk will be helpful when it comes to detail questions later.

The multi platform issue ... well not much to add to that, all said. It simply
is the wrong track for an mmoserver. As it would be wrong track to run it on a
pda. Not for ideal world issues, just because it might be a bit boring to play
a single player galaxy all the time.

Indeed there didnt happen much in the public. Maybe I should report a bit of what
I'm currently doing.

One goal was to get rid of boost, thats what I did in the first place. So by now
I have a boost free trunk. Most boosts could be replaced by STL, I needed to add
tiny plugins to the Utils replacing things like boost pool or - for editing lazyness -
the lexical cast, that is now more or less like

Code: Select all
template <typename T>
T    lexical(char *from) {
       T ret;
       std::istringstream to;
       to >> ret;
return ret;

Config file and debug were also boost, so I wrote new modules that replaced them.
Debug Levels now can be changed at runtime, config is all in one file.

config is then simply used like

Code: Select all
cfg svrcfg("Zoneserver",argc,argv);
if(svrcfg.getValue("ZoneName","")==""){helpmsg(); return;}

It will then evaluate the configfile for a section [Zoneserver], extract the
keyword ZoneName and makes its value (ZoneName = tutorial) available to the
getValue. In case --ZoneName is used as argument, it overrides the configfile

The boost free code has 30% smaller footprint, the binary as well as in memory.

Another task was to get rid of the cpu burn in feature at idle operation. I tried
a couple of things, like adjusting sleeps with signals and pipes, what partially
went in the right direction.

In this process I had to apply a couple of changes to network stack too. According
to the last months discussion with eru, threading things up here looked promising.
But as it turned out, central network functions were not thread safe - not even
meant to be. For example, writing a message runs like that:

Message *m = gMessageFactory->EndMessage();

Between start and end all sort of things happend to be in code, like db queries,
calculations, calling methods, whatever. Thread nightmare.

Also the factory allocated a couple of k heap storage where all the messages were
embedded - cyclic in a row - with a regular garbage collection. In most code parts
the singleton factory was used, in some a dynamic allocated one. Even more weird -
the only error control was done by some asserts.

So, what to do. This is a central part, used frequent and everywhere in code, the
message lifetime monitoring is vital - so not a chance to rewrite all this stuff.
Allocating factories for each thread would be a way to go, but for the price of
having a heap (and its garbage collection needs) for each thread.

Alas I changed in main two things. At first, the singleton was made a multiton, making
sure that each thread automatically allocates its own factory. The message itself now
knows how to delete itself properly, all messages are subscribed to a central "Heap"
that takes care to dispose old and unneeded messages. Seems to work quite well by
now, but guess some review under load is needed.

Another issue I had with the Database manager. Running querys async is a nice idea, but
it has its price. The old way then async engine interface worked was like

async(sql,container,callback) - put all needed data into the container, run the query and
callback(container,result) - evaluate the result then using the container references.

The later added way was

asny(sql, [=]({ ... }); - ... evaluate the result like in the cppconnector

This is a nice one, neat,slim, fast but ... the evaluation is done in the worker thread,
so much more elegant, even if a bit tricky for nested queries.

The odd side is that callbacks and evaluations are executed in the database worker thread.
No real issue if the db management would have been designed and protected properly. But
this wasnt the case, so the question was how to continue. One way would have been to
rewrite all the old style asyncs to the new style and to make the db manager thread safe,
what - like in the message factory - isnt that easy if this wasnt on the table at design time.
Also this leads to awful backtraces ...

Another option - with a lot of rewrite too - was to replace the whole db manager with a
simple sync engine that can be used thread safe without a thread of its own, having a
very simple interface.

So by now all this rewrite is done, the queries now go like

db << "SELECT 1";
int x = db.u32get(0,0);

This is - more or less - a php like interface to the classic mysql driver. I do not like it
very much, because in a ++ environment it doesnt work very well, also it per se is not
thread safe. But that is a general mysql issue. The cppconector tricked around that a
bit, so did I.

The advantage on the other hand is that its working with no special care needs, without
management overhead, containers and all this stuff. Things happen where and when they are
expected to happen. What makes debug more a fun than a burden. Also this opens an easy way
to add and experiment with other storage engines, what will be intesting when it comes to
speed issues.

So as an impression how things currently work here is a top snapshot, the machine is
a 1GHZ 2core 2GRam box, running idle with a single zone:

Code: Select all
28245 root      20   0  123m 3040 2036 S  0.7  0.1   0:01.70 ANH:Chat
28219 root      20   0  132m 2704 1800 S  0.7  0.1   0:02.24 ANH:ConX
28265 root      20   0 73800 2540 1760 S  0.3  0.1   0:00.80 ANH:L:UI
28237 root      20   0 14748  956  768 S  0.0  0.0   0:00.00 ANH:P:Main
28279 root      20   0  111m  26m 3716 S  1.0  1.3   0:10.17 ANH:Z:tutorial

and in threads top -H:

28249 root      20   0  123m 3012 2036 S  0.0  0.1   0:00.05 ANH:C:AO
28247 root      20   0  123m 3012 2036 S  0.0  0.1   0:00.00 ANH:C:CLK
28257 root      20   0  123m 3012 2036 S  0.3  0.1   0:00.01 ANH:C:SE1
28252 root      20   0  123m 3012 2036 S  0.0  0.1   0:00.00 ANH:C:SERVICE
28251 root      20   0  123m 3012 2036 S  0.0  0.1   0:00.00 ANH:C:SRT
28250 root      20   0  123m 3012 2036 S  0.0  0.1   0:00.04 ANH:C:SWT
28253 root      20   0  123m 3012 2036 S  0.0  0.1   0:00.00 ANH:C:TIMER
28254 root      20   0  123m 3012 2036 S  0.0  0.1   0:00.00 ANH:C:TIMER
28255 root      20   0  123m 3012 2036 S  0.0  0.1   0:00.00 ANH:C:TIMER
28256 root      20   0  123m 3012 2036 S  0.0  0.1   0:00.00 ANH:C:TIMER
28245 root      20   0  123m 3012 2036 S  0.0  0.1   0:00.01 ANH:Chat
28259 root      20   0  123m 3012 2036 S  0.0  0.1   0:00.00 ANH:Chat
28219 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.00 ANH:ConX
28231 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.04 ANH:ConX
28269 root      20   0 73800 2520 1756 S  0.3  0.1   0:00.04 ANH:L:AO
28267 root      20   0 73800 2520 1756 S  0.0  0.1   0:00.00 ANH:L:CLK
28273 root      20   0 73800 2520 1756 S  0.0  0.1   0:00.00 ANH:L:MAIN
28272 root      20   0 73800 2520 1756 S  0.0  0.1   0:00.00 ANH:L:SERVICE
28271 root      20   0 73800 2520 1756 S  0.0  0.1   0:00.00 ANH:L:SRT
28270 root      20   0 73800 2520 1756 S  0.0  0.1   0:00.00 ANH:L:SWT
28265 root      20   0 73800 2520 1756 S  0.0  0.1   0:00.00 ANH:L:UI
28237 root      20   0 14748  956  768 S  0.0  0.0   0:00.00 ANH:P:Main
28239 root      20   0 14748  956  768 S  0.0  0.0   0:00.00 ANH:P:TASK
28223 root      20   0  132m 2696 1796 S  0.3  0.1   0:00.06 ANH:X:AO
28227 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.05 ANH:X:AO
28221 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.00 ANH:X:CLK
28258 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.01 ANH:X:SE1
28295 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.00 ANH:X:SE2
28226 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.00 ANH:X:SERVICE
28230 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.00 ANH:X:SERVICE
28225 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.00 ANH:X:SRT
28229 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.00 ANH:X:SRT
28224 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.00 ANH:X:SWT
28228 root      20   0  132m 2696 1796 S  0.0  0.1   0:00.05 ANH:X:SWT
28284 root      20   0  111m  26m 3700 S  0.3  1.3   0:00.02 ANH:Z:AO
28281 root      20   0  111m  26m 3700 S  0.0  1.3   0:00.00 ANH:Z:CLK
28294 root      20   0  111m  26m 3700 S  0.0  1.3   0:00.00 ANH:Z:SE1
28287 root      20   0  111m  26m 3700 S  0.0  1.3   0:00.00 ANH:Z:SERVICE
28286 root      20   0  111m  26m 3700 S  0.0  1.3   0:00.00 ANH:Z:SRT
28285 root      20   0  111m  26m 3700 S  0.3  1.3   0:00.02 ANH:Z:SWT
28279 root      20   0  111m  26m 3700 S  0.0  1.3   0:08.79 ANH:Z:tutorial
28296 root      20   0  111m  26m 3700 S  0.0  1.3   0:00.00 ANH:Z:tutorial

The next task on schedule is to work off my own errata - I guess in some parts
of the rewrite I was a bit too quick and too dirty, a prominent TODO and FIXME
list and at first to get rid of all the dynamic casts in code because this
address a known issue with the linux c++, they do not work over module borders.
Also I started to get rid of the ILC mechanism (the handleObjectReady callback)
since it makes not that much sense any more. And ... yes ... even the emperor
does not use void as the one and only method return ;)

So far the report for what was/is going on in Unofficial Edit land,

Greetz, lei

Re: December\January Development Objectives

PostPosted: December 7th, 2015, 2:16 pm
by Eruptor
Great work Lei, I would say much of your analysis is correct.

Lets take a little ride back in time, and let me try to explain why I think you are going in the right direction.

Looking back at 2009 when the project took another direction than mine, one of the very first things I did when leaving the team was to rewrite the message factory, making it support multithreading and using pre allocated buffers in different sizes and having built in diagnostics detecting memory leaks. And fixing the bugs involved with the old message handling, of course.

The new memory managers was kind of the ones used in small embedded systems (like OSE for early Ericsson cell phones) at that time.

One of the other things to fix was the network code, including removal of the garbage collection of "lost or forgotten" network messages who was schedule to run every 30 seconds. I can't even grasp the stupidity of that implementation and all the excuses when strange things happened. "It's client lag"...
Well it was not, it was the server discarding valid network packets.

The original implementation of messages was one of the biggest cause to all strange things that happened randomly, specially in combination with network code relying at bad packet handling and using wipes of "forgotten network messages" as a mean of keeping the server sane. I just had to repeat that...

Fixing these two issues gave me a much more healthy server, able to hold 500 player bots in Bestine.
And spread out over a complete zone, in clusters of 25 player bots at each location,I managed to get the server up to Medium load; That's 1 500+ player bots loaded.

It took me about 3 weeks after I left ANH to fix the above, and then about a week more to get the bots up and running, many thanks to Mugly for the bot support.

My conclusion back then was that the server was not in need of any complete rewrite from scratch, it just needed someone to look into the issues and fix it! Sorry to say, I was not allowed to touch the network code since we had our "specialist" fooling around in that code :)

Well, back to present time, and sorry if I got carried away but I still think it was an enormous waste of good effort from the early devs to scrap that server.

My point is that *I* think you are on the right track again, almost everything Lei suggest are things I have done similar things with or identified as issues. But remember that I have had 6 years to reflect on the server implementation, even though I have not been working on it for most of that time. But I do, from time to time. This week I will get ride of the old third party SpatialIndex, introducing a light weight SI more suitable for a SWG server.

Re: December\January Development Objectives

PostPosted: December 9th, 2015, 11:15 am
by lei2
Good to hear that my analysis is correct, I hope my implementation is that too ;)

From my point of view there are no holy cows in code, but managing a complex codebase with a couple of coders isnt that easy. Sometimes the coordination takes more room than the coding itself.

You def are a bit ahead with the code and protocol, Im still working on quite basic things. But guess a bit attention to WM and SI will improve things.

One issue Im also on atm is the WorldManagers getObjectById, returning an Object* from the object map. This is then casted back and fourth to other objects, casts that tend to fail in shared libs. My current approach here is in WorldManager.h
Code: Select all
template <typename T>
T                          getById(uint64 id){
                                        ObjectMap::iterator i = mObjectMap.find(id);
                                         if(i == mObjectMap.end()) return NULL;

                                         T it = static_cast<T>((*i).second);
                                         return it;

which gives the WM a nice Object getter:
Code: Select all
PlayerObject *player = gWoldManager->getById<PlayerObject *>(id);

Thats a bit the way I like how things can be told in c++. Even if it in the shard lib casting issue is not a clean solution. Since it needs to be done in the header, it will fan up to all the local types.

But thats only one thing, atm Im working through the factories to get rid of the ILC since they make more issues than doing things directly.

Is there any serious source for a bot? I already thought of making one to help on debugging, but so far other things seemd to be more important.

Greetz, lei

Re: December\January Development Objectives

PostPosted: December 15th, 2015, 10:36 am
by lei2
Another few days in ANH zone paradise ... get things sorted out piece by piece. The new db engine runs as expected. On load it is significant slower than the async db, but load speed wasnt the objective. Once the zone is up it doesnt matter, the queries have to be done either way and mysql itself is the limiting factor.

So meanwhile most of the ILC/async container stuff is gone. It straightens up things a lot and for my simple mind it helps a lot in understanding how and where things are done. Im not glad with saying so, but I have to admit that eru def is right with the historical issues.

One example is the state handling. What I found is the mother of a state handler, a base class, herits, lists, maps, shared pointers, a manager class, event callback driven and last but not least an object embedded class with own methods. This is not only a simple state handler, it is more something like a states infrastructure.

The only problem - as far as I understand the states topic - is that it handles states transitions, but not states permissions. Simply said for example in a building a char may not enter the mounted state, in a water cell most states except swimming or mounted would not be appropriate. Or maybe I missed that part in code.

For me the simple approach this topic would be a binary states mask and maybe a map, that for each state has a permit mask:
state to enter:
+----- stand
|+ ---sit
Permit mask: 1011 meaning kneel can be entered from stand, kneel and crouch, but not from sit.
So I have my current state 1000 the new state 0010 and the mask 1011 that or'ed together would result in 1000|0010|1011 == 1011 which is the mask. For sit to kneel transition it would be 0100|0010|1011 == 1011 which would not result true. Using an stl map<uint64 newState,uint64 permitMask> would enable 64 states to be checked, 21 the client knows, maybe some combat related as well. Make it smaller would not save much, on a 64bit machine the smallest addressable cell is 64bit wide. So this would be get, set method in object, a 64bit currentState and maybe a 64bit allowedState for cell and building objects and char skill as well, that would be and'ed with the newState. So a map, 2 vars, 3 methods and a lightweight manager that handles the transition and the resulting methods and packets to be issued ... states done.
Might be I missed something, this sounds too simple, doesnt it?

But this is just one issue among others. Lots of things have been implemented in sandbox state, also a snappy replacement for the event handler is needed since this was the part that didnt survive the deboost. All in all I decided to set the Version number for my "unofficial" trunk back to 0.6.0, which is more appropriate for the current code condition. 0.9.0 was quite a bit optimistic. Too many things that are only realized as proof of concept and need to get a draft board review, not really close to be a ready to play game server.

Also there still are lots of predetermined breaking points, so the zoneserver still from time to time drops asserts or segv. I hoped it would be possible to hand out a minimal version for TC before xmas, currently Im not sure if that will work out. Even if the servers now behave quite ok on a linux box I dont want to emergency patch too many things - there are already enough of those.

So far for the moment,

Greetz, lei

Re: December\January Development Objectives

PostPosted: December 15th, 2015, 5:45 pm
by Obi-Two
Greating guys, Sorry for my absence, I've had a lot on latley,
Fantastic work Lei2, I'm looking for ward to you uploading your trunk to the unofficial git and we can take her for a test drive :D

As for the states, this rings a bell, we had a problem with states and beong able to (for example) got prone while riding a mount, (

was always fun to see :D

keep up the good work, I hope to be around a little move over the holiday period, but we shall see haha.


Re: December\January Development Objectives

PostPosted: December 16th, 2015, 5:58 pm
by Eruptor

My strategy for a while has been to once and for all setup a structure of object ownership.
I use C++ Smart Pointers, and the one that "owns" an object hold a Shared Pointer to the object.
In case of object owned by the "world", WorldManager has some containers holding the Shared Pointers to such objects.

The goal is to get a clean chain of ownerships, so for example, when a player leaves the game everything owned by him is automatically de-allocated. Player owns an Inventory, Inventory owns several objects, some of those objects may contain other objects and so on.

To avoid cross referencing objects, all other "copies" of pointers to said object is of type Weak Pointer.
So containers holding duplicates like "knownObject" or the container used for getObjectById all holds Weak Pointers.

You never delete a Smart Pointer, you just have the pointer to go out of scope, and when the last reference is gone, the destructor of the actual object will be called.

Simple put, you will never get any null pointer exceptions due to objects that have been deleted. No more dangling pointers. A Shared Pointer is "always" valid. Weak Pointers can't be used before promoted to Shared Pointers and if validated and handled during the promotion, you will always know if you have valid objects. And it works with multithreading. Maybe a little simplified description, but I hope you get my idea.

It's a pita to do the conversion of pointers, and to get the server run again afterwards.

Regarding the states, there are a sort of command table that handles what commands you can do in different states, correction; postures. If you want to be able to attack players when mounted on a swoop, that's the place were you should go and fix that. (assuming the correct states posture are setup when you enter the swoop). There used to be validators and stuff using the data from the command table.
It's a long time since I was digging there, but if my memory serve me correct somewhere in that code is also the place where you filter keys and prevent spam (command_table.defaultTime in my DB).

Re: December\January Development Objectives

PostPosted: December 19th, 2015, 4:27 am
by lei2
Hey Eru,

the idea sounds somehow familiar, its a bit like a managed object, isnt it?

From the point of having some reference to a gone object in a queue sitting there and waiting to core the server ... yes, it would be great to keep the object alive until all references are gone. To encapsulate all the pointer stuff in Zone ... yup, thats indeed pita. Might be a good thing to write an encapsulating class for that, so future changes could take place then at a central point.

The states/postures indeed are done via command table, wiring up the event handler for the object controller. Nice idea to have mounted combat ;) This ObjectControl is quite twisted stuff ... a real pain to debug, Im currently getting the Npc conversations back online.

Another thing I'm a bit concerned of is complete absence of any kind of flow control in the zone, with all these lists, vectors and maps ... there is already a strange behaviour with npc spawns, sometimes this works as expected, sometimes some of them miss.

What I so far did not find - armor stats. They seem to miss in db.

Re: December\January Development Objectives

PostPosted: December 19th, 2015, 7:14 am
by Eruptor
Hi Lei,

It's like a managed object that you have the full control of. One big difference though, a managed object will not go away until the last reference is gone, that would be the same as using Shared Pointers only.

With the usage of Weak Pointers in containers and as references and such, the object goes away when the last reference from Shared Pointers are gone, no matter of how many Weak Pointers you have to the object.

Agree about the flow, I too still have that issue. Even with an infrastructure built for multithreading, the most of the "functionality" is still a mess with public data, almost every piece of code is fiddling with data structures in other parts of the code etc.

I have started refactoring the object controller modules to be thread safe, but there are also some/many common and essential data like objects position, direction and similar that you must be able to have instant access to, it's not practical sane to use any fancy messaging system for that kind of data.

I'm prepared to let the system pay the price of more copying of data instead of just pointer references.
For example, if Object A need the position or direction of Object B, then Object A will call a method and get copies back of the required data. During the actual copy, the access to the data object (like a Vector3 for a position) will be protected with a "lock". In this case I'm going with a read-write lock pattern; meaning the lock itself will only be hindering access if someone tries to write to the data.

The goal is to remove ALL uncontrolled references between objects, Without that fixed I can never guarantee the functionality when running multiple threads at the same time.

To clarify, today the server runs multithreaded, but isolated within "blocks". Networking, database (down to the actual MySQL), Scheduler and Session handling (it's the sessions that feeds the Object controllers).
There are also several WorldManager stuff running independently.

So even if the server can run players with multiple threads, the operations are not safe since there are a lot of fiddling with each others data at the same time. Sooner or later the server will protest.
When stability is needed I have to schedule one thread only for handling of players, or more exactly, handling just one Session at the time.

I have never gotten to the point where I needed armor stats. I have some vague idea of implementing the combat rule set and support a very limited amount of weapon and armor, just enough to prove the concept.
Implementing all weapons/armor/buffs/stats etc. is beyond my current scope.

Re: December\January Development Objectives

PostPosted: December 22nd, 2015, 4:11 pm
by lei2
Hmm, this shared-containing-weaks-container is indeed intresting, but as far as having played with those stls it is not that easy to concinve the compiler to handle them properly, same issue as with the dynamic casts. Copying sounds reasonable, most times there are just basic types or small structs needed and that would help to improve reliability.

Locking ... well, nothing in threads still is as it was. Network alas for my tiny tests seems to work ok, Login and CX as well, Chat is mostly untested and Zone ... it currently runs, for a while, but more with luck than anything else. After having done most of the asserts the majority of its segvs are determined, a few though are not what points to locking issues. One of the issues is that most of the code works in garbage in, segv out state. Hard lesson to learn working with references and casts and tests should be done without crashbug reports ;) Garbage in assert out btw is better, but not much.

For the locking I do not separate read from write access since this complicates the locking procedure a lot. To get some kind of understanding when and where things collide, I use timed lock with a logfile notice on timeout to trace issues, that works quite well for the moment.

With the armor I just was curious cause I'm at equip manager atm and thought by chance there might be somewhere a table around that has the armor values. Tutorial is next, need to find out why sometimes its npcs spawn, sometimes not ... maybe thats the first locking issue then.

So still a lot of work to do. The code currently is not in a publisheable state, too much fundamental issues to solve, too much things with concept issues. If anyone would like to take a glance I have a binary snapshot prepared. But I guess it makes not much sense to publish it as it would need some decent linux experience to get it into operation, and its ubuntu 12.04 32 bit only. But if there is interest I could upload it.

Only left to say merry xmas to all,

Greets, lei