Link Servers

    Joined
    Jan 31, 2015
    Messages
    1,696
    Reaction score
    1,199
    • Thinking Positive
    • Likeable
    Despite the recent performance improvements, even the most powerful servers seem to start choking once their number of online players approaches 30. It would seem that a good way to create a more expansive online multiplayer experience might be to allow community servers to network on an ad hoc basis.

    What's the potential for developing the ability to enable two (or more) community servers with identical settings to link their game worlds?

    For example if the network consisted of two servers, each could host an adjacent galaxy and all other galaxies on one side of a planar divide (i.e. ServerA hosts galaxy 0, 0, 0 and everything Z-positive of 0, 0, -1000 while ServerB hosts galaxy 0, 0, -2000 and everything Z-negative of 0, 0, -1000). Protocols would have to be developed for sharing assets on asymmetrical networks, but that's not impossible. I think the biggest problem area would be the border region that handles things moving from server to server. It would be nice if servers could distribute the hosting of elements within a single galaxy, but I fear it may be too problematic dealing with so many assets situated on server borders.

    Anyway - intergalactic/inter-server wars would be mad fun and though such a load distribution plan would do little to alleviate the strain of 20 players in a tactical engagement over a single system or so, it would greatly expand the strategic playability of the game.

    Note: I understand that this is not currently possible with the existing code - this is an inquiry regarding long-term development potential.
     
    • Like
    Reactions: Knack

    CyberTao

    鬼佬
    Joined
    Nov 10, 2013
    Messages
    2,564
    Reaction score
    641
    • Legacy Citizen 4
    • Railman Gold
    • Thinking Positive
    One thing to note is that it would relieve server stress, it just spreads it out. Running 2 servers and linking them might be more expensive than 1 with the combined specs.
    Another is that you'd need a way to send files back and forth from the different servers, which doesn't exactly exist and isn't within Schine's duties to create.
     
    Joined
    Jul 29, 2013
    Messages
    1,173
    Reaction score
    494
    • Competition Winner - Small Fleets
    • Top Forum Contributor
    • Legacy Citizen 5
    I seriously love this idea.

    Two individual servers would spread out the lag much better. Even the most powerful servers can only handle so much stress before slowing WAY down. So spreading those players over multiple servers would certainly help.

    This would probably be an absolute nightmare to program, however.
     
    Joined
    Jan 31, 2015
    Messages
    1,696
    Reaction score
    1,199
    • Thinking Positive
    • Likeable
    I seriously love this idea.

    Two individual servers would spread out the lag much better. Even the most powerful servers can only handle so much stress before slowing WAY down. So spreading those players over multiple servers would certainly help.

    This would probably be an absolute nightmare to program, however.
    It probably would, but it's worth considering. Anything can be accomplished with sufficient resources.
    [DOUBLEPOST=1427669242,1427667589][/DOUBLEPOST]
    One thing to note is that it would relieve server stress, it just spreads it out. Running 2 servers and linking them might be more expensive than 1 with the combined specs.
    Another is that you'd need a way to send files back and forth from the different servers, which doesn't exactly exist and isn't within Schine's duties to create.
    Duties...
    It's not in their duties to make video games either. But since they have, and (once again, CT) I assume that this forum someone (not I) made called "Suggestions" is meant to solicit ideas and thoughts from the player base, it seemed appropriate to share my idea here. I never dreamt that making a friendly, constructive suggestion would imply that it was anyone's "duty" to do or not to do anything. I apologize if my OP came off as demanding or aggressive.

    I don't know that my suggestion would relieve any stress at all, really. It would probably slightly increase server stress to link them.

    I never mentioned anyone buying two computers to host with - I specifically suggested enabling ad hoc networking between existing servers. Meaning that if two or more people/entities currently hosting vanilla servers wanted, they could link their servers so that players could travel from one to the other. Anyway, in terms of deep-pocket super-servers, at a certain level of performance, it does actually become cheaper to double performance by doubling the number of server boxes than to double it by buying bleeding-edge hardware to install in a single box. This is why larger multi-player operations set up for gaming use distributed server networks rather than one $5M supercomputer.

    Java is quite network friendly, overall, and sending files back and forth is what the servers currently do all day long between themselves and our client applications. Sending files back and forth is a server's whole job. Teaching them to send files back and forth to other servers is a far cry from turning lead into gold. Part of why I suggested a load distribution based on in-game geography was to minimize the need for communication anyway - as outlined in the OP the only communication required between linked servers would be to transfer a player from one to another along with the database info on his current inventory and his ship. Beyond that there wouldn't be any need for servers to synchronize assets frequently (although occasional synchronization - such as at server save times - might help prevent exploits, I'm not entirely sure)
     

    CyberTao

    鬼佬
    Joined
    Nov 10, 2013
    Messages
    2,564
    Reaction score
    641
    • Legacy Citizen 4
    • Railman Gold
    • Thinking Positive
    Sending files back and forth is a server's whole job.
    I find it hard to believe 2 server hosting companies would be willing to send information back and forth to each other. It is a business, and I'm not sure how they would think about having to share data and such with a competitor. If you make it client side, there will be exploits and such, because it will be in the client's hands for a short while. Could be wrong, could be right, could be virus hell because some people host on their own computers. I remember security being a big concern back when it was last tossed around. Might even go against Schema's goal of a 'seamless' universe, who knows.

    I'm not against the idea, if you went back to oldsite you'd see I supported it. It just isn't something that feels like Schine should do, but you could probably do it with some wrappers, bots, and a shared storage drive (probably) if it was desperately needed.
     
    Joined
    Mar 10, 2015
    Messages
    122
    Reaction score
    50
    • Community Content - Bronze 1
    • Purchased!
    • Legacy Citizen 5
    Networking is weird. I just want to put that up front.

    The basic idea is pretty simple, and also pretty sound. Having two computers is more or less the same as having one computer with two cores, and it's pretty common knowledge that more cores = less lag. Easy enough. The problem is making sure every connected computer is on the same page. Ideally, we want a decentralized model - where no one server is more important than any other - because that way, one server crash can't bring down the whole network.

    I have a lot of time and nothing better to do right now, so I'm just going to write a specification for how you might accomplish this. All of this is going to hinge on one really convenient fact: no user can be in more than one place at any one time. This is a distinct advantage over, say, an IRC network, where one user can receive messages from thousands of channels at once.

    I'm gonna call this the first draft, because I probably forgot something.

    Spread the load
    Assume all servers in the network have all of the world information all the time. We're going to actually solve this problem later, but for now just pretend like we already have.

    Each server in the network is responsible for one or more isolated areas of space. An "isolated area of space" is a contiguous group of loaded sectors. So, imagine, for example, that you have two players in 0,0,5, and three players in 5,0,0. They can't interact with each other, because they're too far away - they don't have any loaded sectors in common - so they can be handled by different servers, and no one would ever know the difference.

    In order to determine what servers handle what players, two heuristics are used:
    1. "load": How complicated an isolated area is to simulate. Example:
      Code:
      Number of players * combined mass of all entities * number of projectiles fired in the last 3 minutes
    2. "power": How powerful the server is. Example:
      Code:
      (Total RAM * CPU speed * CPU cores) / current average ping

    Not gonna lie, heuristics are basically computer magic. The definition is pretty vague and hand-wavy, but picking good ones is really important, which is just about the most annoying thing. I group them with matrices in the category of "math that someone literally just made up". I will fully admit that I suck at heuristics, someone who's better with them might be able to come up with better ones; there's also no reason why you couldn't outsource this expression to a config file.

    In any case, the load:power ratio of a server indicates how much work it's doing.

    Servers under high load may attempt to pass off some of their isolated areas to servers with a low load - this is called a "netsplit". To do this, they first select an area they wish to pass off (ideally something small). The high load server then broadcasts a message to all servers saying that it wishes to pass off that area, and it includes the load calculation for the area. Other servers in the network reply by either accepting or denying the request. In addition to accepting a request to move a sector group, a server may also request priority. It may do this because it believes it's going to need those clients soon anyway; for example, if the area being moved is close to some sectors the acceptor already has loaded. If at least one server accepts, the high load server selects the acceptor with the lowest load:power ratio and synchronizes all information about the players and entities in those sectors with the accepting server. Clients are informed of the split and moved to the accepting server. Once the high load server has finished the synchronization, it simply unloads the area. The accepting server ensures everything is loaded before beginning to simulate the area and send information to players.

    Periodically, all servers broadcast their load:power ratios to all other servers in the network. If convenient, two servers may elect to move sectors in order to more evenly distribute the load - for example, if one server is new to the network and has no load, the highest load server may try to give it some work.

    The entire netsplit process is going to suck for players in the area, but ideally splits should only affect a small portion of players for a brief period of time. There's not really any way around it, though; sometimes you just have to move people. Ideally, being moved should result in your ping going down anyway, since you're moving from a high load server to a low load server.

    Synchronization
    When you see "server autosaving", along with writing information to disk, all servers attempt to synchronize information with each other.

    New servers joining the network are always brought up to speed (see Registering new servers), so all we need to do here is inform everyone else about all the stuff that's changed (everything built, blown up, mined, etc.).

    The easiest way to do this is to broadcast a list of all recently changed chunks, their SHA1 hashes, and change timestamps. Servers on the network check this list and determine whether or not they need to download an update. If they do, they request the full chunk files from the originating server.

    This is going to involve a lot of information going a lot of places. To prevent ping spikes, the easiest thing to do is just throttle everything - this isn't a time sensitive process, every server has all the information it needs to serve all of its current clients. They can upload and download relevant information at their leisure. Realistically, if a server is under high load, it can afford to skip a synchronization and just wait for the next one to come around. There's also no reason why every server needs to synchronize at the same time as every other server, they can stagger the process. There's a lot of room for config options here.

    Moving clients
    Sometimes, a client will have to change servers. This is most likely to happen if a client enters a sector that is loaded by a different server. Imagine going through a warp gate from your base to your friend's base - if your base is on one server, and your friend's base is on another server, you'll have to connect to their server when you go through the gate.

    When a server determines that it needs to move a client, the client is given the address for the new server, and the old server closes the connection. The client displays some message and effectively pauses the game (ideally, if you're jumping, you just stay in the warp tunnel). The old server then synchronizes any relevant data with the new server (player inventory, chunk data for the player's ship, etc.). Once the new server has received and loaded all information, it synchronizes with the client. This completes the move.

    Players joining
    Whenever a player joins, they will contact one server (probably the one on the global server list, but not necessarily). When this happens, there are two possibilities (players that are entirely new to the world are effectively the same as players whose last known location is the spawn point):

    1. The player's last known position is in a sector that is currently loaded by some server on the network. In this case, if the player's last known location is currently loaded by the server contacted, it is responsible for the new player. Otherwise, the server contacted simply refers the client to the correct server and closes the connection. The client then automatically connects to the correct server, which is responsible for the new player.
    2. The player's last known position is not in any currently loaded sector. In this case, the server contacted is responsible for the new player if it believes it can handle the load. If it can't, it attempts to pass the connection off to the server with the lowest load. If it finds another server that can handle the request, it refers the client to that server and closes the connection. The client then automatically connects to the other server, which is then responsible for the new player.

    The server responsible for the new player informs all other servers on the network that the player has joined, loads in adjacent sectors (if required), and continues normally.

    Registering new servers
    Just to make things easy, a "network" is one or more servers simulating the same universe.

    When you have a network already (which may be only one computer), and you want to add a new server to it, the outside server asks one server in the network if it's allowed to join. As part of this handshake process, ideally a password should be involved, so that some random asshole can't connect to your network and pipe all of his traffic to /dev/null.

    If the network accepts the outside server, the contacted server sends to the outside server a full list of all servers in the network. The outside server then connects to every other server in the network. The outside server must then synchronize all of its world data the same way you torrent "legal" movies. Each server in the network sends the outside server all of its modified player and sector data. For the sake of simplicity, any data that isn't currently loaded by any server (sectors that just don't have any people in them right now, but still have stuff associated with them) should come from the initially contacted server.

    Once the outside server has received all world data, it broadcasts a message to all servers that it is ready to handle requests. This completes the process of adding the server to the network. Note that the server may spend a while doing nothing; this is because it's probably not worth it to cause a netsplit just to load up the new server if the network is currently under mild load. Instead, since its load is effectively 0, the new server will be the first to take sectors from high load servers, or accept new clients.

    To keep the global server list clean, each network should only be listed once. Without some kind of round robin style thing going on, the easiest way to accomplish this is to keep the list of all servers in the network ordered by connection time - that is, the first server on the network is first, then the second server to connect, then the third, and so on. The server at the front of the list is responsible for the global listing. If that server goes down, responsibility moves to the second server, and so on. As a side effect, this means that one server will handle the majority of all player join requests, but those aren't hard to handle (nor do they happen very often), so that's no big deal.

    Graceful shutdown
    If a server is to be terminated, it broadcasts a message that it will shut down soon and needs to move all of its clients. It includes in this message a list of each isolated area, and the load score for each. Servers reply by attempting to take a particular area, ideally the largest they can accept; sectors are then assigned in a way that minimizes the average load:power ratio.

    If any sectors are left over (no one accepted them), the closing server reserves the right to assign the sectors to another server in the network (with a message that basically says "you need to take this and it's an emergency"). The assigned server must accept, but it may try to move the sectors again later.

    Crashes
    If any server on the network crashes, all servers on the network must attempt to synchronize the data it was responsible for amongst themselves. It's unlikely that any of them have new information, but it's important for all of them to be on the same page, because the clients that crashed with the server are likely to reconnect. It is, unfortunately, up to each client to reconnect.

    Chat
    Turns out this is actually really easy. The new chat system uses an IRC backend, and IRC was basically designed for this.

    I would bet there's already something in the IRC library schema is using that would allow users to chat globally across the whole network, although it's not really any big deal if there isn't - just broadcast every chat message to all the other servers. Or, even better, if all servers know all user information all the time - what users are connected to which servers, what chat channels are open, what users are in each chat channel - servers that receive a chat message can determine themselves which other servers have to be notified of the message.

    Conclusion
    If this seems like a lot of work, it's not actually that bad. People tend to stay pretty dispersed in my experience; most of the time you probably won't ever have to move servers. When you do, it probably won't take long, and you'll probably be jumping - so the warp tunnel is a great loading screen. Unless you frequently fly into other people's space, you're unlikely to hop servers very often. Plus, the system is set up such that if you group a lot of people together, your server is likely to try to ditch its other loads and give you its undivided attention.

    This won't protect you from pingspikes from massive fleet battles with 50 titans, but it will protect everyone else from pingspikes when one stupid asshole is mining a planet. And that's worth something.

    In any case, this would be the mod to end all mods.
     
    Last edited:
    • Like
    Reactions: MacThule and Lecic
    Joined
    Jul 1, 2013
    Messages
    35
    Reaction score
    18
    This isn't really practical to link servers any time soon nor is it really all that beneficial yet. The problem with servers is they only use effectively one core so it matters only about clock speed and multi-core is basically worthless to a server right now. The better thing would be to spread out more efficiently on the one physical server first before event talking about linking physical servers.

    Having two computers is more or less the same as having one computer with two cores, and it's pretty common knowledge that more cores = less lag. Easy enough.
    This is by no means completely accurate. more cores is irrelevant if the program isn't designed to make use of them.

    This is why larger multi-player operations set up for gaming use distributed server networks rather than one $5M supercomputer.
    Supercomputers are also clusters of computers and more frequently nowadays, hybrid clusters where it's a combination of conventional CPU and GPU's to get more power out of less space, less nodes and more power and heat efficient. But again the software has to be designed to support this type of architecture for it to have any benefit over a single core of a single processor.
     

    Bench

    Creative Director
    Joined
    Jun 24, 2013
    Messages
    1,046
    Reaction score
    1,745
    • Schine
    • Wired for Logic
    • Legacy Citizen 6
    tl;dr will sometime later once i've had coffee. but just commenting off what i read in the OP, linking servers somehow is a suggestion we've heard before and we're not against that idea. We've thought about simple ways that might be achieved but nothing is planned at the moment. I'll try to remember to prefix this after reading through the whole thread.
     
    • Like
    Reactions: AssIn9
    Joined
    Jul 29, 2013
    Messages
    1,173
    Reaction score
    494
    • Competition Winner - Small Fleets
    • Top Forum Contributor
    • Legacy Citizen 5
    Would a simple "shared personal inventory and credits" system work for this? Perhaps as a super simple interim system that allows two servers to become "sister" servers?
     
    Joined
    Jan 31, 2015
    Messages
    1,696
    Reaction score
    1,199
    • Thinking Positive
    • Likeable
    tl;dr will sometime later once i've had coffee. but just commenting off what i read in the OP, linking servers somehow is a suggestion we've heard before and we're not against that idea. We've thought about simple ways that might be achieved but nothing is planned at the moment. I'll try to remember to prefix this after reading through the whole thread.
    Thank you for the response.
    I'm glad the dev team has examined some of the simple approaches to this. I know that there are a lot of other key features being worked on at the moment, so it's good just to know that this is something under consideration.
     
    Joined
    Mar 10, 2015
    Messages
    122
    Reaction score
    50
    • Community Content - Bronze 1
    • Purchased!
    • Legacy Citizen 5
    This isn't really practical to link servers any time soon nor is it really all that beneficial yet. The problem with servers is they only use effectively one core so it matters only about clock speed and multi-core is basically worthless to a server right now. The better thing would be to spread out more efficiently on the one physical server first before event talking about linking physical servers.



    This is by no means completely accurate. more cores is irrelevant if the program isn't designed to make use of them.



    Supercomputers are also clusters of computers and more frequently nowadays, hybrid clusters where it's a combination of conventional CPU and GPU's to get more power out of less space, less nodes and more power and heat efficient. But again the software has to be designed to support this type of architecture for it to have any benefit over a single core of a single processor.
    Given that all servers on the network would be required to think about isolated areas as their own independent packages, threading this would be really easy.
     
    Joined
    Jul 1, 2013
    Messages
    35
    Reaction score
    18
    Given that all servers on the network would be required to think about isolated areas as their own independent packages, threading this would be really easy.
    You completely missed the part where the game doesn't even utilize all the hardware it's currently given, giving it more hardware wouldn't help anything...
     
    Joined
    Jan 31, 2015
    Messages
    1,696
    Reaction score
    1,199
    • Thinking Positive
    • Likeable
    Would a simple "shared personal inventory and credits" system work for this? Perhaps as a super simple interim system that allows two servers to become "sister" servers?
    Actually there are even easier solutions than constantly sharing player data like that, I just don't know if all of them can work with the existing data structures or the developers' vision. Some of the simpler solutions are also less robust and may not harmonize with player experience. There are multiple possible solutions though, and the first one a developer attempts to use may not end up being the best one.

    Two servers can continue to operate almost completely independently of each other until PlayerA crosses an in-game boundary between the half of the universe server1 is hosting, and the half of the universe server2 is hosting. At that point S1 would just need to "hand off" the player's client to S2, and send S2 the information packet regarding the playerA's ship, inventory, etc that are coming across. That's most of it right there. PlayerA might experience a slight delay while their client synched with the new server, but would essentially continue to play as if the changeover had never occurred. I think the only data servers would need to constantly update each other about would be the players currently online (so that in-game you could see total players online) and the faction relationship & membership portions of the DB, since playerA on S1 might kick playerB off his faction while playerB is running around on space hosted by S2.

    CyberTao - The server is already constantly communicating with various clients over a variety of networks, hosted by many different companies. When you log on all you are doing is establishing a constant communication with the server - like making a phone call. Doesn't matter if you use one service provide and the person you call uses another, you can still send information back and forth. Verizon doesn't balk at making calls to AT&T, nor does Comcast balk at a server they provide network to communicating with clients and servers provided for by competitors. Establishing a link between servers has nothing to do with the companies hosting them - all those companies do is provide processing power to the application and bandwidth for it to communicate across the network. They don't discriminate whether the application you pay them to host talks to another an application being hosted by another company or not. That's how there's a World Wide Web of networks. You think all the companies hosting all the nodes on the web are in some kind of agreement with each other?
     
    Last edited:
    Joined
    Mar 10, 2015
    Messages
    122
    Reaction score
    50
    • Community Content - Bronze 1
    • Purchased!
    • Legacy Citizen 5
    You completely missed the part where the game doesn't even utilize all the hardware it's currently given, giving it more hardware wouldn't help anything...
    Threading isn't hardware, threading is software. More computers would absolutely help, that's literally the basis for every CDN ever. Why are you so upset about this?

    Two servers can continue to operate almost completely independently of each other until PlayerA crosses an in-game boundary between the half of the universe server1 is hosting, and the half of the universe server2 is hosting. At that point S1 would just need to "hand off" the player's client to S2, and send S2 the information packet regarding the playerA's ship, inventory, etc that are coming across. That's most of it right there. PlayerA might experience a slight delay while their client synched with the new server, but would essentially continue to play as if the changeover had never occurred. I think the only data servers would need to constantly update each other about would be the players currently only (so that in-game you could see total players online) and the faction relationship & membership portions of the DB, since playerA on S1 might kick playerB off his faction while playerB is running around on space hosted by S2.
    This is a little bit lighter weight, but it doesn't scale very well. Schema's not one to half ass stuff, he tends to go for a long term solution as soon as he can. There's enough information moving around here that if you were to create this system, you could go for the full package scaleable solution with only a little more effort.
     
    Joined
    Jul 1, 2013
    Messages
    35
    Reaction score
    18
    Threading isn't hardware, threading is software. More computers would absolutely help, that's literally the basis for every CDN ever. Why are you so upset about this?
    How am I upset? I'm pointing out the fact that making the game use more servers would be wasteful and pointless before it was able to utilize the full power of a server it's already given... Currently the server utilizes about 30% of a quad core server. That tells me that while it has a minor amount of work farmed out to other threads, the bulk of the game is single-threaded. Adding more servers would be wasteful, and expensive for server owners while providing no benefit, as it doesn't use all the hardware we already give it...
     
    Joined
    Jul 21, 2013
    Messages
    2,932
    Reaction score
    460
    • Hardware Store
    How am I upset? I'm pointing out the fact that making the game use more servers would be wasteful and pointless before it was able to utilize the full power of a server it's already given... Currently the server utilizes about 30% of a quad core server. That tells me that while it has a minor amount of work farmed out to other threads, the bulk of the game is single-threaded. Adding more servers would be wasteful, and expensive for server owners while providing no benefit, as it doesn't use all the hardware we already give it...
    Correction: observing a server during it's normal runtime will never result in 100% load, as any thread of the server will have a sleep-timer built in[and a sleeping thread obviously uses up next to no CPU power].
    You must overload the server to test how far it can go.
     
    Joined
    Jul 1, 2013
    Messages
    35
    Reaction score
    18
    Correction: observing a server during it's normal runtime will never result in 100% load, as any thread of the server will have a sleep-timer built in[and a sleeping thread obviously uses up next to no CPU power].
    You must overload the server to test how far it can go.
    I never said it should always use 100% but using 100% of one core and like 5% of all the others combined, hence the lag, is not utilizing the hardware at all...

    Also, under normal load, no a server should never use 100%, but when it's under a truly demanding load, ie peak time on game servers, using 100% is not really bad. It's better to let the server iterate as fast as it can even if using all the cpu if it's running under some pre-defined speed at which there's no point going faster...
     
    Joined
    Jan 31, 2015
    Messages
    1,696
    Reaction score
    1,199
    • Thinking Positive
    • Likeable
    Networking is weird. I just want to put that up front.

    The basic idea is pretty simple, and also pretty sound. Having two computers is more or less the same as having one computer with two cores, and it's pretty common knowledge that more cores = less lag. Easy enough. The problem is making sure every connected computer is on the same page. Ideally, we want a decentralized model - where no one server is more important than any other - because that way, one server crash can't bring down the whole network.

    I have a lot of time and nothing better to do right now, so I'm just going to write a specification for how you might accomplish this. All of this is going to hinge on one really convenient fact: no user can be in more than one place at any one time. This is a distinct advantage over, say, an IRC network, where one user can receive messages from thousands of channels at once.

    I'm gonna call this the first draft, because I probably forgot something.

    Spread the load
    Assume all servers in the network have all of the world information all the time. We're going to actually solve this problem later, but for now just pretend like we already have.

    Each server in the network is responsible for one or more isolated areas of space. An "isolated area of space" is a contiguous group of loaded sectors. So, imagine, for example, that you have two players in 0,0,5, and three players in 5,0,0. They can't interact with each other, because they're too far away - they don't have any loaded sectors in common - so they can be handled by different servers, and no one would ever know the difference.

    In order to determine what servers handle what players, two heuristics are used:
    1. "load": How complicated an isolated area is to simulate. Example:
      Code:
      Number of players * combined mass of all entities * number of projectiles fired in the last 3 minutes
    2. "power": How powerful the server is. Example:
      Code:
      (Total RAM * CPU speed * CPU cores) / current average ping

    Not gonna lie, heuristics are basically computer magic. The definition is pretty vague and hand-wavy, but picking good ones is really important, which is just about the most annoying thing. I group them with matrices in the category of "math that someone literally just made up". I will fully admit that I suck at heuristics, someone who's better with them might be able to come up with better ones; there's also no reason why you couldn't outsource this expression to a config file.

    In any case, the load:power ratio of a server indicates how much work it's doing.

    Servers under high load may attempt to pass off some of their isolated areas to servers with a low load - this is called a "netsplit". To do this, they first select an area they wish to pass off (ideally something small). The high load server then broadcasts a message to all servers saying that it wishes to pass off that area, and it includes the load calculation for the area. Other servers in the network reply by either accepting or denying the request. In addition to accepting a request to move a sector group, a server may also request priority. It may do this because it believes it's going to need those clients soon anyway; for example, if the area being moved is close to some sectors the acceptor already has loaded. If at least one server accepts, the high load server selects the acceptor with the lowest load:power ratio and synchronizes all information about the players and entities in those sectors with the accepting server. Clients are informed of the split and moved to the accepting server. Once the high load server has finished the synchronization, it simply unloads the area. The accepting server ensures everything is loaded before beginning to simulate the area and send information to players.

    Periodically, all servers broadcast their load:power ratios to all other servers in the network. If convenient, two servers may elect to move sectors in order to more evenly distribute the load - for example, if one server is new to the network and has no load, the highest load server may try to give it some work.

    The entire netsplit process is going to suck for players in the area, but ideally splits should only affect a small portion of players for a brief period of time. There's not really any way around it, though; sometimes you just have to move people. Ideally, being moved should result in your ping going down anyway, since you're moving from a high load server to a low load server.

    Synchronization
    When you see "server autosaving", along with writing information to disk, all servers attempt to synchronize information with each other.

    New servers joining the network are always brought up to speed (see Registering new servers), so all we need to do here is inform everyone else about all the stuff that's changed (everything built, blown up, mined, etc.).

    The easiest way to do this is to broadcast a list of all recently changed chunks, their SHA1 hashes, and change timestamps. Servers on the network check this list and determine whether or not they need to download an update. If they do, they request the full chunk files from the originating server.

    This is going to involve a lot of information going a lot of places. To prevent ping spikes, the easiest thing to do is just throttle everything - this isn't a time sensitive process, every server has all the information it needs to serve all of its current clients. They can upload and download relevant information at their leisure. Realistically, if a server is under high load, it can afford to skip a synchronization and just wait for the next one to come around. There's also no reason why every server needs to synchronize at the same time as every other server, they can stagger the process. There's a lot of room for config options here.

    Moving clients
    Sometimes, a client will have to change servers. This is most likely to happen if a client enters a sector that is loaded by a different server. Imagine going through a warp gate from your base to your friend's base - if your base is on one server, and your friend's base is on another server, you'll have to connect to their server when you go through the gate.

    When a server determines that it needs to move a client, the client is given the address for the new server, and the old server closes the connection. The client displays some message and effectively pauses the game (ideally, if you're jumping, you just stay in the warp tunnel). The old server then synchronizes any relevant data with the new server (player inventory, chunk data for the player's ship, etc.). Once the new server has received and loaded all information, it synchronizes with the client. This completes the move.

    Players joining
    Whenever a player joins, they will contact one server (probably the one on the global server list, but not necessarily). When this happens, there are two possibilities (players that are entirely new to the world are effectively the same as players whose last known location is the spawn point):

    1. The player's last known position is in a sector that is currently loaded by some server on the network. In this case, if the player's last known location is currently loaded by the server contacted, it is responsible for the new player. Otherwise, the server contacted simply refers the client to the correct server and closes the connection. The client then automatically connects to the correct server, which is responsible for the new player.
    2. The player's last known position is not in any currently loaded sector. In this case, the server contacted is responsible for the new player if it believes it can handle the load. If it can't, it attempts to pass the connection off to the server with the lowest load. If it finds another server that can handle the request, it refers the client to that server and closes the connection. The client then automatically connects to the other server, which is then responsible for the new player.

    The server responsible for the new player informs all other servers on the network that the player has joined, loads in adjacent sectors (if required), and continues normally.

    Registering new servers
    Just to make things easy, a "network" is one or more servers simulating the same universe.

    When you have a network already (which may be only one computer), and you want to add a new server to it, the outside server asks one server in the network if it's allowed to join. As part of this handshake process, ideally a password should be involved, so that some random asshole can't connect to your network and pipe all of his traffic to /dev/null.

    If the network accepts the outside server, the contacted server sends to the outside server a full list of all servers in the network. The outside server then connects to every other server in the network. The outside server must then synchronize all of its world data the same way you torrent "legal" movies. Each server in the network sends the outside server all of its modified player and sector data. For the sake of simplicity, any data that isn't currently loaded by any server (sectors that just don't have any people in them right now, but still have stuff associated with them) should come from the initially contacted server.

    Once the outside server has received all world data, it broadcasts a message to all servers that it is ready to handle requests. This completes the process of adding the server to the network. Note that the server may spend a while doing nothing; this is because it's probably not worth it to cause a netsplit just to load up the new server if the network is currently under mild load. Instead, since its load is effectively 0, the new server will be the first to take sectors from high load servers, or accept new clients.

    To keep the global server list clean, each network should only be listed once. Without some kind of round robin style thing going on, the easiest way to accomplish this is to keep the list of all servers in the network ordered by connection time - that is, the first server on the network is first, then the second server to connect, then the third, and so on. The server at the front of the list is responsible for the global listing. If that server goes down, responsibility moves to the second server, and so on. As a side effect, this means that one server will handle the majority of all player join requests, but those aren't hard to handle (nor do they happen very often), so that's no big deal.

    Graceful shutdown
    If a server is to be terminated, it broadcasts a message that it will shut down soon and needs to move all of its clients. It includes in this message a list of each isolated area, and the load score for each. Servers reply by attempting to take a particular area, ideally the largest they can accept; sectors are then assigned in a way that minimizes the average load:power ratio.

    If any sectors are left over (no one accepted them), the closing server reserves the right to assign the sectors to another server in the network (with a message that basically says "you need to take this and it's an emergency"). The assigned server must accept, but it may try to move the sectors again later.

    Crashes
    If any server on the network crashes, all servers on the network must attempt to synchronize the data it was responsible for amongst themselves. It's unlikely that any of them have new information, but it's important for all of them to be on the same page, because the clients that crashed with the server are likely to reconnect. It is, unfortunately, up to each client to reconnect.

    Chat
    Turns out this is actually really easy. The new chat system uses an IRC backend, and IRC was basically designed for this.

    I would bet there's already something in the IRC library schema is using that would allow users to chat globally across the whole network, although it's not really any big deal if there isn't - just broadcast every chat message to all the other servers. Or, even better, if all servers know all user information all the time - what users are connected to which servers, what chat channels are open, what users are in each chat channel - servers that receive a chat message can determine themselves which other servers have to be notified of the message.

    Conclusion
    If this seems like a lot of work, it's not actually that bad. People tend to stay pretty dispersed in my experience; most of the time you probably won't ever have to move servers. When you do, it probably won't take long, and you'll probably be jumping - so the warp tunnel is a great loading screen. Unless you frequently fly into other people's space, you're unlikely to hop servers very often. Plus, the system is set up such that if you group a lot of people together, your server is likely to try to ditch its other loads and give you its undivided attention.

    This won't protect you from pingspikes from massive fleet battles with 50 titans, but it will protect everyone else from pingspikes when one stupid asshole is mining a planet. And that's worth something.

    In any case, this would be the mod to end all mods.
    This is a more robust approach to networking servers for multi-player in the long-run. I would love to see this be the final solution, because it would enable more close interaction between large numbers of players in a universe. And as you say, the beauty is that it would insulate most players in a galaxy from idiots ramming titans and face-planting into planets and such.

    Without taking away from the fact that I really like the dynamic networking model you're outlining, I fear that it would also be harder to implement, and could be substantially more demanding on servers (by increasing processing overhead as each server ran processor load statistics through an algorithm to determine if it should hand off some of the work, and constantly comparing notes with other servers on the same network/universe to determine who can absorb that additional load should it be determined that partial handover is needed). The increased overhead from coordinating a dynamic network like this wouldn't be noticeable at all, of course, in most situations because of the workload distribution, but in certain situations... probably fleet battles mostly, where distributing processes would cause unacceptable lag to a cluster of so many things happening all at once in so close an area (even fractions of seconds of drag become noticeable in this situation) so one CPU has to just deal with it. Then the extra weight of also constantly checking to see if it should and can hand off any load to other network nodes seems like it would reduce the server's performance in the most high-stress situations specifically.

    So as you mention, a dynamic network would insulate most players from massive lag spikes (which would be SO nice), but for the players entangled in them it might actually compound the problem. Or not - I'm not even sure that your dynamic model would in fact increase processor overhead. I just don't personally know of a system to dynamically share loads that wouldn't, but it's not really my specialty.

    I could be wrong, but I also feel that - because of more frequent hand offs - a dynamic network like this is best suited to rigs that are hardwired together. It's definitely a more streamlined solution that dividing up the universe into halves, quadrants or 8ths even, but maybe only viable*if* there are community server hosts with pockets deep enough to implement it, isn't it? Again - I don't even know. I guess if the hand-off thresholds/tolerances were set up to be extra sticky for long-distance networking (i.e. only hand off load when a very high threshold had been breached to prevent constantly swapping back and forth) it might not matter.
    [DOUBLEPOST=1427734986,1427734508][/DOUBLEPOST]
    I never said it should always use 100% but using 100% of one core and like 5% of all the others combined, hence the lag, is not utilizing the hardware at all...

    Also, under normal load, no a server should never use 100%, but when it's under a truly demanding load, ie peak time on game servers, using 100% is not really bad. It's better to let the server iterate as fast as it can even if using all the cpu if it's running under some pre-defined speed at which there's no point going faster...
    This is a good insight into why single events (ramming, planet mining) lag out entire servers. The game isn't isolating them onto a core and letting everyone else continue playing on other cores with lower loads. If this is true, it's especially wasteful on a rig with 4 or 8 cores. If solving this issue is relatively equal in difficulty to enabling server networks, then this should absolutely take priority, since an 8 core rig would probably be able to serve several times as many players as it currently can. Server networks would still be something nice in the long term, and moreover may be a more viable solution if teaching SM to properly multi-core is substantially more difficult than teaching it to network servers.
    [DOUBLEPOST=1427735815][/DOUBLEPOST]
    This is a little bit lighter weight, but it doesn't scale very well. Schema's not one to half ass stuff, he tends to go for a long term solution as soon as he can. There's enough information moving around here that if you were to create this system, you could go for the full package scaleable solution with only a little more effort.
    Agreed. I don't know the developers' intentions, but a dynamic network is absolutely going to be more robust at larger scales. Preferable in all ways, assuming my concerns about its preference to co-located networking and potential for increased processor overhead and are unfounded. My solution was based on a projection about the size (and available funding) of the Starmade community remaining within an order of magnitude of current levels, as well as involving the least amount of trouble to implement. I think an epic-scale game like this (infinite galaxies) really would shine on a more scalable, powerful co-located network sharing loads dynamically, and if I were working on a project like this that would probably be my preferred long-term vision over simple ad hoc server linkups. The latter would still be preferable to the current total inability to link servers though!

    In the end, the developers will have to decide. I would prefer to see whatever solution fits best with the resources and vision of the development team. And at least now we know we may eventually see improved multi-player numbers through some kind of server networking :)
     
    Joined
    Jul 1, 2013
    Messages
    35
    Reaction score
    18
    This is a good insight into why single events (ramming, planet mining) lag out entire servers. The game isn't isolating them onto a core and letting everyone else continue playing on other cores with lower loads. If this is true, it's especially wasteful on a rig with 4 or 8 cores. If solving this issue is relatively equal in difficulty to enabling server networks, then this should absolutely take priority, since an 8 core rig would probably be able to serve several times as many players as it currently can. Server networks would still be something nice in the long term, and moreover may be a more viable solution if teaching SM to properly multi-core is substantially more difficult than teaching it to network servers.
    This is absolutely true.. I've run RedShift for 2 years, and know quite well the load behavior and issues with the server. Spreading out accross threads is easier than networking individual servers together as you can share memory, you just have to design the system to spread out. Obviously there's still issues like thread synchronization but it's still far simpler than networking individual servers together.

    We used to have a 16 core server at a much lower clock speed and after seeing how SM behaved we moved to what we have now which is a quad core 4GHz server and it performs far better thanks to the speed increase.
     
    Joined
    Aug 22, 2013
    Messages
    18
    Reaction score
    18
    To prove Koderz point, here is my server. Dual Xeon's with total 16 cores (8 cores each).

    Here is an example of how things normally should be, and most of the time are with ~20 people on:
    http://puu.sh/gVpdP/7580afa45b.png


    Here is an example of our server under "very high load" and things begin to lag with ~5 people and 1 mining a lot. Notice how the other cores actually have less load because the single task on the core taking the load is locking up other tasks from running properly.
    http://puu.sh/gVoYv/007318f586.png


    There is no need to allow the game to use multiple servers when it cannot even make use of all the processing power it already has available to it. If the game was actually better optimized to make use of what it has available, almost any need to have server clusters is removed until the game grew much larger.
     
    Last edited: