Table of Contents
After deploying the BigWorld cluster, an operations team should be in charge of maintaining and updating it as required.
The operations team should monitor the cluster usage on a daily basis using StatGrapher to detect unexpected issues. Disk usage should also be monitored as running out of disk is a common cause of multiple server issues particularly for tools machines.
The operations team should pay attention to any server crashes and report those to the BigWorld support team making sure all relevant information is collected and sent to the team. See the Server Operations Guide's chapter First Aid After a Crash for more information about the procedure for investigating and reporting a crash.
As a server cluster can encompass a large number of machines, it is critical to have a deployment plan in place for updating your game resources for regular game updates, as well as updating the BigWorld server binaries for critical patch releases.
Some common methods of distributing resources around a large cluster include:
-
NFS (Network File System)
This has the advantage of keeping all your cluster machines up to date at once, with the disadvantage of a large network load during startup / shutdown times and potentially during critical load situations where processes may need to dump a core file to the filesystem which will further impact network load.
-
RSync
The rsync utililty is an extremely fast and efficient tool for copying files between machines in your network. With some customisation for your our server cluster, this can be a good alternative for pushing releases to server machines.
-
RPMs (Package Management)
As outlined in the Server Operations Guide's chapter RPM, creating RPMs and setting up an in house YUM repository can be an effective way of keeping machines synchronised with the latest versions by a single push operation to the master repository.
Using this approach an RPM would be generated for the BigWorld server binaries (e.g.,
bigworld-server-2.0.1.x86_64.rpm
) while a seperate RPM would be generated for your game resources (e.g.,customer-gamename-1.0.2.noarch.rpm
. This would allow updates of game resources and BigWorld server binaries to be independently pushed to machines.An added advantage of this approach is that packages can be automatically downgraded in an emergency, for example when a bad game script update was pushed onto a live production server. This would be as simple as running the yum downgrade command.
Other options are also discussed in Secondary Storage Considerations.