Table of Contents
We illustrate the previous chapters' guidelines with respect to handling fault tolerance and disaster recovery mechanisms in BigWorld by providing an example involving the use of transactions and how to make them work with fault tolerance and disaster recovery mechanisms.
We give here an example of a trading transaction for transferring an item between two player entities.
The transaction logic between the two player entities Alice and Bob is as follows:
-
Alice and Bob are within each other's Area of Interest (AoI). Alice's client informs her base entity that she would like to give Bob a Sword item.
Alice's base entity is passed the player name of Bob as part of the request from the client, along with the representation of the Sword item within Alice's inventory.
Alice's base entity adds an entry into her transaction list. This entry contains a unique transaction ID that identifies this transaction, Bob's player name, the state of the transaction (set to a symbolic constant called BEGIN), and the item in Alice's inventory.
Alice's base entity removes the Sword from Alice's inventory.
Alice requests a write to the database.
-
When the write to the database calls back, it indicates whether the write was successful or not.
If it was not successful, then there is a problem with the database, and the transaction is aborted (a message is sent back to the client informing Alice of this situation), and the Sword is added back to Alice's inventory.
Otherwise, if the write to the database was successful, the transaction action starts with Alice's base entity requesting an entity base mailbox lookup based on the player name, via the
BigWorld
.lookUpBaseByName
() method, and registers a callback to a functor containing the transaction ID. -
Alice's base entity gets notification with the base mailbox of Bob.
If Bob's base entity can't be found, the transaction is aborted by removing the transaction entry from the transaction list, adding the Sword back to Alice's inventory, informing Alice's client and calling another
writeToDB
().Otherwise, Alice's base entity calls a method on Bob's base mailbox and requesting that it add the Sword item to his inventory. We pass the item, transaction ID, Alice's player name and a mailbox back to Alice's base entity.
-
Bob's base entity adds the Sword item to its inventory (but marks it as unusable by Bob's client for the moment).
Bob's base entity adds an entry into its transaction list, with Alice's player name, the state of the transaction set to the symbolic constant REMOVE, and the same transaction ID that was passed in from Alice.
Bob's base entity then starts a write to the database, registering a callback to a functor object that holds Alice's base entity mailbox and the transaction ID.
-
Bob's base entity is called back with the result of the database write.
If it was unsuccessful, Bob should remove the sword from his inventory as well as the transaction entry in the transaction list (the transaction ID is stored in the functor callback).
If the write was successful, the item should be marked as usable for Bob's client.
Whether or not the database write was successful, Bob's base entity informs Alice of the success of the database write through her base mailbox that is supplied through the functor callback. Bob passes in the success flag, a mailbox to Bob's base entity and the transaction ID.
-
Alice's base entity receives the result of the transaction from Bob's side.
Alice removes the transaction entry from her transaction list.
If Bob indicated that the transaction was unsuccessful, Alice re-adds the item back to her inventory informs the client of the trading failure.
Alice writes to the database, and registers a callback to a functor that holds Bob's base mailbox and the transaction ID.
Alice notifies Bob that the transaction on her side is complete, by passing in the transaction ID.
-
Bob receives this notification, and removes the transaction entry with the given transaction ID.
If the CellApp that Alice's and/or Bob's cell entity resides on exits, all cell entities that have base entities will be restored to another CellApp. With this example scenario and transaction as described, there is not much of concern with regards to behaviour of restored cell entities as the transaction only involves BaseApps.
However, suppose the inventory system implementation was such that the player cell entities required knowledge of items, for example, what item a player was holding in its hands, which would need to be a OTHER_CLIENTS or ALL_CLIENTS cell entity property so that other players could view the item that a player was holding. If the cell entity was restored from an older version of its cell entity data when it was last backed up to the base entity, there could be inconsistencies in the cell entity state with respect to the base entity state.
For example, if Alice was restored to another CellApp, her cell entity could check with her base entity whether she still owned the item that she was holding, and if not, her cell entity should remove that item.
Cell entities that are restored do not have their
__init__
() method called, instead, after they
are restored with the cell backup data from their base entity, they have
their onRestore
() method called, and checks
such as these can be done in this method to make sure the state is
consistent with the base entity state.
If the BaseApp that Alice's and/or Bob's base entity resides on exits, those base entities will be restored to other BaseApps if they exist (if there is only one BaseApp, they cannot be restored).
As with cell entities, restored base entities do not have
__init__
() called on them, instead, they have
onRestore
() called on them when they are each
restored from their most recent base entity backup data. This is a good
place to do checks on uncompleted transactions.
For example, if the BaseApp that contained Alice exited, and Alice was restored onto another BaseApp (and perhaps Bob was too, and it could be a different BaseApp to where he was), then we need to replay any transactions that may have been underway.
For each transaction entry in Alice's transaction list, the entity needs to replay each transaction depending on the state that it's in.
For example, if it is in the BEGIN state, we resume the transaction from step 3 by looking up Bob's base entity, and continuing on.
If we are Bob, we may have transactions in the REMOVE state, and so we resume the transaction from step 6, and we tell Alice (or whoever the transactions' player name refers to) that they should complete the transaction on their end.
When we are starting the server and restoring from the database, the
base entities will be restored, and each of these will have
__init__
() called on them. The variable
BigWorld
.hasStarted
will be
False for restored base entities, so we can do similar
checks to what we have in the BaseApp fault tolerance section.
It is also the responsibility of the base entities to recreate the
cell entities, usually via createCellEntity
().
The space ID is archived with the entity when it is written to the
database, and this is present in the base entity's
cellData
dictionary.