This has been a hot topic at WSO2 as many of our users and customers often ask how they can deploy WSO2 API Manager in a multi-datacenter setup. This blog post is an effort to elaborate the technical details and internals of the multi data center (m-dc) deployment architecture.
Caching, clustering, deployment synchronization, worker-manager separation, API subscription/metadata sharing and traffic routing are some of the important areas engineers and devops have to focus on at the deployment stage. Each of these aspects will be explained in detail to provide more clarity.
Caching and clustering
The WSO2 platform uses Hazelcast as the clustering framework/engine, which is also a JSR-107 (JCache) provider. The platform has support for L1 and L2 cache, and the L2 cache is implemented as a distributed cache that adds and removes values from a Hazelcast distributed map. More information about WSO2 caching implementation can be found here: http://blog.afkham.org/2013/11/wso2-multi-tenant-cache-jsr-107-jcache.html
When it comes to a multi-dc deployment, WSO2’s recomendation is to set up the cluster local to a data center. This is done mainly to avoid environmental stability issues (such as the cache not getting synced instantaneously) due to network latencies across geographic locations.
This is the traditional setup where we will carry out the worker-manager deployment per data center. Each data center will have its own manager node managing the local cluster.
Deployment synchronization in a multi-dc deployment is two-fold: the local synchronization between nodes and across data center artifact synchronization.
In this step, among the data centers, a master needs to be selected. Even though there are multiple manager nodes within each data center, only one manager is configured to check-in artifacts (<AutoCommit>true</AutoCommit>) other managers will not commit anything. The data center with the privileged manager node will be the master data center.
We also have to setup SVN repositories at each data center and only the master will have a read/write repository whereas others will be read-only. There needs to be some mechanism (there are plenty of tools for this task) to synchronize these SVN repositories in a unidirectional manner.
Once new APIs are added from the master data center’s manager, it will take some time to synchronize across data centers because there won’t be a cluster message broadcasted across DCs to get an update; therefore, the nodes in slave data centers will eventually be consistent (artifacts will be polled periodically from the SVN). This can be expedited if required by reducing the polling interval and the SVN sync delay.
API publishing will only happen from the master data center, hence publisher application will only be deployed at the master.
API subscription and metadata sharing
When an API is published there are associated metadata like tags, throttling information, scope information, comments, and ratings. An API also has subscriptions coming in from both data centers; this means OAuth tokens, keys, and secrets. All this information is stored in the WSO2 registry database schema and API manager database schema. These schemas need to be replicated across the data centers. Traditionally, tools like Oracle Goldengate are used for this task and if it’s in EC2, the RDS replication can be used.
API traffic routing
Since the data centers will be eventually consistent, enabling session affinity (sticky sessions*) at the load balancer is the best option to avoid intermittent resource errors that are not found. This will also avoid any throttling inconsistencies that can occur in a multi-dc setup.
Complete deployment architecture