CMX High Availability

I have had the privilege of Beta Testing the new CMX High Availability version this post is to run through how it was setup and different options of how to set up the HA. I will after testing post detailing my findings around how well the HA handles different situations.

For this testing I have had the CMX servers installed into AWS as I didnt have available on-site VM resources. Details of how to get CMX into AWS are available here

Setup

CMX high availability requires two servers. One is defined as the primary. This is the server CMX will prefer to remain active and the virtual IP address will use initially. A primary CMX server is any server installed normally and selecting a node type of Location or Presence. Another way to determine if the CMX server is a primary if the system can be logged into. The other server is defined as the secondary. The secondary will be synchronized with the primary and failed over to when the primary has issues. A server can be defined as a secondary during the installation process or a CLI command can be used to convert a primary to a secondary.

Requirements

Both Primary and Secondary instances must be in the same subnet. Following ports must be accessible between the primary and secondary instances:

Ports

Description

6378, 6379, 6380, 6381, 6382, 6383, 6385, 16378, 16379, 16380, 16381, 16382, 16383, 16385

Redis

7000, 7001, 9042

Cassandra database

5432

Postgres database

4242

High availability REST and web service

22

SSH port and used to synchronize files between servers

Virtual IP Address

With the HA system in place, after a failover, users must be redirected to the new CMX instance running on the secondary. In order to maintain the failover transparent from network connectivity point of view, we will be using the concept of Virtual IP (VIP). When both the primary and secondary are in the same subnet, a VIP address mapping will be used. In this setup, external systems are exposed to a VIP. This VIP is mapped to the real IP of the running primary CMX. When failover happens, VIP is remapped to the address of the secondary CMX. All this happens automatically without any human intervention.

Primary Install

Install CMX normally. In the web installer select the node type of Presence or Location. This type of installation does not require to specify the node type as primary. This is considered a stand alone server which can run as a primary.

Instructions for how to install CMX can be found here

Secondary Install

Install CMX as normal until the node type needs to be selected in the web installer. (instructions available here) A third option is provided for Secondary.

CMXHA1.png

Selecting this option will configure the system as a secondary and provide a link to the CMX High Availability Admin interface.

CMXHA2.png

The CMX High Availability Admin web interface runs on CMX port 4242 and can be accessed: http://cmx_ip_address:4242/. Login to the web interface using the userid cmxadmin and the password configured the cmxadmin userid during installation. After login the user interface will have status and configuration information. The role will be shown as Secondary for the system

Convert existing Primary to Secondary

A primary server can be converted to a secondary if the server has already been configured as a primary. Caution should be taken when converting a system to a secondary. This will result in loss of existing data when paired with a primary server. The convert must be done using the ‘cmxha‘ CLI command. The command to run is cmxha secondary convert. Upon confirmation and completion of the command the system will be converted to a secondary server.

$cmxha secondary convert
Are you sure you wish to convert the system to a secondary?
[y/N]: y
Stopping all services. This may take a while..
Stopped all services

Enabling High Availability

High Availability can now be enabled once the primary and secondary servers have been prepared. High availability can be enabled in CMX web interface or the CMX command line. The following are the options required to enable high availability:

  • Secondary IP Address: Secondary IP address
  • Secondary Password: Password for the cmxadmin account on the secondary server
  • Virtual IP Address: Virtual IP address to be used by the activer server
  • Failover Type:
    • Auto failover will allow CMX to automatically failover to the secondary server when a serious issues is detected.
    • Manual failover will require the user to initiate the failover from the web interface or command line. The failure will be reported to the user via notifications but no action taken more for manual failover
  • Notification Email Address: Email address to send notifications about high availability information or issues

Enable High Availability Web

In CMX navigate to the System tab and click the Settings icon. This will display a modal dialog with a variety of settings in CMX. Select the High Availability option to display the options required to enable High Availability.

cmxha3

Click the Enable button when all the options are provided to start enabling high availability. CMX will verify high availability settings and start to enable high availability between the primary and secondary. The web UI will return when the configuration has started successfully.

cmxha4

However high availability has not completed being enabled. The initial synchronization of all the data between the primary and secondary server can take a significant amount of time to complete. The user interface will indicate the state as Primary Syncing while the synchronization is being done.

cmxha5

When the synchronization has completed successfully the server on the primary will enter the state Primary Active.

cmxha6

When completed an information alert will be generated in CMX. In addition an email alert will be sent indicating the system is active and syncing properly.

cmxha7

Enable High Availability Command Line

From the command line run cmxha config enable. The command will then prompt for each of the high availability options. The same options in the web interface can be used in the command line. Unlike the web interface the command line will wait for the entire completion of high availability. Dots will progress to indicate the system is making progress to completing enablement of high availability. A connection issue occurring while waiting for completion will not impact the enablement of high availability. The process will continue to run and can be monitored from the web or command line interface later.

$cmxha config enable
Are you sure you wish to enable high availability? [y/N]: y
Please enter secondary IP address: 10.22.0.8
Please enter the cmxadmin user password for secondary:
Please enter the virtual IP address: 10.22.0.9
Please enter failover type [manual|automatic]: automatic
Please enter an email address for high availability
notifications: info@thewlan.net
Attempting to configure high availability with server:
10.22.0.8
Attempt to synchronize data from primary to secondary server.
........................
Successfully started high availability. Primary is syncing with
secondary.


Manual Failover:

WebUI

cmxha8

cmxha9

Click the Failover button to start failover to the secondary. The web button will gray out at first and finally disappear as the failover proceeds. The failover will be completed when the servers enter Failover Active states

CLI

From the command line run cmxha failover. The command will prompt for a confirmation and then start the failover to the secondary.

$ cmxha failover
Are you sure you wish to failover to the secondary? [y/N]: y
Starting failover from primary to secondary server: 10.22.0.8
Failover to secondary server has completed successfully

Failback

WebUI

cmxha10

cmxha9

Click the Failback button to start failback to the primary. The web button will gray out at first and finally disappear as the failback proceeds. The failback will be completed when the servers enter Active states.

CLI

From the command line run cmxha failback. The command will prompt for a confirmation and then start the fail back to the primary.

$ cmxha failback
Are you sure you wish to failback to the primary? [y/N]: y
Starting to failback to primary server from secondary server: 10.22.0.8
Starting to synchronize data from secondary to primary server
........................................................
Completed synchronization of data from secondary to primary server
Starting to synchronize data from primary to secondary server
...................................................
Completed failback to primary server

Disable HA

In CMX’s current format you have to disable HA in order to perform an upgrade (which I will do in another post).

In order to disable HA from the command line run cmxha config disable

[cmxadmin@10.22.0.7]$ cmxha config disable
Are you sure you wish to disable high availability? [y/N]: y
Attempting to disable high availability with server: 10.22.0.8
Successfully disabled high availability. Primary is no longer syncing 
with secondary.
Advertisements

10 thoughts on “CMX High Availability

    1. Hi Miniwalks,
      From the information i have been given from Cisco:
      For Licensing currently (as I believe to stay this way) is based on a per AP licence, the same as for CMX without HA. I will ask the question to the CMX BU to confirm if this is how its going to stay.
      Limits: Currently the HA pair need to be in the same subnet but I believe future plans are for this to be changed to allow each server to be in separate subnets – this is a feature I have requested to provide true redundancy.

      Like

  1. Vinodh

    Thanks Haydn. This article is very helpful.

    Is it mandatory to match the hardware specs on primary and secondary CMX servers? i.e. RAM, vCPU and HDD should be of same capacity on both the servers in HA.

    Like

    1. Hi Vinodh as far as I know both servers must have the same system specs but I will confirm and update.
      Another new feature that is about to be released is the ability to use an elastic ip and potentially your own load balancers to provide the VIP enabling you to have each box in seperate subnets. I am yet to be able to test this tho.

      Like

  2. Sandy

    Hey Haydan

    Good to see your post mate.

    We are in the process of configuring our CMX in HA. I want to confirm, with this HA configuration I have the following questions:
    – I am assuming the Database will be synched on both the servers automatically.
    – Is there a Keepalive or Heartbeat running between the 2 CMX instances ?? if yes, whats the time frame ? (few seconds) ??
    – Is there a preampt when Primary server comes back online ?? or the Secondary will stay as Primary till a failover.

    Regards
    Sandy

    Like

    1. Hi Sandy
      – The DB will be synced automatically when you establish the HA.
      – I believe there is a keep alive/ Heartbeat but not sure on the timing, I have requested this from Cisco and will update when I find out.
      – Currently there is no Preempt, you have to manually fail back. I have asked for this as a feature request so we might see it in future releases.
      On another not Cisco did mention that they are going to support elastic IPs to enable the HA pair to be in different subnets.

      Like

  3. Hi Sandy,

    From Cisco on the HA:
    – The keepalived heartbeat is every 10 seconds.

    On the current master, the following runs:
    – Network checks every 60 seconds
    – Self-checks every 2.5 minutes
    – Replication checks every 5 minutes (non-fatal)

    On the backup server:
    – Network checks every 30 seconds
    – Self-checks every 2.5 minutes
    – Replication checks every 5 minutes

    If the keepalived heartbeat detects a problem, it will move the VIP from the primary to the secondary. During our network check, we check the VIP and if it is not where we expect it to be, then we initiate the failover. So this failover trigger would happen within 60 seconds of detection.
    Checks that run on the master may also initiate a failover and could take up to 2.5 minutes to detect.
    Lastly, the backup server is checking the master and could also trigger the failover. One wrinkle here is that if it detects a problem, it will then “ask” the master when the last time the master’s self-check ran. If the master is reporting health in the last 5 minutes, the backup server will not trigger the failover.

    No plans for auto fail back once the primary recovers.

    Cheers
    Haydn

    Like

  4. Sandy

    Oh thats great information Haydn. Thanks for the details on HA. One more Question on Virtual Boxes, As per Ciscos website High end VMSE can track upto 90,000 unique devices. If I double up the resources on a VM, can I track double the devices (180000 devices) ??

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s