Clusters

Intro
Installation
SysAdmin
Objects
Transfer
Access
Directory
Data
Clusters
WebMail
Miscellaneous
Licensing
HowTo
  • Cluster Types
  • Supported Services
  • Frontend Servers
  • Cluster Server Configuration
  • Static Clusters
  • Dynamic Clusters
  • Assigning IP Addresses to Shared Domains
  • Security Issues
  • Cluster Configuration Details
  • Cluster Of Clusters
  • When your site serves more than 150,000-200,000 accounts, or when you expect a really heavy IMAP/WebMail traffic, you should consider using a Cluster configuration.

    If your site serves many domains, you may want to install several independent CommuniGate Pro Servers and distribute the load by distributing domains between the servers. In this case you do not need to employ the special Cluster Support features. However if you have one or several domains with 100,000 or more accounts in each, and you cannot guarantee that clients will always connect to the proper server, when you need dynamic load balancing and very high availability, you should implement a CommuniGate Pro Cluster on your site.

    Many vendors use the term Cluster for simple fail-over or hot stand-by configurations. The CommuniGate Pro software can be used in fail-over, as well as in Distributed Domains configurations, however these configurations are not referred to as Cluster configurations.

    A CommuniGate Pro Cluster is a set of server computers that handle the site mail load together. Each Cluster Server hosts a set of regular, non-shared domains (the CommuniGate Pro Main Domain is always a non-shared one), and it also serves (together with other Cluster Servers) a set of Shared Domains.

    To use CommuniGate Pro servers in a Cluster, you need a special CommuniGate Pro Cluster License.

    Please read the Scalability section first to learn how to estimate your mail server load, and how to get most out of each CommuniGate Pro Server running your multi-server (Cluster) site.


    Cluster Types

    There are two main types of Cluster configurations: Static and Dynamic.

    Each Account in a Shared Domain served with a Static Cluster is created (hosted) on a certain Server, and only that Server can access the account data directly. When a Static Cluster Server needs to perform any operation with an account hosted on a different Server, it establishes a TCP/IP connection with the account Host Server and accesses account data via that Host Server. This architecture allows you to use local (i.e. non-shared) storage devices for account data.
    Note: some vendors have "Mail Multiplexor"-type products. Those products usually implement a subset of Static Cluster frontend functionality.

    Accounts in Shared Domains served with a Dynamic Cluster are stored on a shared storage, so each Cluster Server (except for frontend Servers, see below) can access the account data directly. At any given moment, one of the Cluster Servers acts as a Cluster Controller synchronizing access to Accounts in Shared Domains. When a Dynamic Cluster Server needs to perform any operation with an account currently opened on a different Server, it establishes a TCP/IP connection with that "current host" Server and accesses account data via that Server. This architecture provides the highest availability (all accounts can be accessed as long as at least one Server is running), and does not require file-locking operations on the storage device.


    Supported Services

    The CommuniGate Pro Clustering features support the following services: The WebUser Interface module maintains user sessions even if subsequent page requests come to the Backend server through different Frontend servers.


    Frontend Servers

    Clusters of both types are usually equipped with frontend Servers. Frontend Servers cannot access account data directly - they always open connection to other (backend) Servers to perform any operation with account data.

    Frontend servers accept TCP/IP connections from client computers (usually - from the Internet). In a pure Frontend-Backend configuration no accounts are created on any Frontend Server, but nothing prohibits you from serving some domains (with accounts and mailing lists) directly on the Frontend servers.

    When a client establishes a connection with one of the Frontend Servers and sends the authentication information (the account name), the Frontend server detects on which Backend server the address account actually resides, and establishes a connection with that Backend Server.

    The Frontend Servers:

    If the Frontend Servers are direcly exposed to the Internet, and the security of a Frontend Server operating system is compromised, so someone gets unauthorized access to that Server OS, the security of the site is not totally compromised. Frontend Servers do not keep any Account information (mailboxes, passwords) on their disks. The "cracker" would then have to go through the firewall and break the security of the Backend Server OS in order to get access to any account information. Since the network between Frontend and Backend Servers can be disabled for all types of communications except the CommuniGate Pro inter-server communications, breaking the Backend Server OS is virtualy impossible.

    Both Static and Dynamic Clusters can work without dedicated Frontend Servers This is called a symmetric configuration, where each Cluster Server implements both Frontend and Backend functions.

    In the example below, the domain1.dom and domain2.dom domain Accounts are distributed between three Static Cluster Servers, and each Server accepts incoming connections for these domains. If the Server SV1 receives a connection for the account kate@domain1.dom located on the Server SV2, the Server SV1 starts to operate as a Frontend Server, connecting to the Server SV2 as the Backend Server hosting the addressed Account.
    At the same time, an external connection established with the server SV2 can request access to the ada@domain1.dom account located on the Server SV1. The Server SV2 acting as a Frontend Server will open a connection to the Server SV1 and will use it as the Backend Server hosting the addressed account.

    In a symmetric configuration, the number of inter-server connections can be equal to the number of external (user) access-type (POP, IMAP, HTTP) connections. For a symmetric Static Cluster, the average number of inter-server connections is M*(N-1)/N, where M is the number of external (user) connections, and the N is the number of Servers in the Static Cluster. For a symmetric Dynamic Cluster, the average number of inter-Server connections is M*(N-1)/N * A/T, where T is the total number of Accounts in Shared Domains, and A is the average number of Accounts opened on each Server. For large ISP-type and portal-type sites, the A/T ratio is small (usually - not more than 1:100).

    In a pure Frontend-Backend configuration, the number of inter-server connections is usually the same as the number of external (user) connections: for each external connection, a Frontend Server opens a connection to a Backend Server. A small number of inter-server connections can be opened between Backend Servers, too.

    Withdrawing Frontend Servers from a Cluster

    To remove a Frontend Server from a Cluster (for maintainance, hardware upgrade, etc.), reconfigure your Load Balancer or the round-robin DNS server to stop redirection of incoming requests to this Frontend Server address. After all current POP, IMAP, SMTP sessions are closed, the Frontend Server can be shut down. Since the WebMail sessions do no use persistant HTTP connections, a Frontend Server in a WebMail-only Cluster can be shut down almost immediately.

    Acceess to all Shared Domain Accounts is provided without interruption as long as at least one Frontend Server is running.

    If a Frontend server fails, no account becomes unavailable and no mail is lost. While POP and IMAP sessions conducted via the failed Frontend server are interrupted, all WebUser Interface session remain active, and WebUser Interface clients can continue to work via remaining Frontend Servers. POP and IMAP users can immediately re-establish their connections via remaining Frontend Servers.


    Cluster Server Configuration

    This section specifies how each CommuniGate Pro Server should be configured to participate in a Static or Dynamic Cluster. These settings control inter-server communications in your Cluster.

    First, install CommuniGate Pro Software on all Servers that will take part in your Cluster. Specify the Main Domain Name for all Cluster Servers. Those names should differ in the first domain name element only:

    back1.isp.dom, back2.isp.dom, front1.isp.dom, front2.isp.dom, etc.
    Remember that Main Domains are never shared, so all these names should be different. You may want to create only the Server administrator accounts in the Main Domains - these accounts can be used to connect to that particular Server and configure its local, Server-specific settings.

    Use the WebAdmin Interface to open the Settings->General->Cluster page on each Backend Server, and enter all Frontend and Backend Server IP addresses. Backend CommuniGate Pro Servers will accept Cluster connections from the specified IP addresses only. If the Frontend Servers use dedicated Network Interface Cards (NICs) to communicate with Backend Servers, specify the IP addresses the Frontend Servers have on that internal network:

    Backend Server Addresses
    Frontend Server Addresses


    Static Clusters

    Shared Domains in a Static Cluster are created in exactly the same manner as regular CommuniGate Pro Domains. Each Server in a Static Cluster contains a subset of all Shared Domain Accounts. As a result, each Shared Domain Account has a "Host Server". Only the Host Server needs access to the Account data, so Static Clusters can use regular, non-shared disk storage. Static Clusters rely on some method that allows each Cluster Server to learn the name of the Host Server for any Shared Doman account. This type of routing can be implemented using a shared Directory Server, in the same way it is implemented for Distributed Domains:

    If your Backend Servers use non-standard port numbers for mail services, change the Backend Server Ports table on the Settings->General->Cluster page:

    Backend PortCache Backend PortCache
    Cluster: SMTP:
    POP:  IMAP: 
    ACAP:  PWD: 
    HTTP User:  HTTP Admin: 

    For example, if your Backend Servers accept WebUser Interface connections not on the port number 8100, but on the standard HTTP port 80, set 80 in the HTTP User field and click the Update button.

    The Cluster and SMTP port, as well as their Cache values should be specified for the Dynamic Cluster Servers only.

    Backend and Frontend Server Settings

    The CommuniGate Pro Static Cluster setup is an extension of the Distributed Domains configuration.
    Static Clustering
    Static Member NameMember Address

    If an address is routed to a domain listed in this table, the CommuniGate Pro Server uses its Clustering mechanism to connect to the Backend server at the specified address and performs the requested operations on that Backend server.

    The logical setup of the Backend and Frontend Servers is the same - you simply do not create Shared Domain Accounts on any Frontend Server, but create them on your Backend Servers.

    Computers in a Static Cluster can use different operating systems.

    A complete Frontend-Backend Static Cluster configuration uses Load Balancers and several separate networks:

    In a simplified configuration, you can connect Frontend Servers directly to the Internet, and balance the load using the DNS round-robin mechanism. In this case, it is highly recommended to install a firewall between Frontend and Backend Servers.

    Adding Server to a Static Cluster

    You can add Frontend and Backend Servers to a Static Cluster at any time.

    To add a Server to a Static Cluster:

    After a new Frontend Server is configured and added to the Static Cluster, reconfigure the Load Balancer or the round-robin DNS server to direct incoming requests to the new Server, too.

    After a new Backend Server is configured and added to the Static Cluster, you can start creating Accounts in its Shared Domains.

    Withdrawing a Server from a Static Cluster

    If you decide to shut down a Static Cluster Backend Server, all Accounts hosted on that Server become unavailable. Incoming messages to unavailable Accounts will be collected in the Frontend Server queues, and they will be delivered as soon as the Backend Server is added back or these Accounts become available on a different Backend Server (see below).

    Backend Failover in a Static Cluster

    If a Backend Server in a Static Cluster is shut down, all Accounts hosted on that Server become unavailable (there is no interrupt in service for Accounts hosted on other Backend Servers).

    To restore access to the Accounts hosted on the failed Server, its Account Storage should be connected to any other Backend server. You can either:

    After a sibling Backend server gets physical access to Account Storage of the failed server, you should modify the Directory so all Servers will contact the new "home" for Accounts in that Storage. This can be done by an LDAP utility that modifies all records in the Domains Subtree that contain the name of the failed Server as the hostServer attribute value. The utility should set the attribute value to the name of the new Host Server, and should add the oldHostServer attribute with the name of the original Host Server. This additional attribute will allow you to restore the hostServer attribute value after the original Host Server is restored and the Account Storage is reconnected to it. If the CommuniGate Pro is used as the site Directory Server, 100,000 Directory records can be modified within 1-2 minutes.


    Dynamic Clusters

    The Static Clusters described above can be used to handle extremely large (practically unlimited) Internet sites, providing 24x7 site access. In a rare case of a Backend Server failure, the Static Cluster continues to operate and access to accounts on the failed Server can be restored within 2-10 minutes (depending on how easily the disk storage can be reassigned and how fast the Routing tables/Directory can be updated).

    If it is necessary to provide 100% site uptime and 24x7 access to all Accounts even when some of the Backend Servers fail, the Dynamic Cluster should be deployed.

    The main difference between Static and Dynamic Clusters is the account hosting. While each account in a Static Cluster has its Host Server, and only that Server can access the Account data directly, all Backend Servers in a Dynamic Cluster can access the Account data directly. The most common method to implement a Dynamic Cluster shared Account Storage is employing dedicated File Servers.

    Traditional File-Locking Approach

    Many legacy mail servers can employ file servers for account storage. Since those servers are usually implemented as multi-process systems (under Unix), they use the same synchronisation methods in both single-server and multi-server environments: file locks implemented on the Operating System/File System level.

    This method has the following problems:

    In the attempt to decrease the negative effect of file-locking, some legacy mail servers support the MailDir mailbox format only (one file per message), and they rely on the "atomic" nature of file directory operations (rather than on file-level locks). This approach theoretically can solve some of the outlined problems (in real-life implementations it hardly solves any), but it results in wasting most of the file server storage: many high-end file servers use 64Kbyte blocks for files, while an average mail message size is about 4Kb, and storing each message in a separate file results in wasting more than 90% of the file server disk space, and overloads file server internal file tables. Also, performance of File Servers severely declines when an application uses many smaller files instead of few larger files.

    While simple clustering based on Operating System/File System multi-access capabilities works fine for Web servers (where the data is not modified too often), it does not work well for Mail servers where the data modification traffic is almost the same as the data retrieval traffic.

    Simple Clustering does not provide any additional value (like Single Service Image), so administering a 10-Server cluster is more difficult than administering 10 independent Servers.

    The CommuniGate Pro software supports the External INBOX feature, so a file-based clustering can be implemented with the CommuniGate Pro, too. But because of the problems outlined above, it is highly recommended to avoid this type of solutions and use the real CommuniGate Pro Dynamic Cluster instead.

    Cluster Controller

    CommuniGate Pro Servers in a Dynamic Cluster do not use Operating System/File System locks to synchronize Account access operations. Like in a Static Cluster, only one Server in a Dynamic Cluster has direct access to the given Account at any given moment. All other Servers work through that Server if they want to access the same Account. But this assignment is not static: any Server can open any Account directly if that Account is not opened with some other Server.

    This architecture provides the maximum uptime: if a Backend Server fails, all Accounts can be accessed via other Backend Servers - without any manual operator intervention, and without any downtime. The site continues to operate and provide access to all it Accounts as long as at least one Backend Server is running.

    One of the Backend Servers in a Dynamic Cluster acts as the Cluster Controller. It synchronizes all other Servers in the Cluster and executes operations such as creating Shared Domains, creating and removing accounts in the shared domains, etc. The Cluster Controller also provides the Single Service Image functionality: not only a site user, but also a site administrator can connect to any Server in the Dynamic Cluster and perform any Account operation (even if the Account is currently opened on a different Server), as well as any Domain-level operations (like Domain Settings modification), and all modifications will be automatically propagated to all Cluster Servers.

    Note: most of the Domain-level update operations, such as updating Domain Settings, Default Account Settings, WebUser Interface Settings, and Domain-Level Alerts may take up to 30 seconds to propagate to all Servers in the Cluster. Account-Level modifications come into effect on all Servers immediately.

    The Cluster Contoller collects the load level information from the Backend Servers. When a Frontend Server receives a session request for an Account not currently opened on any Backend Server, the Controller directs the Frontend Server to the least loaded Backend Server. This second-level load balancing for Backend Server is based on actual load levels and it supplements the basic first-level Frontend load balancing (DNS round-robin or traffic-based).

    While the Dynamic Cluster can maintain a Directory with Account records, the Dynamic Cluster functionality does not rely on the Directory. If the Directory is used, it should be implemented as a Shared Directory.

    A complete Frontend-Backend Dynamic Cluster configuration uses Load Balancers and several separate networks:

    Since all Backend Servers in a Dynamic Cluster have direct access to Account data, they should run the operating systems using the same EOL (end-of-line) conventions. This means that all Backend Servers should either run the same or different flavors of the Unix OS, or they all should run the same or different flavors of the MS Windows OS. Frontend Servers do not have direct access to the Account data, so you can use any OS for your Frontend Servers (for example, a site can use the Solaris OS for Backend Servers and Microsoft Windows 2000 for Frontend Servers).

    Backend Server Settings

    Use the WebAdmin Interface of this first Backend Server to verify that the Cluster Controller is running. Open the Domains page to check that:

    Use the Create Shared Domain button to create additional Shared Domains to be served with the Dynamic Cluster.

    When the Cluster Controller is running, the site can start serving clients (if you do not use Frontend Servers). If your configuration employs Frontend servers, at least one Frontend Server should be started (see below).

    Adding a Backend Server to a Dynamic Cluster

    Additional Backend Server can be added to the Cluster at any moment. They should be pre-configured in the exactly the same was as the first Backend Server was configured.

    To add a Backend Server to your Dynamic Cluster, start it with the --ClusterMember address Command Line option (it can be added to the CommuniGatePro startup script). The address parameter should specify the IP address of the current Cluster Controller Server.

    Use the WebAdmin interface to verify that the Backend Server is running. Use the Domains page to check that all Shared Domains are visible and that you can administer Accounts in the Shared Domains.

    When the Cluster Controller and at least one Backend Server are running, they both can serve all accounts in the Shared Domains. If you do not use Frontend Servers, load-balancing should be implemented using a regular load-balancer switch, DNS round-robin, or similar technique that distributes incoming requests between all Backend Servers.

    Adding a Frontend Server to a Dynamic Cluster

    You can add additional Frontend servers to the Cluster at any moment.

    Install and Configure the CommuniGate Pro software on a Frontend Server computer. Since Frontend Servers do not access Account data directly, there is no need to make the SharedDomains file directory available ("mounted" or "mapped") to any Frontend Server.

    To add a Frontend Server to your Dynamic Cluster, start it with the --ClusterFrontend address Command Line option (it can be added to the CommuniGatePro startup script). The address parameter should specify the IP address of the current Cluster Controller Server.

    Use the WebAdmin interface to verify that the Frontend Server is running. Use the Domains page to check that all Shared Domains are visible.

    When Frontend Servers try to open one of the Shared Domain accounts, the Controller directs them to one of the running Backend Servers, distributing the load between all available Backend Servers.

    Withdrawing Servers from a Dynamic Cluster

    If a Backend Server fails, all Shared Domain Accounts that were open on that Server at the time of failure become unavailable. They become available again within 10-20 seconds, when the Cluster Controller detects the failure. A Backend Server failure does not cause any data loss.


    Assigning IP Addresses to Shared Domains

    A CommuniGate Pro Cluster can serve several Shared Domains. If you plan to provide POP and IMAP access to Accounts in those Domains, you may want to assign dedicated IP addresses to those Domains to simplify client mailer setups. See the Access section for more details.

    If you use Frontend Servers, only Frontend Servers should have dedicated IP Addresses for Shared Domains. Inter-server communications always use full account names (accountname@domainname), so there is no need to dedicate IP Addresses to Shared Domains on Backend Servers.

    If you use the DNS round-robin mechanisms to distribute the site load, you need to assign N IP addresses to each Shared Domain that needs dedicated IP addresses, where N is the number of your Frontend Servers. Configure the DNS Server to return these addresses in the round-robin manner:

    In this example, the Cluster is serving two Shared Domains: domain1.dom and domain2.dom, and the Cluster has three Frontend Servers. Three IP addresses are assigned to the each domain name in the DNS server tables, and the DNS server returns all three addresses when a client is requesting A-records for one of these domain names. Each time the DNS server "rotates" the order of the IP addresses in its responses, implementing the DNS "round-robin" load balancing (client applications usually use the first address in the DNS server response, and use other addresses only if an attempt to establish a TCP/IP connection with the first address fails).

    When configuring these Shared Domains in your CommuniGate Pro Servers, you assign all three IP addresses to the each Domain.

    If you use a Load Balancer to distribute the site load, you need to assign only 1 IP addresses to each Shared Domain, where N is the number of your Frontend Servers. You assign a unique IP address (in your internal LAN address range) for each Shared Domain on each Frontend Server:

    In this example, the Cluster is serving two Shared Domains: domain1.dom and domain2.dom, and the Cluster has three Frontend Servers. One IP Addresses assigned to each Shared Domain in the DNS server tables, and those addresses are external (Internet) addresses of your Load Balancer. You should instruct the Load Balancer to distribute connections received on each of its external IP addresses to three internal IP addresses - the addresses assigned to your Frontend Servers.

    When configuring these Shared Domains in your CommuniGate Pro Servers, you assign these three internal IP addresses to each Domain.

    DNS MX-records for Shared Domains can point to their A-records.


    Security Issues

    The Frontend-Backend topology allows you to protect the site information and Backend Servers not only if one of the Frontend Servers is crashed because of some type of network attack, but even if the Frontend Server OS is "cracked" and an intruder gets the complete ("root") access to the Frontend Server OS using a security hole in that OS.

    To protect the site from these "cracks":

    These measures do not cause any problem for your users that have the domain administrator rights and want to administer their Shared Domains (using WebAdmin Interface or CLI). They also do not cause any problem for your regular users that want to use the PWD module to update their passwords.


    Cluster Configuration Details

    SMTP

    The outgoing mail traffic generated with regular (POP/IMAP) clients is submitted to the site using the A-records of the site Domains. As a result, the submitted messages go to the Frontend Servers and the messages are distributed from there.

    Messages generated with WebUser clients and messages generated automatically (using the Automated Rules) are generated on the Backend Servers. Since usually the Backend servers are behind the firewall and since you usually do not want the Backend Servers to spend their resources maintaining SMTP queues, it is recommended to use the forwarding feature of the CommuniGate Pro SMTP module. Select the Forward to option and specify the domain name that resolves into the IP addresses of all (or some) Frontend Servers. In this case all mail generated on the Backend Server will be quickly sent to the Frontend Servers and it will be distributed from there.


    Cluster Of Clusters

    For extremely large sites (more than 5,000,000 active accounts), you can deploy a Static Cluster of Dynamic Clusters. It is essentially the same as a regular Static Cluster with Frontend Servers, but instead of Backend Servers you install Dynamic Clusters. This solves the redundancy problem of Static Clusters, but does not require extremely large Shared Storage devices and excessive network traffic of extra-large Dynamic Clusters:
    Frontend Servers is such super-cluster need access to the Directory in order to implement Static Clustering. The Frontend Servers only read the information from the Directory, while the Backend Servers modify the Directory when accounts are added, renamed, or removed. The hostServer attribute of Account directory records contains the name of the Backend Dynamic Cluster hosting the Account (the name is supplied by the current Dynamic Cluster Controller, since all Account add/rename/remove operations are handled by the Controller).

    Frontend Servers can be grouped into subsets for traffic segmentation. Each subset can have its own load balancer(s), and a switch that connects this Frontend Subset with every Backend Dynamic Cluster.

    If you plan to deploy many (50 and more) Frontend Servers, the Directory Server itself can become the main site bottleneck. To remove this bottleneck and to provide redundancy on the Directory level, you can deploy several Directory Servers (each Serving serving one or serveral Frontend subsets). Backend Dynamic Clusters can be configured to update only one "Master" Directory Server, and other Directory Servers can use replication mechanisms to synchronize with the Master Directory Server, or the Backend Clusters can be configured to modify all Directory Servers at once.


    CommuniGate® Pro Guide. Copyright © 1998-2000, Stalker Software, Inc.