-2

I need to setup the IP of a cluster of applications servers to connect to a cluster of database servers. For 1 to 1, application server to database server is quite easy, but when comes to a cluster environment, I wonder how should I configure/setup the IP in application servers to connect to more than one database servers, especially in a active/passive setup. I am confuse as I am not sure how should I setup the IP where the application server know which active database server it should communicate to and when to communicate to another database server when the master database server is down. Can somebody guide me towards a direction I should go?

These are the configuration for the cluster/HA setup:

  • I have a layer 2 switch that is connected to the datacenter's router.
  • I have 2 application servers and 2 database servers.
  • Both application servers and database servers are in their own cluster using Debian OS with Corosync, Pacemaker and DRBD setup.
  • Both the application servers and database servers cluster are connected directly to the switch.
  • In between the application/database server, I used a dedicated NIC for the corosync to monitor the heartbeat.
  • Application running on my application server are just APIs coded using code igniter.
  • My database contain postgesSQL and MongoDB.

Network Infrastructure Design

kasperd
  • 29,894
  • 16
  • 72
  • 122
John
  • 1
  • 3
  • Please see my answer, and let me know if this helps or you're in the need of additional information / details. – gxx Nov 12 '15 at 07:09

1 Answers1

0

Can only speak for PostgreSQL, as I'm lacking knowledge regarding MongoDB:

  • I wouldn't use DRBD for PostgreSQL.
  • Instead, I would strongly recommend to use the pgsql ocf resource agent.
  • This allows you to setup an active/passive cluster using PostgreSQL synchronous streaming replication which is implemented since version 9.1.
  • Add an virtual/floating IP to the cluster and, using colocation constraints, force it to run on the node, which currently has the PostgreSQL master role assigned.
  • Point your application servers to this virtual/floating IP.
  • However, this approach has also a downside: After fail-over, you might have to copy data from the "now-master" to the "now-slave" (to get the cluster back into a completely working state). This isn't such a big deal, because

    • fail-over doesn't happen that often (normally),
    • you can use a handy tool for that (pg_basebackup),
    • depending on how long the replication between both nodes didn't work and how much data got written / deleted to / from the database during this time, PostgreSQL will handle this automatically, after the failed node was brought back into operation.
  • If you implement this approach, I would recommend to (automatically) monitor the replication state and some details, for example via postgres -c "psql -c \"SELECT application_name, client_addr, client_hostname, sync_state, state, sync_priority, replay_location FROM pg_stat_replication;\"", which gives something similar to

    application_name    | client_addr | client_hostname | sync_state |   state   | sync_priority | replay_location
    --------------------+-------------+-----------------+------------+-----------+---------------+-----------------
    node2.example.com   | 10.0.15.21  |                 | sync       | streaming |             0 | 0/40000C8
    

    so you get informed, if things go fishy.

  • Disclosure: I'm using this in production since ~ two years.



Edit: Albeit this tutorial is using Fedora 19, it might be of interest and help for you as well.

gxx
  • 5,483
  • 2
  • 21
  • 42
  • Thanks @gf_, your explanation especially on the floating IP guided me to the correct direction. For mongoDb, I'm still confuse after googling. Seems difficult to find a example or method like ocf for mongoDB. – John Nov 12 '15 at 09:09
  • @John: Glad to hear. Regarding `MongoDB`: You could try to write an ocf resource agent by yourself. Have a look at [More About OCF Resource Agents](http://clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ap-ocf.html). – gxx Nov 12 '15 at 10:02