最近公司有个异地多机房数据同步需求,mysql原生支持双主同步,所以只能另寻他法,于是找到了Percona XtraDB Cluster。这个可以理解为给Mysql打了个补丁,以便支持多主同步。

测试环境:centos 6.5

IP分配:

  1. 192.168.0.154(DB)
  2. 192.168.0.152(DB)
  3. 192.168.0.153(DB)
  4. 192.168.0.151(HA)
    首先安装Percona XtraDB Cluster的源:
    yum install http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm

然后
yum install Percona-XtraDB-Cluster-56

安装完毕后,修改/etc/my.cnf:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
[mysqld]
server_id=4
datadir=/var/lib/mysql
user=mysql
# Path to Galeralibrary
wsrep_provider=/usr/lib64/libgalera_smm.so
# Cluster connectionURL contains the IPs of node#1, node#2 and node#3----所有节点的ip
wsrep_cluster_address=gcomm://192.168.0.152,192.168.0.154,192.168.0.153
# In order for Galerato work correctly binlog format should be ROW
binlog_format=ROW
# MyISAM storageengine has only experimental support
default_storage_engine=InnoDB
# This changes howInnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
# Node #1 address----本机ip
wsrep_node_address=192.168.0.154
# SST method----节点间同步的方式
wsrep_sst_method=xtrabackup-v2
# Cluster name
wsrep_cluster_name=my_centos_cluster
# Authentication forSST method----来做节点间数据同步的账号密码
wsrep_sst_auth="root:asdasd"
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
[mysqld_safe]
log-error=/var/log/mysqld.log

注意,修改不同节点的server_id以及 wsrep_node_address。

然后在192.168.0.152上执行/etc/init.d/mysql bootstrap-pxc

网上有些文章说需要修改 wsrep_cluster_address=gcomm://,在新版本中不需要了,上面这句就是初始化集群。结果如下:

1
2
[root@localhost ~]# /etc/init.d/mysql bootstrap-pxc
Bootstrapping PXC (Percona XtraDB Cluster)Starting MySQL (Percona XtraDB Cluster).. SUCCESS!

然后根据配置文件修改用于同步的用户名和密码,由于是实验环境我偷懒直接使用root了:

1
mysqladmin -u root password asdasd

进入mysql终端后可以看当前信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
mysql> show status like 'wsrep%';
+----------------------------+--------------------------------------+
| Variable_name | Value |
+----------------------------+--------------------------------------+
| wsrep_local_state_uuid | c2883338-834d-11e2-0800-03c9c68e41ec |
...
| wsrep_local_state | 4 |
| wsrep_local_state_comment | Synced |
...
| wsrep_incoming_addresses | 192.168.0.152:3306 |
| wsrep_cluster_size | 1 |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
...
| wsrep_ready | ON |
+----------------------------+--------------------------------------+

当第一个节点成功启动后,启动其他节点,注意此时命令是

1
/etc/init.d/mysql start

正常情况下很快就会启动完成,如果启动了很长时间后出现如下提示:

1
2
3
4
Shutting down MySQL (Percona XtraDB Cluster)..... SUCCESS!
Starting MySQL (Percona XtraDB Cluster).................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ERROR!
ERROR! MySQL (Percona XtraDB Cluster) server startup failed!
ERROR! Failed to restart server.

但日志里没相关错误信息,那么请 检查selinux是否关闭以及防火墙4444和4567端口 !!!(我就忘了防火墙的原因纠结了好久)

启动成功提示如下:

1
2
3
4
5
6
7
8
9
10
11
[root@test4 ~]# service mysql start
Starting MySQL (Percona XtraDB Cluster)...... SUCCESS!
[root@test4 ~]# netstat -anp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 963/sshd
tcp 0 0 0.0.0.0:4567 0.0.0.0:* LISTEN 5184/mysqld
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1041/master
tcp 0 0 192.168.0.154:22 192.168.0.37:38500 ESTABLISHED 1080/sshd
tcp 0 0 192.168.0.154:4567 192.168.0.153:53681 ESTABLISHED 5184/mysqld
tcp 0 0 192.168.0.154:59348 192.168.0.152:4567 ESTABLISHED 5184/mysqld

可以看出,4567端口也处于监听状态。此时,在任意机器上进行数据库操作其他2个也会自动同步了。

此时,手动关闭152,然后在154上插入数据,153也同步了,同时自动把152从集群中删除了:

1
| wsrep_incoming_addresses | 192.168.0.154:3306,192.168.0.153:3306 |

再启动152后,数据也自动同步了。

如果非常非常不幸,集群中所有节点都挂掉了,修复后需要在最后挂掉的节点上执行bootstrap-pxc命令,这样才能拯救多一些的数据。

如果重启时候报错:

1
2
3
4
5
6
[root@test3 ~]# service mysql start
ERROR! MySQL (Percona XtraDB Cluster) is not running, but lock file (/var/lock/subsys/mysql) exists
Stale sst_in_progress file in datadir
Starting MySQL (Percona XtraDB Cluster)State transfer in progress, setting sleep higher
.. ERROR! The server quit without updating PID file (/var/lib/mysql/test3.pid).
ERROR! MySQL (Percona XtraDB Cluster) server startup failed!

直接删除/var/lock/subsys/mysql即可。还有一点需要注意的,数据库表需要使用INNODB而不是MYISAM引擎,否则会出现表结构同步了而数据无法同步的情况。

安装HAproxy:yum install haproxy

修改配置文件,位于/etc/haproxy/haproxy.cfg:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
#---------------------------------------------------------------------
# Example configuration for a possibleweb application. See the
# full configuration options online.
# http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option tothe SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like thefollowing can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option tcplog
option dontlognull
# option http-server-close
# option forwardfor except127.0.0.0/8
option redispatch
retries 3
maxconn 2000
timeout connect 5s
timeout client 50s
timeout server 50s
# timeout http-keep-alive 10s
timeout check 10s
listen mysql-cluster 0.0.0.0:3306
mode tcp
balance roundrobin
server node1 192.168.0.152:3306 check
server node2 192.168.0.153:3306 check
server node3 192.168.0.154:3306 check
listen status 192.168.0.151:8080
stats enable
stats uri /status
stats auth admin:admin
stats realm (haproxy\ statistic)

启动服务后,访问192.168.0.151:8080/status登录即可看到界面。对外则使用192.168.0.151:3306访问数据库即可。

—-20150120更新—-

对于上面的HA配置,默认是监控第4层,换言之如果由于某情况下3306端口开放而MYSQL实际并没提供服务时,HA就无法解决这种情况。为了模拟这种情况,停止某个节点的MYSQL服务后,使用NC监听3306端口,成功欺骗了HA。为了解决这中情况,我们就需要针对应用层进行监控。

首先在节点上安装xinetd:yum install -y xinetd

然后编辑/etc/services,添加

1
mysqlchk 9200/tcp

然后编辑/usr/bin/clustercheck,修改

1
2
MYSQL_USERNAME="${1-root}"
MYSQL_PASSWORD="${2-asdasd}"

这里我偷懒使用root,大家根据实际情况修改。保存后启动xinetd服务

1
2
[root@test4 ~]# /etc/init.d/xinetd start
Starting xinetd: [ OK ]

此时9200端口应该已经处于了监听状态,执行检测命令:

1
2
3
4
5
6
[root@test5 ~]# clustercheck
HTTP/1.1 200 OK
Content-Type: text/plain
Connection: close
Content-Length: 40
Percona XtraDB Cluster Node is synced.

接下来修改HA节点的配置文件,修改成:

1
2
3
4
5
6
7
listen mysql-cluster 0.0.0.0:3306
mode tcp
balance roundrobin
option httpchk
server node1 192.168.0.152:3306 check port 9200 inter 12000 rise 3 fall 3
server node2 192.168.0.153:3306 check port 9200 inter 12000 rise 3 fall 3
server node3 192.168.0.154:3306 check port 9200 inter 12000 rise 3 fall 3

保存后重启HAPROXY即可,如果使用了KEEPALIVE(这里)把备份HA配置也修改成一样的。此时就是基于应用层的监控了。