注册
DM8 数据守护集群(DataWatch)搭建
专栏/培训园地/ 文章详情 /

DM8 数据守护集群(DataWatch)搭建

river+++ 2024/03/13 1069 1 0
摘要

1 场景描述

Data Watch 是一种高可用、高性能数据库解决方案,可以实现异地容灾 。
原理:将主库产生的 Redo 日志传输到备库,备库接收并重新应用 Redo 日志,从而实现备库与主库的数据同步;主要由主库、备库、Redo 日志、Redo 日志传输、Redo 日志重演、守护进程、监视器组成。

2 环境说明

2.1 配置环境

主机类型 IP 地址 实例名 操作系统
主库 192.168.182.129 MMJ_P CentOS 7
备库 192.168.182.130 MMJ_S CentOS 7

2.2 集群规划

前期最好先规划集群相关参数信息,为后续搭建做好准备。
(此例子未区分心跳 IP 和业务 IP)

参数 主库 备库
实例名(INSTANCE_NAME) MMJ_P MMJ_S
系统监听 TCP 连接的 IP 地址(MAL_HOSTMAL) 192.168.182.129 192.168.182.130
实例端口(PORT_NUM) 5236 5236
实例对外服务端口(MAL_INST_PORT) 5236 5236
守护进程端口(MAL_INST_DW_PORT) 5237 5237
MAL 端口(MAL_PORT) 5238 5238
MAL守护进程端口(MAL_DW_PORT) 5239 5239
OGUID(集群唯一标识号) 45331 45331
守护组 GRP1 GRP1
安装目录 /dm8/dmdbms /dm8/dmdbms
实例目录 /dm8/dmdbms/data /dm8/dmdbms/data
归档上限 5120 5120

3 主备集群搭建

3.1 数据库安装

主备库均需要先安装数据库。详细安装步骤可以参考:linux 下安装达梦数据库

3.2 配置主库

3.2.1 初始化实例并备份数据

  • 初始化实例
[dmdba@localhost bin]$ ./dminit path=/dm8/dmdbms/data INSTANCE_NAME=MMJ_P initdb V8 db version: 0x7000c file dm.key not found, use default license! License will expire on 2024-11-09 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL log file path: /dm8/dmdbms/data/DAMENG/DAMENG01.log log file path: /dm8/dmdbms/data/DAMENG/DAMENG02.log write to dir [/dm8/dmdbms/data/DAMENG]. create dm database success. 2023-11-27 16:41:50
  • 启动服务
[dmdba@localhost /]$ /dm8/dmdbms/bin/dmserver /dm8/dmdbms/data/DAMENG/dm.ini file dm.key not found, use default license! version info: develop DM Database Server 64 V8 03134284094-20231109-208042-20067 startup... Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL Database mode = 0, oguid = 0 License will expire on 2024-11-09 file lsn: 41322 ndct db load finished, code:0 ndct second level fill fast pool finished ndct third level fill fast pool finished ndct second level fill fast pool finished ndct third level fill fast pool finished ndct fill fast pool finished pseg_set_gtv_trxid_low next_trxid in mem:[8008] pseg_collect_mgr_items, total collect 0 active_trxs, 0 cmt_trxs, 0 pre_cmt_trxs, 0 to_release_trxs, 0 active_pages, 0 cmt_pages, 0 pre_cmt_pages, 0 to_release_pages, 0 mgr pages, 0 mgr recs! next_trxid in mem:[10010] next_trxid = 12012. pseg recv finished nsvr_startup end. uthr_pipe_create, create pipe[read:12, write:13] uthr_pipe_create, create pipe[read:14, write:15] uthr_pipe_create, create pipe[read:16, write:17] uthr_pipe_create, create pipe[read:18, write:19] uthr_pipe_create, create pipe[read:20, write:21] uthr_pipe_create, create pipe[read:22, write:23] uthr_pipe_create, create pipe[read:24, write:25] uthr_pipe_create, create pipe[read:26, write:27] uthr_pipe_create, create pipe[read:28, write:29] uthr_pipe_create, create pipe[read:30, write:31] uthr_pipe_create, create pipe[read:32, write:33] uthr_pipe_create, create pipe[read:34, write:35] uthr_pipe_create, create pipe[read:36, write:37] uthr_pipe_create, create pipe[read:38, write:39] uthr_pipe_create, create pipe[read:40, write:41] uthr_pipe_create, create pipe[read:42, write:43] aud sys init success. aud rt sys init success. systables desc init success. ndct_db_load_info finished, code:0. nsvr_process_before_open begin. nsvr_process_before_open success. SYSTEM IS READY.
  • 开启归档
    上面是以前台的方式启动,直接在该终端上退出启动之后会停止服务,所以建议新开一个终端来开启归档。
[dmdba@localhost ~]$ /dm8/dmdbms/bin/disql SYSDBA/SYSDBA@192.168.182.129:5236 服务器[192.168.182.129:5236]:处于普通打开状态 登录使用时间 : 116.368(ms) disql V8

进入 sql 中开启归档:

SQL> ALTER DATABASE MOUNT; 操作已执行 已用时间: 82.352(毫秒). 执行号:0. SQL> ALTER DATABASE ARCHIVELOG; 操作已执行 已用时间: 8.581(毫秒). 执行号:0. SQL> ALTER DATABASE ADD ARCHIVELOG 'DEST=/dm8/dmdbms/data/DAMENG/arch, TYPE=LOCAL, FILE_SIZE=1024, SPACE_LIMIT=5120'; 操作已执行 已用时间: 81.448(毫秒). 执行号:0. SQL> ALTER DATABASE OPEN; 操作已执行 已用时间: 40.580(毫秒). 执行号:0.
  • 备份数据
SQL> BACKUP DATABASE BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/BACKUP_FILE'; 操作已执行 已用时间: 00:00:03.281. 执行号:501.
  • 修改配置文件 dm.ini
SQL> SP_SET_PARA_VALUE (2,'DW_INACTIVE_INTERVAL',60); DMSQL 过程已成功完成 已用时间: 6.358(毫秒). 执行号:502. SQL> SP_SET_PARA_VALUE (2,'ALTER_MODE_STATUS',0); DMSQL 过程已成功完成 已用时间: 5.228(毫秒). 执行号:503. SQL> SP_SET_PARA_VALUE (2,'ENABLE_OFFLINE_TS',2); DMSQL 过程已成功完成 已用时间: 5.029(毫秒). 执行号:504. SQL> SP_SET_PARA_VALUE (2,'MAL_INI',1); DMSQL 过程已成功完成 已用时间: 5.229(毫秒). 执行号:505. SQL> SP_SET_PARA_VALUE (2,'RLOG_SEND_APPLY_MON',64); DMSQL 过程已成功完成 已用时间: 5.142(毫秒). 执行号:506.
  • 关闭前台实例服务
SQL> exit

叉掉前面前台启动实例服务的窗口即可关闭。

3.2.2 修改 dmarch.ini

[dmdba@localhost ~]$ vi /dm8/dmdbms/data/DAMENG/dmarch.ini #DaMeng Database Archive Configuration file #this is comments ARCH_WAIT_APPLY = 0 [ARCHIVE_LOCAL1] ARCH_TYPE = LOCAL ARCH_DEST = /dm8/dmdbms/data/DAMENG/arch ARCH_FILE_SIZE = 1024 ARCH_SPACE_LIMIT = 5120 ARCH_FLUSH_BUF_SIZE = 0 ARCH_HANG_FLAG = 1 [ARCHIVE_REALTIME1] ARCH_TYPE = REALTIME #实时归档类型 ARCH_DEST = MMJ_S #实时归档目标实例名

3.2.3 创建 dmmal.ini

[dmdba@localhost ~]$ vi /dm8/dmdbms/data/DAMENG/dmmal.ini MAL_CHECK_INTERVAL = 10 #MAL 链路检测时间间隔 MAL_CONN_FAIL_INTERVAL = 10 #判定 MAL 链路断开的时间 MAL_TEMP_PATH = /dm8/dmdbms/data/malpath/ #临时文件目录 MAL_BUF_SIZE = 512 #单个 MAL 缓存大小,单位 MB MAL_SYS_BUF_SIZE = 2048 #MAL 总大小限制,单位 MB MAL_COMPRESS_LEVEL = 0 #MAL 消息压缩等级,0 表示不压缩 [MAL_INST1] MAL_INST_NAME = MMJ_P #实例名,和 dm.ini 的 INSTANCE_NAME 一致 MAL_HOST = 192.168.182.129 #MAL 系统监听 TCP 连接的 IP 地址 MAL_PORT = 5238 #MAL 系统监听 TCP 连接的端口 MAL_INST_HOST = 192.168.182.129 #实例的对外服务 IP 地址 MAL_INST_PORT = 5236 #实例对外服务端口,和 dm.ini 的 PORT_NUM 一致 MAL_DW_PORT = 5239 #实例对应的守护进程监听 TCP 连接的端口 MAL_INST_DW_PORT = 5237 #实例监听守护进程 TCP 连接的端口 [MAL_INST2] MAL_INST_NAME = MMJ_S MAL_HOST = 192.168.182.130 MAL_PORT = 5238 MAL_INST_HOST = 192.168.182.130 MAL_INST_PORT = 5236 MAL_DW_PORT = 5239 MAL_INST_DW_PORT = 5237

3.2.4 创建 dmwatcher.ini

[dmdba@localhost ~]$ vi /dm8/dmdbms/data/DAMENG/dmwatcher.ini [GRP1] DW_TYPE = GLOBAL #全局守护类型 DW_MODE = MANUAL #MANUAL:故障手切 AUTO:故障自切 DW_ERROR_TIME = 20 #远程守护进程故障认定时间 INST_ERROR_TIME = 20 #本地实例故障认定时间 INST_RECOVER_TIME = 60 #主库守护进程启动恢复的间隔时间 INST_OGUID = 45331 #守护系统唯一 OGUID 值 INST_INI = /dm8/dmdbms/data/DAMENG/dm.ini #dm.ini 文件路径 INST_AUTO_RESTART = 1 #打开实例的自动启动功能 INST_STARTUP_CMD = /dm8/dmdbms/bin/dmserver #命令行方式启动 RLOG_SEND_THRESHOLD = 0 #指定主库发送日志到备库的时间阈值,默认关闭 RLOG_APPLY_THRESHOLD = 0 #指定备库重演日志的时间阈值,默认关闭

3.2.5 拷贝备份文件

拷贝备份文件到备库(可以备库先建好实例之后,再从主库拷贝备份文件到备库)

[dmdba@localhost ~]$ scp -r /dm8/dmdbms/data/DAMENG/bak/BACKUP_FILE dmdba@192.168.182.130:/dm8/dmdbms/data/DAMENG/bak The authenticity of host '192.168.182.130 (192.168.182.130)' can't be established. ECDSA key fingerprint is SHA256:ppzuJYyDD1f/OnS65AJjCWMuECgk34oQyN8b7x7QAVQ. ECDSA key fingerprint is MD5:aa:0b:36:79:d6:74:18:96:3c:d3:13:32:02:46:23:ef. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '192.168.182.130' (ECDSA) to the list of known hosts. dmdba@192.168.182.130's password: BACKUP_FILE.bak 100% 8974KB 42.2MB/s 00:00 BACKUP_FILE_1.bak 100% 82KB 19.3MB/s 00:00 BACKUP_FILE.meta

3.2.6 注册服务

  • 注册实例启动服务
[root@localhost ~]# /dm8/dmdbms/script/root/dm_service_installer.sh -t dmserver -p MMJ_P -dm_ini /dm8/dmdbms/data/DAMENG/dm.ini -m mount Created symlink from /etc/systemd/system/multi-user.target.wants/DmServiceMMJ_P.service to /usr/lib/systemd/system/DmServiceMMJ_P.service. 创建服务(DmServiceMMJ_P)完成

注意这里是以 mount 状态注册的服务。

  • 注册守护进程服务
[root@localhost ~]# /dm8/dmdbms/script/root/dm_service_installer.sh -t dmwatcher -p Watcher -watcher_ini /dm8/dmdbms/data/DAMENG/dmwatcher.ini Created symlink from /etc/systemd/system/multi-user.target.wants/DmWatcherServiceWatcher.service to /usr/lib/systemd/system/DmWatcherServiceWatcher.service. 创建服务(DmWatcherServiceWatcher)完成

3.3 配置备库

3.3.1 初始化实例

[dmdba@localhost bin]$ ./dminit path=/dm8/dmdbms/data INSTANCE_NAME=MMJ_S initdb V8 db version: 0x7000c file dm.key not found, use default license! License will expire on 2024-11-09 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL log file path: /dm8/dmdbms/data/DAMENG/DAMENG01.log log file path: /dm8/dmdbms/data/DAMENG/DAMENG02.log write to dir [/dm8/dmdbms/data/DAMENG]. create dm database success. 2023-11-29 10:06:07

3.3.2 还原并恢复数据库

  • 还原数据库
[dmdba@localhost bin]$ /dm8/dmdbms/bin/dmrman CTLSTMT="RESTORE DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/BACKUP_FILE'" dmrman V8 RESTORE DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/BACKUP_FILE' file dm.key not found, use default license! Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL [Percent:100.00%][Speed:0.00M/s][Cost:00:00:02][Remaining:00:00:00] restore successfully. time used: 00:00:02.581
  • 恢复数据库
[dmdba@localhost bin]$ /dm8/dmdbms/bin/dmrman CTLSTMT="RECOVER DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/BACKUP_FILE'" dmrman V8 RECOVER DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/BACKUP_FILE' file dm.key not found, use default license! Database mode = 2, oguid = 0 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL EP[0]'s cur_lsn[42739], file_lsn[42739] [Percent:100.00%][Speed:0.00PKG/s][Cost:00:00:00][Remaining:00:00:00] recover successfully! time used: 00:00:02.544
  • 更新 magic
[dmdba@localhost bin]$ /dm8/dmdbms/bin/dmrman CTLSTMT="RECOVER DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' UPDATE DB_MAGIC" dmrman V8 RECOVER DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' UPDATE DB_MAGIC file dm.key not found, use default license! Database mode = 2, oguid = 0 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL EP[0]'s cur_lsn[42975], file_lsn[42975] recover successfully! time used: 00:00:01.133

3.3.3 开启归档

  • 启动实例服务
    以前台方式启动
[dmdba@localhost ~]$ /dm8/dmdbms/bin/dmserver /dm8/dmdbms/data/DAMENG/dm.ini file dm.key not found, use default license! version info: develop DM Database Server 64 V8 03134284094-20231109-208042-20067 startup... Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL Database mode = 0, oguid = 0 License will expire on 2024-11-09 file lsn: 43329 ndct db load finished, code:0 ndct second level fill fast pool finished ndct third level fill fast pool finished ndct second level fill fast pool finished ndct third level fill fast pool finished ndct fill fast pool finished pseg_set_gtv_trxid_low next_trxid in mem:[44068] pseg_collect_mgr_items, total collect 0 active_trxs, 0 cmt_trxs, 0 pre_cmt_trxs, 0 to_release_trxs, 0 active_pages, 0 cmt_pages, 0 pre_cmt_pages, 0 to_release_pages, 0 mgr pages, 0 mgr recs! next_trxid in mem:[46070] next_trxid = 48072. pseg recv finished nsvr_startup end. uthr_pipe_create, create pipe[read:12, write:13] uthr_pipe_create, create pipe[read:14, write:15] uthr_pipe_create, create pipe[read:16, write:17] uthr_pipe_create, create pipe[read:18, write:19] uthr_pipe_create, create pipe[read:20, write:21] uthr_pipe_create, create pipe[read:22, write:23] uthr_pipe_create, create pipe[read:24, write:25] uthr_pipe_create, create pipe[read:26, write:27] uthr_pipe_create, create pipe[read:28, write:29] uthr_pipe_create, create pipe[read:30, write:31] uthr_pipe_create, create pipe[read:32, write:33] uthr_pipe_create, create pipe[read:34, write:35] uthr_pipe_create, create pipe[read:36, write:37] uthr_pipe_create, create pipe[read:38, write:39] uthr_pipe_create, create pipe[read:40, write:41] uthr_pipe_create, create pipe[read:42, write:43] aud sys init success. aud rt sys init success. systables desc init success. ndct_db_load_info finished, code:0. nsvr_process_before_open begin. nsvr_process_before_open success. SYSTEM IS READY.
  • 开启归档
[dmdba@localhost ~]$ /dm8/dmdbms/bin/disql SYSDBA/SYSDBA@192.168.182.130:5236 服务器[192.168.182.130:5236]:处于普通打开状态 登录使用时间 : 4.665(ms) disql V8 SQL> ALTER DATABASE MOUNT; 操作已执行 已用时间: 1.439(毫秒). 执行号:0. SQL> ALTER DATABASE ARCHIVELOG; 操作已执行 已用时间: 6.174(毫秒). 执行号:0. SQL> ALTER DATABASE ADD ARCHIVELOG 'DEST=/dm8/dmdbms/data/DAMENG/arch, TYPE=LOCAL, FILE_SIZE=1024, SPACE_LIMIT=5120'; 操作已执行 已用时间: 1.048(毫秒). 执行号:0. SQL> ALTER DATABASE OPEN; 操作已执行 已用时间: 6.265(毫秒). 执行号:0.

3.3.4 配置 dm.ini

由于 dm.ini 里面的配置参数太多,这里是直接用 SQL 语句进行修改配置。

SQL> SP_SET_PARA_VALUE (2,'DW_INACTIVE_INTERVAL',60); DMSQL 过程已成功完成 已用时间: 4.766(毫秒). 执行号:501. SQL> SP_SET_PARA_VALUE (2,'ALTER_MODE_STATUS',0); DMSQL 过程已成功完成 已用时间: 4.651(毫秒). 执行号:502. SQL> SP_SET_PARA_VALUE (2,'ENABLE_OFFLINE_TS',2); DMSQL 过程已成功完成 已用时间: 3.993(毫秒). 执行号:503. SQL> SP_SET_PARA_VALUE (2,'MAL_INI',1); DMSQL 过程已成功完成 已用时间: 4.470(毫秒). 执行号:504. SQL> SP_SET_PARA_VALUE (2,'RLOG_SEND_APPLY_MON',64); DMSQL 过程已成功完成 已用时间: 3.858(毫秒). 执行号:505.

关闭数据库实例服务(前台启动窗口关闭即可)。

3.3.5 配置 dmarch.ini

[dmdba@localhost ~]$ vi /dm8/dmdbms/data/DAMENG/dmarch.ini #DaMeng Database Archive Configuration file #this is comments ARCH_WAIT_APPLY = 0 [ARCHIVE_LOCAL1] ARCH_TYPE = LOCAL ARCH_DEST = /dm8/dmdbms/data/DAMENG/arch ARCH_FILE_SIZE = 1024 ARCH_SPACE_LIMIT = 5120 ARCH_FLUSH_BUF_SIZE = 0 ARCH_HANG_FLAG = 1 [ARCHIVE_REALTIME1] ARCH_TYPE = REALTIME #实时归档类型 ARCH_DEST = MMJ_P #实时归档目标实例名

3.3.6 配置 dmmal.ini

与主库配置一致。

[dmdba@localhost /]$ vi /dm8/dmdbms/data/DAMENG/dmmal.ini MAL_CHECK_INTERVAL = 10 #MAL 链路检测时间间隔 MAL_CONN_FAIL_INTERVAL = 10 #判定 MAL 链路断开的时间 MAL_TEMP_PATH = /dm8/dmdbms/data/malpath/ #临时文件目录 MAL_BUF_SIZE = 512 #单个 MAL 缓存大小,单位 MB MAL_SYS_BUF_SIZE = 2048 #MAL 总大小限制,单位 MB MAL_COMPRESS_LEVEL = 0 #MAL 消息压缩等级,0 表示不压缩 [MAL_INST1] MAL_INST_NAME = MMJ_P #实例名,和 dm.ini 的 INSTANCE_NAME 一致 MAL_HOST = 192.168.182.129 #MAL 系统监听 TCP 连接的 IP 地址 MAL_PORT = 5238 #MAL 系统监听 TCP 连接的端口 MAL_INST_HOST = 192.168.182.129 #实例的对外服务 IP 地址 MAL_INST_PORT = 5236 #实例对外服务端口,和 dm.ini 的 PORT_NUM 一致 MAL_DW_PORT = 5239 #实例对应的守护进程监听 TCP 连接的端口 MAL_INST_DW_PORT = 5237 #实例监听守护进程 TCP 连接的端口 [MAL_INST2] MAL_INST_NAME = MMJ_S MAL_HOST = 192.168.182.130 MAL_PORT = 5238 MAL_INST_HOST = 192.168.182.130 MAL_INST_PORT = 5236 MAL_DW_PORT = 5239 MAL_INST_DW_PORT = 5237

3.3.7 配置 dmwatcher.ini

与主库配置一致。

[dmdba@localhost /]$ vi /dm8/dmdbms/data/DAMENG/dmwatcher.ini [GRP1] DW_TYPE = GLOBAL #全局守护类型 DW_MODE = MANUAL #MANUAL:故障手切 AUTO:故障自切 DW_ERROR_TIME = 20 #远程守护进程故障认定时间 INST_ERROR_TIME = 20 #本地实例故障认定时间 INST_RECOVER_TIME = 60 #主库守护进程启动恢复的间隔时间 INST_OGUID = 45331 #守护系统唯一 OGUID 值 INST_INI = /dm8/dmdbms/data/DAMENG/dm.ini #dm.ini 文件路径 INST_AUTO_RESTART = 1 #打开实例的自动启动功能 INST_STARTUP_CMD = /dm8/dmdbms/bin/dmserver #命令行方式启动 RLOG_SEND_THRESHOLD = 0 #指定主库发送日志到备库的时间阈值,默认关闭 RLOG_APPLY_THRESHOLD = 0 #指定备库重演日志的时间阈值,默认关闭

3.3.8 注册服务

  • 注册实例服务
[root@localhost ~]# /dm8/dmdbms/script/root/dm_service_installer.sh -t dmserver -p MMJ_S -dm_ini /dm8/dmdbms/data/DAMENG/dm.ini -m mount Created symlink from /etc/systemd/system/multi-user.target.wants/DmServiceMMJ_S.service to /usr/lib/systemd/system/DmServiceMMJ_S.service. 创建服务(DmServiceMMJ_S)完成

注意这里是以 mount 状态注册的服务。

  • 注册守护进程服务
[root@localhost ~]# /dm8/dmdbms/script/root/dm_service_installer.sh -t dmwatcher -p Watcher -watcher_ini /dm8/dmdbms/data/DAMENG/dmwatcher.ini Created symlink from /etc/systemd/system/multi-user.target.wants/DmWatcherServiceWatcher.service to /usr/lib/systemd/system/DmWatcherServiceWatcher.service. 创建服务(DmWatcherServiceWatcher)完成

3.4 配置监视器

  • 手动切换:集群各节点的 bin 目录中,存放非确认监视器配置文件。
  • 自动切换:在确认监视器上(非集群节点),存放确认监视器配置文件,并注册后台自启服务。

以下是配置的非确认监视器,可以在主库/备库任选一个;
确认监视器需要在第三台服务器上配置。

3.4.1 创建 dmmonitor.ini

  • 主库
[dmdba@localhost ~]$ vi /dm8/dmdbms/bin/dmmonitor.ini MON_DW_CONFIRM = 0 #0:非确认(故障手切) 1:确认(故障自切) MON_LOG_PATH = ../log #监视器日志文件存放路径 MON_LOG_INTERVAL = 60 #每隔 60s 定时记录系统信息到日志文件 MON_LOG_FILE_SIZE = 512 #单个日志大小,单位 MB MON_LOG_SPACE_LIMIT = 2048 #日志上限,单位 MB [GRP1] MON_INST_OGUID = 45331 #组 GRP1 的唯一 OGUID 值 MON_DW_IP = 192.168.182.129:5239 #IP 对应 MAL_HOST,PORT 对应 MAL_DW_PORT MON_DW_IP = 192.168.182.130:5239
  • 备库
    备库与主库一致。
[dmdba@localhost ~]$ vi /dm8/dmdbms/bin/dmmonitor.ini MON_DW_CONFIRM = 0 #0:非确认(故障手切) 1:确认(故障自切) MON_LOG_PATH = ../log #监视器日志文件存放路径 MON_LOG_INTERVAL = 60 #每隔 60s 定时记录系统信息到日志文件 MON_LOG_FILE_SIZE = 512 #单个日志大小,单位 MB MON_LOG_SPACE_LIMIT = 2048 #日志上限,单位 MB [GRP1] MON_INST_OGUID = 45331 #组 GRP1 的唯一 OGUID 值 MON_DW_IP = 192.168.182.129:5239 #IP 对应 MAL_HOST,PORT 对应 MAL_DW_PORT MON_DW_IP = 192.168.182.130:5239

3.4.2 注册服务

  • 主库
[root@localhost ~]# /dm8/dmdbms/script/root/dm_service_installer.sh -t dmmonitor -p Monitor -monitor_ini /dm8/dmdbms/bin/dmmonitor.ini Created symlink from /etc/systemd/system/multi-user.target.wants/DmMonitorServiceMonitor.service to /usr/lib/systemd/system/DmMonitorServiceMonitor.service. 创建服务(DmMonitorServiceMonitor)完成
  • 备库
    备库与主库一致。
[root@localhost ~]# /dm8/dmdbms/script/root/dm_service_installer.sh -t dmmonitor -p Monitor -monitor_ini /dm8/dmdbms/bin/dmmonitor.ini Created symlink from /etc/systemd/system/multi-user.target.wants/DmMonitorServiceMonitor.service to /usr/lib/systemd/system/DmMonitorServiceMonitor.service. 创建服务(DmMonitorServiceMonitor)完成

4 启动集群服务

4.1 启动主库

设置 oguid、设置数据库为 primary。

[dmdba@localhost bin]$ ll DmService* -rwxr-xr-x. 1 dmdba dinstall 17502 1129 10:15 DmServiceMMJ_P [dmdba@localhost bin]$ ./DmServiceMMJ_P start Starting DmServiceMMJ_P: [ OK ] [dmdba@localhost bin]$ ./disql SYSDBA/SYSDBA@192.168.182.129:5236 服务器[192.168.182.129:5236]:处于普通配置状态 登录使用时间 : 6.266(ms) disql V8 SQL> SP_SET_OGUID(45331); DMSQL 过程已成功完成 已用时间: 100.790(毫秒). 执行号:1. SQL> ALTER DATABASE PRIMARY; 操作已执行 已用时间: 34.605(毫秒). 执行号:0.

4.2 启动备库

设置 oguid、设置数据库为 standby。

[dmdba@localhost bin]$ ll DmService* -rwxr-xr-x. 1 dmdba dinstall 17502 1129 11:12 DmServiceMMJ_S [dmdba@localhost bin]$ ./DmServiceMMJ_S start Starting DmServiceMMJ_S: [ OK ] [dmdba@localhost bin]$ ./disql SYSDBA/SYSDBA@192.168.182.130:5236 服务器[192.168.182.130:5236]:处于普通配置状态 登录使用时间 : 3.783(ms) disql V8 SQL> SP_SET_OGUID(45331); DMSQL 过程已成功完成 已用时间: 2.348(毫秒). 执行号:1. SQL> ALTER DATABASE STANDBY; 操作已执行 已用时间: 4.553(毫秒). 执行号:0.

4.3 启动守护进程

分别启动主库和备库的守护进程。

[dmdba@localhost bin]$ /dm8/dmdbms/bin/DmWatcherServiceWatcher start Starting DmWatcherServiceWatcher: [ OK ]

4.4 启动监视器

分别启动主库和备库的监视器。

[dmdba@localhost bin]$ /dm8/dmdbms/bin/DmMonitorServiceMonitor start Starting DmMonitorServiceMonitor: [ OK ]

5 监视器查看主备集群状态

5.1 主备集群状态一直为mount

通过监视器查看,主库和备库的状态都是mount状态:

[dmdba@localhost bin]$ ./dmmonitor /dm8/dmdbms/bin/dmmonitor.ini [monitor] 2023-11-29 11:39:00: DMMONITOR[4.0] V8 [monitor] 2023-11-29 11:39:01: DMMONITOR[4.0] IS READY. [monitor] 2023-11-29 11:39:01: #--------------------------------------------------------------------------------# GET MONITOR CONNECT INFO FROM DMWATCHER(MMJ_P), THE FIRST LINE IS SELF INFO. DW_CONN_TIME MON_CONFIRM MID MON_IP MON_VERSION 2023-11-29 11:39:02 FALSE 1891667898 ::ffff:192.168.182.130 DMMONITOR[4.0] V8 2023-11-29 11:37:09 FALSE 468913185 ::ffff:192.168.182.129 DMMONITOR[4.0] V8 2023-11-29 11:37:47 FALSE 855767332 ::ffff:192.168.182.130 DMMONITOR[4.0] V8 #--------------------------------------------------------------------------------# [monitor] 2023-11-29 11:39:01: 收到守护进程(MMJ_P)消息 WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2023-11-29 11:39:02 STARTUP OK MMJ_P MOUNT PRIMARY VALID 6 42979 42979 [monitor] 2023-11-29 11:39:01: 收到守护进程(MMJ_S)消息 WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2023-11-29 11:39:01 STARTUP OK MMJ_S MOUNT STANDBY VALID 9 44164 44164

5.1.1检查防火墙是否关闭(主备都要关闭)

## 开启防火墙并设置开机自启 ## 启动 systemctl start firewalld ## 关闭 systemctl stop firewalld ## 查看状态 systemctl status firewalld ## 开机禁用 systemctl disable firewalld ## 开机启用 systemctl enable firewalld

5.1.2 FLSN不一致

在监视器里面执行 show 命令,可以查看到主库和备库的 FLSN 不一样,可以重新备份主库文件,在备库上还原:

show 2023-11-29 11:41:49 #================================================================================# GROUP OGUID MON_CONFIRM MODE MPP_FLAG GRP1 45331 FALSE MANUAL FALSE GROUP SPLIT: 1: DATABASE(MMJ_P): <<DATABASE GLOBAL INFO:>> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 192.168.182.129 5239 2023-11-29 11:41:50 GLOBAL VALID STARTUP MMJ_P OK 1 1 MOUNT PRIMARY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 192.168.182.129 5236 OK MMJ_P MOUNT PRIMARY 0 0 REALTIME VALID 4901 42979 4901 42979 NONE 2: DATABASE(MMJ_S): <<DATABASE GLOBAL INFO:>> DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT 192.168.182.130 5239 2023-11-29 11:41:49 GLOBAL VALID STARTUP MMJ_S OK 1 1 MOUNT STANDBY DSC_OPEN REALTIME VALID EP INFO: INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG 192.168.182.130 5236 OK MMJ_S MOUNT STANDBY 0 0 REALTIME VALID 4917 44164 4917 44164 NONE DATABASE(MMJ_S) APPLY INFO FROM (MMJ_P), REDOS_PARALLEL_NUM (1), WAIT_APPLY[FALSE]: DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[4917, 4917, 4917], (RLSN, SLSN, KLSN)[44164, 44164, 44164], N_TSK[0], TSK_MEM_USE[0] REDO_LSN_ARR: (44164) #================================================================================#

5.2 由于主备库库的FLSN不一致,重新搭建备库的方法

5.2.1 关闭主备集群

  • 退出 DM 数据守护
  • 退出监视器
[dmdba@localhost ~]$ /dm8/dmdbms/bin/DmMonitorServiceMonitor stop Stopping DmMonitorServiceMonitor: [ OK ]
  • 关闭备机的守护进程
[dmdba@localhost ~]$ /dm8/dmdbms/bin/DmWatcherServiceWatcher stop Stopping DmWatcherServiceWatcher: [ OK ]
  • 关闭主机的守护进程
[dmdba@localhost ~]$ /dm8/dmdbms/bin/DmWatcherServiceWatcher stop Stopping DmWatcherServiceWatcher: [ OK ]
  • 关闭主机实例
[dmdba@localhost ~]$ /dm8/dmdbms/bin/DmServiceMMJ_P stop Stopping DmServiceMMJ_P: [ OK ]
  • 关闭备机实例
[dmdba@localhost ~]$ /dm8/dmdbms/bin/DmServiceMMJ_S stop Stopping DmServiceMMJ_S: [ OK ]

5.2.2 备份主库数据库

由于现在主库已经设置为了 primary 状态,需要先将主库状态改为 normal,再去备份,否则直接备份还原可能会出现两个库都变成主库的状态,详细可以参考下本篇章节 7 如何集群变单机

因为主库备库均处于 mount 状态,这个时候可以选择冷备(或者将主库装填修改为 open 之后,与前面一样做热备也可以):

[dmdba@localhost ~]$ /dm8/dmdbms/bin/DmAPService status DmAPService (pid 1207) is running. [dmdba@localhost ~]$ /dm8/dmdbms/bin/dmrman dmrman V8 RMAN> BACKUP DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FULL BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/db_full_bak_01'; BACKUP DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FULL BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/db_full_bak_01'; file dm.key not found, use default license! Database mode = 1, oguid = 45331 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL EP[0]'s cur_lsn[42979], file_lsn[42979] Processing backupset /dm8/dmdbms/data/DAMENG/bak/db_full_bak_01 [Percent:100.00%][Speed:0.00M/s][Cost:00:00:02][Remaining:00:00:00] backup successfully! time used: 00:00:03.451 RMAN> exit time used: 0.220(ms)

复制备份文件到备库:

[dmdba@localhost ~]$ scp -r /dm8/dmdbms/data/DAMENG/bak/db_full_bak_01 dmdba@192.168.182.130:/dm8/dmdbms/data/DAMENG/bak dmdba@192.168.182.130's password: db_full_bak_01.bak 100% 9102KB 55.5MB/s 00:00 db_full_bak_01.meta

5.2.3 还原备库数据库

[dmdba@localhost ~]$ /dm8/dmdbms/bin/dmrman dmrman V8 RMAN> RESTORE DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/db_full_bak_01' RESTORE DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/db_full_bak_01' file dm.key not found, use default license! Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL [Percent:100.00%][Speed:0.00M/s][Cost:00:00:02][Remaining:00:00:00] restore successfully. time used: 00:00:02.579 RMAN> RECOVER DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/db_full_bak_01' RECOVER DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/db_full_bak_01' Database mode = 2, oguid = 45331 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL EP[0]'s cur_lsn[42979], file_lsn[42979] recover successfully! time used: 591.152(ms) RMAN> RECOVER DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' UPDATE DB_MAGIC RECOVER DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' UPDATE DB_MAGIC Database mode = 2, oguid = 45331 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL EP[0]'s cur_lsn[42979], file_lsn[42979] recover successfully! time used: 00:00:01.134

5.2.4 重新启动集群

  • 启动主库实例服务
[dmdba@localhost bin]$ ./DmServiceMMJ_P start Starting DmServiceMMJ_P: [ OK ]
  • 启动备库实例服务
[dmdba@localhost bin]$ ./DmServiceMMJ_S start Starting DmServiceMMJ_S: [ OK ]
  • 启动主库守护进程
[dmdba@localhost bin]$ ./DmWatcherServiceWatcher start Starting DmWatcherServiceWatcher: [ OK ]
  • 启动备库守护进程
[dmdba@localhost bin]$ ./DmWatcherServiceWatcher start Starting DmWatcherServiceWatcher: [ OK ]
  • 启动监视器
[dmdba@localhost bin]$ ./DmMonitorServiceMonitor start Starting DmMonitorServiceMonitor: [ OK ]

5.2.5 监视器查看集群状态

主备集群已经是open状态,搭建成功。

[dmdba@localhost bin]$ ./dmmonitor /dm8/dmdbms/bin/dmmonitor.ini [monitor] 2023-11-29 17:22:13: DMMONITOR[4.0] V8 [monitor] 2023-11-29 17:22:14: DMMONITOR[4.0] IS READY. [monitor] 2023-11-29 17:22:14: #--------------------------------------------------------------------------------# GET MONITOR CONNECT INFO FROM DMWATCHER(MMJ_S), THE FIRST LINE IS SELF INFO. DW_CONN_TIME MON_CONFIRM MID MON_IP MON_VERSION 2023-11-29 17:22:14 FALSE 1465653700 ::ffff:192.168.182.130 DMMONITOR[4.0] V8 2023-11-29 17:02:11 FALSE 1411321164 ::ffff:192.168.182.130 DMMONITOR[4.0] V8 #--------------------------------------------------------------------------------# [monitor] 2023-11-29 17:22:14: 收到守护进程(MMJ_S)消息 WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2023-11-29 17:22:14 OPEN OK MMJ_S OPEN STANDBY NULL 8 43551 43551 [monitor] 2023-11-29 17:22:15: 收到守护进程(MMJ_P)消息 WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2023-11-29 17:22:15 OPEN OK MMJ_P OPEN PRIMARY VALID 8 43551 43551

6 常见问题

6.1 主库备份数据报错

  • Q1:
    备份报错:磁盘空间不足;
    原因是归档参数SPACE_LIMIT设置为51200超过了最大磁盘容量;

  • Q2:
    备份报错:ARCH_SPACE_LIMIT value 512 is invalid;
    原因是SPACE_LIMIT大小设置未达到最小取值;
    注:ARCH_SPACE_LIMIT REDO:日志归档空间限制,当所有本地归档文件达到限制值时,系统自动删除最老的归档文件。0 表示无空间限制,取值范围 1024~2147483647,单位 MB,缺省值为 0;

  • Q3:
    备份报错:8086,无效的TRACE文件;
    原因是修改了SPACE_LIMIT的值之后,没有重启服务(修改位置是实例路径下arch.ini文件

6.2 备库恢复数据报错

  • Q4:
[dmdba@localhost /]$ /dm8/dmdbms/bin/dmrman CTLSTMT="RESTORE DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/BACKUP_FILE'" /dm8/dmdbms/bin/dmrman: error while loading shared libraries: libdmnlssort.so: cannot open shared object file: No such file or directory

原因是没有进入安装目录bin下执行,需要 cd /dm8/dmdbms/bin

  • Q5:
[dmdba@localhost bin]$ /dm8/dmdbms/bin/dmrman CTLSTMT="RESTORE DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/BACKUP_FILE'" dmrman V8 RESTORE DATABASE '/dm8/dmdbms/data/DAMENG/dm.ini' FROM BACKUPSET '/dm8/dmdbms/data/DAMENG/bak/BACKUP_FILE' file dm.key not found, use default license! [-7170]:bakres连接DMAP失败

原因是没有启动 dmap 服务,需要手动启动:

[dmdba@localhost bin]$ ./DmAPService start Starting DmAPService: [ OK ]

6.3 开启归档报错

  • Q6:
[dmdba@localhost /]$ /dm8/dmdbms/bin/dmserver /dm8/dmdbms/data/DAMENG/dm.ini file dm.key not found, use default license! openssl lib load failed! code: -115

原因是因为环境变量设置不对,可以参考:

[dmdba@localhost ~]$ vi .bash_profile # .bash_profile # Get the aliases and functions if [ -f ~/.bashrc ]; then . ~/.bashrc fi # User specific environment and startup programs PATH=$PATH:$HOME/.local/bin:$HOME/bin export PATH export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/dm8/dmdbms/bin" export DM_HOME="/dm8/dmdbms" export PATH=$PATH:$DM_HOME/bin

7 如何集群变单机

有时候在搭建主备集群的时候出现问题,需要将主库变为单机,再重新搭建。

[dmdba@localhost bin]$ /dm8/dmdbms/bin/DmServiceMMJ_P start Starting DmServiceMMJ_P: [ OK ] [dmdba@localhost bin]$ ./disql SYSDBA/SYSDBA@192.168.182.129:5236 服务器[LOCALHOST:5236]:处于主库配置状态 登录使用时间 : 4.346(ms) SQL> SP_SET_PARA_VALUE(1, 'ALTER_MODE_STATUS', 1); DMSQL 过程已成功完成 已用时间: 40.762(毫秒). 执行号:201. SQL> alter database normal; 操作已执行 已用时间: 6.643(毫秒). 执行号:0. SQL> alter database open; 操作已执行 已用时间: 38.446(毫秒). 执行号:0.

重新对主库进行备份之后,再次搭建即可。

注意,在重新备份还原之后,由于之前主库在注册服务的时候是以 mount 方式注册,后面启动应该也是 mount 状态,上述操作是直接将主库的状态改为了 open,所以在启动的时候,需要将主库状态修改为 mount 之后,再去正常启动。

评论
后发表回复

作者

文章

阅读量

获赞

扫一扫
联系客服