DM 数据守护(Data Watch)是一种集成化的高可用、高性能数据库解决方案,是数据库异地容灾的首选方案。通过部署 DM 数据守护,可以在硬件故障(如磁盘损坏)、自然灾害(地震、火灾)等极端情况下,避免数据损坏、丢失,保障数据安全,并且可以快速恢复数据库服务,满足用户不间断提供数据库服务的要求。与常规的数据库备份(Backup)、还原(Restore)技术相比,数据守护可以更快地恢复数据库服务。随着数据规模不断增长,通过还原手段恢复数据,往往需要数个小时、甚至更长时间,而数据守护基本不受数据规模的影响,只需数秒时间就可以将备库切换为主库对外提供数据库服务。
DM 版本为DM8,主备模式实时同步。
| 角色 | 主机名 | 实例名 | IP | 端口 |
|---|---|---|---|---|
| 主库 | panda01 | SCM01 | 192.168.66.201 | 15236 |
| 备库 | panda02 | SCM02 | 192.168.66.202 | 25236 |
在主库上创建表空间、用户并插入数据
-- 创建表空间和用户
create tablespace DATA datafile '/dm8/dmdata/SCM01/DATA01.DBF' size 200 AUTOEXTEND OFF;
create user panda identified by "dm_OPS_123" DEFAULT tablespace DATA;
grant dba to panda;
-- 创建业务表 panda.testdw,插入 4 条数据并提交
[dmdba@panda01 SCM01]$ /dm8/dmdbms/bin/disql panda/dm_OPS_123@192.168.66.201:15236
服务器[192.168.66.201:15236]:处于主库打开状态
登录使用时间 : 4.349(ms)
create table panda.testdw (id int,time TIMESTAMP DEFAULT SYSDATE);
insert into panda.testdw (id) values(1);
insert into panda.testdw (id) values(2);
insert into panda.testdw (id) values(3);
insert into panda.testdw (id) values(4);
commit;
主库可以正常写入,符合预期。
登录备库后可以看到:无法创建表空间、无法插入数据、可以正常查询主库数据
[dmdba@panda02 ~]$ /dm8/dmdbms/bin/disql SYSDBA/dm_OPS_123@192.168.66.202:25236
服务器[192.168.66.202:25236]:处于备库打开状态
登录使用时间 : 4.538(ms)
disql V8
SQL> create tablespace DATA datafile '/dm8/dmdata/SCM01/DATA01.DBF' size 200 AUTOEXTEND OFF;
create tablespace DATA datafile '/dm8/dmdata/SCM01/DATA01.DBF' size 200 AUTOEXTEND OFF;
[-710]:试图在STANDBY模式下,修改用户库.
已用时间: 4.917(毫秒). 执行号:0.
SQL> select path from v$datafile;
行号 PATH
---------- ----------------------------
1 /dm8/dmdata/SCM01/DATA01.DBF
2 /dm8/dmdata/SCM02/MAIN.DBF
3 /dm8/dmdata/SCM02/ROLL.DBF
4 /dm8/dmdata/SCM02/TEMP.DBF
5 /dm8/dmdata/SCM02/SYSTEM.DBF
已用时间: 15.518(毫秒). 执行号:102.
SQL> select * from panda.testdw;
行号 ID TIME
---------- ----------- --------------------------
1 1 2025-11-04 15:40:57.000000
2 2 2025-11-04 15:42:22.000000
3 3 2025-11-04 15:42:22.000000
4 4 2025-11-04 15:42:22.000000
已用时间: 0.392(毫秒). 执行号:101.
SQL> insert into panda.testdw (id) values(1);
insert into panda.testdw (id) values(1);
[-710]:试图在STANDBY模式下,修改用户库.
已用时间: 15.965(毫秒). 执行号:0.
数据完整同步,说明,主库到备库实时同步正常,**备库禁止写操作,**这正是主备架构应有的行为。
该场景适用于 主库正常、有计划的维护或割接。
切换命令如下:
-- 主库 panda01 SCM01 192.168.66.201:15236
-- 备库 panda02 SCM02 192.168.66.202:25236
choose switchover GRP1 -- 主机正常:查看可切换为主机的实例列表
Switchover GRP1.实例名 -- 主机正常:使用指定组的指定实例,切换为主机
choose takeover GRP1 -- 主机故障:查看可切换为主机的实例列表
Takeover GRP1.实例名 -- 主机故障:使用指定组的指定实例,切换为主机
choose takeover force GRP1 -- 强制切换:查看可切换为主机的实例列表
takeover force GRP1.实例名 -- 强制切换:使用指定组的指定实例,切换为主机
通过 DM 监视器先登录sysdba用户,在执行切换命令:
login
用户名:sysdba
密码:
[monitor] 2025-11-04 16:18:04: 登录监视器成功!
choose switchover grp1
Can choose one of the following instances to do switchover:
1: GRP1_SCM_S
switchover grp1.GRP1_SCM_S
此操作需谨慎, 将会导致主库发生切换, 是否继续使用GRP1.GRP1_SCM_S执行SWITCHOVER操作(YES/NO/Y/N)?
Y
[monitor] 2025-11-04 16:18:59: 开始切换实例GRP1_SCM_S
[monitor] 2025-11-04 16:18:59: 通知守护进程GRP1_SCM_P切换SWITCHOVER状态
[monitor] 2025-11-04 16:18:59: 守护进程(GRP1_SCM_P)状态切换 [OPEN-->SWITCHOVER]
[monitor] 2025-11-04 16:19:00: 切换守护进程GRP1_SCM_P为SWITCHOVER状态成功
[monitor] 2025-11-04 16:19:00: 通知守护进程GRP1_SCM_S切换SWITCHOVER状态
[monitor] 2025-11-04 16:19:00: 守护进程(GRP1_SCM_S)状态切换 [OPEN-->SWITCHOVER]
[monitor] 2025-11-04 16:19:00: 切换守护进程GRP1_SCM_S为SWITCHOVER状态成功
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_P开始执行SP_SET_GLOBAL_DW_STATUS(0, 6)语句
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_P执行SP_SET_GLOBAL_DW_STATUS(0, 6)语句成功
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S开始执行SP_SET_GLOBAL_DW_STATUS(0, 6)语句
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S执行SP_SET_GLOBAL_DW_STATUS(0, 6)语句成功
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_P开始执行ALTER DATABASE MOUNT语句
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_P执行ALTER DATABASE MOUNT语句成功
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S开始执行SP_APPLY_KEEP_PKG()语句
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S执行SP_APPLY_KEEP_PKG()语句成功
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S开始执行ALTER DATABASE MOUNT语句
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S执行ALTER DATABASE MOUNT语句成功
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_P开始执行ALTER DATABASE STANDBY语句
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_P执行ALTER DATABASE STANDBY语句成功
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S开始执行ALTER DATABASE PRIMARY语句
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S执行ALTER DATABASE PRIMARY语句成功
[monitor] 2025-11-04 16:19:00: 通知实例GRP1_SCM_S修改所有归档状态无效
[monitor] 2025-11-04 16:19:00: 修改所有实例归档为无效状态成功
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_P开始执行ALTER DATABASE OPEN FORCE语句
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_P执行ALTER DATABASE OPEN FORCE语句成功
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S开始执行ALTER DATABASE OPEN FORCE语句
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S执行ALTER DATABASE OPEN FORCE语句成功
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_P开始执行SP_SET_GLOBAL_DW_STATUS(6, 0)语句
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_P执行SP_SET_GLOBAL_DW_STATUS(6, 0)语句成功
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S开始执行SP_SET_GLOBAL_DW_STATUS(6, 0)语句
[monitor] 2025-11-04 16:19:00: 实例GRP1_SCM_S执行SP_SET_GLOBAL_DW_STATUS(6, 0)语句成功
[monitor] 2025-11-04 16:19:00: 通知守护进程GRP1_SCM_P切换OPEN状态
[monitor] 2025-11-04 16:19:00: 守护进程(GRP1_SCM_P)状态切换 [SWITCHOVER-->OPEN]
[monitor] 2025-11-04 16:19:01: 切换守护进程GRP1_SCM_P为OPEN状态成功
[monitor] 2025-11-04 16:19:01: 通知守护进程GRP1_SCM_S切换OPEN状态
[monitor] 2025-11-04 16:19:01: 守护进程(GRP1_SCM_S)状态切换 [SWITCHOVER-->OPEN]
[monitor] 2025-11-04 16:19:01: 切换守护进程GRP1_SCM_S为OPEN状态成功
[monitor] 2025-11-04 16:19:01: 通知组(GRP1)的守护进程执行清理操作
[monitor] 2025-11-04 16:19:01: 清理守护进程(GRP1_SCM_P)请求成功
[monitor] 2025-11-04 16:19:01: 清理守护进程(GRP1_SCM_S)请求成功
[monitor] 2025-11-04 16:19:01: 实例GRP1_SCM_S切换成功
2025-11-04 16:19:01
#================================================================================#
GROUP OGUID MON_CONFIRM MODE MPP_FLAG
GRP1 25114 TRUE AUTO FALSE
<<DATABASE GLOBAL INFO:>>
DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT DETACHED
192.168.66.102 25238 2025-11-04 16:19:00 GLOBAL VALID OPEN GRP1_SCM_S OK 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID FALSE
EP INFO:
INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG
192.168.66.202 25236 OK GRP1_SCM_S OPEN PRIMARY 0 0 REALTIME VALID 3230 46481 3230 46482 NONE
<<DATABASE GLOBAL INFO:>>
DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT DETACHED
192.168.66.101 15238 2025-11-04 16:19:01 GLOBAL VALID OPEN GRP1_SCM_P OK 1 1 OPEN STANDBY DSC_OPEN REALTIME INVALID FALSE
EP INFO:
INST_IP INST_PORT INST_OK INAME ISTATUS IMODE DSC_SEQNO DSC_CTL_NODE RTYPE RSTAT FSEQ FLSN CSEQ CLSN DW_STAT_FLAG
192.168.66.201 15236 OK GRP1_SCM_P OPEN STANDBY 0 0 REALTIME INVALID 3228 46405 3228 46405 NONE
DATABASE(GRP1_SCM_P) APPLY INFO FROM (GRP1_SCM_S), REDOS_PARALLEL_NUM (1), WAIT_APPLY[FALSE]:
DSC_SEQNO[0], (RSEQ, SSEQ, KSEQ)[3228, 3228, 3228], (RLSN, SLSN, KLSN)[46405, 46405, 46405], N_TSK[0], TSK_MEM_USE[0]
#================================================================================#
[monitor] 2025-11-04 16:19:04: 守护进程(GRP1_SCM_S)状态切换 [OPEN-->RECOVERY]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:19:03 RECOVERY OK GRP1_SCM_S OPEN PRIMARY VALID 3 46482 46482
[monitor] 2025-11-04 16:19:07: 守护进程(GRP1_SCM_S)状态切换 [RECOVERY-->OPEN]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:19:06 OPEN OK GRP1_SCM_S OPEN PRIMARY VALID 3 46482 46482
主库降级为 STANDBY,备库提升为 PRIMARY,数据状态保持一致服务不中断,整个过程由 DM 自动完成,再次登录,主库已经改为备库
[dmdba@panda01 SCM01]$ /dm8/dmdbms/bin/disql panda/dm_OPS_123@192.168.66.201:15236
服务器[192.168.66.201:15236]:处于备库打开状态
登录使用时间 : 6.168(ms)
disql V8
SQL> insert into panda.testdw (id) values(5);
insert into panda.testdw (id) values(5);
[-710]:试图在STANDBY模式下,修改用户库.
已用时间: 3.520(毫秒). 执行号:0.
模拟方式:reboot
观察监视器信息
-- 上面主动切换了一次,现在状态如下,使用 reboot 模拟 panda02 主机上主库断电重启
-- 备库 panda01 SCM01 192.168.66.201:15236
-- 主库 panda02 SCM02 192.168.66.202:25236
[root@panda02 ~]# reboot
-- 观察监视器信息
[monitor] 2025-11-04 16:26:31: 接收守护进程(GRP1_SCM_S)消息超时
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:26:09 ERROR OK GRP1_SCM_S OPEN PRIMARY VALID 3 46489 46489
[monitor] 2025-11-04 16:26:31: 检测到PRIMARY实例故障,开始对组(GRP1)执行自动接管
[monitor] 2025-11-04 16:26:31: 通知组(GRP1)当前活动的守护进程设置MID
[monitor] 2025-11-04 16:26:31: 通知组(GRP1)当前活动的守护进程设置MID成功
[monitor] 2025-11-04 16:26:31: 开始使用实例GRP1_SCM_P接管
[monitor] 2025-11-04 16:26:31: 通知守护进程GRP1_SCM_P切换TAKEOVER状态
[monitor] 2025-11-04 16:26:31: 守护进程(GRP1_SCM_P)状态切换 [OPEN-->TAKEOVER]
[monitor] 2025-11-04 16:26:31: 切换守护进程GRP1_SCM_P为TAKEOVER状态成功
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P开始执行SP_SET_GLOBAL_DW_STATUS(0, 7)语句
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P执行SP_SET_GLOBAL_DW_STATUS(0, 7)语句成功
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P开始执行SP_APPLY_KEEP_PKG()语句
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P执行SP_APPLY_KEEP_PKG()语句成功
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P开始执行ALTER DATABASE MOUNT语句
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P执行ALTER DATABASE MOUNT语句成功
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P开始执行ALTER DATABASE PRIMARY语句
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P执行ALTER DATABASE PRIMARY语句成功
[monitor] 2025-11-04 16:26:31: 通知实例GRP1_SCM_P修改所有归档状态无效
[monitor] 2025-11-04 16:26:31: 修改所有实例归档为无效状态成功
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P开始执行ALTER DATABASE OPEN FORCE语句
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P执行ALTER DATABASE OPEN FORCE语句成功
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P开始执行SP_SET_GLOBAL_DW_STATUS(7, 0)语句
[monitor] 2025-11-04 16:26:31: 实例GRP1_SCM_P执行SP_SET_GLOBAL_DW_STATUS(7, 0)语句成功
[monitor] 2025-11-04 16:26:31: 通知守护进程GRP1_SCM_P切换OPEN状态
[monitor] 2025-11-04 16:26:31: 守护进程(GRP1_SCM_P)状态切换 [TAKEOVER-->OPEN]
[monitor] 2025-11-04 16:26:31: 切换守护进程GRP1_SCM_P为OPEN状态成功
[monitor] 2025-11-04 16:26:31: 通知组(GRP1)的守护进程执行清理操作
[monitor] 2025-11-04 16:26:31: 清理守护进程(GRP1_SCM_P)请求成功
[monitor] 2025-11-04 16:26:31: 使用实例GRP1_SCM_P接管成功
[monitor] 2025-11-04 16:26:31: 组(GRP1)使用实例GRP1_SCM_P自动接管成功 <<<<
DM 监视器可以通过检测主库心跳超时,自动触发 TAKEOVER,备库自动升级为主库整个过程秒级接管,业务可继续写入。
-- 登录备库,已经变成了主库,s级别切换
[dmdba@panda01 SCM01]$ /dm8/dmdbms/bin/disql panda/dm_OPS_123@192.168.66.201:15236
服务器[192.168.66.201:15236]:处于主库打开状态
登录使用时间 : 4.724(ms)
disql V8
-- panda02自动重启变成了备库
Last login: Tue Nov 4 15:26:59 2025 from 192.168.66.11
[root@panda02 ~]# su - dmdba
上一次登录:二 11月 4 15:29:37 CST 2025pts/1 上
[dmdba@panda02 ~]$ /dm8/dmdbms/bin/disql SYSDBA/dm_OPS_123@192.168.66.202:25236
服务器[192.168.66.202:25236]:处于备库打开状态
登录使用时间 : 7.300(ms)
disql V8
[monitor] 2025-11-04 16:26:58: 守护进程(GRP1_SCM_S)状态切换 [NONE-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:26:57 STARTUP OK GRP1_SCM_S MOUNT PRIMARY VALID 3 46516 46516
[monitor] 2025-11-04 16:26:59: 守护进程(GRP1_SCM_S)状态切换 [STARTUP-->UNIFY EP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:26:58 UNIFY EP OK GRP1_SCM_S MOUNT PRIMARY VALID 3 46516 46516
[monitor] 2025-11-04 16:26:59: 守护进程(GRP1_SCM_S)状态切换 [UNIFY EP-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:26:58 STARTUP OK GRP1_SCM_S MOUNT STANDBY INVALID 3 46516 46516
[monitor] 2025-11-04 16:26:59: 守护进程(GRP1_SCM_S)状态切换 [STARTUP-->UNIFY EP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:26:58 UNIFY EP OK GRP1_SCM_S MOUNT STANDBY INVALID 3 46516 46516
[monitor] 2025-11-04 16:26:59: 守护进程(GRP1_SCM_S)状态切换 [UNIFY EP-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:26:58 STARTUP OK GRP1_SCM_S OPEN STANDBY INVALID 3 46516 46516
[monitor] 2025-11-04 16:26:59: 守护进程(GRP1_SCM_S)状态切换 [STARTUP-->OPEN]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:26:58 OPEN OK GRP1_SCM_S OPEN STANDBY INVALID 3 46516 46516
[monitor] 2025-11-04 16:26:59: 守护进程(GRP1_SCM_P)状态切换 [OPEN-->RECOVERY]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:26:59 RECOVERY OK GRP1_SCM_P OPEN PRIMARY VALID 4 46561 46561
[monitor] 2025-11-04 16:27:01: 守护进程(GRP1_SCM_P)状态切换 [RECOVERY-->OPEN]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:27:01 OPEN OK GRP1_SCM_P OPEN PRIMARY VALID 4 46561 46561
模拟方式:systemctl stop network
-- 模拟主机panda01上主库断网
[root@panda01 ~]# systemctl stop network
-- 60s后 备库已经接管主库
[dmdba@panda02 ~]$ /dm8/dmdbms/bin/disql SYSDBA/dm_OPS_123@192.168.66.202:25236
服务器[192.168.66.202:25236]:处于主库打开状态
登录使用时间 : 5.273(ms)
disql V8
[monitor] 2025-11-04 16:30:56: 接收守护进程(GRP1_SCM_P)消息超时
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:30:35 ERROR OK GRP1_SCM_P OPEN PRIMARY VALID 4 46594 46594
[monitor] 2025-11-04 16:30:56: 检测到PRIMARY实例故障,开始对组(GRP1)执行自动接管
[monitor] 2025-11-04 16:30:56: 通知组(GRP1)当前活动的守护进程设置MID
[monitor] 2025-11-04 16:30:56: 通知组(GRP1)当前活动的守护进程设置MID成功
[monitor] 2025-11-04 16:30:56: 开始使用实例GRP1_SCM_S接管
[monitor] 2025-11-04 16:30:56: 通知守护进程GRP1_SCM_S切换TAKEOVER状态
[monitor] 2025-11-04 16:30:56: 守护进程(GRP1_SCM_S)状态切换 [OPEN-->TAKEOVER]
[monitor] 2025-11-04 16:30:56: 切换守护进程GRP1_SCM_S为TAKEOVER状态成功
[monitor] 2025-11-04 16:30:56: 实例GRP1_SCM_S开始执行SP_SET_GLOBAL_DW_STATUS(0, 7)语句
[monitor] 2025-11-04 16:30:57: 实例GRP1_SCM_S执行SP_SET_GLOBAL_DW_STATUS(0, 7)语句成功
[monitor] 2025-11-04 16:30:57: 实例GRP1_SCM_S开始执行SP_APPLY_KEEP_PKG()语句
[monitor] 2025-11-04 16:30:57: 实例GRP1_SCM_S执行SP_APPLY_KEEP_PKG()语句成功
[monitor] 2025-11-04 16:30:57: 实例GRP1_SCM_S开始执行ALTER DATABASE MOUNT语句
[monitor] 2025-11-04 16:30:57: 实例GRP1_SCM_S执行ALTER DATABASE MOUNT语句成功
[monitor] 2025-11-04 16:30:57: 实例GRP1_SCM_S开始执行ALTER DATABASE PRIMARY语句
[monitor] 2025-11-04 16:30:57: 实例GRP1_SCM_S执行ALTER DATABASE PRIMARY语句成功
[monitor] 2025-11-04 16:30:57: 通知实例GRP1_SCM_S修改所有归档状态无效
[monitor] 2025-11-04 16:30:57: 修改所有实例归档为无效状态成功
[monitor] 2025-11-04 16:30:57: 实例GRP1_SCM_S开始执行ALTER DATABASE OPEN FORCE语句
[monitor] 2025-11-04 16:30:57: 实例GRP1_SCM_S执行ALTER DATABASE OPEN FORCE语句成功
[monitor] 2025-11-04 16:30:57: 实例GRP1_SCM_S开始执行SP_SET_GLOBAL_DW_STATUS(7, 0)语句
[monitor] 2025-11-04 16:30:57: 实例GRP1_SCM_S执行SP_SET_GLOBAL_DW_STATUS(7, 0)语句成功
[monitor] 2025-11-04 16:30:57: 通知守护进程GRP1_SCM_S切换OPEN状态
[monitor] 2025-11-04 16:30:57: 守护进程(GRP1_SCM_S)状态切换 [TAKEOVER-->OPEN]
[monitor] 2025-11-04 16:30:57: 切换守护进程GRP1_SCM_S为OPEN状态成功
[monitor] 2025-11-04 16:30:57: 通知组(GRP1)的守护进程执行清理操作
[monitor] 2025-11-04 16:30:57: 清理守护进程(GRP1_SCM_S)请求成功
[monitor] 2025-11-04 16:30:57: 使用实例GRP1_SCM_S接管成功
[monitor] 2025-11-04 16:30:57: 组(GRP1)使用实例GRP1_SCM_S自动接管成功
systemctl start network
[monitor] 2025-11-04 16:34:06: 实例GRP1_SCM_P[PRIMARY, SUSPEND, ISTAT_SAME:TRUE]故障
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:34:06 STARTUP ERROR GRP1_SCM_P SUSPEND PRIMARY VALID 4 46594 46594
[monitor] 2025-11-04 16:34:06: 守护进程(GRP1_SCM_P)状态切换 [NONE-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:34:06 STARTUP ERROR GRP1_SCM_P SUSPEND PRIMARY VALID 4 46594 46594
[monitor] 2025-11-04 16:34:21: 实例GRP1_SCM_P[PRIMARY, MOUNT, ISTAT_SAME:TRUE]恢复正常
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:34:21 STARTUP OK GRP1_SCM_P MOUNT PRIMARY VALID 4 46594 46594
[monitor] 2025-11-04 16:34:34: 守护进程(GRP1_SCM_P)状态切换 [STARTUP-->UNIFY EP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:34:34 UNIFY EP OK GRP1_SCM_P MOUNT PRIMARY VALID 4 46594 46594
[monitor] 2025-11-04 16:34:34: 守护进程(GRP1_SCM_P)状态切换 [UNIFY EP-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:34:34 STARTUP OK GRP1_SCM_P MOUNT STANDBY INVALID 4 46594 46594
[monitor] 2025-11-04 16:34:34: 守护进程(GRP1_SCM_P)状态切换 [STARTUP-->UNIFY EP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:34:34 UNIFY EP OK GRP1_SCM_P MOUNT STANDBY INVALID 4 46594 46594
[monitor] 2025-11-04 16:34:34: 守护进程(GRP1_SCM_P)状态切换 [UNIFY EP-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:34:34 STARTUP OK GRP1_SCM_P OPEN STANDBY INVALID 4 46594 46594
[monitor] 2025-11-04 16:34:34: 守护进程(GRP1_SCM_P)状态切换 [STARTUP-->OPEN]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:34:34 OPEN OK GRP1_SCM_P OPEN STANDBY INVALID 4 46594 46594
[monitor] 2025-11-04 16:34:35: 守护进程(GRP1_SCM_S)状态切换 [OPEN-->RECOVERY]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:34:34 RECOVERY OK GRP1_SCM_S OPEN PRIMARY VALID 5 46688 46688
[monitor] 2025-11-04 16:34:36: 守护进程(GRP1_SCM_S)状态切换 [RECOVERY-->OPEN]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:34:35 OPEN OK GRP1_SCM_S OPEN PRIMARY VALID 5 46688 46688
现象:主库网络中断,监视器在约 60 秒内检测到异常,备库自动接管为主库。
网络恢复后,原主库自动以 备库 身份重新加入,数据自动追平。
-- 自动切换到备库
Last login: Tue Nov 4 15:29:17 2025 from 192.168.66.11
[root@panda01 ~]# su - dmdba
上一次登录:二 11月 4 16:35:55 CST 2025pts/2 上
[dmdba@panda01 ~]$ /dm8/dmdbms/bin/disql SYSDBA/dm_OPS_123@192.168.66.201:15236
服务器[192.168.66.201:15236]:处于备库打开状态
登录使用时间 : 5.475(ms)
disql V8
场景三和场景一很像模拟主库不可用,模拟方式:kill -9 dmserver_pid
-- kill -9 模拟主库异常宕机
[dmdba@panda02 ~]$ /dm8/dmdbms/bin/disql SYSDBA/dm_OPS_123@192.168.66.202:25236
服务器[192.168.66.202:25236]:处于主库打开状态
登录使用时间 : 4.625(ms)
disql V8
SQL> exit
[dmdba@panda02 ~]$ ps -ef|grep dms
dmdba 9704 1 0 16:26 ? 00:00:04 /dm8/dmdbms/bin/dmserver path=/dm8/dmdata/SCM02/dm.ini -noconsole mount
dmdba 10645 10478 0 16:37 pts/0 00:00:00 grep --color=auto dms
[dmdba@panda02 ~]$
[dmdba@panda02 ~]$ kill -9 9704
[dmdba@panda02 ~]$ ps -ef|grep dms
dmdba 10671 10478 0 16:37 pts/0 00:00:00 grep --color=auto dms
-- 监视器信息
[monitor] 2025-11-04 16:37:31: 实例GRP1_SCM_S[PRIMARY, OPEN, ISTAT_SAME:TRUE]故障
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:37:30 STARTUP ERROR GRP1_SCM_S OPEN PRIMARY VALID 5 46733 46733
[monitor] 2025-11-04 16:37:31: 守护进程(GRP1_SCM_S)状态切换 [OPEN-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:37:30 STARTUP ERROR GRP1_SCM_S OPEN PRIMARY VALID 5 46733 46733
[monitor] 2025-11-04 16:37:31: 检测到PRIMARY实例故障,开始对组(GRP1)执行自动接管
[monitor] 2025-11-04 16:37:31: 通知组(GRP1)当前活动的守护进程设置MID
[monitor] 2025-11-04 16:37:31: 通知组(GRP1)当前活动的守护进程设置MID成功
[monitor] 2025-11-04 16:37:31: 开始使用实例GRP1_SCM_P接管
[monitor] 2025-11-04 16:37:31: 通知守护进程GRP1_SCM_P切换TAKEOVER状态
[monitor] 2025-11-04 16:37:31: 守护进程(GRP1_SCM_P)状态切换 [OPEN-->TAKEOVER]
[monitor] 2025-11-04 16:37:31: 切换守护进程GRP1_SCM_P为TAKEOVER状态成功
[monitor] 2025-11-04 16:37:31: 实例GRP1_SCM_P开始执行SP_SET_GLOBAL_DW_STATUS(0, 7)语句
[monitor] 2025-11-04 16:37:32: 实例GRP1_SCM_P执行SP_SET_GLOBAL_DW_STATUS(0, 7)语句成功
[monitor] 2025-11-04 16:37:32: 实例GRP1_SCM_P开始执行SP_APPLY_KEEP_PKG()语句
[monitor] 2025-11-04 16:37:32: 实例GRP1_SCM_P执行SP_APPLY_KEEP_PKG()语句成功
[monitor] 2025-11-04 16:37:32: 实例GRP1_SCM_P开始执行ALTER DATABASE MOUNT语句
[monitor] 2025-11-04 16:37:32: 实例GRP1_SCM_P执行ALTER DATABASE MOUNT语句成功
[monitor] 2025-11-04 16:37:32: 实例GRP1_SCM_P开始执行ALTER DATABASE PRIMARY语句
[monitor] 2025-11-04 16:37:32: 实例GRP1_SCM_P执行ALTER DATABASE PRIMARY语句成功
[monitor] 2025-11-04 16:37:32: 通知实例GRP1_SCM_P修改所有归档状态无效
[monitor] 2025-11-04 16:37:32: 修改所有实例归档为无效状态成功
[monitor] 2025-11-04 16:37:32: 实例GRP1_SCM_P开始执行ALTER DATABASE OPEN FORCE语句
[monitor] 2025-11-04 16:37:32: 实例GRP1_SCM_P执行ALTER DATABASE OPEN FORCE语句成功
[monitor] 2025-11-04 16:37:32: 实例GRP1_SCM_P开始执行SP_SET_GLOBAL_DW_STATUS(7, 0)语句
[monitor] 2025-11-04 16:37:32: 实例GRP1_SCM_P执行SP_SET_GLOBAL_DW_STATUS(7, 0)语句成功
[monitor] 2025-11-04 16:37:32: 通知守护进程GRP1_SCM_P切换OPEN状态
[monitor] 2025-11-04 16:37:32: 守护进程(GRP1_SCM_P)状态切换 [TAKEOVER-->OPEN]
[monitor] 2025-11-04 16:37:32: 切换守护进程GRP1_SCM_P为OPEN状态成功
[monitor] 2025-11-04 16:37:32: 通知组(GRP1)的守护进程执行清理操作
[monitor] 2025-11-04 16:37:32: 清理守护进程(GRP1_SCM_P)请求成功
[monitor] 2025-11-04 16:37:32: 清理守护进程(GRP1_SCM_S)请求成功
[monitor] 2025-11-04 16:37:32: 使用实例GRP1_SCM_P接管成功
[monitor] 2025-11-04 16:37:32: 组(GRP1)使用实例GRP1_SCM_P自动接管成功
[monitor] 2025-11-04 16:37:53: 实例GRP1_SCM_S[PRIMARY, MOUNT, ISTAT_SAME:TRUE]恢复正常
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:37:52 STARTUP OK GRP1_SCM_S MOUNT PRIMARY VALID 5 46733 46733
[monitor] 2025-11-04 16:37:53: 守护进程(GRP1_SCM_S)状态切换 [STARTUP-->UNIFY EP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:37:52 UNIFY EP OK GRP1_SCM_S MOUNT PRIMARY VALID 5 46733 46733
[monitor] 2025-11-04 16:37:53: 守护进程(GRP1_SCM_S)状态切换 [UNIFY EP-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:37:53 STARTUP OK GRP1_SCM_S MOUNT STANDBY INVALID 5 46733 46733
[monitor] 2025-11-04 16:37:53: 守护进程(GRP1_SCM_S)状态切换 [STARTUP-->UNIFY EP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:37:53 UNIFY EP OK GRP1_SCM_S MOUNT STANDBY INVALID 5 46733 46733
[monitor] 2025-11-04 16:37:54: 守护进程(GRP1_SCM_S)状态切换 [UNIFY EP-->STARTUP]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:37:53 STARTUP OK GRP1_SCM_S OPEN STANDBY INVALID 5 46733 46733
[monitor] 2025-11-04 16:37:54: 守护进程(GRP1_SCM_S)状态切换 [STARTUP-->OPEN]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:37:53 OPEN OK GRP1_SCM_S OPEN STANDBY INVALID 5 46733 46733
[monitor] 2025-11-04 16:37:54: 守护进程(GRP1_SCM_P)状态切换 [OPEN-->RECOVERY]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:37:54 RECOVERY OK GRP1_SCM_P OPEN PRIMARY VALID 6 46785 46785
[monitor] 2025-11-04 16:37:56: 守护进程(GRP1_SCM_P)状态切换 [RECOVERY-->OPEN]
WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN
2025-11-04 16:37:56 OPEN OK GRP1_SCM_P OPEN PRIMARY VALID 6 46785 46785
DM 行为:立即识别 PRIMARY 故障,备库自动 TAKEOVER,打开为 PRIMARY。
当原主库重新启动后,自动变为备库,数据全部同步完成,切换期间产生的数据 全部存在
-- 登录备库,已经转为主库,模拟一些事务信息
truncate table panda.testdw;
insert into panda.testdw (id) values(100);
create table panda.hang (id int,time TIMESTAMP DEFAULT SYSDATE);
insert into panda.hang (id) values(1);
insert into panda.hang (id) values(2);
insert into panda.hang (id) values(3);
insert into panda.hang (id) values(4);
insert into panda.hang (id) values(5);
insert into panda.hang (id) values(6);
insert into panda.hang (id) values(7);
insert into panda.hang (id) values(8);
insert into panda.hang (id) values(9);
commit;
-- 再次启动 SCM02_P,发现自动重启了,数据也同步完成
[dmdba@panda02 ~]$ /dm8/dmdbms/bin/disql SYSDBA/dm_OPS_123@192.168.66.202:25236
服务器[192.168.66.202:25236]:处于备库打开状态
登录使用时间 : 5.622(ms)
disql V8
SQL>
SQL> select START_TIME from v$instance;
行号 START_TIME
---------- -------------------
1 2025-11-04 16:37:49
已用时间: 5.317(毫秒). 执行号:2.
SQL> select * from panda.testdw;
行号 ID TIME
---------- ----------- --------------------------
1 100 2025-11-04 16:41:02.000000
已用时间: 4.554(毫秒). 执行号:3.
SQL> select * from panda.hang;
行号 ID TIME
---------- ----------- --------------------------
1 1 2025-11-04 16:41:02.000000
2 2 2025-11-04 16:41:02.000000
3 3 2025-11-04 16:41:02.000000
4 4 2025-11-04 16:41:02.000000
5 5 2025-11-04 16:41:02.000000
6 6 2025-11-04 16:41:02.000000
7 7 2025-11-04 16:41:02.000000
8 8 2025-11-04 16:41:02.000000
9 9 2025-11-04 16:41:02.000000
9 rows got
已用时间: 3.715(毫秒). 执行号:4.
通过本次测试,可以得出DM 主备架构具备:实时数据同步、主写备读保护、平滑的手动切换(Switchover)、自动故障接管(Takeover)、支持断电、断网、进程崩溃等多种异常场景。
文章
阅读量
获赞
