为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。
【DM版本】:dm8
【操作系统】:
【CPU】:
【问题描述】*:
测试环境的主库在昨天下午五点挂起了,dmmonitor前端时间在测试的强烈要求下关闭了。内存没有发生oom。看了一下从库日志没啥异常。
主库的dmwatcher日志报错如下:
2024-03-05 17:09:17.437 [INFO] dmwatcher P0000005425 T0000000000000005430 检测到实例(IDP2)发送归档成功,设置为当前恢复实例
2024-03-05 17:09:17.442 [INFO] dmwatcher P0000005425 T0000000000000005430 dw2_notify_sql_exec, dseq = 1706082609, sql: ALTER DATABASE SUSPEND
2024-03-05 17:09:17.442 [INFO] dmwatcher P0000005425 T0000000000000005430 Send tcp msg to local ep IDP1, hpc_seqno:0, code:0
2024-03-05 17:09:17.443 [INFO] dmwatcher P0000005425 T0000000000000005430 设置GRP1守护进程子状态为WAIT_TO_SUSPEND状态
2024-03-05 17:09:17.443 [INFO] dmwatcher P0000005425 T0000000000000005430 向实例(IDP2)发送归档日志成功,实例(IDP1)转入suspend状态
2024-03-05 17:09:17.444 [INFO] dmwatcher P0000005425 T0000000000000005430 dw2_group_get_curr_ep_retcode, ep(IDP1) cmd_ret:cmd=1, dseq=1706082609, code=100
2024-03-05 17:09:17.572 [INFO] dmwatcher P0000005425 T0000000000000005430 dw2_group_get_curr_ep_retcode, ep(IDP1) cmd_ret:cmd=1, dseq=1706082609, code=0
2024-03-05 17:09:17.572 [INFO] dmwatcher P0000005425 T0000000000000005430 dw2_clear_ep_cmd_info_with_recv_inst_low, clear ep(IDP1) cmd info, and reset curr_ep to NULL.
2024-03-05 17:09:17.572 [INFO] dmwatcher P0000005425 T0000000000000005430 转入suspend状态后,再次发送归档日志
2024-03-05 17:09:17.573 [INFO] dmwatcher P0000005425 T0000000000000005430 dw2_rarch_send to IDP2[seqno: 0], dseq = 1706082610
2024-03-05 17:09:17.573 [INFO] dmwatcher P0000005425 T0000000000000005430 Send tcp msg to local ep IDP1, hpc_seqno:0, code:0
2024-03-05 17:09:17.573 [INFO] dmwatcher P0000005425 T0000000000000005430 设置GRP1守护进程子状态为WAIT_SEND_ALL_ARCH状态
2024-03-05 17:09:17.575 [INFO] dmwatcher P0000005425 T0000000000000005430 dw2_group_get_curr_ep_retcode, ep(IDP1) cmd_ret:cmd=210, dseq=1706082610, code=100
2024-03-05 17:09:17.712 [INFO] dmwatcher P0000005425 T0000000000000005430 dw2_group_get_curr_ep_retcode, ep(IDP1) cmd_ret:cmd=210, dseq=1706082610, code=100
之后就一直刷后面几行,请教一下老师们,这个啥原因造成的?
看看服务日志有什么报错信息,而且为什么要关闭监视器呢?现在可以启动确认监视器,看能不能自动拉起,如果有core文件生成,也可以分析一下core文件