controller node pcs resource fail 이슈 해결 방법
페이지 정보
본문
controller node pcs resource fail 이슈 발생
분석내용
--------------------------------------
grep crmd\: corosync.log-20180704 | grep "Jul 03 13:2" | grep failed
Jul 03 13:29:13 [3717] ggcsactr-0 crmd: info: update_failcount: Updating failcount for rabbitmq on ggcsactr-1 after failed monitor: rc=7 (update=value++, time=1530592153)
Jul 03 13:29:13 [3717] ggcsactr-0 crmd: info: process_graph_event: Detected action (1985.69) rabbitmq_monitor_10000.417=not running: failed
Jul 03 13:29:22 [3717] ggcsactr-0 crmd: info: update_failcount: Updating failcount for rabbitmq on ggcsactr-2 after failed monitor: rc=1 (update=value++, time=1530592162)
Jul 03 13:29:22 [3717] ggcsactr-0 crmd: info: process_graph_event: Detected action (1987.71) rabbitmq_monitor_10000.170=unknown error: failed
Jul 03 13:29:24 [3717] ggcsactr-0 crmd: info: update_failcount: Updating failcount for rabbitmq on ggcsactr-0 after failed monitor: rc=7 (update=value++, time=1530592164)
Jul 03 13:29:24 [3717] ggcsactr-0 crmd: info: process_graph_event: Detected action (1986.76) rabbitmq_monitor_10000.253=not running: failed
Jul 03 13:29:48 [3717] ggcsactr-0 crmd: info: update_failcount: Updating failcount for rabbitmq on ggcsactr-2 after failed monitor: rc=7 (update=value++, time=1530592188)
Jul 03 13:29:48 [3717] ggcsactr-0 crmd: info: process_graph_event: Detected action (1987.71) rabbitmq_monitor_10000.170=not running: failed
$ grep 'Out of memory' messages
Jul 3 13:29:09 ggcsactr-0 kernel: Out of memory: Kill process 1022198 (beam.smp) score 269 or sacrifice child
Jul 3 13:29:09 ggcsactr-0 kernel: Out of memory: Kill process 1022198 (beam.smp) score 269 or sacrifice child
Jul 3 16:18:44 ggcsactr-0 kernel: Out of memory: Kill process 752783 (beam.smp) score 274 or sacrifice child
Jul 3 16:18:44 ggcsactr-0 kernel: Out of memory: Kill process 752783 (beam.smp) score 274 or sacrifice child
$ grep 'Out of memory' messages
Jul 3 13:28:57 ggcsactr-1 kernel: Out of memory: Kill process 386499 (beam.smp) score 329 or sacrifice child
Jul 3 13:28:57 ggcsactr-1 kernel: Out of memory: Kill process 386499 (beam.smp) score 329 or sacrifice child
rabbitmq monitor failed 에러가 발생 한 윈인은 rabbitmq beam.smp 프로세스가
oom-killer 에 의해 킬 되어 발생
---------------------------------------------------------------
관련문서
https://access.redhat.com/articles/1360063 를 참조하여 서버 메모리 증설 검토로 해결
아래 문서를 통해 추천하는 메모리 사이즈 검토
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/director_installation_and_usage/chap-requirements#sect-Controller_Node_Requirements
분석내용
--------------------------------------
grep crmd\: corosync.log-20180704 | grep "Jul 03 13:2" | grep failed
Jul 03 13:29:13 [3717] ggcsactr-0 crmd: info: update_failcount: Updating failcount for rabbitmq on ggcsactr-1 after failed monitor: rc=7 (update=value++, time=1530592153)
Jul 03 13:29:13 [3717] ggcsactr-0 crmd: info: process_graph_event: Detected action (1985.69) rabbitmq_monitor_10000.417=not running: failed
Jul 03 13:29:22 [3717] ggcsactr-0 crmd: info: update_failcount: Updating failcount for rabbitmq on ggcsactr-2 after failed monitor: rc=1 (update=value++, time=1530592162)
Jul 03 13:29:22 [3717] ggcsactr-0 crmd: info: process_graph_event: Detected action (1987.71) rabbitmq_monitor_10000.170=unknown error: failed
Jul 03 13:29:24 [3717] ggcsactr-0 crmd: info: update_failcount: Updating failcount for rabbitmq on ggcsactr-0 after failed monitor: rc=7 (update=value++, time=1530592164)
Jul 03 13:29:24 [3717] ggcsactr-0 crmd: info: process_graph_event: Detected action (1986.76) rabbitmq_monitor_10000.253=not running: failed
Jul 03 13:29:48 [3717] ggcsactr-0 crmd: info: update_failcount: Updating failcount for rabbitmq on ggcsactr-2 after failed monitor: rc=7 (update=value++, time=1530592188)
Jul 03 13:29:48 [3717] ggcsactr-0 crmd: info: process_graph_event: Detected action (1987.71) rabbitmq_monitor_10000.170=not running: failed
$ grep 'Out of memory' messages
Jul 3 13:29:09 ggcsactr-0 kernel: Out of memory: Kill process 1022198 (beam.smp) score 269 or sacrifice child
Jul 3 13:29:09 ggcsactr-0 kernel: Out of memory: Kill process 1022198 (beam.smp) score 269 or sacrifice child
Jul 3 16:18:44 ggcsactr-0 kernel: Out of memory: Kill process 752783 (beam.smp) score 274 or sacrifice child
Jul 3 16:18:44 ggcsactr-0 kernel: Out of memory: Kill process 752783 (beam.smp) score 274 or sacrifice child
$ grep 'Out of memory' messages
Jul 3 13:28:57 ggcsactr-1 kernel: Out of memory: Kill process 386499 (beam.smp) score 329 or sacrifice child
Jul 3 13:28:57 ggcsactr-1 kernel: Out of memory: Kill process 386499 (beam.smp) score 329 or sacrifice child
rabbitmq monitor failed 에러가 발생 한 윈인은 rabbitmq beam.smp 프로세스가
oom-killer 에 의해 킬 되어 발생
---------------------------------------------------------------
관련문서
https://access.redhat.com/articles/1360063 를 참조하여 서버 메모리 증설 검토로 해결
아래 문서를 통해 추천하는 메모리 사이즈 검토
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/director_installation_and_usage/chap-requirements#sect-Controller_Node_Requirements
- 이전글beam.smp 프로세스 높은 메모리 사용률 해결방법 20.10.19
- 다음글OSP10 RabbitMQ Failed issue 해결 방법 20.10.13
댓글목록
등록된 댓글이 없습니다.