You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Open-IM-Server/config/instance-down-rules.yml

11 lines
344 B

Add Prometheus alerting functionality (#1424) * Code adaptation k8s: service discovery and registration adaptation, configuration adaptation * Initial submission of the help charts script for openim API * change the help charts script * change the help charts script * change helm chart codes * change dockerfiles script * change chart script:add configmap mounts * change chart script:change repository * change chart script:msggateway add one service * change config.yaml * roll back some config values * change chart script:change Ingress rule with a rewrite annotation * add mysql charts scrible * change chart script:add mysql.config.yaml * add nfs provisioner charts * change chart script:add nfs.config.yaml * add ingress-nginx charts * change chart script:add ingress-nginx.config.yaml * add redis &mongodb charts * add kafka&minio charts * change chart script:change redis.values.yaml * change chart script:add redis.config.yaml * change chart script:change redis.config.yaml * change chart script:change mongodb.value.yaml * change chart script:change mongodb.value.yaml * change chart script:add mongodb.config.yaml * change chart script:change minio.values.yaml * change chart script:add minio.config.yaml * change chart script:change kafka.values.yaml * change chart script:add kafka.config.yaml * change chart script:change services.config.yaml * bug fix:Delete websocket's Port restrictions * bug fix:change port value * change chart script:Submit a stable version script * fix bug:Implement option interface * fix bug:change K8sDR.Register * change config.yaml * change chats script:minio service add ingress * change chats script:minio service add ingress * change chats script:kafka.replicaCount=3& change minio.api ingress * delete change chats script * change config.yaml * change openim.yaml * merge go.sum * Add monitoring function and struct for Prometheus on gin and GRPC * Add GRPC and gin server monitoring logic * Add GRPC and gin server monitoring logic2 * Add GRPC and gin server monitoring logic3 * Add GRPC and gin server monitoring logic4 * Add GRPC and gin server monitoring logic5 * Add GRPC and gin server monitoring logic6 * Add GRPC and gin server monitoring logic7 * delete:old monitoring code * add for test * fix bug:change packname * fix bug:delete getPromPort funciton * fix bug:delete getPromPort funciton * fix bug:change logs * fix bug:change registerName logic in GetGrpcCusMetrics function * add getPrometheus url api * fix:config path logic * fix:prometheus enable function * fix:prometheus enable function * fix:transfer Multi process monitoring logic * del:del not using manifest * fix:openim-msgtransfer.sh * fix:openim-msgtransfer.sh * cicd: robot automated Change * delete not using files * add prometheus docker-compose for monitor * fix prometheus.yaml * fix environment.sh * fix init-config.sh * fix init-config.sh * fix env_template.yaml * fix docker-compose.yml * fix docker-compose.yml * add openim_admin_front service * change openim-admin-front * del not using files * add node-exporter-dashaboard.yaml * cicd: robot automated Change * cicd: robot automated Change * feature: add alertmanager function * feature: add alertmanager function * feature: add alertmanager function * feature: add alertmanager function * feature: add alertmanager function * del:delete not using files * del:delete not using files * change:change to personal email info * feat: deployment and design of management backend and monitoring Signed-off-by: Xinwei Xiong(cubxxw) <3293172751nss@gmail.com> * feat: deployment and design of management backend and monitoring Signed-off-by: Xinwei Xiong(cubxxw) <3293172751nss@gmail.com> * feat: deployment and design of management backend and monitoring Signed-off-by: Xinwei Xiong(cubxxw) <3293172751nss@gmail.com> --------- Signed-off-by: Xinwei Xiong(cubxxw) <3293172751nss@gmail.com> Co-authored-by: lin.huang <lin.huang@apulis.com> Co-authored-by: xuexihuang <1339326187@qq.com> Co-authored-by: xuexihuang <xuexihuang@users.noreply.github.com> Co-authored-by: cubxxw <cubxxw@users.noreply.github.com>
1 year ago
groups:
- name: instance_down
rules:
- alert: InstanceDown
expr: up == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."