From 07eb8d9ca4fe58747ba5aeaa4fd2bb152a419495 Mon Sep 17 00:00:00 2001 From: TigErJin Date: Mon, 15 Sep 2025 13:56:20 +0900 Subject: [PATCH] convert to gitea --- README.md | 774 ++++++ alertmanager/alertmanager.yml | 16 + alertmanager/templates/msteams.tmpl | 53 + dashboard001.json | 957 +++++++ dashboard002.json | 331 +++ grafana/grafana.ini | 2464 +++++++++++++++++ grafana/ldap.toml | 75 + .../provisioning/access-control/sample.yaml | 68 + grafana/provisioning/alerting/sample.yaml | 227 ++ grafana/provisioning/dashboards/sample.yaml | 11 + grafana/provisioning/datasources/sample.yaml | 71 + grafana/provisioning/plugins/sample.yaml | 11 + prometheus/prometheus.yml | 199 ++ prometheus/rules/resource_alert.yml | 46 + promteams/config.env | 10 + promteams/start_promteams.sh | 26 + promteams/stop_promteams.sh | 11 + 17 files changed, 5350 insertions(+) create mode 100644 README.md create mode 100644 alertmanager/alertmanager.yml create mode 100644 alertmanager/templates/msteams.tmpl create mode 100644 dashboard001.json create mode 100644 dashboard002.json create mode 100644 grafana/grafana.ini create mode 100644 grafana/ldap.toml create mode 100644 grafana/provisioning/access-control/sample.yaml create mode 100644 grafana/provisioning/alerting/sample.yaml create mode 100644 grafana/provisioning/dashboards/sample.yaml create mode 100644 grafana/provisioning/datasources/sample.yaml create mode 100644 grafana/provisioning/plugins/sample.yaml create mode 100644 prometheus/prometheus.yml create mode 100644 prometheus/rules/resource_alert.yml create mode 100644 promteams/config.env create mode 100644 promteams/start_promteams.sh create mode 100644 promteams/stop_promteams.sh diff --git a/README.md b/README.md new file mode 100644 index 0000000..2d76fcf --- /dev/null +++ b/README.md @@ -0,0 +1,774 @@ +# Prometheus & Grafana 모니터링 시스템 구축 가이드 + +**Version:** 1.0.0 +**Last Modified:** 2025-08-08 + +## 개요 + +본 문서는 Prometheus, Grafana, Alertmanager를 포함하는 모니터링 스택을 구축하는 엔지니어링 절차를 상세히 기술한다. 시스템은 서버 인프라의 핵심 메트릭을 수집, 시각화하며, 정의된 임계값에 따라 MS Teams로 자동화된 알림을 전송하는 것을 목표로 한다. + +구축 과정에서 발생한 기술적 문제와 해결 과정을 '트러블슈팅' 섹션에 상세히 기록하여, 향후 유사 시스템 구축 시 재현성과 안정성을 보장하고 기술적 부채를 최소화하는 데 중점을 둔다. + +## 목차 + +1. [시스템 아키텍처](#1-시스템-아키텍처) +2. [설치 및 구성 절차](#2-설치-및-구성-절차) + 1. [사전 준비 (`ds-commandcenter` 서버)](#21-사전-준비-ds-commandcenter-서버) + 2. [Node Exporter 설치 (모든 대상 서버)](#22-node-exporter-설치-모든-대상-서버) + 3. [Prometheus 설치 및 구성](#23-prometheus-설치-및-구성) + 4. [Alertmanager 설치 및 구성](#24-alertmanager-설치-및-구성) + 5. [Prometheus-MSTeams (Docker) 설치](#25-prometheus-msteams-docker-설치) + 6. [Grafana 설치 및 구성](#26-grafana-설치-및-구성) + 7. [Load Balancer 및 TLS 설정](#27-load-balancer-및-tls-설정) +3. [최종 확인 및 테스트](#3-최종-확인-및-테스트) +4. [트러블슈팅 (Troubleshooting)](#4-트러블슈팅-troubleshooting) +5. [부록 (Appendix)](#5-부록) + 1. [주요 컴포넌트 버전](#51-주요-컴포넌트-버전) + 2. [보안 권장 사항](#52-보안-권장-사항) + +--- + +## 1. 시스템 아키텍처 + +![시스템 아키텍처 다이어그램](https'://i.imgur.com/your-architecture-diagram.png') + +1. **데이터 수집 (Data Collection):** 모든 대상 서버에 설치된 `Node Exporter`가 시스템 메트릭을 `:9500` 포트로 노출한다. +2. **수집 및 평가 (Scraping & Evaluation):** `ds-commandcenter` 서버의 `Prometheus`가 모든 Node Exporter로부터 메트릭을 수집하고, `resource_alert.yml` 규칙에 따라 알림 조건을 평가한다. +3. **알림 라우팅 (Alert Routing):** 알림 조건이 충족되면, `Prometheus`는 `Alertmanager`에게 알림을 전송한다. +4. **알림 처리 및 프록시 (Alert Processing & Proxying):** `Alertmanager`는 알림을 그룹화하고, `msteams.tmpl` 템플릿을 적용하여 `prometheus-msteams` Docker 컨테이너로 웹훅을 전송한다. +5. **최종 발송 (Final Delivery):** `prometheus-msteams` 컨테이너는 Alertmanager로부터 받은 데이터를 MS Teams 카드 형식으로 변환하여 최종적으로 MS Teams 채널에 알림을 보낸다. +6. **시각화 (Visualization):** `Grafana`는 Prometheus를 데이터 소스로 사용하여 모든 메트릭을 대시보드로 시각화한다. +7. **외부 접속 (External Access):** 사용자는 `Load Balancer`를 통해 HTTPS로 안전하게 Grafana와 Prometheus UI에 접근한다. + +--- + +## 2. 설치 및 구성 절차 + +### 2.1. 사전 준비 (`ds-commandcenter` 서버) +* **목적:** 서비스 실행에 필요한 시스템 계정, 디렉토리 생성 및 패키지 다운로드. +* **실행 위치:** `ds-commandcenter` 서버. +* **명령어:** + ```bash + # 시스템 계정 생성 + useradd --no-create-home --shell /bin/false prometheus + useradd --no-create-home --shell /bin/false alertmanager + + # 디렉토리 생성 및 소유권 변경 + mkdir -p /etc/prometheus/rules /etc/alertmanager/templates /data/prometheus /data/alertmanager /data/promteams + chown -R prometheus:prometheus /etc/prometheus /data/prometheus + chown -R alertmanager:alertmanager /etc/alertmanager /data/alertmanager + + # 패키지 다운로드 + cd /data + wget https://github.com/prometheus/prometheus/releases/download/v3.5.0/prometheus-3.5.0.linux-amd64.tar.gz + wget https://github.com/prometheus/alertmanager/releases/download/v0.28.1/alertmanager-0.28.1.linux-amd64.tar.gz + wget https://github.com/prometheus/node_exporter/releases/download/v1.9.1/node_exporter-1.9.1.linux-amd64.tar.gz + + # 압축 해제 + tar -xvf prometheus-3.5.0.linux-amd64.tar.gz + tar -xvf alertmanager-0.28.1.linux-amd64.tar.gz + tar -xvf node_exporter-1.9.1.linux-amd64.tar.gz + ``` + +### 2.2. Node Exporter 설치 (모든 대상 서버) +* **목적:** 메트릭 수집 에이전트 설치 및 실행. +* **실행 위치:** 모니터링할 모든 서버 (공용, 게임 서버 포함). +* **실행 스크립트:** + ```bash + #!/bin/bash + # Node Exporter 설치 및 실행 스크립트 + + # 바이너리 파일 이동 + mv /data/node_exporter-1.9.1.linux-amd64/node_exporter /usr/local/bin/ + + # 시스템 계정 생성 + useradd --no-create-home --shell /bin/false node_exporter + + # systemd 서비스 파일 생성 + cat < /etc/systemd/system/node_exporter.service + [Unit] + Description=Node Exporter + Wants=network-online.target + After=network-online.target + + [Service] + User=node_exporter + Group=node_exporter + Type=simple + ExecStart=/usr/local/bin/node_exporter --web.listen-address=":9500" + + [Install] + WantedBy=multi-user.target + EOF + + # 서비스 등록 및 시작 + systemctl daemon-reload + systemctl enable node_exporter + systemctl start node_exporter + systemctl status node_exporter + ``` + +### 2.3. Prometheus 설치 및 구성 +* **목적:** 메트릭 수집 서버 설정. +* **실행 위치:** `ds-commandcenter` 서버. +* **설치 명령어:** + ```bash + # 바이너리 및 설정 파일 이동 + mv /data/prometheus-3.5.0.linux-amd64/{prometheus,promtool} /usr/local/bin/ + mv /data/prometheus-3.5.0.linux-amd64/{consoles,console_libraries} /etc/prometheus + chown -R prometheus:prometheus /etc/prometheus/* + chown prometheus:prometheus /usr/local/bin/prometheus /usr/local/bin/promtool + ``` +* **`/etc/prometheus/prometheus.yml`** + ```yaml + global: + scrape_interval: 1m + evaluation_interval: 1m + + alerting: + alertmanagers: + - static_configs: + - targets: + - localhost:9094 + + rule_files: + - "/etc/prometheus/rules/resource_alert.yml" + + scrape_configs: + - job_name: "prometheus" + static_configs: + - targets: ["localhost:9091"] + + - job_name: "common_servers" + static_configs: + - targets: ["10.0.10.21:9500"] + labels: + hostname: "ds-battlefield" + ip: "10.0.10.21" + - targets: ["10.0.10.6:9500"] + labels: + hostname: "ds-commandcenter" + ip: "10.0.10.6" + - targets: ["10.0.10.7:9500"] + labels: + hostname: "ds-crashreport" + ip: "10.0.10.7" + - targets: ["10.0.10.17:9500"] + labels: + hostname: "ds-maingate001" + ip: "10.0.10.17" + - targets: ["10.0.10.18:9500"] + labels: + hostname: "ds-maingate002" + ip: "10.0.10.18" + - targets: ["10.0.10.14:9500"] + labels: + hostname: "ds-mongodb001" + ip: "10.0.10.14" + - targets: ["10.0.10.15:9500"] + labels: + hostname: "ds-mongodb002" + ip: "10.0.10.15" + - targets: ["10.0.10.16:9500"] + labels: + hostname: "ds-mongodb003" + ip: "10.0.10.16" + - targets: ["10.0.10.8:9500"] + labels: + hostname: "ds-opensearch001" + ip: "10.0.10.8" + - targets: ["10.0.10.9:9500"] + labels: + hostname: "ds-opensearch002" + ip: "10.0.10.9" + - targets: ["10.0.10.10:9500"] + labels: + hostname: "ds-opensearch003" + ip: "10.0.10.10" + - targets: ["10.0.10.22:9500"] + labels: + hostname: "ds-promotor" + ip: "10.0.10.22" + - targets: ["10.0.10.24:9500"] + labels: + hostname: "ds-racetrack" + ip: "10.0.10.24" + - targets: ["10.0.10.11:9500"] + labels: + hostname: "ds-redis001" + ip: "10.0.10.11" + - targets: ["10.0.10.12:9500"] + labels: + hostname: "ds-redis002" + ip: "10.0.10.12" + - targets: ["10.0.10.13:9500"] + labels: + hostname: "ds-redis003" + ip: "10.0.10.13" + - targets: ["10.0.10.23:9500"] + labels: + hostname: "ds-social" + ip: "10.0.10.23" + - targets: ["10.0.10.25:9500"] + labels: + hostname: "ds-tavern" + ip: "10.0.10.25" + - targets: ["10.0.10.19:9500"] + labels: + hostname: "ds-warehouse001" + ip: "10.0.10.19" + - targets: ["10.0.10.20:9500"] + labels: + hostname: "ds-warehouse002" + ip: "10.0.10.20" + + - job_name: "game_servers" + static_configs: + - targets: ["110.234.163.37:9500"] + labels: + hostname: "ds-jpn-game001" + ip: "110.234.163.37" + - targets: ["110.234.163.30:9500"] + labels: + hostname: "ds-jpn-game002" + ip: "110.234.163.30" + - targets: ["110.234.161.170:9500"] + labels: + hostname: "ds-jpn-game003" + ip: "110.234.161.170" + - targets: ["110.234.160.149:9500"] + labels: + hostname: "ds-jpn-game004" + ip: "110.234.160.149" + - targets: ["110.234.162.181:9500"] + labels: + hostname: "ds-jpn-game005" + ip: "110.234.162.181" + - targets: ["110.234.160.50:9500"] + labels: + hostname: "ds-jpn-game006" + ip: "110.234.160.50" + - targets: ["110.234.165.61:9500"] + labels: + hostname: "ds-jpn-game007" + ip: "110.234.165.61" + - targets: ["110.234.163.151:9500"] + labels: + hostname: "ds-jpn-game008" + ip: "110.234.163.151" + - targets: ["110.234.195.8:9500"] + labels: + hostname: "ds-sgn-game001" + ip: "110.234.195.8" + - targets: ["110.234.193.164:9500"] + labels: + hostname: "ds-sgn-game002" + ip: "110.234.193.164" + - targets: ["110.234.193.189:9500"] + labels: + hostname: "ds-sgn-game003" + ip: "110.234.193.189" + - targets: ["110.234.192.213:9500"] + labels: + hostname: "ds-sgn-game004" + ip: "110.234.192.213" + - targets: ["110.234.194.108:9500"] + labels: + hostname: "ds-sgn-game005" + ip: "110.234.194.108" + - targets: ["110.234.194.199:9500"] + labels: + hostname: "ds-sgn-game006" + ip: "110.234.194.199" + - targets: ["110.234.194.179:9500"] + labels: + hostname: "ds-sgn-game007" + ip: "110.234.194.179" + - targets: ["110.234.193.159:9500"] + labels: + hostname: "ds-sgn-game008" + ip: "110.234.193.159" + - targets: ["44.198.4.245:9500"] + labels: + hostname: "ds-us-game001" + ip: "44.198.4.245" + - targets: ["52.5.176.32:9500"] + labels: + hostname: "ds-us-game002" + ip: "52.5.176.32" + - targets: ["98.86.208.130:9500"] + labels: + hostname: "ds-us-game003" + ip: "98.86.208.130" + - targets: ["98.87.57.10:9500"] + labels: + hostname: "ds-us-game004" + ip: "98.87.57.10" + - targets: ["18.153.131.248:9500"] + labels: + hostname: "ds-de-game001" + ip: "18.153.131.248" + - targets: ["18.185.201.217:9500"] + labels: + hostname: "ds-de-game002" + ip: "18.185.201.217" + - targets: ["3.124.28.212:9500"] + labels: + hostname: "ds-de-game003" + ip: "3.124.28.212" + - targets: ["3.69.139.75:9500"] + labels: + hostname: "ds-de-game004" + ip: "3.69.139.75" + + - job_name: "game_info" + static_configs: + - targets: ["10.0.10.22:9200"] + labels: + hostname: "ds-promotor" + ip: "10.0.10.22" + - targets: ["110.234.163.37:9200"] + labels: + hostname: "ds-jpn-game001" + ip: "110.234.163.37" + - targets: ["110.234.163.30:9200"] + labels: + hostname: "ds-jpn-game002" + ip: "110.234.163.30" + - targets: ["110.234.161.170:9200"] + labels: + hostname: "ds-jpn-game003" + ip: "110.234.161.170" + - targets: ["110.234.160.149:9200"] + labels: + hostname: "ds-jpn-game004" + ip: "110.234.160.149" + - targets: ["110.234.162.181:9200"] + labels: + hostname: "ds-jpn-game005" + ip: "110.234.162.181" + - targets: ["110.234.160.50:9200"] + labels: + hostname: "ds-jpn-game006" + ip: "110.234.160.50" + - targets: ["110.234.165.61:9200"] + labels: + hostname: "ds-jpn-game007" + ip: "110.234.165.61" + - targets: ["110.234.163.151:9200"] + labels: + hostname: "ds-jpn-game008" + ip: "110.234.163.151" + - targets: ["110.234.195.8:9200"] + labels: + hostname: "ds-sgn-game001" + ip: "110.234.195.8" + - targets: ["110.234.193.164:9200"] + labels: + hostname: "ds-sgn-game002" + ip: "110.234.193.164" + - targets: ["110.234.193.189:9200"] + labels: + hostname: "ds-sgn-game003" + ip: "110.234.193.189" + - targets: ["110.234.192.213:9200"] + labels: + hostname: "ds-sgn-game004" + ip: "110.234.192.213" + - targets: ["110.234.194.108:9200"] + labels: + hostname: "ds-sgn-game005" + ip: "110.234.194.108" + - targets: ["110.234.194.199:9200"] + labels: + hostname: "ds-sgn-game006" + ip: "110.234.194.199" + - targets: ["110.234.194.179:9200"] + labels: + hostname: "ds-sgn-game007" + ip: "110.234.194.179" + - targets: ["110.234.193.159:9200"] + labels: + hostname: "ds-sgn-game008" + ip: "110.234.193.159" + - targets: ["44.198.4.245:9200"] + labels: + hostname: "ds-us-game001" + ip: "44.198.4.245" + - targets: ["52.5.176.32:9200"] + labels: + hostname: "ds-us-game002" + ip: "52.5.176.32" + - targets: ["98.86.208.130:9200"] + labels: + hostname: "ds-us-game003" + ip: "98.86.208.130" + - targets: ["98.87.57.10:9200"] + labels: + hostname: "ds-us-game004" + ip: "98.87.57.10" + - targets: ["18.153.131.248:9200"] + labels: + hostname: "ds-de-game001" + ip: "18.153.131.248" + - targets: ["18.185.201.217:9200"] + labels: + hostname: "ds-de-game002" + ip: "18.185.201.217" + - targets: ["3.124.28.212:9200"] + labels: + hostname: "ds-de-game003" + ip: "3.124.28.212" + - targets: ["3.69.139.75:9200"] + labels: + hostname: "ds-de-game004" + ip: "3.69.139.75" + + ``` +* **`/etc/prometheus/rules/resource_alert.yml`** + ```yaml + groups: + - name: 리소스 사용량 경고 + rules: + - alert: CPU사용량경고 + expr: 100 - (avg by(instance, hostname, ip) (rate(node_cpu_seconds_total{mode="idle"}[10m])) * 100) > 70 + for: 10m + labels: + severity: warning + annotations: + summary: "높은 CPU 사용량 감지" + description: "{{ $labels.hostname }}에서 지난 10분 동안 CPU 사용량이 70%를 초과했습니다." + value: "{{ $value | printf \"%.2f\" }}" + runbook_url: "https://grafana.dungeonstalkers.com:8443" + + - alert: CPU사용량심각 + expr: 100 - (avg by(instance, hostname, ip) (rate(node_cpu_seconds_total{mode="idle"}[10m])) * 100) > 80 + for: 10m + labels: + severity: critical + annotations: + summary: "심각한 CPU 사용량 감지" + description: "{{ $labels.hostname }}에서 지난 10분 동안 CPU 사용량이 80%를 초과했습니다." + value: "{{ $value | printf \"%.2f\" }}" + runbook_url: "https://grafana.dungeonstalkers.com:8443" + + - alert: 메모리사용량경고 + expr: (1 - (node_memory_MemAvailable_bytes{hostname!=""} / node_memory_MemTotal_bytes{hostname!=""})) * 100 > 70 + for: 10m + labels: + severity: warning + annotations: + summary: "높은 메모리 사용량 감지" + description: "{{ $labels.hostname }}에서 지난 10분 동안 메모리 사용량이 70%를 초과했습니다." + value: "{{ $value | printf \"%.2f\" }}" + runbook_url: "https://grafana.dungeonstalkers.com:8443" + + - alert: 메모리사용량심각 + expr: (1 - (node_memory_MemAvailable_bytes{hostname!=""} / node_memory_MemTotal_bytes{hostname!=""})) * 100 > 80 + for: 10m + labels: + severity: critical + annotations: + summary: "심각한 메모리 사용량 감지" + description: "{{ $labels.hostname }}에서 지난 10분 동안 메모리 사용량이 80%를 초과했습니다." + value: "{{ $value | printf \"%.2f\" }}" + runbook_url: "https://grafana.dungeonstalkers.com:8443" + ``` +* **`/etc/systemd/system/prometheus.service`** + ```ini + [Unit] + Description=Prometheus + Wants=network-online.target + After=network-online.target + + [Service] + User=prometheus + Group=prometheus + Type=simple + ExecStart=/usr/local/bin/prometheus \ + --config.file /etc/prometheus/prometheus.yml \ + --storage.tsdb.path /data/prometheus/ \ + --web.listen-address=":9091" \ + --web.enable-lifecycle \ + --web.external-url=https://prometheus.dungeonstalkers.com:8444/ + + [Install] + WantedBy=multi-user.target + ``` +* **서비스 시작:** + ```bash + systemctl daemon-reload + systemctl enable prometheus + systemctl start prometheus + ``` + +### 2.4. Alertmanager 설치 및 구성 +* **목적:** 알림 처리 및 라우팅 서버 설정. +* **실행 위치:** `ds-commandcenter` 서버. +* **설치 명령어:** + ```bash + mv /data/alertmanager-0.28.1.linux-amd64/{alertmanager,amtool} /usr/local/bin/ + chown alertmanager:alertmanager /usr/local/bin/alertmanager /usr/local/bin/amtool + ``` +* **`/etc/alertmanager/alertmanager.yml`** + ```yaml + route: + group_by: ['alertname', 'hostname'] + group_wait: 15s + group_interval: 1m + repeat_interval: 10m + receiver: "resource_alert" + + receivers: + - name: "resource_alert" + webhook_configs: + - url: "http://127.0.0.1:2000/resource_alert" + + templates: + - '/etc/alertmanager/templates/msteams.tmpl' + ``` +* **`/etc/alertmanager/templates/msteams.tmpl`** + ```go-template + {{ define "teams.card" }} + { + "@type": "MessageCard", + "@context": "http://schema.org/extensions", + "summary": "{{ .CommonAnnotations.summary }}", + "themeColor": "0078D7", + "title": "🚨 {{ .CommonAnnotations.summary }}", + "sections": [ + {{ $root := . }} + {{ range $index, $alert := .Alerts }} + { + "activityTitle": "{{ $alert.Annotations.description }}", + "facts": [ + { + "name": "상태", + "value": "**{{ printf "%.0f" $alert.Annotations.value }}%**" + }, + { + "name": "심각도", + "value": "{{ $alert.Labels.severity }}" + }, + { + "name": "호스트명", + "value": "{{ $alert.Labels.hostname }}" + }, + { + "name": "IP 주소", + "value": "{{ $alert.Labels.ip }}" + }, + { + "name": "발생 일시", + "value": "{{ $alert.StartsAt }}" + } + ], + "markdown": true + }{{ if ne (add $index 1) (len $root.Alerts) }},{{ end }} + {{ end }} + ], + "potentialAction": [ + { + "@type": "OpenUri", + "name": "Grafana에서 보기", + "targets": [ + { + "os": "default", + "uri": "{{ .CommonAnnotations.runbook_url }}" + } + ] + } + ] + } + {{ end }} + ``` +* **`/etc/systemd/system/alertmanager.service`** + ```ini + [Unit] + Description=Alertmanager + Wants=network-online.target + After=network-online.target + + [Service] + User=alertmanager + Group=alertmanager + Type=simple + ExecStart=/usr/local/bin/alertmanager \ + --config.file=/etc/alertmanager/alertmanager.yml \ + --storage.path=/data/alertmanager/ \ + --web.listen-address=":9094" \ + --cluster.listen-address=":9095" + + [Install] + WantedBy=multi-user.target + ``` +* **서비스 시작:** + ```bash + systemctl daemon-reload + systemctl enable alertmanager + systemctl start alertmanager + ``` + +### 2.5. Prometheus-MSTeams (Docker) 설치 +* **목적:** Alertmanager와 MS Teams를 연결하는 프록시 설치. +* **실행 위치:** `ds-commandcenter` 서버. +* **`/data/promteams/config.env`** + ```env + # MS Teams Webhook URL + WEBHOOK_URL="https://oneunivrs.webhook.office.com/webhookb2/7248d32a-3473-43bd-961b-c2a2516f28f5@1e8605cc-8007-46b0-993f-b388917f9499/IncomingWebhook/ac17804386cc4efdad5c78b3a8c182f7/f5368752-03f7-4e64-93e6-b40991c04c0c/V2jpWgnliaoihAzy3iMA2p_2KWou2hMIj4T32F8MCMVH01" + + # Alertmanager의 webhook_configs.url 경로와 일치해야 하는 요청 URI + REQUEST_URI="resource_alert" + + # 사용할 템플릿 파일의 호스트 경로 (마운트할 원본 파일) + TEMPLATE_HOST_PATH="/etc/alertmanager/templates/msteams.tmpl" + # 컨테이너 내부에서 템플릿 파일이 위치할 경로 + TEMPLATE_CONTAINER_PATH="/app/default-message-card.tmpl" + ``` +* **`/data/promteams/start_promteams.sh`** + ```bash + #!/bin/bash + + # 설정 파일 로드 + source /data/promteams/config.env + + # 필수 변수 확인 + if [ -z "$WEBHOOK_URL" ] || [ -z "$REQUEST_URI" ]; then + echo "필수 설정 값이 누락되었습니다. config.env 파일을 확인하세요." + exit 1 + fi + + echo "기존 promteams 컨테이너를 중지하고 삭제합니다." + docker stop promteams >/dev/null 2>&1 + docker rm promteams >/dev/null 2>&1 + + echo "환경변수 방식을 사용하는 구버전 이미지(v1.5.2)로 Prometheus-MSTeams 컨테이너를 시작합니다." + docker run -d -p 2000:2000 \ + --name="promteams" \ + --restart=always \ + -e TEAMS_INCOMING_WEBHOOK_URL="$WEBHOOK_URL" \ + -e TEAMS_REQUEST_URI="$REQUEST_URI" \ + -v "$TEMPLATE_HOST_PATH:$TEMPLATE_CONTAINER_PATH" \ + quay.io/prometheusmsteams/prometheus-msteams:v1.5.2 + + echo "컨테이너가 시작되었습니다. 아래 명령어로 상태를 확인하세요:" + echo "docker ps | grep promteams" + ``` +* **`/data/promteams/stop_promteams.sh`** + ```bash + #!/bin/bash + CONTAINER_NAME="promteams" + + if [ $(docker ps -q -f name=$CONTAINER_NAME) ]; then + echo "Prometheus-MSTeams 컨테이너($CONTAINER_NAME)를 중지하고 삭제합니다." + docker stop $CONTAINER_NAME + docker rm $CONTAINER_NAME + echo "완료되었습니다." + else + echo "실행 중인 Prometheus-MSTeams 컨테이너가 없습니다." + fi + ``` +* **컨테이너 실행:** + ```bash + chmod +x /data/promteams/*.sh + /data/promteams/start_promteams.sh + ``` + +### 2.6. Grafana 설치 및 구성 +* **목적:** 시각화 대시보드 설치 및 설정. +* **실행 위치:** `ds-commandcenter` 서버. +* **설치 명령어:** + ```bash + apt-get update && apt-get install -y adduser libfontconfig1 musl + wget https://dl.grafana.com/enterprise/release/grafana-enterprise_12.1.0_amd64.deb + dpkg -i grafana-enterprise_12.1.0_amd64.deb + ``` +* **`/etc/grafana/grafana.ini` 수정:** 아래 `sed` 명령어는 주요 설정을 변경한다. + ```bash + # 외부 접속 주소 및 포트 설정 + sed -i 's/;http_port = 3000/http_port = 3001/' /etc/grafana/grafana.ini + sed -i 's/;domain = localhost/domain = grafana.dungeonstalkers.com/' /etc/grafana/grafana.ini + sed -i "s|;root_url = .*|root_url = https://grafana.dungeonstalkers.com:8443/|" /etc/grafana/grafana.ini + + # 임베딩 및 익명 접속 설정 추가 + cat <<'EOF' | tee -a /etc/grafana/grafana.ini + + [security] + allow_embedding = true + + [auth.anonymous] + enabled = true + org_name = Main Org. + org_role = Viewer + EOF + ``` +* **서비스 시작:** + ```bash + systemctl enable grafana-server + systemctl start grafana-server + ``` + +### 2.7. Load Balancer 및 TLS 설정 +* **목적:** 외부 접속을 위한 HTTPS 통신 및 포트 포워딩 설정. +* **설정 위치:** AWS, GCP 등 클라우드 콘솔 또는 L4 장비. +* **구성 요약:** + * `https://grafana.dungeonstalkers.com:8443` -> `http://10.0.10.6:3001` + * `https://prometheus.dungeonstalkers.com:8444` -> `http://10.0.10.6:9091` + * `dungeonstalkers.com`에 대한 유효한 TLS 인증서 필요. + +--- + +## 3. 최종 확인 및 테스트 + +1. **웹 UI 접속:** + * `https://grafana.dungeonstalkers.com:8443` + * `https://prometheus.dungeonstalkers.com:8444` +2. **Prometheus Targets 확인:** Prometheus UI의 'Status' -> 'Targets' 페이지에서 모든 대상이 `UP` 상태인지 확인. +3. **전체 알림 파이프라인 테스트:** + ```bash + amtool alert add \ + --alertmanager.url="http://localhost:9094" \ + alertname="Final-System-Test" \ + severity="critical" \ + hostname="ds-commandcenter" \ + ip="10.0.10.6" \ + summary="전체 시스템 최종 테스트" \ + description="이 알림이 도착하고 모든 링크가 올바르게 작동하면 성공이다." \ + value="99" \ + runbook_url="https://grafana.dungeonstalkers.com:8443" + ``` + MS Teams 채널에 알림 카드 도착 및 'Grafana에서 보기' 버튼 링크의 정상 작동 여부를 확인한다. + +--- + +## 4. 트러블슈팅 (Troubleshooting) + +본 섹션은 구축 과정에서 발생했던 주요 문제와 해결 과정을 기술한다. + +| 문제 현상 | 원인 | 해결 방안 | +| -------------------------------------------------------------- | ------------------------------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `alertmanager` 서비스 시작 실패 (`address already in use`) | 웹 포트(`:9094`)와 클러스터 포트(기본값 `:9094`)가 충돌함. | `alertmanager.service` 파일에 `--cluster.listen-address=":9095"` 옵션을 추가하여 클러스터 포트를 명시적으로 변경. | +| `alertmanager` 서비스 시작 실패 (`function "add"/"floor" not defined`) | MS Teams 템플릿 파일(`msteams.tmpl`)에 Alertmanager가 지원하지 않는 함수가 포함됨. | `floor`는 `printf "%.0f"`로 대체하고, `add` 함수를 사용하는 로직은 더 단순하고 호환성 높은 방식으로 수정. | +| `prometheus` 서비스 시작 실패 (`yaml: did not find expected key`) | `prometheus.yml` 파일의 YAML 문법 오류 (주로 들여쓰기 문제). | `external_url` 설정을 `prometheus.yml`에서 제거하고, 대신 `prometheus.service` 파일의 실행 옵션에 `--web.external-url`을 추가하는 방식으로 변경하여 YAML 파일의 무결성을 보장. | +| `prometheus-msteams` 컨테이너 Crash Loop (재시작 반복) | 최신 Docker 이미지와 구버전 설정 방식(환경 변수) 간의 비호환성 문제. | 이전에 성공했던 방식이 환경 변수를 사용했음을 확인하고, 해당 방식을 지원하는 구버전 이미지 태그(`v1.5.2`)를 명시적으로 사용하여 컨테이너를 실행. | +| MS Teams 알림은 실패하는데 템플릿 관련 에러가 발생함. | `amtool`로 보낸 테스트 알림 데이터에 템플릿이 요구하는 필드(`ip`, `runbook_url`)가 누락됨. | Prometheus 설정에서 모든 타겟에 `ip` 라벨을 추가하고, 알림 규칙에 `runbook_url` 어노테이션을 추가하여 실제 알림 데이터에 해당 필드가 포함되도록 구성. `amtool` 테스트 시에도 해당 필드를 직접 포함하여 전송. | +| `node_exporter` 포트 충돌 (`:9300`) | OpenSearch 등 다른 서비스가 이미 `:9300`번대 포트를 사용하고 있었음. | 모든 서버의 `node_exporter` 포트를 `:9500`으로 변경하고, `prometheus.yml`의 수집 대상 포트도 모두 `:9500`으로 수정. | + +--- + +## 5. 부록 (Appendix) + +### 5.1. 주요 컴포넌트 버전 +* **Prometheus:** `3.5.0` +* **Alertmanager:** `0.28.1` +* **Node Exporter:** `1.9.1` +* **Grafana:** `12.1.0` +* **Prometheus-MSTeams (Docker):** `quay.io/prometheusmsteams/prometheus-msteams:v1.5.2` + +### 5.2. 보안 권장 사항 +* **Grafana 관리자 비밀번호 변경:** 설치 후 즉시 Grafana의 `admin` 계정 비밀번호를 변경해야 한다. + ```bash + grafana-cli admin reset-admin-password <새롭고-안전한-비밀번호> + ``` +* **네트워크 방화벽:** LB의 공인 포트 외에, 각 서비스의 내부 포트(`9091`, `9094`, `3001` 등)는 외부에서 직접 접근할 수 없도록 방화벽으로 차단하는 것을 권장한다. +* **Webhook URL 보안:** MS Teams Webhook URL은 민감 정보이므로, `config.env` 파일의 권한을 제한(`chmod 600`)하고 Git 등 버전 관리 시스템에 포함되지 않도록 주의해야 한다. diff --git a/alertmanager/alertmanager.yml b/alertmanager/alertmanager.yml new file mode 100644 index 0000000..44d5cbf --- /dev/null +++ b/alertmanager/alertmanager.yml @@ -0,0 +1,16 @@ +route: + group_by: ['...'] + group_wait: 15s + group_interval: 1m + repeat_interval: 10m + receiver: "resource_alert" + +receivers: + - name: "resource_alert" + webhook_configs: + - url: "http://127.0.0.1:2000/resource_alert" + send_resolved: false + + +templates: + - '/etc/alertmanager/templates/msteams.tmpl' diff --git a/alertmanager/templates/msteams.tmpl b/alertmanager/templates/msteams.tmpl new file mode 100644 index 0000000..2336fc6 --- /dev/null +++ b/alertmanager/templates/msteams.tmpl @@ -0,0 +1,53 @@ +{{ define "teams.card" }} +{ + "@type": "MessageCard", + "@context": "http://schema.org/extensions", + "summary": "{{ .CommonAnnotations.summary }}", + "themeColor": "0078D7", + "title": "🚨 {{ .CommonAnnotations.summary }}", + "sections": [ + {{- /* add 함수 대신 .CommonAnnotations를 사용하여 쉼표를 처리하는 안정적인 방식 */ -}} + {{- range .Alerts }} + { + "activityTitle": "{{ .Annotations.description }}", + "facts": [ + { + "name": "상태", + "value": "**{{ printf "%.0f" .Annotations.value }}%**" + }, + { + "name": "심각도", + "value": "{{ .Labels.severity }}" + }, + { + "name": "호스트명", + "value": "{{ .Labels.hostname }}" + }, + { + "name": "IP 주소", + "value": "{{ .Labels.ip }}" + }, + { + "name": "발생 일시", + "value": "{{ .StartsAt }}" + } + ], + "markdown": true + } + {{- if .CommonAnnotations -}},{{- end }} + {{- end }} + ], + "potentialAction": [ + { + "@type": "OpenUri", + "name": "Grafana에서 보기", + "targets": [ + { + "os": "default", + "uri": "{{ .CommonAnnotations.runbook_url }}" + } + ] + } + ] +} +{{ end }} diff --git a/dashboard001.json b/dashboard001.json new file mode 100644 index 0000000..a7bb704 --- /dev/null +++ b/dashboard001.json @@ -0,0 +1,957 @@ +{ + "__inputs": [ + { + "name": "DS_PROMETHEUS", + "label": "prometheus", + "description": "", + "type": "datasource", + "pluginId": "prometheus", + "pluginName": "Prometheus" + } + ], + "__elements": {}, + "__requires": [ + { + "type": "grafana", + "id": "grafana", + "name": "Grafana", + "version": "11.6.0" + }, + { + "type": "datasource", + "id": "prometheus", + "name": "Prometheus", + "version": "1.0.0" + }, + { + "type": "panel", + "id": "table", + "name": "Table", + "version": "" + } + ], + "annotations": { + "list": [ + { + "$$hashKey": "object:2875", + "builtIn": 1, + "datasource": { + "type": "datasource", + "uid": "grafana" + }, + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "target": { + "limit": 100, + "matchAny": false, + "tags": [], + "type": "dashboard" + }, + "type": "dashboard" + } + ] + }, + "description": "Command Center Frontend Dashboard", + "editable": true, + "fiscalYearStartMonth": 0, + "graphTooltip": 0, + "id": null, + "links": [ + { + "$$hashKey": "object:2302", + "asDropdown": true, + "icon": "external link", + "tags": [], + "targetBlank": true, + "title": "", + "type": "dashboards" + } + ], + "panels": [ + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "description": "", + "fieldConfig": { + "defaults": { + "color": { + "mode": "thresholds" + }, + "custom": { + "align": "center", + "cellOptions": { + "type": "auto" + }, + "filterable": false, + "inspect": false + }, + "decimals": 1, + "mappings": [], + "max": 100, + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green" + } + ] + }, + "unit": "none" + }, + "overrides": [ + { + "matcher": { + "id": "byName", + "options": "Memory" + }, + "properties": [ + { + "id": "unit", + "value": "bytes" + }, + { + "id": "decimals" + }, + { + "id": "custom.width", + "value": 89 + }, + { + "id": "decimals", + "value": 0 + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Uptime" + }, + "properties": [ + { + "id": "unit", + "value": "none" + }, + { + "id": "custom.width", + "value": 90 + }, + { + "id": "decimals" + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Disk read" + }, + "properties": [ + { + "id": "unit", + "value": "binBps" + }, + { + "id": "custom.cellOptions", + "value": { + "mode": "gradient", + "type": "color-background" + } + }, + { + "id": "thresholds", + "value": { + "mode": "absolute", + "steps": [ + { + "color": "rgba(50, 172, 45, 0.97)" + }, + { + "color": "rgba(237, 129, 40, 0.89)", + "value": 10485760 + }, + { + "color": "rgba(245, 54, 54, 0.9)", + "value": 20485760 + } + ] + } + }, + { + "id": "custom.width", + "value": 108 + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Disk write" + }, + "properties": [ + { + "id": "unit", + "value": "binBps" + }, + { + "id": "custom.cellOptions", + "value": { + "mode": "gradient", + "type": "color-background" + } + }, + { + "id": "thresholds", + "value": { + "mode": "absolute", + "steps": [ + { + "color": "rgba(50, 172, 45, 0.97)" + }, + { + "color": "rgba(237, 129, 40, 0.89)", + "value": 10485760 + }, + { + "color": "rgba(245, 54, 54, 0.9)", + "value": 20485760 + } + ] + } + }, + { + "id": "custom.width", + "value": 107 + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Download" + }, + "properties": [ + { + "id": "unit", + "value": "binbps" + }, + { + "id": "custom.cellOptions", + "value": { + "mode": "gradient", + "type": "color-background" + } + }, + { + "id": "thresholds", + "value": { + "mode": "absolute", + "steps": [ + { + "color": "rgba(50, 172, 45, 0.97)" + }, + { + "color": "rgba(237, 129, 40, 0.89)", + "value": 30485760 + }, + { + "color": "rgba(245, 54, 54, 0.9)", + "value": 104857600 + } + ] + } + }, + { + "id": "custom.width", + "value": 109 + }, + { + "id": "decimals" + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Upload" + }, + "properties": [ + { + "id": "unit", + "value": "binbps" + }, + { + "id": "custom.cellOptions", + "value": { + "mode": "gradient", + "type": "color-background" + } + }, + { + "id": "thresholds", + "value": { + "mode": "absolute", + "steps": [ + { + "color": "rgba(50, 172, 45, 0.97)" + }, + { + "color": "rgba(237, 129, 40, 0.89)", + "value": 30485760 + }, + { + "color": "rgba(245, 54, 54, 0.9)", + "value": 104857600 + } + ] + } + }, + { + "id": "custom.width", + "value": 95 + }, + { + "id": "decimals" + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "TCP conn" + }, + "properties": [ + { + "id": "custom.cellOptions", + "value": { + "mode": "gradient", + "type": "color-background" + } + }, + { + "id": "thresholds", + "value": { + "mode": "absolute", + "steps": [ + { + "color": "rgba(50, 172, 45, 0.97)" + }, + { + "color": "rgba(237, 129, 40, 0.89)", + "value": 1000 + }, + { + "color": "rgba(245, 54, 54, 0.9)", + "value": 1500 + } + ] + } + }, + { + "id": "custom.width", + "value": 106 + }, + { + "id": "decimals" + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "CPU" + }, + "properties": [ + { + "id": "custom.width", + "value": 75 + }, + { + "id": "decimals", + "value": 0 + } + ] + }, + { + "matcher": { + "id": "byRegexp", + "options": "/.*used.*/" + }, + "properties": [ + { + "id": "unit", + "value": "percent" + }, + { + "id": "custom.cellOptions", + "value": { + "mode": "gradient", + "type": "gauge" + } + }, + { + "id": "color", + "value": { + "mode": "continuous-GrYlRd" + } + }, + { + "id": "custom.width", + "value": 110 + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Memory used%" + }, + "properties": [ + { + "id": "custom.width", + "value": 144 + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "CPU used%" + }, + "properties": [ + { + "id": "custom.width", + "value": 132 + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "IO used" + }, + "properties": [ + { + "id": "custom.width", + "value": 116 + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Partition used" + }, + "properties": [ + { + "id": "custom.width", + "value": 122 + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "IP" + }, + "properties": [ + { + "id": "links", + "value": [ + { + "title": "Show details", + "url": "d/rYdddlPWk/node-exporter-full?orgId=1&var-job=${job}&var-node=${__value.raw}" + } + ] + }, + { + "id": "custom.align", + "value": "left" + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Hostname" + }, + "properties": [ + { + "id": "links", + "value": [ + { + "title": "Show details", + "url": "/d/feh6u5st2pou8b/node-exporter-full-with-node-name?orgId=1&var-name=${__value.raw}" + } + ] + }, + { + "id": "custom.align", + "value": "left" + } + ] + } + ] + }, + "gridPos": { + "h": 15, + "w": 24, + "x": 0, + "y": 0 + }, + "id": 198, + "options": { + "cellHeight": "sm", + "footer": { + "countRows": false, + "enablePagination": false, + "fields": [ + "Value #B", + "Value #C", + "Value #L", + "Value #H", + "Value #I", + "Value #M", + "Value #N", + "Value #J", + "Value #K" + ], + "reducer": [ + "sum" + ], + "show": false + }, + "showHeader": true, + "sortBy": [ + { + "desc": false, + "displayName": "Hostname" + } + ] + }, + "pluginVersion": "11.6.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "node_uname_info{job=~\"$job\"} - 0", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "Hostname", + "refId": "A" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "node_memory_MemTotal_bytes{job=~\"$job\"} - 0", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "Memory", + "refId": "B" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "count(node_cpu_seconds_total{job=~\"$job\",mode='system'}) by (instance)", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "CPU cores", + "refId": "C" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "sum(time() - node_boot_time_seconds{job=~\"$job\"})by(instance)/86400", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "Uptime", + "refId": "D" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "node_load5{job=~\"$job\"}", + "format": "table", + "instant": true, + "interval": "", + "legendFormat": "5m load", + "refId": "L" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "(1 - avg(irate(node_cpu_seconds_total{job=~\"$job\",mode=\"idle\"}[$interval])) by (instance)) * 100", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "CPU used%", + "refId": "F" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "(1 - (node_memory_MemAvailable_bytes{job=~\"$job\"} / (node_memory_MemTotal_bytes{job=~\"$job\"})))* 100", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "Memory used%", + "refId": "G" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "max((node_filesystem_size_bytes{job=~\"$job\",fstype=~\"ext.?|xfs\"}-node_filesystem_free_bytes{job=~\"$job\",fstype=~\"ext.?|xfs\"}) *100/(node_filesystem_avail_bytes {job=~\"$job\",fstype=~\"ext.?|xfs\"}+(node_filesystem_size_bytes{job=~\"$job\",fstype=~\"ext.?|xfs\"}-node_filesystem_free_bytes{job=~\"$job\",fstype=~\"ext.?|xfs\"})))by(instance)", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "Partition used", + "refId": "E" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "max(irate(node_disk_read_bytes_total{job=~\"$job\"}[$interval])) by (instance)", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "Disk read", + "refId": "H" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "max(irate(node_disk_written_bytes_total{job=~\"$job\"}[$interval])) by (instance)", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "Disk write", + "refId": "I" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "node_netstat_Tcp_CurrEstab{job=~\"$job\"} - 0", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "TCP connections", + "refId": "M" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "node_sockstat_TCP_tw{job=~\"$job\"} - 0", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "TCP sockets", + "refId": "N" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "max(irate(node_network_receive_bytes_total{job=~\"$job\"}[$interval])*8) by (instance)", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "Download", + "refId": "J" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "max(irate(node_network_transmit_bytes_total{job=~\"$job\"}[$interval])*8) by (instance)", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "Upload", + "refId": "K" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "((1-(1 - avg(irate(node_cpu_seconds_total{job=~\"$job\",mode=\"idle\"}[$interval])) by (instance))^1.3)^(1/3)*0.5 + \r\n(1-(1 - avg(node_memory_MemAvailable_bytes{job=~\"$job\"} / node_memory_MemTotal_bytes{job=~\"$job\"})by (instance))^6)^(1/3)*0.3 + \r\n(1 - max(irate(node_disk_io_time_seconds_total{job=~\"$job\"}[$interval]))by (instance)^1.1)^(1/2)*0.2)*100", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "__auto", + "refId": "O" + }, + { + "datasource": { + "type": "prometheus", + "uid": "${DS_PROMETHEUS}" + }, + "editorMode": "code", + "exemplar": false, + "expr": "max(irate(node_disk_io_time_seconds_total{job=~\"$job\"}[$interval])) by (instance) *100", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "legendFormat": "IO used", + "refId": "P" + } + ], + "transformations": [ + { + "id": "merge", + "options": { + "reducers": [] + } + }, + { + "id": "organize", + "options": { + "excludeByName": { + "Value #C": false, + "Value #L": true, + "Value #N": true, + "Value #O": true, + "exp": false, + "iid": false + }, + "includeByName": {}, + "indexByName": { + "Time": 20, + "Value #A": 36, + "Value #B": 7, + "Value #C": 8, + "Value #D": 4, + "Value #E": 13, + "Value #F": 10, + "Value #G": 11, + "Value #H": 14, + "Value #I": 15, + "Value #J": 18, + "Value #K": 19, + "Value #L": 9, + "Value #M": 16, + "Value #N": 17, + "Value #O": 6, + "Value #P": 12, + "__name__": 37, + "account": 21, + "cservice": 22, + "domainname": 23, + "exp": 5, + "group": 24, + "iaccount": 25, + "igroup": 26, + "iid": 3, + "iname": 27, + "instance": 2, + "job": 28, + "machine": 29, + "name": 1, + "nodename": 0, + "origin_prometheus": 30, + "region": 31, + "release": 32, + "sysname": 33, + "vendor": 34, + "version": 35 + }, + "renameByName": { + "Value #B": "Memory", + "Value #C": "CPU", + "Value #D": "Uptime", + "Value #E": "Partition used", + "Value #F": "CPU used%", + "Value #G": "Memory used%", + "Value #H": "Disk read", + "Value #I": "Disk write", + "Value #J": "Download", + "Value #K": "Upload", + "Value #L": "5m load", + "Value #M": "TCP conn", + "Value #N": "TCP sockets", + "Value #O": "Health", + "Value #P": "IO used", + "exp": "到期日", + "iid": "实例ID", + "instance": "IP", + "name": "", + "nodename": "Hostname" + } + } + }, + { + "id": "filterFieldsByName", + "options": { + "include": { + "names": [ + "Hostname", + "IP", + "Uptime", + "Health", + "Memory", + "CPU", + "CPU used%", + "Memory used%", + "IO used", + "Partition used", + "Disk read", + "Disk write", + "TCP conn", + "TCP sockets", + "Download", + "Upload", + "5m load" + ] + } + } + } + ], + "type": "table" + } + ], + "refresh": "", + "schemaVersion": 41, + "tags": [ + "Dashboard", + "CommandCenter" + ], + "templating": { + "list": [ + { + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "3m", + "value": "3m" + }, + "label": "Interval", + "name": "interval", + "options": [ + { + "selected": true, + "text": "3m", + "value": "3m" + } + ], + "query": "3m", + "refresh": 2, + "type": "interval" + }, + { + "current": {}, + "definition": "label_values(node_uname_info,job)", + "label": "JOB", + "name": "job", + "options": [], + "query": { + "qryType": 1, + "query": "label_values(node_uname_info,job)", + "refId": "PrometheusVariableQueryEditor-VariableQuery" + }, + "refresh": 1, + "regex": "", + "sort": 5, + "type": "query" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "30s", + "1m", + "3m", + "5m", + "15m", + "30m" + ] + }, + "timezone": "browser", + "title": "dashboard", + "uid": "behgmepd9v08wd", + "version": 35, + "weekStart": "" +} \ No newline at end of file diff --git a/dashboard002.json b/dashboard002.json new file mode 100644 index 0000000..6e568df --- /dev/null +++ b/dashboard002.json @@ -0,0 +1,331 @@ +{ + "__inputs": [ + { + "name": "DS_PROMETHEUS", + "label": "Prometheus", + "description": "", + "type": "datasource", + "pluginId": "prometheus", + "pluginName": "Prometheus" + } + ], + "__elements": {}, + "__requires": [ + { + "type": "grafana", + "id": "grafana", + "name": "Grafana", + "version": "11.1.0" + }, + { + "type": "datasource", + "id": "prometheus", + "name": "Prometheus", + "version": "1.0.0" + }, + { + "type": "panel", + "id": "table", + "name": "Table", + "version": "" + } + ], + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": { + "type": "grafana", + "uid": "grafana" + }, + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "description": "Server Resource Dashboard", + "editable": true, + "fiscalYearStartMonth": 0, + "graphTooltip": 0, + "id": null, + "links": [], + "panels": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus_ds" + }, + "description": "", + "fieldConfig": { + "defaults": { + "color": { + "mode": "thresholds" + }, + "custom": { + "align": "auto", + "cellOptions": { + "type": "auto" + }, + "filterable": false + }, + "mappings": [], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green", + "value": null + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [ + { + "matcher": { + "id": "byName", + "options": "Hostname" + }, + "properties": [ + { + "id": "custom.align", + "value": "left" + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "IP" + }, + "properties": [ + { + "id": "custom.align", + "value": "left" + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "CPU used%" + }, + "properties": [ + { + "id": "unit", + "value": "percent" + }, + { + "id": "custom.cellOptions", + "value": { + "type": "gauge", + "mode": "gradient" + } + }, + { + "id": "thresholds", + "value": { + "mode": "absolute", + "steps": [ + { "color": "#73BF69", "value": null }, + { "color": "#F2CC0C", "value": 70 }, + { "color": "#F2495C", "value": 80 } + ] + } + }, + { + "id": "min", + "value": 0 + }, + { + "id": "max", + "value": 100 + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Memory used%" + }, + "properties": [ + { + "id": "unit", + "value": "percent" + }, + { + "id": "custom.cellOptions", + "value": { + "type": "gauge", + "mode": "gradient" + } + }, + { + "id": "thresholds", + "value": { + "mode": "absolute", + "steps": [ + { "color": "#73BF69", "value": null }, + { "color": "#F2CC0C", "value": 70 }, + { "color": "#F2495C", "value": 80 } + ] + } + }, + { + "id": "min", + "value": 0 + }, + { + "id": "max", + "value": 100 + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Disk used" + }, + "properties": [ + { + "id": "unit", + "value": "percent" + }, + { + "id": "custom.cellOptions", + "value": { + "type": "gauge", + "mode": "gradient" + } + }, + { + "id": "thresholds", + "value": { + "mode": "absolute", + "steps": [ + { "color": "#73BF69", "value": null }, + { "color": "#F2CC0C", "value": 70 }, + { "color": "#F2495C", "value": 80 } + ] + } + }, + { + "id": "min", + "value": 0 + }, + { + "id": "max", + "value": 100 + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Uptime" + }, + "properties": [ + { + "id": "unit", + "value": "dtdurations" + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Download" + }, + "properties": [ + { + "id": "unit", + "value": "bps" + } + ] + }, + { + "matcher": { + "id": "byName", + "options": "Upload" + }, + "properties": [ + { + "id": "unit", + "value": "bps" + } + ] + } + ] + }, + "gridPos": { "h": 24, "w": 24, "x": 0, "y": 0 }, + "id": 1, + "options": { + "cellHeight": "sm", + "footer": { "show": false }, + "showHeader": true + }, + "pluginVersion": "11.1.0", + "targets": [ + { "refId": "A", "datasource": { "type": "prometheus", "uid": "prometheus_ds" }, "expr": "node_uname_info{job=~\"$job\"}", "format": "table", "instant": true, "legendFormat": "Hostname" }, + { "refId": "B", "datasource": { "type": "prometheus", "uid": "prometheus_ds" }, "expr": "100 - (avg by(hostname) (rate(node_cpu_seconds_total{mode=\"idle\", job=~\"$job\"}[$interval])) * 100)", "format": "table", "instant": true, "legendFormat": "CPU used%" }, + { "refId": "C", "datasource": { "type": "prometheus", "uid": "prometheus_ds" }, "expr": "(1 - (node_memory_MemAvailable_bytes{job=~\"$job\"} / node_memory_MemTotal_bytes{job=~\"$job\"})) * 100", "format": "table", "instant": true, "legendFormat": "Memory used%" }, + { "refId": "D", "datasource": { "type": "prometheus", "uid": "prometheus_ds" }, "expr": "(1 - (node_filesystem_avail_bytes{mountpoint=~\"/|/data\", job=~\"$job\"} / node_filesystem_size_bytes{mountpoint=~\"/|/data\", job=~\"$job\"})) * 100", "format": "table", "instant": true, "legendFormat": "Disk used" }, + { "refId": "E", "datasource": { "type": "prometheus", "uid": "prometheus_ds" }, "expr": "time() - node_boot_time_seconds{job=~\"$job\"}", "format": "table", "instant": true, "legendFormat": "Uptime" }, + { "refId": "F", "datasource": { "type": "prometheus", "uid": "prometheus_ds" }, "expr": "sum by (hostname) (rate(node_network_receive_bytes_total{job=~\"$job\"}[$interval])) * 8", "format": "table", "instant": true, "legendFormat": "Download" }, + { "refId": "G", "datasource": { "type": "prometheus", "uid": "prometheus_ds" }, "expr": "sum by (hostname) (rate(node_network_transmit_bytes_total{job=~\"$job\"}[$interval])) * 8", "format": "table", "instant": true, "legendFormat": "Upload" } + ], + "transformations": [ + { "id": "merge", "options": {} }, + { "id": "organize", "options": { "indexByName": {}, "renameByName": { "Value #A": "Hostname", "Value #B": "CPU used%", "Value #C": "Memory used%", "Value #D": "Disk used", "Value #E": "Uptime", "Value #F": "Download", "Value #G": "Upload", "instance": "IP" } } } + ], + "type": "table" + } + ], + "refresh": "1m", + "schemaVersion": 39, + "tags": ["command-center", "overview"], + "templating": { + "list": [ + { + "current": { "selected": true, "text": "common_servers", "value": "common_servers" }, + "hide": 0, + "includeAll": false, + "multi": false, + "name": "job", + "options": [ + { "selected": true, "text": "common_servers", "value": "common_servers" }, + { "selected": false, "text": "game_servers", "value": "game_servers" } + ], + "query": "common_servers,game_servers", + "skipUrlSync": false, + "type": "custom" + }, + { + "current": { "selected": true, "text": "1m", "value": "1m" }, + "hide": 0, + "name": "interval", + "options": [ + { "selected": false, "text": "30s", "value": "30s" }, + { "selected": true, "text": "1m", "value": "1m" }, + { "selected": false, "text": "5m", "value": "5m" } + ], + "query": "30s,1m,5m", + "skipUrlSync": false, + "type": "interval" + } + ] + }, + "time": { "from": "now-1h", "to": "now" }, + "timepicker": {}, + "timezone": "browser", + "title": "Server Overview", + "uid": "server-overview-dashboard", + "version": 1, + "weekStart": "" +} \ No newline at end of file diff --git a/grafana/grafana.ini b/grafana/grafana.ini new file mode 100644 index 0000000..539decf --- /dev/null +++ b/grafana/grafana.ini @@ -0,0 +1,2464 @@ +##################### Grafana Configuration Example ##################### +# +# Everything has defaults so you only need to uncomment things you want to +# change + +# possible values : production, development +;app_mode = production + +# instance name, defaults to HOSTNAME environment variable value or hostname if HOSTNAME var is empty +;instance_name = ${HOSTNAME} + +#################################### Paths #################################### +[paths] +# Path to where grafana can store temp files, sessions, and the sqlite3 db (if that is used) +;data = /var/lib/grafana + +# Temporary files in `data` directory older than given duration will be removed +;temp_data_lifetime = 24h + +# Directory where grafana can store logs +;logs = /var/log/grafana + +# Directory where grafana will automatically scan and look for plugins +;plugins = /var/lib/grafana/plugins + +# folder that contains provisioning config files that grafana will apply on startup and while running. +;provisioning = conf/provisioning + +# Directories that are permitted to contain local repositories. +# This is a list. Each entry is delimited by a pipe (|). No leading or trailing spaces are supported. +# These do not need to be absolute paths, in which case they'll be relative to the path where you are running Grafana. +# Empty entries will return an error, unless the string is just a single pipe. +# Example: permitted_provisioning_paths = /tmp|/etc/grafana/repositories|conf/provisioning +;permitted_provisioning_paths = devenv/dev-dashboards|conf/provisioning + +#################################### Server #################################### +[server] +# Protocol (http, https, h2, socket) +;protocol = http + +# Minimum TLS version allowed. By default, this value is empty. Accepted values are: TLS1.2, TLS1.3. If nothing is set TLS1.2 would be taken +;min_tls_version = "" + +# The ip address to bind to, empty will bind to all interfaces +;http_addr = + +# The http port to use +http_port = 3001 + +# The public facing domain name used to access grafana from a browser +domain = grafana.dungeonstalkers.com + +# Redirect to correct domain if host header does not match domain +# Prevents DNS rebinding attacks +;enforce_domain = grafana.dungeonstalkers.com + +# The full public facing url you use in browser, used for redirects and emails +# If you use reverse proxy and sub path specify full url (with sub path) +root_url = https://grafana.dungeonstalkers.com:8443/ + +# Serve Grafana from subpath specified in `root_url` setting. By default it is set to `false` for compatibility reasons. +;serve_from_sub_path = false + +# Log web requests +;router_logging = false + +# the path relative working path +;static_root_path = public + +# enable gzip +;enable_gzip = false + +# https certs & key file +;cert_file = +;cert_key = + +# optional password to be used to decrypt key file +;cert_pass = + +# Certificates file watch interval +;certs_watch_interval = + +# Unix socket gid +# Changing the gid of a file without privileges requires that the target group is in the group of the process and that the process is the file owner +# It is recommended to set the gid as http server user gid +# Not set when the value is -1 +;socket_gid = + +# Unix socket mode +;socket_mode = + +# Unix socket path +;socket = + +# CDN Url +;cdn_url = + +# Sets the maximum time using a duration format (5s/5m/5ms) before timing out read of an incoming request and closing idle connections. +# `0` means there is no timeout for reading the request. +;read_timeout = 0 + +# This setting enables you to specify additional headers that the server adds to HTTP(S) responses. +[server.custom_response_headers] +#exampleHeader1 = exampleValue1 +#exampleHeader2 = exampleValue2 + +[environment] +# Sets whether the local file system is available for Grafana to use. Default is true for backward compatibility. +;local_file_system_available = true + +#################################### GRPC Server ######################### +;[grpc_server] +;network = "tcp" +;address = "127.0.0.1:10000" +;use_tls = false +;cert_file = +;key_file = +;max_recv_msg_size = +;max_send_msg_size = +# this will log the request and response for each unary gRPC call +;enable_logging = false + +#################################### Database #################################### +[database] +# You can configure the database connection by specifying type, host, name, user and password +# as separate properties or as on string using the url properties. + +# Either "mysql", "postgres" or "sqlite3", it's your choice +;type = sqlite3 +;host = 127.0.0.1:3306 +;name = grafana +;user = root +# If the password contains # or ; you have to wrap it with triple quotes. Ex """#password;""" +;password = +# Use either URL or the previous fields to configure the database +# Example: mysql://user:secret@host:port/database +;url = + +# Set to true or false to enable or disable high availability mode. +# When it's set to false some functions will be simplified and only run in-process +# instead of relying on the database. +# +# Only set it to false if you run only a single instance of Grafana. +;high_availability = true + +# Max idle conn setting default is 2 +;max_idle_conn = 2 + +# Max conn setting default is 0 (mean not set) +;max_open_conn = + +# Connection Max Lifetime default is 14400 (means 14400 seconds or 4 hours) +;conn_max_lifetime = 14400 + +# Set to true to log the sql calls and execution times. +;log_queries = + +# For "postgres", use either "disable", "require" or "verify-full" +# For "mysql", use either "true", "false", or "skip-verify". +;ssl_mode = disable + +# For "postgres", use either "1" to enable or "0" to disable SNI +;ssl_sni = + +# Database drivers may support different transaction isolation levels. +# Currently, only "mysql" driver supports isolation levels. +# If the value is empty - driver's default isolation level is applied. +# For "mysql" use "READ-UNCOMMITTED", "READ-COMMITTED", "REPEATABLE-READ" or "SERIALIZABLE". +;isolation_level = + +;ca_cert_path = +;client_key_path = +;client_cert_path = +;server_cert_name = + +# For "sqlite3" only, path relative to data_path setting +;path = grafana.db + +# For "sqlite3" only. cache mode setting used for connecting to the database. (private, shared) +;cache_mode = private + +# For "sqlite3" only. Enable/disable Write-Ahead Logging, https://sqlite.org/wal.html. Default is false. +;wal = false + +# For "mysql" and "postgres" only. Lock the database for the migrations, default is true. +;migration_locking = true + +# For "mysql" and "postgres" only. How many seconds to wait before failing to lock the database for the migrations, default is 0. +;locking_attempt_timeout_sec = 0 + +# For "sqlite" only. How many times to retry query in case of database is locked failures. Default is 0 (disabled). +;query_retries = 0 + +# For "sqlite" only. How many times to retry transaction in case of database is locked failures. Default is 5. +;transaction_retries = 5 + +# Set to true to add metrics and tracing for database queries. +;instrument_queries = false + +#################################### Cache server ############################# +[remote_cache] +# Either "redis", "memcached" or "database" default is "database" +;type = database + +# cache connectionstring options +# database: will use Grafana primary database. +# redis: config like redis server e.g. `addr=127.0.0.1:6379,pool_size=100,db=0,username=grafana,password=grafanaRocks,ssl=false`. Only addr is required. ssl may be 'true', 'false', or 'insecure'. +# memcache: 127.0.0.1:11211 +;connstr = + +# prefix prepended to all the keys in the remote cache +; prefix = + +# This enables encryption of values stored in the remote cache +;encryption = + +#################################### Data proxy ########################### +[dataproxy] + +# This enables data proxy logging, default is false +;logging = false + +# How long the data proxy waits to read the headers of the response before timing out, default is 30 seconds. +# This setting also applies to core backend HTTP data sources where query requests use an HTTP client with timeout set. +;timeout = 30 + +# How long the data proxy waits to establish a TCP connection before timing out, default is 10 seconds. +;dialTimeout = 10 + +# How many seconds the data proxy waits before sending a keepalive probe request. +;keep_alive_seconds = 30 + +# How many seconds the data proxy waits for a successful TLS Handshake before timing out. +;tls_handshake_timeout_seconds = 10 + +# How many seconds the data proxy will wait for a server's first response headers after +# fully writing the request headers if the request has an "Expect: 100-continue" +# header. A value of 0 will result in the body being sent immediately, without +# waiting for the server to approve. +;expect_continue_timeout_seconds = 1 + +# Optionally limits the total number of connections per host, including connections in the dialing, +# active, and idle states. On limit violation, dials will block. +# A value of zero (0) means no limit. +;max_conns_per_host = 0 + +# The maximum number of idle connections that Grafana will keep alive. +;max_idle_connections = 100 + +# How many seconds the data proxy keeps an idle connection open before timing out. +;idle_conn_timeout_seconds = 90 + +# If enabled and user is not anonymous, data proxy will add X-Grafana-User header with username into the request, default is false. +;send_user_header = false + +# Limit the amount of bytes that will be read/accepted from responses of outgoing HTTP requests. +;response_limit = 0 + +# Limits the number of rows that Grafana will process from SQL data sources. +;row_limit = 1000000 + +# Sets a custom value for the `User-Agent` header for outgoing data proxy requests. If empty, the default value is `Grafana/` (for example `Grafana/9.0.0`). +;user_agent = + +#################################### Analytics #################################### +[analytics] +# Server reporting, sends usage counters to stats.grafana.org every 24 hours. +# No ip addresses are being tracked, only simple counters to track +# running instances, dashboard and error counts. It is very helpful to us. +# Change this option to false to disable reporting. +;reporting_enabled = true + +# The name of the distributor of the Grafana instance. Ex hosted-grafana, grafana-labs +;reporting_distributor = grafana-labs + +# Set to false to disable all checks to https://grafana.com +# for new versions of grafana. The check is used +# in some UI views to notify that a grafana update exists. +# This option does not cause any auto updates, nor send any information +# only a GET request to https://grafana.com/api/grafana/versions/stable to get the latest version. +;check_for_updates = true + +# Set to false to disable all checks to https://grafana.com +# for new versions of plugins. The check is used +# in some UI views to notify that a plugin update exists. +# This option does not cause any auto updates, nor send any information +# only a GET request to https://grafana.com to get the latest versions. +;check_for_plugin_updates = true + +# Google Analytics universal tracking code, only enabled if you specify an id here +;google_analytics_ua_id = + +# Google Analytics 4 tracking code, only enabled if you specify an id here +;google_analytics_4_id = + +# When Google Analytics 4 Enhanced event measurement is enabled, we will try to avoid sending duplicate events and let Google Analytics 4 detect navigation changes, etc. +;google_analytics_4_send_manual_page_views = false + +# Google Tag Manager ID, only enabled if you specify an id here +;google_tag_manager_id = + +# Rudderstack write key, enabled only if rudderstack_data_plane_url is also set +;rudderstack_write_key = + +# Rudderstack data plane url, enabled only if rudderstack_write_key is also set +;rudderstack_data_plane_url = + +# Rudderstack SDK url, optional, only valid if rudderstack_write_key and rudderstack_data_plane_url is also set +;rudderstack_sdk_url = + +# Rudderstack Config url, optional, used by Rudderstack SDK to fetch source config +;rudderstack_config_url = + +# Rudderstack Integrations URL, optional. Only valid if you pass the SDK version 1.1 or higher +;rudderstack_integrations_url = + +# Intercom secret, optional, used to hash user_id before passing to Intercom via Rudderstack +;intercom_secret = + +# Application Insights connection string. Specify an URL string to enable this feature. +;application_insights_connection_string = + +# Optional. Specifies an Application Insights endpoint URL where the endpoint string is wrapped in backticks ``. +;application_insights_endpoint_url = + +# Controls if the UI contains any links to user feedback forms +;feedback_links_enabled = true + +# Static context that is being added to analytics events +;reporting_static_context = grafanaInstance=12, os=linux + +# Logs interaction events to the browser javascript console, intended for development only +;browser_console_reporter = false + +#################################### Security #################################### +[security] +# disable creation of admin user on first start of grafana +;disable_initial_admin_creation = false + +# default admin user, created on startup +;admin_user = admin + +# default admin password, can be changed before first start of grafana, or in profile settings +;admin_password = admin + +# default admin email, created on startup +;admin_email = admin@localhost + +# used for signing +;secret_key = SW2YcwTIb9zpOOhoPsMm + +# current key provider used for envelope encryption, default to static value specified by secret_key +;encryption_provider = secretKey.v1 + +# list of configured key providers, space separated (Enterprise only): e.g., awskms.v1 azurekv.v1 +;available_encryption_providers = + +# disable gravatar profile images +;disable_gravatar = false + +# data source proxy whitelist (ip_or_domain:port separated by spaces) +;data_source_proxy_whitelist = + +# disable protection against brute force login attempts +;disable_brute_force_login_protection = false + +# max number of failed login attempts before user gets locked +;brute_force_login_protection_max_attempts = 5 + +# disable protection against brute force login attempts by IP address +; disable_ip_address_login_protection = true + +# set to true if you host Grafana behind HTTPS. default is false. +;cookie_secure = false + +# set cookie SameSite attribute. defaults to `lax`. can be set to "lax", "strict", "none" and "disabled" +;cookie_samesite = lax + +# set to true if you want to allow browsers to render Grafana in a ,