> 本文作者:丁辉 # N9e对接Kube-Prometheus-Stack ## 更新Kube-Prometheus-Stack 1. 编写 values.yaml ```bash vi kube-prometheus-stack-values.yaml ``` 2. 内容如下 ```yaml prometheusOperator: admissionWebhooks: patch: enabled: true image: registry: registry.aliyuncs.com # 配置国内镜像加速 repository: google_containers/kube-webhook-certgen grafana: enabled: false alertmanager: enabled: false defaultRules: create: false # 这些设置表明所提及的选择器(规则、服务监视器、Pod 监视器和抓取配置)将具有独立的配置,而不会基于 Helm 图形值。(否则你的 ServiceMonitor 可能不会被自动发现) prometheus: prometheusSpec: ruleSelectorNilUsesHelmValues: false serviceMonitorSelectorNilUsesHelmValues: false podMonitorSelectorNilUsesHelmValues: false probeSelectorNilUsesHelmValues: false scrapeConfigSelectorNilUsesHelmValues: false # 服务器上启用 --web.enable-remote-write-receiver 标志 enableRemoteWriteReceiver: true # 启用 Prometheus 中被禁用的特性 enableFeatures: - remote-write-receiver # 挂载持久化存储 storageSpec: volumeClaimTemplate: spec: # 选择默认的 sc 创建存储(我已在集群内准备 nfs-client) storageClassName: accessModes: ["ReadWriteOnce"] resources: requests: storage: 10Gi # 根据自己的需求申请 pvc 大小 # 挂载本地时区 volumes: - name: timezone hostPath: path: /usr/share/zoneinfo/Asia/Shanghai volumeMounts: - name: timezone mountPath: /etc/localtime readOnly: true ``` 3. 更新 ```bash helm upgrade kube-prometheus-stack -f kube-prometheus-stack-values.yaml --set "kube-state-metrics.image.registry=k8s.dockerproxy.com" prometheus-community/kube-prometheus-stack -n monitor ``` ## 更新N9e 1. 获取 nightingale-center svc ```bash kubectl get svc nightingale-center -n monitor | grep -v NAME | awk '{print $3}' ``` 2. 编写 values.yaml ```bash vi n9e-values.yaml ``` 内容如下 ```yaml expose: type: clusterIP # 使用 clusterIP externalURL: https://hello.n9e.info # 改为自己的外部服务访问地址 persistence: enabled: true categraf: internal: docker_socket: unix:///var/run/docker.sock # 如果您的kubernetes运行时是容器或其他,则清空此变量。 n9e: internal: image: repository: flashcatcloud/nightingale tag: latest # 使用最新版镜像 prometheus: type: external external: host: "10.43.119.105" # 这里添加 nightingale-center svc port: "9090" username: "" password: "" podAnnotations: {} ``` 3. 更新 ```bash helm upgrade nightingale ./n9e-helm -n monitor -f n9e-values.yaml ``` 4. 编写 ServiceMonitor ```bash vi n9e-servicemonitor.yaml ``` 内容如下 ```yaml apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: n9e-center-monitor namespace: monitor spec: endpoints: - path: /metrics port: port namespaceSelector: matchNames: - monitor selector: matchLabels: app: n9e ``` 5. 部署 ```bash kubectl apply -f n9e-servicemonitor.yaml ``` 6. N9e 添加数据源 ```bash http://kube-prometheus-stack-prometheus:9090/ ``` ## 问题记录 > ``` > WARNING writer/writer.go:129 push data with remote write:http://10.43.119.105:9090/api/v1/write request got status code: 400, response body: out of order sample > WARNING writer/writer.go:79 post to http://10.43.119.105:9090/api/v1/write got error: push data with remote write:http://10.43.119.105:9090/api/v1/write request got status code: 400, response body: out of order sample > ``` > > 上报数据 400, 暂时没有思路咋解决