1.获取prometheus二进制文件 方法一 在官网直接浏览器进行下载 方法二 wget下载(需要外网下载,国内网太慢,大概率下不下来),地址只要将光标移动到相应的二进制文件名称上,页面左下角就会浮现下载链接,如
wget https://github.com/prometheus/prometheus/releases/download/v2.5.0/prometheus-2.5.0.linux-amd64.tar.gz2.下载完成后执行tar -zxvf命令进行解压(解压文件中包含prometheus.yml的初始文件)
3.查看版本/验证yml文件
./prometheus --version ./promtool check config /etc/prom/prometheus-2.13.1.linux-amd64/prometheus.yml4.启动前准备 启动前先检查一下端口占用情况
netstat -anpl |grep 8091同步时间,解决浏览器和server时间不匹配的问题。注意,如果是k8s集群,需要同步时间的节点并不是主节点,而是prometheus所在的那个节点
date -s 11:45:125.启动prometheus 1–直接启动
./prometheus --config.file=prometheus.yml --web.listen-address=:8091 --web.enable-lifecycle2–后台启动(日志输出在当前的文件内)
nohup ./prometheus --config.file=prometheus.yml --web.listen-address=:8091 --web.enable-lifecycle >> output.log 2>&1 &其中
#数据库存储位置,如果是用docker部署,则这个路径代表的是docker中的路径 - "--storage.tsdb.path=/prometheus/data/" #数据保留时间,这里设置为为15天 - "--storage.tsdb.retention=15d" #只有在启动prometheus时增加了这个参数,才可以执行reload指令 #reload之后是立即生效的 - "--web.enable-lifecycle"reload命令
#当rules文件发生改变时,需要reload的是prometheus而不是alertmanager curl -X POST http://ip:9090/-/reload注:prometheus帮助文档
[root@node-gvngrmix prometheus-2.13.1.linux-amd64]# ./prometheus --help usage: prometheus [<flags>] The Prometheus monitoring server Flags: -h, --help Show context-sensitive help (also try --help-long and --help-man). --version Show application version. --config.file="prometheus.yml" Prometheus configuration file path. --web.listen-address="0.0.0.0:9090" Address to listen on for UI, API, and telemetry. --web.read-timeout=5m Maximum duration before timing out read of the request, and closing idle connections. --web.max-connections=512 Maximum number of simultaneous connections. --web.external-url=<URL> The URL under which Prometheus is externally reachable (for example, if Prometheus is served via a reverse proxy). Used for generating relative and absolute links back to Prometheus itself. If the URL has a path portion, it will be used to prefix all HTTP endpoints served by Prometheus. If omitted, relevant URL components will be derived automatically. --web.route-prefix=<path> Prefix for the internal routes of web endpoints. Defaults to path of --web.external-url. --web.user-assets=<path> Path to static asset directory, available at /user. --web.enable-lifecycle Enable shutdown and reload via HTTP request. --web.enable-admin-api Enable API endpoints for admin control actions. --web.console.templates="consoles" Path to the console template directory, available at /consoles. --web.console.libraries="console_libraries" Path to the console library directory. --web.page-title="Prometheus Time Series Collection and Processing Server" Document title of Prometheus instance. --web.cors.origin=".*" Regex for CORS origin. It is fully anchored. Example: 'https?://(domain1|domain2)\.com' --storage.tsdb.path="data/" Base path for metrics storage. --storage.tsdb.retention=STORAGE.TSDB.RETENTION [DEPRECATED] How long to retain samples in storage. This flag has been deprecated, use "storage.tsdb.retention.time" instead. --storage.tsdb.retention.time=STORAGE.TSDB.RETENTION.TIME How long to retain samples in storage. When this flag is set it overrides "storage.tsdb.retention". If neither this flag nor "storage.tsdb.retention" nor "storage.tsdb.retention.size" is set, the retention time defaults to 15d. --storage.tsdb.retention.size=STORAGE.TSDB.RETENTION.SIZE [EXPERIMENTAL] Maximum number of bytes that can be stored for blocks. Units supported: KB, MB, GB, TB, PB. This flag is experimental and can be changed in future releases. --storage.tsdb.no-lockfile Do not create lockfile in data directory. --storage.tsdb.allow-overlapping-blocks [EXPERIMENTAL] Allow overlapping blocks, which in turn enables vertical compaction and vertical query merge. --storage.tsdb.wal-compression Compress the tsdb WAL. --storage.remote.flush-deadline=<duration> How long to wait flushing sample on shutdown or config reload. --storage.remote.read-sample-limit=5e7 Maximum overall number of samples to return via the remote read interface, in a single query. 0 means no limit. This limit is ignored for streamed response types. --storage.remote.read-concurrent-limit=10 Maximum number of concurrent remote read calls. 0 means no limit. --storage.remote.read-max-bytes-in-frame=1048576 Maximum number of bytes in a single frame for streaming remote read response types before marshalling. Note that client might have limit on frame size as well. 1MB as recommended by protobuf by default. --rules.alert.for-outage-tolerance=1h Max time to tolerate prometheus outage for restoring "for" state of alert. --rules.alert.for-grace-period=10m Minimum duration between alert and restored "for" state. This is maintained only for alerts with configured "for" time greater than grace period. --rules.alert.resend-delay=1m Minimum amount of time to wait before resending an alert to Alertmanager. --alertmanager.notification-queue-capacity=10000 The capacity of the queue for pending Alertmanager notifications. --alertmanager.timeout=10s Timeout for sending alerts to Alertmanager. --query.lookback-delta=5m The maximum lookback duration for retrieving metrics during expression evaluations. --query.timeout=2m Maximum time a query may take before being aborted. --query.max-concurrency=20 Maximum number of queries executed concurrently. --query.max-samples=50000000 Maximum number of samples a single query can load into memory. Note that queries will fail if they try to load more samples than this into memory, so this also limits the number of samples a query can return. --log.level=info Only log messages with the given severity or above. One of: [debug, info, warn, error] --log.format=logfmt Output format of log messages. One of: [logfmt, json]