Graphite + Collectd + Statsd + Grafana安装配置
目录
这一周碰巧有机会接触到 Statsd,仔细研究了一下,发现涉及的东西还是挺多的,于是顺便把配置部署的过程记录一下备忘。
一开始是项目中需要统计一些数据,用golang开发,于是采用了g2s。但是首先总得在自己机器上面部署配置一下 Statsd的服务,google了一下发现后端依赖的服务还是挺多的,而且各种选择也是挺多的。一般的选择用:
- grafana + influxdb
- statsd + graphite
- collectd + graphite
- grafana + graphite
参考 introduction-to-tracking-statistics-on-servers 这里的介绍说明,我选择了 Graphite + Collectd + Statsd + Grafana 作为测试。
Graphite
graphite主要由三大部分组成:web界面 graphite-web,数据聚合存储 carbon-cache,数据库 whisper。对于whisper,数据默认存储在 var/lib/graphite/whisper 目录下
$ sudo apt-get install graphite-web graphite-carbon
graphite-web
graphite-web是由Python Django开发的 wsgi 应用,它的数据不与 carbon、whisper共同存储,因为我们需要额外配置数据库,这里我们采用 postgres 作为数据库
$ sudo apt-get install postgresql libpq-dev python-psycopg2 # 这里就不详细介绍 postgres 的用法了 $ sudo su postgres postgres$ createdb -E utf-8 graphite postgres$ createuser --interactive graphite -P
修改 graphite 的配置
$ sudo vim /etc/graphite/local_settings.py # 修改 secret_key 和 time_zone SECRET_KEY = 'a_salty_string' TIME_ZONE = 'Asia/Shanghai' # 修改数据库配置 DATABASES = { 'default': { 'NAME': 'graphite', 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'USER': 'graphite', 'PASSWORD': 'password', 'HOST': '127.0.0.1', 'PORT': '5432' } }
更新数据库
$ sudo graphite-manage syncdb
Graphite-web的基本部分就搞定了,接下来是 Carbon-cache。
carbon-cache
修改 carbon-cache 服务开机启动
$ sudo vim /etc/defaults/graphite-carbon CARBON_CACHE_ENABLED=true
修改配置文件
$ sudo vim /etc/carbon/carbon.conf # 修改一下日子的ratate,其它配置基本无需改动 ENABLE_LOGROTATION = True
配置数据存储 schemas
$ sudo vim /etc/carbon/storage-schemas.conf
[carbon]
pattern = ^carbon\.
retentions = 60:90d
# 添加一个作为测试
[test]
pattern = ^test\.
retentions = 10s:10m,1m:1h,10m:1d
# 这个是默认配置,一定要放在最后面,否则其他配置会被覆盖
[default_1min_for_1day]
pattern = .*
retentions = 60s:1d
对于 retentions 的说明:
The retention policy is defined by sets of numbers. Each set consists of a metric interval (how often a metric is recorded), followed by a colon and then the length of time to store those values. You can define multiple sets of numbers separated by commas.
配置数据 aggregation
$ sudo cp /usr/share/doc/graphite-carbon/examples/storage-aggregation.conf.example /etc/carbon/storage-aggregation.conf
$ sudo vim /etc/carbon/storage-aggregation.conf
# 内容看看就好,暂时不需要修改
[min]
pattern = \.min$
xFilesFactor = 0.1
aggregationMethod = min
……
启动服务
$ sudo service carbon-cache start
whisper
在使用过程中,如果whisper的数据已经生成了,然后才修改 storage-schemas 的retentions,则新的规则将不起作用,解决方法是:
- 使用 whisper-resize 命令调整
- 删除已有的whisper数据
nginx & uwsgi
graphite-web默认提供了 apache 的配置文件 (/usr/share/graphite-web/apache2-graphite.conf),配置方法可以参考这里。但是我们将以 nginx 和 uwsgi 来启动 graphite-web这个wsgi应用。配置文件参考gist:
# nginx graphite upstream graphite { server unix:///tmp/uwsgi.sock; } server { listen 9002; server_name localhost; access_log /var/log/nginx/graphite-access.log; error_log /var/log/nginx/graphite-error.log; root /user/share/graphite/static; location / { add_header Access-Control-Allow-Origin "*"; add_header Access-Control-Allow-Methods "GET, OPTIONS"; add_header Access-Control-Allow-Headers "origin, authorization, accept"; uwsgi_pass graphite; include /etc/nginx/uwsgi_params; } location /media { # This makes static media available at the /media/ url. The # media will continue to be available during site downtime, # allowing you to use styles and images in your maintenance page. alias /usr/lib/python2.7/dist-packages/django/contrib/admin/media; } }
# /etc/uwsgi/app-enabled/graphite.ini [uwsgi] vacuum = true master = true processes = 4 pidfile = /tmp/uwsgi.pid socket = /tmp/uwsgi.sock chmod-socket = 666 gid = _graphite uid = _graphite chdir = /usr/share/graphite-web wsgi-file = graphite.wsgi pymodule-alias = graphite.local_settings=/etc/graphite/local_settings.py buffer-size = 65536 plugin = python
依赖的包:
$sudo apt-get install graphite-carbon graphite-web python-rrdtool \ python-memcache libapache2-mod-wsgi python-psycopg python-flup \ python-sqlite python-yaml geoip-database-contrib libgdal1 \ nginx-full uwsgi uwsgi-plugin-python
Collectd
collectd主要用来收集服务器上面的各种指标数据如:CPU,内存,磁盘使用率等等,同时也可以收集一些软件如nginx的工作状态
$ sudo apt-get install collectd collectd-utils
修改配置
$ sudo vim /etc/collectd/collectd.conf # 可以根据自己的需要修改 Hostname Hostname "graph_host"
开启常用的插件
LoadPlugin cpu LoadPlugin df LoadPlugin entropy LoadPlugin interface LoadPlugin load LoadPlugin memory LoadPlugin processes LoadPlugin rrdtool LoadPlugin users LoadPlugin write_graphite
配置 df 插件
<Plugin df> Device "/dev/vda" MountPoint "/" FSType "ext3" </Plugin>
配置 write_graphite,这里配置如何将collectd的数据发送到graphite/carbon。carbon-cache默认绑定的端口是2003/2004,tcp连接
<Plugin write_graphite> <Node "graphing"> Host "localhost" Port "2003" Protocol "tcp" LogSendErrors true Prefix "collectd." StoreRates true AlwaysAppendDS false EscapeCharacter "_" </Node> </Plugin>
修改 carbon-cache 的配置,添加一个 collectd 的规则,必须在默认规则 [default_1min_for_1day] 之前,否则新规则将不起作用。
$ sudo vim /etc/carbon/storage-schemas.conf [collectd] pattern = ^collectd.* retentions = 10s:1d,1m:7d,10m:1y # 注意到这里的 最小interval 是 10s,跟 collectd 的默认值相匹配。
重启服务
$ sudo service carbon-cache stop $ sudo service carbon-cache start $ sudo service collectd stop $ sudo service collectd start
Statsd
statsd是一个轻量级的数据收集程序(deamon),采用nodejs开发,对于其与graphite共同协作的工作原理:
StatsD flushes stats to Graphite in sync with Graphite’s configured write interval. To do this, it aggregates all of the data between flush intervals and creates single points for each statistic to send to Graphite.
In this way, StatsD lets applications work around the effective rate-limit for sending Graphite stats. It has many libraries written in different programming languages that make it trivial to build in stats tracking with your applications.
如果已经安装了collectd,其实也可以直接在collectd中添加statsd的插件,具体方法见:StatsD embedded into CollectD。不过这里我们为了多折腾一下,我们还是直接安装Statsd:
$ sudo apt-get install git nodejs devscripts debhelper $ mkdir ~/build $ cd ~/build $ git clone https://github.com/etsy/statsd.git $ cd statsd $ dpkg-buildpackage $ cd .. # 因为statsd是用nodejs开发,所以这里可能需要安装 npm 等额外的依赖 # 安装之前先停止 carbon-cache 服务 sudo service carbon-cache stop sudo dpkg -i statsd*.deb
修改配置文件
$ sudo vim /etc/statsd/localConfig.js { graphitePort: 2003 , graphiteHost: "localhost" , port: 8125 , graphite: { legacyNamespace: false } }
修改 carbon-cache 的配置,添加一个 statsd 的规则
$ sudo vim /etc/carbon/storage-schemas.conf [statsd] pattern = ^stats.* retentions = 10s:1d,1m:7d,10m:1y
修改 carbon-cache 的 aggregation 方法
[min] pattern = \.min$ xFilesFactor = 0.1 aggregationMethod = min [max] pattern = \.max$ xFilesFactor = 0.1 aggregationMethod = max [count] pattern = \.count$ xFilesFactor = 0 aggregationMethod = sum [lower] pattern = \.lower(_\d+)?$ xFilesFactor = 0.1 aggregationMethod = min [upper] pattern = \.upper(_\d+)?$ xFilesFactor = 0.1 aggregationMethod = max [sum] pattern = \.sum$ xFilesFactor = 0 aggregationMethod = sum [gauges] pattern = ^.*\.gauges\..* xFilesFactor = 0 aggregationMethod = last [default_average] pattern = .* xFilesFactor = 0.5 aggregationMethod = average
重新启动服务
$ sudo service statsd start $ sudo service carbon-cache start
The StatsD service connects to the Graphite service using a TCP connection. This allows for a reliable transfer of information.
However, StatsD itself listens for UDP packets. It collects all of the packets sent to it over a period of time (10 seconds by default). It then aggregates the packets it has received and flushes a single value for each metric to Carbon.
It is important to realize that the 10 second flush interval is exactly what we configured in our storage-schema as the shortest interval for storage. It is essential that these two configuration values match because it is what allows StatsD to get around the Carbon limitation of only accepting one value for each interval.
简单点说就是:第三方应用与Statsd采用UDP连接,而Statsd与Graphite则采用TCP连接。Statsd收集数据的时间间隔默认为10s,因此 carbon 里面的配置也必须是10s。
Note: It is important to realize that if you send Graphite data points more frequently than the shortest archive interval length, some of your data will be lost!
This is because Graphite only applies aggregation when going from detailed archives to generalized archives. When creating the detailed data point, it only writes the most recent data sent to it when the interval has passed. We will discuss StatsD in another guide, which can help alleviate this problem by caching and aggregating data that comes in at a more frequent interval.
Grafana
默认情况下的graphite-web界面实在是比较简陋,而且查看数据也比较麻烦,因为我们再安装一下Grafana,具体安装方法参考官方说明(官方文档写得很清晰明了)。
接下来就是 修改配置($ vim /etc/grafana/),启动服务($ sudo service grafana-server start),配置nginx,添加数据源、图表等。
评论