用cAdvisor InfluxDB Grafana监控docker容器的TcpState
enjolras1205 · · 1640 次点击 · · 开始浏览问题
搭建完cAdvisor InfluxDB Grafana监控集群后, 发现没有tcp相关的数据.
clipboard.png
源码版本:
https://github.com/google/cad...
git commit hash:9db8c7dee20a0c41627b208977ab192a0411bf93
搭建cAdvisor InfluxDB Grafana参考
https://botleg.com/stories/mo...
定位过程
是否cadvisor没有记录tcp state?
容易搜索到, 因为cadvisor的高cpu占用, 需要--disable_metrics=""
https://github.com/google/cad...
实际上并非如此.
不带任何参数情况下, 本地启动cadvisor.
~/gopath/src/github.com/google/cadvisor(master*) » sudo ./cadvisor -logtostderr
在浏览器中打开 http://127.0.0.1:8080/containers/ 可以看到response中, 带有TcpState.
clipboard.png
是否写入了influxdb?
- 打开influx db shell
InfluxDB shell 0.9.6.1
> show databases
name: databases
---------------
name
_internal
mydb
cadvisor
> use cadvisor
Using database cadvisor
> show tag keys
name: cpu_usage_system
----------------------
tagKey
container_name
machine
可以看到, 这些tagKey对应grafana中的select column.
那么, 是否cadvisor没有写入influxdb呢?
cadvisor/storage/influxdb/influxdb.go:174
func (self *influxdbStorage) containerStatsToPoints(
cInfo *info.ContainerInfo,
stats *info.ContainerStats,
) (points []*influxdb.Point) {
// CPU usage: Total usage in nanoseconds
points = append(points, makePoint(serCpuUsageTotal, stats.Cpu.Usage.Total))
// CPU usage: Time spend in system space (in nanoseconds)
points = append(points, makePoint(serCpuUsageSystem, stats.Cpu.Usage.System))
// CPU usage: Time spent in user space (in nanoseconds)
points = append(points, makePoint(serCpuUsageUser, stats.Cpu.Usage.User))
// CPU usage per CPU
for i := 0; i < len(stats.Cpu.Usage.PerCpu); i++ {
point := makePoint(serCpuUsagePerCpu, stats.Cpu.Usage.PerCpu[i])
tags := map[string]string{"instance": fmt.Sprintf("%v", i)}
addTagsToPoint(point, tags)
points = append(points, point)
}
// Load Average
points = append(points, makePoint(serLoadAverage, stats.Cpu.LoadAverage))
// Memory Usage
points = append(points, makePoint(serMemoryUsage, stats.Memory.Usage))
// Working Set Size
points = append(points, makePoint(serMemoryWorkingSet, stats.Memory.WorkingSet))
// Network Stats
points = append(points, makePoint(serRxBytes, stats.Network.RxBytes))
points = append(points, makePoint(serRxErrors, stats.Network.RxErrors))
points = append(points, makePoint(serTxBytes, stats.Network.TxBytes))
points = append(points, makePoint(serTxErrors, stats.Network.TxErrors))
self.tagPoints(cInfo, stats, points)
return points
}
结论
需要修改cadvisor代码, 将自己需要的metrics加上.
有疑问加站长微信联系(非本文作者)
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
关注微信- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码` - 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传
收入到我管理的专栏 新建专栏
问题
搭建完cAdvisor InfluxDB Grafana监控集群后, 发现没有tcp相关的数据.
clipboard.png
源码版本:
https://github.com/google/cad...
git commit hash:9db8c7dee20a0c41627b208977ab192a0411bf93
搭建cAdvisor InfluxDB Grafana参考
https://botleg.com/stories/mo...
定位过程
是否cadvisor没有记录tcp state?
容易搜索到, 因为cadvisor的高cpu占用, 需要--disable_metrics=""
https://github.com/google/cad...
实际上并非如此.
不带任何参数情况下, 本地启动cadvisor.
~/gopath/src/github.com/google/cadvisor(master*) » sudo ./cadvisor -logtostderr
在浏览器中打开 http://127.0.0.1:8080/containers/ 可以看到response中, 带有TcpState.
clipboard.png
是否写入了influxdb?
- 打开influx db shell
InfluxDB shell 0.9.6.1
> show databases
name: databases
---------------
name
_internal
mydb
cadvisor
> use cadvisor
Using database cadvisor
> show tag keys
name: cpu_usage_system
----------------------
tagKey
container_name
machine
可以看到, 这些tagKey对应grafana中的select column.
那么, 是否cadvisor没有写入influxdb呢?
cadvisor/storage/influxdb/influxdb.go:174
func (self *influxdbStorage) containerStatsToPoints(
cInfo *info.ContainerInfo,
stats *info.ContainerStats,
) (points []*influxdb.Point) {
// CPU usage: Total usage in nanoseconds
points = append(points, makePoint(serCpuUsageTotal, stats.Cpu.Usage.Total))
// CPU usage: Time spend in system space (in nanoseconds)
points = append(points, makePoint(serCpuUsageSystem, stats.Cpu.Usage.System))
// CPU usage: Time spent in user space (in nanoseconds)
points = append(points, makePoint(serCpuUsageUser, stats.Cpu.Usage.User))
// CPU usage per CPU
for i := 0; i < len(stats.Cpu.Usage.PerCpu); i++ {
point := makePoint(serCpuUsagePerCpu, stats.Cpu.Usage.PerCpu[i])
tags := map[string]string{"instance": fmt.Sprintf("%v", i)}
addTagsToPoint(point, tags)
points = append(points, point)
}
// Load Average
points = append(points, makePoint(serLoadAverage, stats.Cpu.LoadAverage))
// Memory Usage
points = append(points, makePoint(serMemoryUsage, stats.Memory.Usage))
// Working Set Size
points = append(points, makePoint(serMemoryWorkingSet, stats.Memory.WorkingSet))
// Network Stats
points = append(points, makePoint(serRxBytes, stats.Network.RxBytes))
points = append(points, makePoint(serRxErrors, stats.Network.RxErrors))
points = append(points, makePoint(serTxBytes, stats.Network.TxBytes))
points = append(points, makePoint(serTxErrors, stats.Network.TxErrors))
self.tagPoints(cInfo, stats, points)
return points
}
结论
需要修改cadvisor代码, 将自己需要的metrics加上.