容器可观测 步骤二:查看监控大盘 方式1:在云容器引擎控制台查看 登录云容器引擎控制台,进入集群列表页面,点击目标集群名称,在左侧导航栏中选择运维管理 > 监控,在监控页面中单击需要查看的监控大盘,即可查看相应的监控数据。 方式2:在Prometheus服务控制台查看 1. 登录Prometheus监控服务控制台,左侧点击接入管理。 2. 在接入管理页面,点击大盘查询页签。 3. 在容器环境中选择待查看的集群,即可查看对应的监控大盘。 步骤三:设置告警 1. 登录Prometheus监控服务控制台,左侧点击告警规则。 2. 顶部选择对应的Prometheus实例,即可查看Prometheus内置的告警通知,您可对告警规则进行编辑、启停等操作。 容器基础监控指标 以下为部分容器集群基础指标,指标基础免费存储15天,存储时长超过15天的实例,将按照超出的天数,收取免费指标的存储费用。 指标 指标描述 分组 单位 containermemoryfailurestotal 容器内存失败数 cadvisor containermemoryrss 容器内存rss cadvisor bytes containerspecmemorylimitbytes 容器内存limit cadvisor bytes containermemoryfailcnt 容器内存failcnt cadvisor containermemorycache 容器内存cache cadvisor containermemoryswap 容器内存swap cadvisor containermemoryusagebytes 容器内存使用量 cadvisor bytes containermemorymaxusagebytes 容器内存最大用量 cadvisor bytes containercpuloadaverage10s 容器cpu平均负载 cadvisor containerfsreadstotal 容器文件系统读总数 cadvisor containerfswritestotal 容器文件系统吸入总数 cadvisor containernetworktransmitpacketstotal 容器网络发包数 cadvisor containernetworktransmiterrorstotal 容器网络发送错误数 cadvisor containernetworkreceiveerrorstotal 容器网络接收错误数 cadvisor containernetworktransmitbytestotal 容器网络传输的字节数 cadvisor bytes containernetworkreceivebytestotal 容器网络接收的字节数 cadvisor bytes containermemoryworkingsetbytes 容器内存用量 cadvisor bytes containercpuusagesecondstotal 容器cpu用量 cadvisor containerfsreadsbytestotal 容器文件系统读总字节 cadvisor bytes containerfswritesbytestotal 容器文件系统写总字节 cadvisor bytes containerspeccpuquota 容器cpu配额 cadvisor containercpucfsperiodstotal 容器中 CPU CFS (Completely Fair Scheduler) 周期的总数 cadvisor containercpucfsthrottledperiodstotal 容器中 CPU CFS 被限制的周期总数 cadvisor containercpucfsthrottledsecondstotal 容器中 CPU CFS 被限制的总时间 cadvisor containerfsinodesfree 容器inode free cadvisor containerfsiotimesecondstotal 容器io占cpu时间 cadvisor containerfsiotimeweightedsecondstotal 容器文件系统 I/O 的加权时间总数 cadvisor containerfslimitbytes 容器文件系统limit cadvisor bytes containertasksstate 容器中任务的状态 cadvisor containerfsreadsecondstotal 容器文件系统读时间 cadvisor containerfswritesecondstotal 容器文件系统写时间 cadvisor containerfsusagebytes 容器文件系统的使用字节数 cadvisor bytes containerfsinodestotal 容器文件系统的 inode 总数 cadvisor containerfsiocurrent 容器文件系统当前的 I/O 活动 cadvisor machinecpucores 机器CPU核心数 cadvisor machinememorybytes 机器内存字节数 cadvisor bytes gogcdurationseconds Go GC耗时(秒) prometheusnodeexporter gogoroutines Go运行协程数 prometheusnodeexporter nodeboottimeseconds 节点启动时间(秒) prometheusnodeexporter nodecontextswitchestotal 节点上下文切换总数 prometheusnodeexporter kubenodelabels 节点标签 kubestatemetrics nodecpusecondstotal 节点CPU使用时间总计 prometheusnodeexporter nodediskionow 节点磁盘I/O当前量 prometheusnodeexporter nodediskiotimesecondstotal 节点磁盘I/O时间总计(秒) prometheusnodeexporter nodediskiotimeweightedsecondstotal 节点磁盘I/O加权时间总计(秒) prometheusnodeexporter nodediskreadbytestotal 节点磁盘读取字节总计 prometheusnodeexporter bytes nodediskreadtimesecondstotal 节点磁盘读取时间总计(秒) prometheusnodeexporter nodediskreadscompletedtotal 节点磁盘读取完成总数 prometheusnodeexporter nodediskwritetimesecondstotal 节点磁盘写入时间总秒数 prometheusnodeexporter nodediskwritescompletedtotal 节点磁盘写入完成总数 prometheusnodeexporter nodediskwrittenbytestotal 节点磁盘写入字节总数 prometheusnodeexporter bytes nodeexporterbuildinfo 节点导出器构建信息 prometheusnodeexporter nodefilefdallocated 节点文件描述符已分配 prometheusnodeexporter nodefilesystemavailbytes 节点文件系统可用字节数 prometheusnodeexporter bytes nodefilesystemfiles 节点文件系统文件数 prometheusnodeexporter nodefilesystemfilesfree 节点文件系统空闲文件数 prometheusnodeexporter nodefilesystemfreebytes 节点文件系统空闲字节数 prometheusnodeexporter bytes nodefilesystemreadonly 节点文件系统只读状态 prometheusnodeexporter nodefilesystemsizebytes 节点文件系统总大小字节数 prometheusnodeexporter bytes nodeintrtotal 节点中断总数 prometheusnodeexporter nodeload1 节点1分钟负载 prometheusnodeexporter nodeload15 节点15分钟负载 prometheusnodeexporter nodeload5 节点5分钟负载 prometheusnodeexporter nodememoryBuffersbytes 节点buffers内存大小(字节) prometheusnodeexporter bytes nodememoryCachedbytes 节点cached内存大小(字节) prometheusnodeexporter bytes nodememoryMemAvailablebytes 节点可用内存大小(字节) prometheusnodeexporter bytes nodememoryMemFreebytes 节点空闲内存大小(字节) prometheusnodeexporter bytes nodememoryMemTotalbytes 节点总内存大小(字节) prometheusnodeexporter bytes nodenetstatTcpActiveOpens TCP主动打开连接数 prometheusnodeexporter nodenetstatTcpCurrEstab 当前建立的TCP连接数 prometheusnodeexporter nodenetstatTcpPassiveOpens TCP被动打开连接数 prometheusnodeexporter nodenetworkreceivebytestotal 累计接收字节总数 prometheusnodeexporter bytes nodenetworkreceivedroptotal 接收丢包总数 prometheusnodeexporter nodenetworkreceiveerrstotal 接收错误总数 prometheusnodeexporter nodenetworkreceivepacketstotal 接收数据包总数 prometheusnodeexporter nodenetworktransmitbytestotal 累计发送字节总数 prometheusnodeexporter bytes nodenetworktransmitdroptotal 发送丢包总数 prometheusnodeexporter nodenetworktransmiterrstotal 发送错误总数 prometheusnodeexporter nodenetworktransmitpacketstotal 发送数据包总数 prometheusnodeexporter nodenetworkup 网络接口是否启用 prometheusnodeexporter nodenfconntrackentries 链接状态跟踪表条目数量 prometheusnodeexporter nodenfconntrackentrieslimit 链接状态跟踪表条目限制 prometheusnodeexporter kubenoderole k8s节点角色 kubestatemetrics nodeprocessesmaxprocesses 最大进程数 prometheusnodeexporter nodeprocessespids 进程ID数 prometheusnodeexporter kubenodeinfo 节点信息 kubestatemetrics nodesockstatTCPalloc TCP套接字分配数 prometheusnodeexporter nodesockstatTCPinuse TCP套接字使用中 prometheusnodeexporter nodesockstatTCPtw TCP TIMEWAIT套接字数 prometheusnodeexporter nodetimexoffsetseconds 时间偏移(秒) prometheusnodeexporter nodetimexsyncstatus 时钟同步状态 prometheusnodeexporter nodeunameinfo 系统信息(uname) prometheusnodeexporter nodevmstatpgfault VM统计页故障次数 prometheusnodeexporter nodevmstatpgmajfault VM统计重大页故障次数 prometheusnodeexporter nodevmstatpgpgin VM统计页入次数 prometheusnodeexporter nodevmstatpgpgout VM统计页出次数 prometheusnodeexporter processcpusecondstotal 进程CPU使用秒数总计 prometheusnodeexporter processresidentmemorybytes 进程常驻内存字节数 prometheusnodeexporter bytes scrapedurationseconds 抓取持续时间(秒) prometheusnodeexporter kubecronjobcreated Kubernetes CronJob创建时间 kubestatemetrics kubedaemonsetcreated Kubernetes DaemonSet创建时间 kubestatemetrics kubedaemonsetstatuscurrentnumberscheduled Kubernetes DaemonSet当前计划的节点数量 kubestatemetrics kubedaemonsetstatusdesirednumberscheduled Kubernetes DaemonSet期望计划的节点数量 kubestatemetrics kubedaemonsetstatusnumberavailable Kubernetes DaemonSet可用节点数量 kubestatemetrics kubedaemonsetstatusnumbermisscheduled Kubernetes DaemonSet错过的调度节点数量 kubestatemetrics kubedaemonsetstatusnumberready Kubernetes DaemonSet就绪节点数量 kubestatemetrics kubedaemonsetupdatednumberscheduled Kubernetes DaemonSet已更新的计划节点数量 kubestatemetrics kubedeploymentcreated Kubernetes Deployment创建时间 kubestatemetrics kubedeploymentlabels Kubernetes Deployment标签 kubestatemetrics kubedeploymentmetadatageneration Kubernetes Deployment元数据生成代数 kubestatemetrics kubedeploymentspecreplicas Kubernetes Deployment规格副本数 kubestatemetrics kubedeploymentspecstrategyrollingupdatemaxunavailable Kubernetes Deployment滚动更新最大不可用数 kubestatemetrics kubedeploymentstatusobservedgeneration Kubernetes Deployment观察到的生成代数 kubestatemetrics kubedeploymentstatusreplicas Kubernetes Deployment副本总数 kubestatemetrics kubedeploymentstatusreplicasavailable Kubernetes Deployment可用副本数 kubestatemetrics kubedeploymentstatusreplicasunavailable Kubernetes Deployment不可用副本数 kubestatemetrics kubedeploymentstatusreplicasupdated Kubernetes Deployment已更新副本数 kubestatemetrics kubeingressinfo Ingress信息 kubestatemetrics kubejobcreated job创建时间 kubestatemetrics kubenamespacelabels 命名空间标签 kubestatemetrics kubenamespacestatusphase 命名空间状态阶段 kubestatemetrics kubenodespectaint 节点污点配置 kubestatemetrics kubenodespecunschedulable 节点是否可调度标志 kubestatemetrics kubenodestatusallocatablecpucores 节点可分配CPU核心数 kubestatemetrics kubenodestatusallocatablememorybytes 节点可分配内存字节数 kubestatemetrics bytes kubenodestatusallocatablepods 节点可分配Pod数量 kubestatemetrics kubenodestatuscapacity 节点容量 kubestatemetrics kubenodestatuscapacitycpucores 节点容量CPU核心数 kubestatemetrics kubenodestatuscapacitymemorybytes 节点容量内存字节数 kubestatemetrics bytes kubenodestatuscapacitypods 节点容量Pod数量 kubestatemetrics kubenodestatuscondition 节点状态条件 kubestatemetrics kubepersistentvolumestatusphase 持久卷状态阶段 kubestatemetrics kubepersistentvolumeclaimstatusphase 持久卷声明状态阶段 kubestatemetrics kubepodcontainerinfo Pod容器信息 kubestatemetrics kubepodcontainerresourcelimits Pod容器资源限制 kubestatemetrics kubepodcontainerresourcelimitscpucores Pod容器资源限制CPU核心数 kubestatemetrics kubepodcontainerresourcelimitsmemorybytes Pod容器资源限制内存字节数 kubestatemetrics bytes kubepodcontainerresourcerequestscpucores Pod容器资源请求CPU核心数 kubestatemetrics kubepodcontainerresourcerequestsmemorybytes Pod容器资源请求内存字节数 kubestatemetrics bytes kubepodcontainerstatuslastterminatedreason Pod容器最后终止原因 kubestatemetrics kubepodcontainerstatusrestartstotal Pod容器重启总数 kubestatemetrics kubepodcontainerstatusrunning Pod容器运行状态 kubestatemetrics kubepodcontainerstatusterminated Pod容器终止状态 kubestatemetrics kubepodcontainerstatusterminatedreason Pod容器终止原因 kubestatemetrics kubepodcontainerstatuswaiting Pod容器等待状态 kubestatemetrics kubepodcontainerstatuswaitingreason Pod容器等待原因 kubestatemetrics kubepodinfo Pod信息 kubestatemetrics kubepodlabels Pod标签 kubestatemetrics kubepodowner Pod所属对象 kubestatemetrics kubepodstatusphase Pod状态阶段 kubestatemetrics kubepodstatusready Pod就绪状态 kubestatemetrics kuberesourcequota 资源配额 kubestatemetrics kubesecretinfo secret信息 kubestatemetrics kubeserviceinfo 服务信息 kubestatemetrics kubestatefulsetcreated 有状态副本集创建时间 kubestatemetrics kubestatefulsetreplicas 有状态副本集副本数 kubestatemetrics kubestatefulsetstatusreplicas 有状态副本集状态副本数 kubestatemetrics restclientrequeststotal REST客户端请求总数 kubestatemetrics apiserveradmissioncontrolleradmissiondurationsecondsbucket APIServer准入控制器准入耗时秒数桶 kubeapiserver apiserveradmissionwebhookadmissiondurationsecondsbucket APIServer准入Webhook准入耗时秒数桶 kubeapiserver apiserveradmissionwebhookadmissiondurationsecondscount APIServer准入Webhook准入耗时秒数计数 kubeapiserver apiservercurrentinflightrequests APIServer当前正在处理的请求数量 kubeapiserver apiserverrequestdurationsecondsbucket APIServer请求处理时间(以秒为单位)的桶 kubeapiserver apiserverrequestdurationsecondscount APIServer请求持续时间秒数计数 kubeapiserver apiserverrequestdurationsecondssum APIServer请求持续时间秒数总和 kubeapiserver apiserverrequesttotal API总请求数 kubeapiserver restclientrequestdurationsecondsbucket REST客户端:请求耗时秒数分桶 kubeapiserver etcddebuggingmvccdbtotalsizeinbytes ETCD调试MVCC数据库总大小(字节) etcd bytes etcddebuggingmvcckeystotal ETCD调试MVCC键总数 etcd etcddiskbackendcommitdurationsecondsbucket ETCD磁盘后端提交持续时间秒桶 etcd etcdserverhasleader ETCD服务器有Leader etcd etcdserverleaderchangesseentotal ETCD服务器见证Leader变更总数 etcd schedulerpendingpods 调度器待处理Pod数 kubescheduler schedulerpodschedulingattemptsbucket 调度器Pod调度尝试次数桶 kubescheduler schedulerschedulercachesize 调度器缓存大小 kubescheduler