searchusermenu
  • 发布文章
  • 消息中心
点赞
收藏
评论
分享
原创

Rocketmq常见问题处理

2023-12-01 02:19:10
13
0

1.磁盘超过85%,如何快速删除数据

调整过期时间
更新文件过期时间配置  主从的配置的broker配置(只会更新主节点)
./mqadmin updateBrokerConfig -n 'namesrv地址' -c '集群名' -k fileReservedTime -v 24

从节点需要指定broker地址
./mqadmin updateBrokerConfig -b 'broker地址端口' -k fileReservedTime -v 24

2.使用rocketmq自带的脚本测试集群性能

启动消费者
sh consumer.sh -n 'namesrv地址' -t 主题 -a true -ak '账号' -sk '密码' -g 消费组名 -m true

启动生产者
sh producer.sh -n 'namesrv地址' -t 主题 -a true -ak '账号' -sk '密码' -m true


broker的store日志

查看tps情况
cat store.log|grep 'tps'
例如
2023-10-18 22:21:34 INFO StoreStatsService - [STORETPS] put_tps 10941.750808090907 get_found_tps 2.0326896482780454 get_miss_tps 50.85056398807045 get_transfered_tps 2.0326896482780454
2023-10-18 22:22:34 INFO StoreStatsService - [STORETPS] put_tps 10948.832017061548 get_found_tps 2.032757385407468 get_miss_tps 51.118849659263205 get_transfered_tps 2.032757385407468
2023-10-18 22:23:34 INFO StoreStatsService - [STORETPS] put_tps 11030.726675886892 get_found_tps 2.032892873210804 get_miss_tps 51.50550714011965 get_transfered_tps 2.032892873210804
2023-10-18 22:24:34 INFO StoreStatsService - [STORETPS] put_tps 10833.225027076563 get_found_tps 2.0328251270515705 get_miss_tps 50.70399066899942 get_transfered_tps 2.0328251270515705


查看pagecache 写入压力
cat store.log|grep 'CACHE'
2023-10-18 22:46:35 INFO StoreStatsService - [PAGECACHERT] TotalPut 610170, PutMessageDistributeTime [<=0ms]:6747 [0~10ms]:599516 [10~50ms]:3907 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0 
2023-10-18 22:47:35 INFO StoreStatsService - [PAGECACHERT] TotalPut 586070, PutMessageDistributeTime [<=0ms]:7074 [0~10ms]:576681 [10~50ms]:2315 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0 


 

3.查看集群运行情况

./mqadmin clusterList -n 'namesrv地址'

                #Cluster Name     #Broker Name            #BID  #Addr                  #Version                #InTPS(LOAD)       #OutTPS(LOAD) #PCWait(ms) #Hour #SPACE

d4fe994c6d0442dcabf741753d217c1f  broker_1                0     192.168.0.54:9600                                 0.00(,ms)           0.00(,ms)             0.00 0.0000
d4fe994c6d0442dcabf741753d217c1f  broker_1                2     192.168.0.180:9600     V4_9_6                   0.00(0,0ms)         0.00(0,0ms)          0 471565.57 0.0300
d4fe994c6d0442dcabf741753d217c1f  broker_2                0     192.168.0.95:9600      V4_9_6                   0.50(0,0ms)         2.00(0,0ms)          0 273.72 0.0300
d4fe994c6d0442dcabf741753d217c1f  broker_2                2     192.168.0.86:9600      V4_9_6                   0.50(0,0ms)         0.00(0,0ms)          0 273.72 0.0300

inTps   节点每秒写入的消息数量
outTPS  节点每秒读出的消息数量
PCWait  数据有点奇怪   消息落盘会加锁,这个值是当前时间与最后一次加锁之间的时间差值

 

4.集群吞吐量情况

./mqadmin statsAll -n 'nameserv地址' 

可选选项
-a 只打印活动的主题
-t 只打印指定的主题

例如

#Topic                                                            #Consumer Group                                                  #Accumulation      #InTPS     #OutTPS   #InMsg24Hour  #OutMsg24Hour
connector-status-topic                                            StatusManage-rmq_connect_d4fe994c6d0442dcabf741753d217c1f_10                2        0.50        0.50          43167          43167
connector-status-topic                                            StatusManage-rmq_connect_d4fe994c6d0442dcabf741753d217c1f_21                1        0.50        0.50          43167          43167
connector-status-topic                                            StatusManage-rmq_connect_d4fe994c6d0442dcabf741753d217c1f_20                1        0.50        0.50          43167          43167
connector-status-topic                                            StatusManage-rmq_connect_d4fe994c6d0442dcabf741753d217c1f_11                2        0.50        0.50          43167          43167

Accumulation 消息堆积量

InMsg24Hour    24小时内写入消息总数
OutMsg24Hour   24小时内消费者组消费总数

 

5.分析性能瓶颈

写入性能分析

cat store.log |grep 'CACHE'

2023-10-15 22:56:07 INFO StoreStatsService - [PAGECACHERT] TotalPut 25921, PutMessageDistributeTime [<=0ms]:25790 [0~10ms]:129 [10~50ms]:2 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 22:57:08 INFO StoreStatsService - [PAGECACHERT] TotalPut 131186, PutMessageDistributeTime [<=0ms]:131018 [0~10ms]:149 [10~50ms]:19 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 22:58:08 INFO StoreStatsService - [PAGECACHERT] TotalPut 134335, PutMessageDistributeTime [<=0ms]:134239 [0~10ms]:65 [10~50ms]:31 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 22:59:09 INFO StoreStatsService - [PAGECACHERT] TotalPut 129568, PutMessageDistributeTime [<=0ms]:129270 [0~10ms]:286 [10~50ms]:12 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:00:09 INFO StoreStatsService - [PAGECACHERT] TotalPut 164857, PutMessageDistributeTime [<=0ms]:164321 [0~10ms]:536 [10~50ms]:0 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:01:09 INFO StoreStatsService - [PAGECACHERT] TotalPut 174822, PutMessageDistributeTime [<=0ms]:174295 [0~10ms]:526 [10~50ms]:1 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:02:17 INFO StoreStatsService - [PAGECACHERT] TotalPut 152161, PutMessageDistributeTime [<=0ms]:151697 [0~10ms]:464 [10~50ms]:0 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:03:50 INFO StoreStatsService - [PAGECACHERT] TotalPut 26, PutMessageDistributeTime [<=0ms]:26 [0~10ms]:0 [10~50ms]:0 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:05:25 INFO StoreStatsService - [PAGECACHERT] TotalPut 22, PutMessageDistributeTime [<=0ms]:22 [0~10ms]:0 [10~50ms]:0 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:06:26 INFO StoreStatsService - [PAGECACHERT] TotalPut 110666, PutMessageDistributeTime [<=0ms]:110389 [0~10ms]:276 [10~50ms]:1 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:07:26 INFO StoreStatsService - [PAGECACHERT] TotalPut 156556, PutMessageDistributeTime [<=0ms]:156085 [0~10ms]:466 [10~50ms]:5 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:08:27 INFO StoreStatsService - [PAGECACHERT] TotalPut 98427, PutMessageDistributeTime [<=0ms]:98139 [0~10ms]:286 [10~50ms]:2 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:09:27 INFO StoreStatsService - [PAGECACHERT] TotalPut 124845, PutMessageDistributeTime [<=0ms]:124412 [0~10ms]:431 [10~50ms]:2 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:10:28 INFO StoreStatsService - [PAGECACHERT] TotalPut 130766, PutMessageDistributeTime [<=0ms]:130299 [0~10ms]:466 [10~50ms]:1 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:11:28 INFO StoreStatsService - [PAGECACHERT] TotalPut 125288, PutMessageDistributeTime [<=0ms]:124991 [0~10ms]:280 [10~50ms]:17 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:12:31 INFO StoreStatsService - [PAGECACHERT] TotalPut 128216, PutMessageDistributeTime [<=0ms]:127812 [0~10ms]:404 [10~50ms]:0 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0
2023-10-15 23:13:31 INFO StoreStatsService - [PAGECACHERT] TotalPut 72418, PutMessageDistributeTime [<=0ms]:72254 [0~10ms]:164 [10~50ms]:0 [50~100ms]:0 [100~200ms]:0 [200~500ms]:0 [500ms~1s]:0 [1~2s]:0 [2~3s]:0 [3~4s]:0 [4~5s]:0 [5~10s]:0 [10s~]:0

 

 

确定异步刷盘,主从同步是哪里导致慢
cat store.log |grep 'putMessage in lock'

因为写入pagecache导致的慢
2023-07-12 17:12:46 WARN SendMessageThread_5 - [NOTIFYME]putMessage in lock cost time(ms)=596, bodyLength=1533 AppendMessageResult=AppendMessageResult{status=PUT_OK, wroteOffset=2158519262842, wroteBytes=1716, msgId='8540AB0C0000238C000001F691C61E7A', storeTimestamp=1689153165830, logicsOffset=14044411, pagecacheRT=596, msgNum=1}
2023-07-12 17:12:46 WARN SendMessageThread_5 - putMessage not in lock eclipse time(ms)=596, bodyLength=1533
2023-07-12 17:24:09 WARN SendMessageThread_13 - [NOTIFYME]putMessage in lock cost time(ms)=522, bodyLength=146 AppendMessageResult=AppendMessageResult{status=PUT_OK, wroteOffset=2158530088764, wroteBytes=321, msgId='8540AB0C0000238C000001F6926B4F3C', storeTimestamp=1689153848515, logicsOffset=10885, pagecacheRT=522, msgNum=1}
2023-07-12 17:24:09 WARN SendMessageThread_13 - putMessage not in lock eclipse time(ms)=522, bodyLength=146
2023-07-12 17:24:17 WARN SendMessageThread_11 - [NOTIFYME]putMessage in lock cost time(ms)=600, bodyLength=831 AppendMessageResult=AppendMessageResult{status=PUT_OK, wroteOffset=2158530167568, wroteBytes=1022, msgId='8540AB0C0000238C000001F6926C8310', storeTimestamp=1689153856514, logicsOffset=17423725, pagecacheRT=600, msgNum=1}
2023-07-12 17:24:17 WARN SendMessageThread_11 - putMessage not in lock eclipse time(ms)=600, bodyLength=831

因为主从同步慢导致的超时(但不会超时) 主从网络存在阻塞
2023-07-12 17:24:17 WARN SendMessageThread_9 - putMessage not in lock eclipse time(ms)=575, bodyLength=788

 

0条评论
0 / 1000
c****n
4文章数
2粉丝数
c****n
4 文章 | 2 粉丝