CPU Usage on Nodes
Pages: 1 2
ufm
46 Posts
April 18, 2020, 2:33 amQuote from ufm on April 18, 2020, 2:33 amTell me, please - loading 20-30 percent on all nodes in an empty cluster (not a single pool was created, there are no connected clients, only OSD) - is this normal behavior, or did I break something?
Tell me, please - loading 20-30 percent on all nodes in an empty cluster (not a single pool was created, there are no connected clients, only OSD) - is this normal behavior, or did I break something?
ufm
46 Posts
April 18, 2020, 2:46 amQuote from ufm on April 18, 2020, 2:46 amAdditional question: It's normal in ceph.log file on mon node:
2020-04-18 05:36:58.806997 mgr.petasan-mon3 (mgr.64131) 51343 : cluster [DBG] pgmap v48231: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:00.807434 mgr.petasan-mon3 (mgr.64131) 51344 : cluster [DBG] pgmap v48232: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:02.807846 mgr.petasan-mon3 (mgr.64131) 51345 : cluster [DBG] pgmap v48233: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:04.808280 mgr.petasan-mon3 (mgr.64131) 51346 : cluster [DBG] pgmap v48234: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:06.808819 mgr.petasan-mon3 (mgr.64131) 51347 : cluster [DBG] pgmap v48235: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:08.809340 mgr.petasan-mon3 (mgr.64131) 51348 : cluster [DBG] pgmap v48236: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:10.809739 mgr.petasan-mon3 (mgr.64131) 51349 : cluster [DBG] pgmap v48237: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:12.810118 mgr.petasan-mon3 (mgr.64131) 51350 : cluster [DBG] pgmap v48238: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:14.810534 mgr.petasan-mon3 (mgr.64131) 51351 : cluster [DBG] pgmap v48239: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:16.810917 mgr.petasan-mon3 (mgr.64131) 51352 : cluster [DBG] pgmap v48240: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:18.811333 mgr.petasan-mon3 (mgr.64131) 51353 : cluster [DBG] pgmap v48241: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:20.811710 mgr.petasan-mon3 (mgr.64131) 51354 : cluster [DBG] pgmap v48242: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
ceph.audit.log also grow very fast (now size - 36520615 and fast increase):
2020-04-18 05:45:06.968275 mon.petasan-mon3 (mon.0) 78180 : audit [DBG] from='client.? 10.5.108.13:0/619389959' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:06.992723 mon.petasan-mon3 (mon.0) 78181 : audit [INF] from='client.? 10.5.108.51:0/2981322423' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.021436 mon.petasan-mon3 (mon.0) 78182 : audit [DBG] from='client.? 10.5.108.52:0/2954037331' entity='client.admin' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
2020-04-18 05:45:07.034556 mon.petasan-mon2 (mon.2) 59685 : audit [DBG] from='client.? 10.5.108.48:0/1090715648' entity='client.admin' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
2020-04-18 05:45:07.051300 mon.petasan-mon3 (mon.0) 78183 : audit [DBG] from='client.? 10.5.108.52:0/3913457327' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:07.068457 mon.petasan-mon2 (mon.2) 59686 : audit [DBG] from='client.? 10.5.108.48:0/3508208063' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:07.084533 mon.petasan-mon3 (mon.0) 78184 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.086991 mon.petasan-mon1 (mon.1) 52080 : audit [INF] from='client.? 10.5.108.47:0/2084591108' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.087473 mon.petasan-mon3 (mon.0) 78185 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.090076 mon.petasan-mon1 (mon.1) 52081 : audit [INF] from='client.? 10.5.108.49:0/2679780626' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.104355 mon.petasan-mon2 (mon.2) 59687 : audit [INF] from='client.? 10.5.108.12:0/338527332' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.104760 mon.petasan-mon3 (mon.0) 78186 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
Additional question: It's normal in ceph.log file on mon node:
2020-04-18 05:36:58.806997 mgr.petasan-mon3 (mgr.64131) 51343 : cluster [DBG] pgmap v48231: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:00.807434 mgr.petasan-mon3 (mgr.64131) 51344 : cluster [DBG] pgmap v48232: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:02.807846 mgr.petasan-mon3 (mgr.64131) 51345 : cluster [DBG] pgmap v48233: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:04.808280 mgr.petasan-mon3 (mgr.64131) 51346 : cluster [DBG] pgmap v48234: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:06.808819 mgr.petasan-mon3 (mgr.64131) 51347 : cluster [DBG] pgmap v48235: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:08.809340 mgr.petasan-mon3 (mgr.64131) 51348 : cluster [DBG] pgmap v48236: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:10.809739 mgr.petasan-mon3 (mgr.64131) 51349 : cluster [DBG] pgmap v48237: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:12.810118 mgr.petasan-mon3 (mgr.64131) 51350 : cluster [DBG] pgmap v48238: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:14.810534 mgr.petasan-mon3 (mgr.64131) 51351 : cluster [DBG] pgmap v48239: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:16.810917 mgr.petasan-mon3 (mgr.64131) 51352 : cluster [DBG] pgmap v48240: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:18.811333 mgr.petasan-mon3 (mgr.64131) 51353 : cluster [DBG] pgmap v48241: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:20.811710 mgr.petasan-mon3 (mgr.64131) 51354 : cluster [DBG] pgmap v48242: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
ceph.audit.log also grow very fast (now size - 36520615 and fast increase):
2020-04-18 05:45:06.968275 mon.petasan-mon3 (mon.0) 78180 : audit [DBG] from='client.? 10.5.108.13:0/619389959' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:06.992723 mon.petasan-mon3 (mon.0) 78181 : audit [INF] from='client.? 10.5.108.51:0/2981322423' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.021436 mon.petasan-mon3 (mon.0) 78182 : audit [DBG] from='client.? 10.5.108.52:0/2954037331' entity='client.admin' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
2020-04-18 05:45:07.034556 mon.petasan-mon2 (mon.2) 59685 : audit [DBG] from='client.? 10.5.108.48:0/1090715648' entity='client.admin' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
2020-04-18 05:45:07.051300 mon.petasan-mon3 (mon.0) 78183 : audit [DBG] from='client.? 10.5.108.52:0/3913457327' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:07.068457 mon.petasan-mon2 (mon.2) 59686 : audit [DBG] from='client.? 10.5.108.48:0/3508208063' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:07.084533 mon.petasan-mon3 (mon.0) 78184 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.086991 mon.petasan-mon1 (mon.1) 52080 : audit [INF] from='client.? 10.5.108.47:0/2084591108' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.087473 mon.petasan-mon3 (mon.0) 78185 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.090076 mon.petasan-mon1 (mon.1) 52081 : audit [INF] from='client.? 10.5.108.49:0/2679780626' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.104355 mon.petasan-mon2 (mon.2) 59687 : audit [INF] from='client.? 10.5.108.12:0/338527332' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.104760 mon.petasan-mon3 (mon.0) 78186 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
admin
2,930 Posts
April 18, 2020, 9:22 amQuote from admin on April 18, 2020, 9:22 amNo it is not normal.
Are you using real hardware, or a virtual lab setup ? how much RAM/cpu cores ? even with later it should not be this much.
Can you close all management browsers and see if makes a difference, if so does it depend on which page you are accessing ?
if you run
ceph pg ls-by-pool POOL_NAME
replace POOL_NAME with name of pool such as rbd
does it take time to complete ?
No it is not normal.
Are you using real hardware, or a virtual lab setup ? how much RAM/cpu cores ? even with later it should not be this much.
Can you close all management browsers and see if makes a difference, if so does it depend on which page you are accessing ?
if you run
ceph pg ls-by-pool POOL_NAME
replace POOL_NAME with name of pool such as rbd
does it take time to complete ?
Last edited on April 18, 2020, 9:23 am by admin · #3
ufm
46 Posts
April 18, 2020, 9:54 amQuote from ufm on April 18, 2020, 9:54 ammon servers - virtual. 4 core 32G ram each.
OSD - real servers. 4 SSD for OSD, 4 core, 32G ram, 2*10G ethernet
After close management browser no difference.
ceph pg ls-by-pool iscsi-ssd - time of execution - less than second.
real 0m0.261s
user 0m0.186s
sys 0m0.005s
As I can see in atop on OSD server - 5% used by petasan_config_upload and by 2% many ceph threads.
mon servers - virtual. 4 core 32G ram each.
OSD - real servers. 4 SSD for OSD, 4 core, 32G ram, 2*10G ethernet
After close management browser no difference.
ceph pg ls-by-pool iscsi-ssd - time of execution - less than second.
real 0m0.261s
user 0m0.186s
sys 0m0.005s
As I can see in atop on OSD server - 5% used by petasan_config_upload and by 2% many ceph threads.
Last edited on April 18, 2020, 9:58 am by ufm · #4
admin
2,930 Posts
April 18, 2020, 10:37 amQuote from admin on April 18, 2020, 10:37 amis the cluster health show Ok ?
due you see any errors in /opt/petasan/log/PetaSAN.log
is this a fresh install or was this upgraded ?
is the cluster health show Ok ?
due you see any errors in /opt/petasan/log/PetaSAN.log
is this a fresh install or was this upgraded ?
Last edited on April 18, 2020, 10:38 am by admin · #5
ufm
46 Posts
April 18, 2020, 11:22 amQuote from ufm on April 18, 2020, 11:22 amYes, health status is OK (in dashboard and in ceph -s)
root@petasan-mon1:~# ceph -s
cluster:
id: 982c2213-6936-4285-a641-56d1ab906e04
health: HEALTH_OK
services:
mon: 3 daemons, quorum petasan-mon3,petasan-mon1,petasan-mon2 (age 7h)
mgr: petasan-mon3(active, since 8h), standbys: petasan-mon2, petasan-mon1
osd: 24 osds: 24 up (since 8h), 24 in (since 8h)
data:
pools: 1 pools, 1024 pgs
objects: 0 objects, 0 B
usage: 7.8 GiB used, 20 TiB / 20 TiB avail
pgs: 1024 active+clean
There are currently no errors in PetaSAN.log
It's a fresh install 2.5.0 upgraded to 2.5.1 (all nodes upgraded, storage nodes upgraded before creating the OSD.
Yes, health status is OK (in dashboard and in ceph -s)
root@petasan-mon1:~# ceph -s
cluster:
id: 982c2213-6936-4285-a641-56d1ab906e04
health: HEALTH_OK
services:
mon: 3 daemons, quorum petasan-mon3,petasan-mon1,petasan-mon2 (age 7h)
mgr: petasan-mon3(active, since 8h), standbys: petasan-mon2, petasan-mon1
osd: 24 osds: 24 up (since 8h), 24 in (since 8h)
data:
pools: 1 pools, 1024 pgs
objects: 0 objects, 0 B
usage: 7.8 GiB used, 20 TiB / 20 TiB avail
pgs: 1024 active+clean
There are currently no errors in PetaSAN.log
It's a fresh install 2.5.0 upgraded to 2.5.1 (all nodes upgraded, storage nodes upgraded before creating the OSD.
admin
2,930 Posts
April 18, 2020, 12:09 pmQuote from admin on April 18, 2020, 12:09 pmThe 20-30% is this on all nodes ? or only the vms ?
petasan_config_upload 5% on all nodes ?
in atop what are the the 2% processes: OSDs ?
The 20-30% is this on all nodes ? or only the vms ?
petasan_config_upload 5% on all nodes ?
in atop what are the the 2% processes: OSDs ?
Last edited on April 18, 2020, 12:10 pm by admin · #7
ufm
46 Posts
April 18, 2020, 12:17 pmQuote from ufm on April 18, 2020, 12:17 pmIn current moment 20-30% only on storage (hardware nodes). On vms (monitor) - about 15% (used by ceph-mon)
Petasan_config started only on storage.
atop on storage node:
ATOP - S-26-5-2-3 2020/04/18 15:16:24 - 10 2020/04/18 15:16:34 ---- 10
PRC | sys 0.67s | user 8.10s | #proc 496 | #trun 1 | #tslpi 535 | #tslpu 0 | #zombie 0 | clones 1617 | #exit 294 |
CPU | sys 12% | user 84% | irq 0% | idle 304% | wait 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% |
cpu | sys 3% | user 28% | irq 0% | idle 69% | cpu002 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% |
cpu | sys 3% | user 21% | irq 0% | idle 76% | cpu000 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% |
cpu | sys 3% | user 20% | irq 0% | idle 77% | cpu001 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% |
cpu | sys 3% | user 15% | irq 0% | idle 82% | cpu003 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% |
CPL | avg1 0.92 | avg5 0.97 | avg15 0.99 | | csw 45254 | | intr 23734 | | numcpu 4 |
MEM | tot 31.2G | free 29.9G | cache 245.5M | buff 123.0M | slab 107.3M | shmem 7.5M | vmbal 0.0M | hptot 0.0M | hpuse 0.0M |
SWP | tot 0.0M | free 0.0M | | | | | | vmcom 4.4G | vmlim 15.6G |
DSK | sda | busy 0% | read 0 | write 72 | KiB/r 0 | KiB/w 5 | MBr/s 0.0 | MBw/s 0.0 | avio 0.22 ms |
NET | transport | tcpi 8497 | tcpo 11006 | udpi 21 | udpo 21 | tcpao 442 | tcppo 0 | tcprs 0 | udpie 0 |
NET | network | ipi 8518 | ipo 8953 | ipfrw 0 | deliv 8518 | | | icmpi 0 | icmpo 0 |
NET | eth3 0% | pcki 8683 | pcko 1645 | sp 10 Gbps | si 6441 Kbps | so 868 Kbps | erri 0 | erro 0 | drpo 0 |
NET | bond0 0% | pcki 15683 | pcko 10665 | sp 20 Gbps | si 11 Mbps | so 2918 Kbps | erri 0 | erro 0 | drpo 0 |
NET | bond0.7 0% | pcki 8310 | pcko 8711 | sp 20 Gbps | si 10 Mbps | so 2801 Kbps | erri 0 | erro 0 | drpo 0 |
NET | eth2 0% | pcki 7000 | pcko 9020 | sp 10 Gbps | si 4663 Kbps | so 2050 Kbps | erri 0 | erro 0 | drpo 0 |
NET | bond0.5 0% | pcki 10 | pcko 44 | sp 20 Gbps | si 0 Kbps | so 12 Kbps | erri 0 | erro 0 | drpo 0 |
NET | lo ---- | pcki 198 | pcko 198 | sp 0 Mbps | si 162 Kbps | so 162 Kbps | erri 0 | erro 0 | drpo 0 |
PID SYSCPU USRCPU VGROW RGROW RDDSK WRDSK RUID EUID ST EXC THR S CPUNR CPU CMD 1/14
1205 0.22s 0.21s 0K -0.1M 62884K 0K 252K root root -- - 1 S 2 4% petasan_config
1869914 0.02s 0.19s 0K 0K -M 52084K 39552K - root - NE 0 0 E - 2% <ceph>
1869940 0.02s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870556 0.01s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869401 0.02s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869991 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870171 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870222 0.01s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870248 0.02s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870453 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870530 0.01s 0.19s 0K 05.2M 73776K - - root ceph - NE 0 0 E - 2% <ceph>
1870787 0.02s 0.18s 0K 05.2M 70936K - - root ceph - NE 0 0 E - 2% <ceph>
1869375 0.01s 0.18s 0K 05.2M 78816K - - root ceph - NE 0 0 E - 2% <ceph>
1869452 0.00s 0.19s 0K 04.2M 70352K - - root ceph - NE 0 0 E - 2% <ceph>
1869529 0.00s 0.19s 4520K 0K - - root - NE 0 0 E - 2% <ceph>
1869555 0.01s 0.18s 0K 2556K 55580K - - 17164K root - NE 0 0 E - 2% <ceph>
1869632 0.00s 0.19s 0K 03.5M 53772K - - root - NE 0 0 E - 2% <ceph>
1869683 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869709 0.00s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869786 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869837 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869863 0.01s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870017 0.01s 0.18s 0 97028K 0K - - root - zabbix NE 0 0 E - 2% <ceph>
In current moment 20-30% only on storage (hardware nodes). On vms (monitor) - about 15% (used by ceph-mon)
Petasan_config started only on storage.
atop on storage node:
ATOP - S-26-5-2-3 2020/04/18 15:16:24 - 10 2020/04/18 15:16:34 ---- 10
PRC | sys 0.67s | user 8.10s | #proc 496 | #trun 1 | #tslpi 535 | #tslpu 0 | #zombie 0 | clones 1617 | #exit 294 |
CPU | sys 12% | user 84% | irq 0% | idle 304% | wait 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% |
cpu | sys 3% | user 28% | irq 0% | idle 69% | cpu002 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% |
cpu | sys 3% | user 21% | irq 0% | idle 76% | cpu000 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% |
cpu | sys 3% | user 20% | irq 0% | idle 77% | cpu001 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% |
cpu | sys 3% | user 15% | irq 0% | idle 82% | cpu003 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% |
CPL | avg1 0.92 | avg5 0.97 | avg15 0.99 | | csw 45254 | | intr 23734 | | numcpu 4 |
MEM | tot 31.2G | free 29.9G | cache 245.5M | buff 123.0M | slab 107.3M | shmem 7.5M | vmbal 0.0M | hptot 0.0M | hpuse 0.0M |
SWP | tot 0.0M | free 0.0M | | | | | | vmcom 4.4G | vmlim 15.6G |
DSK | sda | busy 0% | read 0 | write 72 | KiB/r 0 | KiB/w 5 | MBr/s 0.0 | MBw/s 0.0 | avio 0.22 ms |
NET | transport | tcpi 8497 | tcpo 11006 | udpi 21 | udpo 21 | tcpao 442 | tcppo 0 | tcprs 0 | udpie 0 |
NET | network | ipi 8518 | ipo 8953 | ipfrw 0 | deliv 8518 | | | icmpi 0 | icmpo 0 |
NET | eth3 0% | pcki 8683 | pcko 1645 | sp 10 Gbps | si 6441 Kbps | so 868 Kbps | erri 0 | erro 0 | drpo 0 |
NET | bond0 0% | pcki 15683 | pcko 10665 | sp 20 Gbps | si 11 Mbps | so 2918 Kbps | erri 0 | erro 0 | drpo 0 |
NET | bond0.7 0% | pcki 8310 | pcko 8711 | sp 20 Gbps | si 10 Mbps | so 2801 Kbps | erri 0 | erro 0 | drpo 0 |
NET | eth2 0% | pcki 7000 | pcko 9020 | sp 10 Gbps | si 4663 Kbps | so 2050 Kbps | erri 0 | erro 0 | drpo 0 |
NET | bond0.5 0% | pcki 10 | pcko 44 | sp 20 Gbps | si 0 Kbps | so 12 Kbps | erri 0 | erro 0 | drpo 0 |
NET | lo ---- | pcki 198 | pcko 198 | sp 0 Mbps | si 162 Kbps | so 162 Kbps | erri 0 | erro 0 | drpo 0 |
PID SYSCPU USRCPU VGROW RGROW RDDSK WRDSK RUID EUID ST EXC THR S CPUNR CPU CMD 1/14
1205 0.22s 0.21s 0K -0.1M 62884K 0K 252K root root -- - 1 S 2 4% petasan_config
1869914 0.02s 0.19s 0K 0K -M 52084K 39552K - root - NE 0 0 E - 2% <ceph>
1869940 0.02s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870556 0.01s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869401 0.02s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869991 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870171 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870222 0.01s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870248 0.02s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870453 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870530 0.01s 0.19s 0K 05.2M 73776K - - root ceph - NE 0 0 E - 2% <ceph>
1870787 0.02s 0.18s 0K 05.2M 70936K - - root ceph - NE 0 0 E - 2% <ceph>
1869375 0.01s 0.18s 0K 05.2M 78816K - - root ceph - NE 0 0 E - 2% <ceph>
1869452 0.00s 0.19s 0K 04.2M 70352K - - root ceph - NE 0 0 E - 2% <ceph>
1869529 0.00s 0.19s 4520K 0K - - root - NE 0 0 E - 2% <ceph>
1869555 0.01s 0.18s 0K 2556K 55580K - - 17164K root - NE 0 0 E - 2% <ceph>
1869632 0.00s 0.19s 0K 03.5M 53772K - - root - NE 0 0 E - 2% <ceph>
1869683 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869709 0.00s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869786 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869837 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1869863 0.01s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph>
1870017 0.01s 0.18s 0 97028K 0K - - root - zabbix NE 0 0 E - 2% <ceph>
Last edited on April 18, 2020, 12:18 pm by ufm · #8
ufm
46 Posts
April 18, 2020, 12:45 pmQuote from ufm on April 18, 2020, 12:45 pmHm. I rebooted one of the storage servers and the CPU load on it drops to 0% (no more petasan_config_upload in process list). As for me - this is not a good situation, I do not like "strange underground knocks" ...
P.S. I rebooted another node. The same situation. I should reboot all nodes one by one, or it's situation interesting for you and I should leave one-two nodes unrebooted?
Hm. I rebooted one of the storage servers and the CPU load on it drops to 0% (no more petasan_config_upload in process list). As for me - this is not a good situation, I do not like "strange underground knocks" ...
P.S. I rebooted another node. The same situation. I should reboot all nodes one by one, or it's situation interesting for you and I should leave one-two nodes unrebooted?
admin
2,930 Posts
April 18, 2020, 12:54 pmQuote from admin on April 18, 2020, 12:54 pmcan you show what ceph processes are running at 2% via
ps aux | grep ceph
can you show what ceph processes are running at 2% via
ps aux | grep ceph
Pages: 1 2
CPU Usage on Nodes
ufm
46 Posts
Quote from ufm on April 18, 2020, 2:33 amTell me, please - loading 20-30 percent on all nodes in an empty cluster (not a single pool was created, there are no connected clients, only OSD) - is this normal behavior, or did I break something?
Tell me, please - loading 20-30 percent on all nodes in an empty cluster (not a single pool was created, there are no connected clients, only OSD) - is this normal behavior, or did I break something?
ufm
46 Posts
Quote from ufm on April 18, 2020, 2:46 amAdditional question: It's normal in ceph.log file on mon node:
2020-04-18 05:36:58.806997 mgr.petasan-mon3 (mgr.64131) 51343 : cluster [DBG] pgmap v48231: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:00.807434 mgr.petasan-mon3 (mgr.64131) 51344 : cluster [DBG] pgmap v48232: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:02.807846 mgr.petasan-mon3 (mgr.64131) 51345 : cluster [DBG] pgmap v48233: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:04.808280 mgr.petasan-mon3 (mgr.64131) 51346 : cluster [DBG] pgmap v48234: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:06.808819 mgr.petasan-mon3 (mgr.64131) 51347 : cluster [DBG] pgmap v48235: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:08.809340 mgr.petasan-mon3 (mgr.64131) 51348 : cluster [DBG] pgmap v48236: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:10.809739 mgr.petasan-mon3 (mgr.64131) 51349 : cluster [DBG] pgmap v48237: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:12.810118 mgr.petasan-mon3 (mgr.64131) 51350 : cluster [DBG] pgmap v48238: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:14.810534 mgr.petasan-mon3 (mgr.64131) 51351 : cluster [DBG] pgmap v48239: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:16.810917 mgr.petasan-mon3 (mgr.64131) 51352 : cluster [DBG] pgmap v48240: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:18.811333 mgr.petasan-mon3 (mgr.64131) 51353 : cluster [DBG] pgmap v48241: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:20.811710 mgr.petasan-mon3 (mgr.64131) 51354 : cluster [DBG] pgmap v48242: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
ceph.audit.log also grow very fast (now size - 36520615 and fast increase):
2020-04-18 05:45:06.968275 mon.petasan-mon3 (mon.0) 78180 : audit [DBG] from='client.? 10.5.108.13:0/619389959' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:06.992723 mon.petasan-mon3 (mon.0) 78181 : audit [INF] from='client.? 10.5.108.51:0/2981322423' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.021436 mon.petasan-mon3 (mon.0) 78182 : audit [DBG] from='client.? 10.5.108.52:0/2954037331' entity='client.admin' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
2020-04-18 05:45:07.034556 mon.petasan-mon2 (mon.2) 59685 : audit [DBG] from='client.? 10.5.108.48:0/1090715648' entity='client.admin' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
2020-04-18 05:45:07.051300 mon.petasan-mon3 (mon.0) 78183 : audit [DBG] from='client.? 10.5.108.52:0/3913457327' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:07.068457 mon.petasan-mon2 (mon.2) 59686 : audit [DBG] from='client.? 10.5.108.48:0/3508208063' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:07.084533 mon.petasan-mon3 (mon.0) 78184 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.086991 mon.petasan-mon1 (mon.1) 52080 : audit [INF] from='client.? 10.5.108.47:0/2084591108' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.087473 mon.petasan-mon3 (mon.0) 78185 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.090076 mon.petasan-mon1 (mon.1) 52081 : audit [INF] from='client.? 10.5.108.49:0/2679780626' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.104355 mon.petasan-mon2 (mon.2) 59687 : audit [INF] from='client.? 10.5.108.12:0/338527332' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.104760 mon.petasan-mon3 (mon.0) 78186 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
Additional question: It's normal in ceph.log file on mon node:
2020-04-18 05:36:58.806997 mgr.petasan-mon3 (mgr.64131) 51343 : cluster [DBG] pgmap v48231: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:00.807434 mgr.petasan-mon3 (mgr.64131) 51344 : cluster [DBG] pgmap v48232: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:02.807846 mgr.petasan-mon3 (mgr.64131) 51345 : cluster [DBG] pgmap v48233: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:04.808280 mgr.petasan-mon3 (mgr.64131) 51346 : cluster [DBG] pgmap v48234: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:06.808819 mgr.petasan-mon3 (mgr.64131) 51347 : cluster [DBG] pgmap v48235: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:08.809340 mgr.petasan-mon3 (mgr.64131) 51348 : cluster [DBG] pgmap v48236: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:10.809739 mgr.petasan-mon3 (mgr.64131) 51349 : cluster [DBG] pgmap v48237: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:12.810118 mgr.petasan-mon3 (mgr.64131) 51350 : cluster [DBG] pgmap v48238: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:14.810534 mgr.petasan-mon3 (mgr.64131) 51351 : cluster [DBG] pgmap v48239: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:16.810917 mgr.petasan-mon3 (mgr.64131) 51352 : cluster [DBG] pgmap v48240: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:18.811333 mgr.petasan-mon3 (mgr.64131) 51353 : cluster [DBG] pgmap v48241: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
2020-04-18 05:37:20.811710 mgr.petasan-mon3 (mgr.64131) 51354 : cluster [DBG] pgmap v48242: 0 pgs: ; 0 B data, 7.8 GiB used, 20 TiB / 20 TiB avail
ceph.audit.log also grow very fast (now size - 36520615 and fast increase):
2020-04-18 05:45:06.968275 mon.petasan-mon3 (mon.0) 78180 : audit [DBG] from='client.? 10.5.108.13:0/619389959' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:06.992723 mon.petasan-mon3 (mon.0) 78181 : audit [INF] from='client.? 10.5.108.51:0/2981322423' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.021436 mon.petasan-mon3 (mon.0) 78182 : audit [DBG] from='client.? 10.5.108.52:0/2954037331' entity='client.admin' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
2020-04-18 05:45:07.034556 mon.petasan-mon2 (mon.2) 59685 : audit [DBG] from='client.? 10.5.108.48:0/1090715648' entity='client.admin' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
2020-04-18 05:45:07.051300 mon.petasan-mon3 (mon.0) 78183 : audit [DBG] from='client.? 10.5.108.52:0/3913457327' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:07.068457 mon.petasan-mon2 (mon.2) 59686 : audit [DBG] from='client.? 10.5.108.48:0/3508208063' entity='client.admin' cmd=[{,",f,o,r,m,a,t,",:, ,",j,s,o,n,",,, ,",p,r,e,f,i,x,",:, ,",s,t,a,t,u,s,",}]: dispatch
2020-04-18 05:45:07.084533 mon.petasan-mon3 (mon.0) 78184 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.086991 mon.petasan-mon1 (mon.1) 52080 : audit [INF] from='client.? 10.5.108.47:0/2084591108' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.087473 mon.petasan-mon3 (mon.0) 78185 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.090076 mon.petasan-mon1 (mon.1) 52081 : audit [INF] from='client.? 10.5.108.49:0/2679780626' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.104355 mon.petasan-mon2 (mon.2) 59687 : audit [INF] from='client.? 10.5.108.12:0/338527332' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
2020-04-18 05:45:07.104760 mon.petasan-mon3 (mon.0) 78186 : audit [INF] from='client.? ' entity='client.admin' cmd=[{"prefix": "config assimilate-conf"}]: dispatch
admin
2,930 Posts
Quote from admin on April 18, 2020, 9:22 amNo it is not normal.
Are you using real hardware, or a virtual lab setup ? how much RAM/cpu cores ? even with later it should not be this much.
Can you close all management browsers and see if makes a difference, if so does it depend on which page you are accessing ?
if you run
ceph pg ls-by-pool POOL_NAME
replace POOL_NAME with name of pool such as rbd
does it take time to complete ?
No it is not normal.
Are you using real hardware, or a virtual lab setup ? how much RAM/cpu cores ? even with later it should not be this much.
Can you close all management browsers and see if makes a difference, if so does it depend on which page you are accessing ?
if you run
ceph pg ls-by-pool POOL_NAME
replace POOL_NAME with name of pool such as rbd
does it take time to complete ?
ufm
46 Posts
Quote from ufm on April 18, 2020, 9:54 ammon servers - virtual. 4 core 32G ram each.
OSD - real servers. 4 SSD for OSD, 4 core, 32G ram, 2*10G ethernet
After close management browser no difference.
ceph pg ls-by-pool iscsi-ssd - time of execution - less than second.
real 0m0.261s
user 0m0.186s
sys 0m0.005sAs I can see in atop on OSD server - 5% used by petasan_config_upload and by 2% many ceph threads.
mon servers - virtual. 4 core 32G ram each.
OSD - real servers. 4 SSD for OSD, 4 core, 32G ram, 2*10G ethernet
After close management browser no difference.
ceph pg ls-by-pool iscsi-ssd - time of execution - less than second.
real 0m0.261s
user 0m0.186s
sys 0m0.005s
As I can see in atop on OSD server - 5% used by petasan_config_upload and by 2% many ceph threads.
admin
2,930 Posts
Quote from admin on April 18, 2020, 10:37 amis the cluster health show Ok ?
due you see any errors in /opt/petasan/log/PetaSAN.log
is this a fresh install or was this upgraded ?
is the cluster health show Ok ?
due you see any errors in /opt/petasan/log/PetaSAN.log
is this a fresh install or was this upgraded ?
ufm
46 Posts
Quote from ufm on April 18, 2020, 11:22 amYes, health status is OK (in dashboard and in ceph -s)
root@petasan-mon1:~# ceph -s
cluster:
id: 982c2213-6936-4285-a641-56d1ab906e04
health: HEALTH_OKservices:
mon: 3 daemons, quorum petasan-mon3,petasan-mon1,petasan-mon2 (age 7h)
mgr: petasan-mon3(active, since 8h), standbys: petasan-mon2, petasan-mon1
osd: 24 osds: 24 up (since 8h), 24 in (since 8h)data:
pools: 1 pools, 1024 pgs
objects: 0 objects, 0 B
usage: 7.8 GiB used, 20 TiB / 20 TiB avail
pgs: 1024 active+cleanThere are currently no errors in PetaSAN.log
It's a fresh install 2.5.0 upgraded to 2.5.1 (all nodes upgraded, storage nodes upgraded before creating the OSD.
Yes, health status is OK (in dashboard and in ceph -s)
root@petasan-mon1:~# ceph -s
cluster:
id: 982c2213-6936-4285-a641-56d1ab906e04
health: HEALTH_OKservices:
mon: 3 daemons, quorum petasan-mon3,petasan-mon1,petasan-mon2 (age 7h)
mgr: petasan-mon3(active, since 8h), standbys: petasan-mon2, petasan-mon1
osd: 24 osds: 24 up (since 8h), 24 in (since 8h)data:
pools: 1 pools, 1024 pgs
objects: 0 objects, 0 B
usage: 7.8 GiB used, 20 TiB / 20 TiB avail
pgs: 1024 active+clean
There are currently no errors in PetaSAN.log
It's a fresh install 2.5.0 upgraded to 2.5.1 (all nodes upgraded, storage nodes upgraded before creating the OSD.
admin
2,930 Posts
Quote from admin on April 18, 2020, 12:09 pmThe 20-30% is this on all nodes ? or only the vms ?
petasan_config_upload 5% on all nodes ?
in atop what are the the 2% processes: OSDs ?
The 20-30% is this on all nodes ? or only the vms ?
petasan_config_upload 5% on all nodes ?
in atop what are the the 2% processes: OSDs ?
ufm
46 Posts
Quote from ufm on April 18, 2020, 12:17 pmIn current moment 20-30% only on storage (hardware nodes). On vms (monitor) - about 15% (used by ceph-mon)
Petasan_config started only on storage.
atop on storage node: ATOP - S-26-5-2-3 2020/04/18 15:16:24 - 10 2020/04/18 15:16:34 ---- 10 PRC | sys 0.67s | user 8.10s | #proc 496 | #trun 1 | #tslpi 535 | #tslpu 0 | #zombie 0 | clones 1617 | #exit 294 | CPU | sys 12% | user 84% | irq 0% | idle 304% | wait 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% | cpu | sys 3% | user 28% | irq 0% | idle 69% | cpu002 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% | cpu | sys 3% | user 21% | irq 0% | idle 76% | cpu000 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% | cpu | sys 3% | user 20% | irq 0% | idle 77% | cpu001 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% | cpu | sys 3% | user 15% | irq 0% | idle 82% | cpu003 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% | CPL | avg1 0.92 | avg5 0.97 | avg15 0.99 | | csw 45254 | | intr 23734 | | numcpu 4 | MEM | tot 31.2G | free 29.9G | cache 245.5M | buff 123.0M | slab 107.3M | shmem 7.5M | vmbal 0.0M | hptot 0.0M | hpuse 0.0M | SWP | tot 0.0M | free 0.0M | | | | | | vmcom 4.4G | vmlim 15.6G | DSK | sda | busy 0% | read 0 | write 72 | KiB/r 0 | KiB/w 5 | MBr/s 0.0 | MBw/s 0.0 | avio 0.22 ms | NET | transport | tcpi 8497 | tcpo 11006 | udpi 21 | udpo 21 | tcpao 442 | tcppo 0 | tcprs 0 | udpie 0 | NET | network | ipi 8518 | ipo 8953 | ipfrw 0 | deliv 8518 | | | icmpi 0 | icmpo 0 | NET | eth3 0% | pcki 8683 | pcko 1645 | sp 10 Gbps | si 6441 Kbps | so 868 Kbps | erri 0 | erro 0 | drpo 0 | NET | bond0 0% | pcki 15683 | pcko 10665 | sp 20 Gbps | si 11 Mbps | so 2918 Kbps | erri 0 | erro 0 | drpo 0 | NET | bond0.7 0% | pcki 8310 | pcko 8711 | sp 20 Gbps | si 10 Mbps | so 2801 Kbps | erri 0 | erro 0 | drpo 0 | NET | eth2 0% | pcki 7000 | pcko 9020 | sp 10 Gbps | si 4663 Kbps | so 2050 Kbps | erri 0 | erro 0 | drpo 0 | NET | bond0.5 0% | pcki 10 | pcko 44 | sp 20 Gbps | si 0 Kbps | so 12 Kbps | erri 0 | erro 0 | drpo 0 | NET | lo ---- | pcki 198 | pcko 198 | sp 0 Mbps | si 162 Kbps | so 162 Kbps | erri 0 | erro 0 | drpo 0 | PID SYSCPU USRCPU VGROW RGROW RDDSK WRDSK RUID EUID ST EXC THR S CPUNR CPU CMD 1/14 1205 0.22s 0.21s 0K -0.1M 62884K 0K 252K root root -- - 1 S 2 4% petasan_config 1869914 0.02s 0.19s 0K 0K -M 52084K 39552K - root - NE 0 0 E - 2% <ceph> 1869940 0.02s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870556 0.01s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869401 0.02s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869991 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870171 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870222 0.01s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870248 0.02s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870453 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870530 0.01s 0.19s 0K 05.2M 73776K - - root ceph - NE 0 0 E - 2% <ceph> 1870787 0.02s 0.18s 0K 05.2M 70936K - - root ceph - NE 0 0 E - 2% <ceph> 1869375 0.01s 0.18s 0K 05.2M 78816K - - root ceph - NE 0 0 E - 2% <ceph> 1869452 0.00s 0.19s 0K 04.2M 70352K - - root ceph - NE 0 0 E - 2% <ceph> 1869529 0.00s 0.19s 4520K 0K - - root - NE 0 0 E - 2% <ceph> 1869555 0.01s 0.18s 0K 2556K 55580K - - 17164K root - NE 0 0 E - 2% <ceph> 1869632 0.00s 0.19s 0K 03.5M 53772K - - root - NE 0 0 E - 2% <ceph> 1869683 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869709 0.00s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869786 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869837 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869863 0.01s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870017 0.01s 0.18s 0 97028K 0K - - root - zabbix NE 0 0 E - 2% <ceph>
In current moment 20-30% only on storage (hardware nodes). On vms (monitor) - about 15% (used by ceph-mon)
Petasan_config started only on storage.
atop on storage node: ATOP - S-26-5-2-3 2020/04/18 15:16:24 - 10 2020/04/18 15:16:34 ---- 10 PRC | sys 0.67s | user 8.10s | #proc 496 | #trun 1 | #tslpi 535 | #tslpu 0 | #zombie 0 | clones 1617 | #exit 294 | CPU | sys 12% | user 84% | irq 0% | idle 304% | wait 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% | cpu | sys 3% | user 28% | irq 0% | idle 69% | cpu002 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% | cpu | sys 3% | user 21% | irq 0% | idle 76% | cpu000 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% | cpu | sys 3% | user 20% | irq 0% | idle 77% | cpu001 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% | cpu | sys 3% | user 15% | irq 0% | idle 82% | cpu003 w 0% | steal 0% | guest 0% | curf 3.30GHz | curscal 94% | CPL | avg1 0.92 | avg5 0.97 | avg15 0.99 | | csw 45254 | | intr 23734 | | numcpu 4 | MEM | tot 31.2G | free 29.9G | cache 245.5M | buff 123.0M | slab 107.3M | shmem 7.5M | vmbal 0.0M | hptot 0.0M | hpuse 0.0M | SWP | tot 0.0M | free 0.0M | | | | | | vmcom 4.4G | vmlim 15.6G | DSK | sda | busy 0% | read 0 | write 72 | KiB/r 0 | KiB/w 5 | MBr/s 0.0 | MBw/s 0.0 | avio 0.22 ms | NET | transport | tcpi 8497 | tcpo 11006 | udpi 21 | udpo 21 | tcpao 442 | tcppo 0 | tcprs 0 | udpie 0 | NET | network | ipi 8518 | ipo 8953 | ipfrw 0 | deliv 8518 | | | icmpi 0 | icmpo 0 | NET | eth3 0% | pcki 8683 | pcko 1645 | sp 10 Gbps | si 6441 Kbps | so 868 Kbps | erri 0 | erro 0 | drpo 0 | NET | bond0 0% | pcki 15683 | pcko 10665 | sp 20 Gbps | si 11 Mbps | so 2918 Kbps | erri 0 | erro 0 | drpo 0 | NET | bond0.7 0% | pcki 8310 | pcko 8711 | sp 20 Gbps | si 10 Mbps | so 2801 Kbps | erri 0 | erro 0 | drpo 0 | NET | eth2 0% | pcki 7000 | pcko 9020 | sp 10 Gbps | si 4663 Kbps | so 2050 Kbps | erri 0 | erro 0 | drpo 0 | NET | bond0.5 0% | pcki 10 | pcko 44 | sp 20 Gbps | si 0 Kbps | so 12 Kbps | erri 0 | erro 0 | drpo 0 | NET | lo ---- | pcki 198 | pcko 198 | sp 0 Mbps | si 162 Kbps | so 162 Kbps | erri 0 | erro 0 | drpo 0 | PID SYSCPU USRCPU VGROW RGROW RDDSK WRDSK RUID EUID ST EXC THR S CPUNR CPU CMD 1/14 1205 0.22s 0.21s 0K -0.1M 62884K 0K 252K root root -- - 1 S 2 4% petasan_config 1869914 0.02s 0.19s 0K 0K -M 52084K 39552K - root - NE 0 0 E - 2% <ceph> 1869940 0.02s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870556 0.01s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869401 0.02s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869991 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870171 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870222 0.01s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870248 0.02s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870453 0.00s 0.20s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870530 0.01s 0.19s 0K 05.2M 73776K - - root ceph - NE 0 0 E - 2% <ceph> 1870787 0.02s 0.18s 0K 05.2M 70936K - - root ceph - NE 0 0 E - 2% <ceph> 1869375 0.01s 0.18s 0K 05.2M 78816K - - root ceph - NE 0 0 E - 2% <ceph> 1869452 0.00s 0.19s 0K 04.2M 70352K - - root ceph - NE 0 0 E - 2% <ceph> 1869529 0.00s 0.19s 4520K 0K - - root - NE 0 0 E - 2% <ceph> 1869555 0.01s 0.18s 0K 2556K 55580K - - 17164K root - NE 0 0 E - 2% <ceph> 1869632 0.00s 0.19s 0K 03.5M 53772K - - root - NE 0 0 E - 2% <ceph> 1869683 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869709 0.00s 0.19s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869786 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869837 0.02s 0.17s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1869863 0.01s 0.18s 0K 0K - - root - NE 0 0 E - 2% <ceph> 1870017 0.01s 0.18s 0 97028K 0K - - root - zabbix NE 0 0 E - 2% <ceph>
ufm
46 Posts
Quote from ufm on April 18, 2020, 12:45 pmHm. I rebooted one of the storage servers and the CPU load on it drops to 0% (no more petasan_config_upload in process list). As for me - this is not a good situation, I do not like "strange underground knocks" ...
P.S. I rebooted another node. The same situation. I should reboot all nodes one by one, or it's situation interesting for you and I should leave one-two nodes unrebooted?
Hm. I rebooted one of the storage servers and the CPU load on it drops to 0% (no more petasan_config_upload in process list). As for me - this is not a good situation, I do not like "strange underground knocks" ...
P.S. I rebooted another node. The same situation. I should reboot all nodes one by one, or it's situation interesting for you and I should leave one-two nodes unrebooted?
admin
2,930 Posts
Quote from admin on April 18, 2020, 12:54 pmcan you show what ceph processes are running at 2% via
ps aux | grep ceph
can you show what ceph processes are running at 2% via
ps aux | grep ceph