My MON Service crash
maxthetor
24 Posts
March 23, 2018, 10:05 pmQuote from maxthetor on March 23, 2018, 10:05 pmone of my mon service is crash.
probably db corruption:
mon / MonitorDBStore.h: 306: FAILED assert (0 == "failed to write to db")
This server is mon and osd at the same time
I would like to know the best practice for changing a mon.
I unplug and reinstall?
Did I remove osd discs before?
This mon02, if I create a new server and call it mon05, and try
add it as new mon server, does the petasan cluster accept?
It's only 3 mon servers, right?
um de meus mon service is crash.
provavelmente db corrupcao:
mon/MonitorDBStore.h: 306: FAILED assert(0 == "failed to write to db")
Esse servidor e mon e osd ao mesmo tempo
Gostaria de saber qual a melhor pratica para trocar um mon.
Desligo e reinstalo?
Removo os discos do osd antes?
Esse mon o mon02, se eu criar um novo servidor e chamalo de mon05, e tentar
adicionar ele como novo servidor mon, o cluster petasan aceita?
Sao apenas 3 mon servers, correto?
one of my mon service is crash.
probably db corruption:
mon / MonitorDBStore.h: 306: FAILED assert (0 == "failed to write to db")
This server is mon and osd at the same time
I would like to know the best practice for changing a mon.
I unplug and reinstall?
Did I remove osd discs before?
This mon02, if I create a new server and call it mon05, and try
add it as new mon server, does the petasan cluster accept?
It's only 3 mon servers, right?
um de meus mon service is crash.
provavelmente db corrupcao:
mon/MonitorDBStore.h: 306: FAILED assert(0 == "failed to write to db")
Esse servidor e mon e osd ao mesmo tempo
Gostaria de saber qual a melhor pratica para trocar um mon.
Desligo e reinstalo?
Removo os discos do osd antes?
Esse mon o mon02, se eu criar um novo servidor e chamalo de mon05, e tentar
adicionar ele como novo servidor mon, o cluster petasan aceita?
Sao apenas 3 mon servers, correto?
admin
2,930 Posts
March 24, 2018, 12:00 amQuote from admin on March 24, 2018, 12:00 amFor v1.5 and 2.0
Replace the OS disk
Install using installer, you need to specify the same hostname, same management interface, same management ip of what you had specified before.
When deploying the node: choose "Replace Management Node", join any of the 2 existing management nodes, click next, next
When done, reboot the system to restart old OSDs (this is a bug)
For v1.5 and 2.0
Replace the OS disk
Install using installer, you need to specify the same hostname, same management interface, same management ip of what you had specified before.
When deploying the node: choose "Replace Management Node", join any of the 2 existing management nodes, click next, next
When done, reboot the system to restart old OSDs (this is a bug)
Last edited on March 24, 2018, 12:01 am by admin · #2
maxthetor
24 Posts
March 26, 2018, 2:17 amQuote from maxthetor on March 26, 2018, 2:17 amdone.
but the new node only enters the cluster when
it's clean / ok, right?
https://imgur.com/ho0lPuI
root@san01:~# ceph -w --cluster cloud
cluster 9f99e76f-1f50-4aa3-b876-dbf194a3cadf
health HEALTH_WARN
676 pgs backfill_wait
3 pgs backfilling
679 pgs degraded
679 pgs stuck unclean
679 pgs undersized
recovery 229202/698840 objects degraded (32.797%)
recovery 194120/698840 objects misplaced (27.777%)
1 mons down, quorum 0,2 san01,san03
monmap e3: 3 mons at {san01=10.0.10.1:6789/0,san02=10.0.10.2:6789/0,san03=10.0.10.3:6789/0}
election epoch 882, quorum 0,2 san01,san03
osdmap e36286: 11 osds: 8 up, 8 in; 679 remapped pgs
flags sortbitwise,require_jewel_osds
pgmap v20312235: 1000 pgs, 1 pools, 1314 GB data, 330 kobjects
1740 GB used, 8176 GB / 9916 GB avail
229202/698840 objects degraded (32.797%)
194120/698840 objects misplaced (27.777%)
676 active+undersized+degraded+remapped+wait_backfill
321 active+clean
3 active+undersized+degraded+remapped+backfilling
recovery io 112 MB/s, 29 objects/s
client io 3973 kB/s rd, 468 kB/s wr, 31 op/s rd, 98 op/s wr
----------------------------------------------------------------------------------
done.
but the new node only enters the cluster when
it's clean / ok, right?
root@san01:~# ceph -w --cluster cloud
cluster 9f99e76f-1f50-4aa3-b876-dbf194a3cadf
health HEALTH_WARN
676 pgs backfill_wait
3 pgs backfilling
679 pgs degraded
679 pgs stuck unclean
679 pgs undersized
recovery 229202/698840 objects degraded (32.797%)
recovery 194120/698840 objects misplaced (27.777%)
1 mons down, quorum 0,2 san01,san03
monmap e3: 3 mons at {san01=10.0.10.1:6789/0,san02=10.0.10.2:6789/0,san03=10.0.10.3:6789/0}
election epoch 882, quorum 0,2 san01,san03
osdmap e36286: 11 osds: 8 up, 8 in; 679 remapped pgs
flags sortbitwise,require_jewel_osds
pgmap v20312235: 1000 pgs, 1 pools, 1314 GB data, 330 kobjects
1740 GB used, 8176 GB / 9916 GB avail
229202/698840 objects degraded (32.797%)
194120/698840 objects misplaced (27.777%)
676 active+undersized+degraded+remapped+wait_backfill
321 active+clean
3 active+undersized+degraded+remapped+backfilling
recovery io 112 MB/s, 29 objects/s
client io 3973 kB/s rd, 468 kB/s wr, 31 op/s rd, 98 op/s wr
----------------------------------------------------------------------------------
Last edited on March 26, 2018, 2:17 am by maxthetor · #3
admin
2,930 Posts
March 26, 2018, 8:53 amQuote from admin on March 26, 2018, 8:53 amThe deployment should complete quickly. Are you using 1.5 or 2,0 ?
The deployment should complete quickly. Are you using 1.5 or 2,0 ?
maxthetor
24 Posts
March 29, 2018, 12:20 amQuote from maxthetor on March 29, 2018, 12:20 am1.5.
The cluster is OK.
thanks
1.5.
The cluster is OK.
thanks
My MON Service crash
maxthetor
24 Posts
Quote from maxthetor on March 23, 2018, 10:05 pmone of my mon service is crash.
probably db corruption:
mon / MonitorDBStore.h: 306: FAILED assert (0 == "failed to write to db")This server is mon and osd at the same time
I would like to know the best practice for changing a mon.
I unplug and reinstall?Did I remove osd discs before?
This mon02, if I create a new server and call it mon05, and try
add it as new mon server, does the petasan cluster accept?
It's only 3 mon servers, right?
um de meus mon service is crash.
provavelmente db corrupcao:
mon/MonitorDBStore.h: 306: FAILED assert(0 == "failed to write to db")Esse servidor e mon e osd ao mesmo tempo
Gostaria de saber qual a melhor pratica para trocar um mon.
Desligo e reinstalo?Removo os discos do osd antes?
Esse mon o mon02, se eu criar um novo servidor e chamalo de mon05, e tentar
adicionar ele como novo servidor mon, o cluster petasan aceita?
Sao apenas 3 mon servers, correto?
one of my mon service is crash.
probably db corruption:
mon / MonitorDBStore.h: 306: FAILED assert (0 == "failed to write to db")
This server is mon and osd at the same time
I would like to know the best practice for changing a mon.
I unplug and reinstall?
Did I remove osd discs before?
This mon02, if I create a new server and call it mon05, and try
add it as new mon server, does the petasan cluster accept?
It's only 3 mon servers, right?
um de meus mon service is crash.
provavelmente db corrupcao:
mon/MonitorDBStore.h: 306: FAILED assert(0 == "failed to write to db")
Esse servidor e mon e osd ao mesmo tempo
Gostaria de saber qual a melhor pratica para trocar um mon.
Desligo e reinstalo?
Removo os discos do osd antes?
Esse mon o mon02, se eu criar um novo servidor e chamalo de mon05, e tentar
adicionar ele como novo servidor mon, o cluster petasan aceita?
Sao apenas 3 mon servers, correto?
admin
2,930 Posts
Quote from admin on March 24, 2018, 12:00 amFor v1.5 and 2.0
Replace the OS disk
Install using installer, you need to specify the same hostname, same management interface, same management ip of what you had specified before.
When deploying the node: choose "Replace Management Node", join any of the 2 existing management nodes, click next, next
When done, reboot the system to restart old OSDs (this is a bug)
For v1.5 and 2.0
Replace the OS disk
Install using installer, you need to specify the same hostname, same management interface, same management ip of what you had specified before.
When deploying the node: choose "Replace Management Node", join any of the 2 existing management nodes, click next, next
When done, reboot the system to restart old OSDs (this is a bug)
maxthetor
24 Posts
Quote from maxthetor on March 26, 2018, 2:17 amdone.
but the new node only enters the cluster when
it's clean / ok, right?https://imgur.com/ho0lPuI
root@san01:~# ceph -w --cluster cloud
cluster 9f99e76f-1f50-4aa3-b876-dbf194a3cadf
health HEALTH_WARN
676 pgs backfill_wait
3 pgs backfilling
679 pgs degraded
679 pgs stuck unclean
679 pgs undersized
recovery 229202/698840 objects degraded (32.797%)
recovery 194120/698840 objects misplaced (27.777%)
1 mons down, quorum 0,2 san01,san03
monmap e3: 3 mons at {san01=10.0.10.1:6789/0,san02=10.0.10.2:6789/0,san03=10.0.10.3:6789/0}
election epoch 882, quorum 0,2 san01,san03
osdmap e36286: 11 osds: 8 up, 8 in; 679 remapped pgs
flags sortbitwise,require_jewel_osds
pgmap v20312235: 1000 pgs, 1 pools, 1314 GB data, 330 kobjects
1740 GB used, 8176 GB / 9916 GB avail
229202/698840 objects degraded (32.797%)
194120/698840 objects misplaced (27.777%)
676 active+undersized+degraded+remapped+wait_backfill
321 active+clean
3 active+undersized+degraded+remapped+backfilling
recovery io 112 MB/s, 29 objects/s
client io 3973 kB/s rd, 468 kB/s wr, 31 op/s rd, 98 op/s wr----------------------------------------------------------------------------------
done.
but the new node only enters the cluster when
it's clean / ok, right?
root@san01:~# ceph -w --cluster cloud
cluster 9f99e76f-1f50-4aa3-b876-dbf194a3cadf
health HEALTH_WARN
676 pgs backfill_wait
3 pgs backfilling
679 pgs degraded
679 pgs stuck unclean
679 pgs undersized
recovery 229202/698840 objects degraded (32.797%)
recovery 194120/698840 objects misplaced (27.777%)
1 mons down, quorum 0,2 san01,san03
monmap e3: 3 mons at {san01=10.0.10.1:6789/0,san02=10.0.10.2:6789/0,san03=10.0.10.3:6789/0}
election epoch 882, quorum 0,2 san01,san03
osdmap e36286: 11 osds: 8 up, 8 in; 679 remapped pgs
flags sortbitwise,require_jewel_osds
pgmap v20312235: 1000 pgs, 1 pools, 1314 GB data, 330 kobjects
1740 GB used, 8176 GB / 9916 GB avail
229202/698840 objects degraded (32.797%)
194120/698840 objects misplaced (27.777%)
676 active+undersized+degraded+remapped+wait_backfill
321 active+clean
3 active+undersized+degraded+remapped+backfilling
recovery io 112 MB/s, 29 objects/s
client io 3973 kB/s rd, 468 kB/s wr, 31 op/s rd, 98 op/s wr
----------------------------------------------------------------------------------
admin
2,930 Posts
Quote from admin on March 26, 2018, 8:53 amThe deployment should complete quickly. Are you using 1.5 or 2,0 ?
The deployment should complete quickly. Are you using 1.5 or 2,0 ?
maxthetor
24 Posts
Quote from maxthetor on March 29, 2018, 12:20 am1.5.
The cluster is OK.
thanks
1.5.
The cluster is OK.
thanks