Iscsi not start
Pages: 1 2
Lerner
7 Posts
December 4, 2018, 12:46 pmQuote from Lerner on December 4, 2018, 12:46 pmHi,
When i included the new node, the petasan show the osd2 down and show me just the option for exclude him.
The new node48 show the OSDs 3 and 7, and all OSDs is up and started.
root@node48:~# ceph health detail --cluster loks
HEALTH_WARN Reduced data availability: 55 pgs inactive, 55 pgs incomplete; Degraded data redundancy: 55 pgs unclean
PG_AVAILABILITY Reduced data availability: 55 pgs inactive, 55 pgs incomplete
pg 1.7 is incomplete, acting [7,4]
pg 1.a is incomplete, acting [5,7]
pg 1.10 is incomplete, acting [3,5]
pg 1.15 is incomplete, acting [3,5]
pg 1.1c is incomplete, acting [5,3]
pg 1.22 is incomplete, acting [7,5]
pg 1.24 is incomplete, acting [5,7]
pg 1.2d is incomplete, acting [3,5]
pg 1.2e is incomplete, acting [3,5]
pg 1.32 is incomplete, acting [7,4]
pg 1.34 is incomplete, acting [4,3]
pg 1.38 is incomplete, acting [4,7]
pg 1.41 is incomplete, acting [3,4]
pg 1.42 is incomplete, acting [4,3]
pg 1.4e is incomplete, acting [3,5]
pg 1.55 is incomplete, acting [4,3]
pg 1.56 is incomplete, acting [3,4]
pg 1.5e is incomplete, acting [7,4]
pg 1.64 is incomplete, acting [4,7]
pg 1.6a is incomplete, acting [3,4]
pg 1.6e is incomplete, acting [5,7]
pg 1.70 is incomplete, acting [5,7]
pg 1.72 is incomplete, acting [3,4]
pg 1.81 is incomplete, acting [4,7]
pg 1.84 is incomplete, acting [4,3]
pg 1.8d is incomplete, acting [4,3]
pg 1.92 is incomplete, acting [5,3]
pg 1.94 is incomplete, acting [3,5]
pg 1.97 is incomplete, acting [3,4]
pg 1.9d is incomplete, acting [4,3]
pg 1.a1 is incomplete, acting [5,7]
pg 1.a3 is incomplete, acting [4,3]
pg 1.a4 is incomplete, acting [3,5]
pg 1.a7 is incomplete, acting [4,3]
pg 1.ab is incomplete, acting [7,5]
pg 1.ac is incomplete, acting [3,4]
pg 1.b2 is stuck inactive for 308250.991831, current state incomplete, last acting [5,3]
pg 1.c8 is incomplete, acting [5,7]
pg 1.ca is incomplete, acting [7,4]
pg 1.ce is incomplete, acting [3,5]
pg 1.d3 is incomplete, acting [5,3]
pg 1.d4 is incomplete, acting [5,7]
pg 1.d5 is incomplete, acting [3,5]
pg 1.d7 is incomplete, acting [7,5]
pg 1.d8 is incomplete, acting [7,4]
pg 1.d9 is incomplete, acting [3,5]
pg 1.e4 is incomplete, acting [4,3]
pg 1.e5 is incomplete, acting [4,7]
pg 1.e8 is incomplete, acting [3,4]
pg 1.ea is incomplete, acting [7,6]
pg 1.fc is incomplete, acting [7,4]
PG_DEGRADED Degraded data redundancy: 55 pgs unclean
pg 1.7 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.a is stuck unclean for 338320.011063, current state incomplete, last acting [5,7]
pg 1.10 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.15 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.1c is stuck unclean for 339842.526709, current state incomplete, last acting [5,3]
pg 1.22 is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.24 is stuck unclean for 337308.425823, current state incomplete, last acting [5,7]
pg 1.2d is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.2e is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.32 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.34 is stuck unclean for 337069.758337, current state incomplete, last acting [4,3]
pg 1.38 is stuck unclean for 337601.590150, current state incomplete, last acting [4,7]
pg 1.41 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.42 is stuck unclean for 337060.232791, current state incomplete, last acting [4,3]
pg 1.4e is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.55 is stuck unclean for 337017.317347, current state incomplete, last acting [4,3]
pg 1.56 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.5e is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.64 is stuck unclean for 350982.214792, current state incomplete, last acting [4,7]
pg 1.6a is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.6e is stuck unclean for 337086.227150, current state incomplete, last acting [5,7]
pg 1.70 is stuck unclean for 337088.813288, current state incomplete, last acting [5,7]
pg 1.72 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.81 is stuck unclean for 337200.926400, current state incomplete, last acting [4,7]
pg 1.84 is stuck unclean for 337072.863306, current state incomplete, last acting [4,3]
pg 1.8d is stuck unclean for 337026.276733, current state incomplete, last acting [4,3]
pg 1.92 is stuck unclean for 337286.897884, current state incomplete, last acting [5,3]
pg 1.94 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.97 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.9d is stuck unclean for 337027.837860, current state incomplete, last acting [4,3]
pg 1.a1 is stuck unclean for 337019.656995, current state incomplete, last acting [5,7]
pg 1.a3 is stuck unclean for 337896.060862, current state incomplete, last acting [4,3]
pg 1.a4 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.a7 is stuck unclean for 339452.951245, current state incomplete, last acting [4,3]
pg 1.ab is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.ac is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.b2 is stuck unclean for 337076.245823, current state incomplete, last acting [5,3]
pg 1.c8 is stuck unclean for 337027.407565, current state incomplete, last acting [5,7]
pg 1.ca is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.ce is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.d3 is stuck unclean for 338267.489514, current state incomplete, last acting [5,3]
pg 1.d4 is stuck unclean for 337923.960109, current state incomplete, last acting [5,7]
pg 1.d5 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.d7 is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.d8 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.d9 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.e4 is stuck unclean for 337071.145460, current state incomplete, last acting [4,3]
pg 1.e5 is stuck unclean for 337885.022022, current state incomplete, last acting [4,7]
pg 1.e8 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.ea is stuck unclean since forever, current state incomplete, last acting [7,6]
pg 1.fc is stuck unclean since forever, current state incomplete, last acting [7,4]
root@node48:~# ceph pg 1.10 query --cluster loks
"recovery_state": [
{
"name": "Started/Primary/Peering/Incomplete",
"enter_time": "2018-12-03 23:01:24.176814",
"comment": "not enough complete instances of this PG"
},
{
"name": "Started/Primary/Peering",
"enter_time": "2018-12-03 23:01:24.140892",
"past_intervals": [
{
"first": "7293",
"last": "8250",
"all_participants": [
{
"osd": 0
},
{
"osd": 1
},
{
"osd": 2
},
{
"osd": 3
},
{
"osd": 5
},
{
"osd": 6
}
],
"intervals": [
{
"first": "7613",
"last": "7625",
"acting": "2"
},
{
"first": "7957",
"last": "7958",
"acting": "0"
},
{
"first": "8224",
"last": "8225",
"acting": "3"
},
{
"first": "8248",
"last": "8250",
"acting": "5"
}
]
}
],
"probing_osds": [
"0",
"1",
"3",
"5",
"6"
],
"down_osds_we_would_probe": [
2
],
"peering_blocked_by": [],
"peering_blocked_by_detail": [
{
"detail": "peering_blocked_by_history_les_bound"
}
]
},
{
"name": "Started",
"enter_time": "2018-12-03 23:01:24.140852"
}
],
root@node48:~# ceph pg 1.b2 query --cluster loks
],
"recovery_state": [
{
"name": "Started/Primary/Peering/Incomplete",
"enter_time": "2018-12-03 23:01:24.101226",
"comment": "not enough complete instances of this PG"
},
{
"name": "Started/Primary/Peering",
"enter_time": "2018-12-03 23:01:24.098373",
"past_intervals": [
{
"first": "7293",
"last": "8250",
"all_participants": [
{
"osd": 2
},
{
"osd": 3
},
{
"osd": 5
},
{
"osd": 6
}
],
"intervals": [
{
"first": "7613",
"last": "7625",
"acting": "2"
},
{
"first": "7957",
"last": "7960",
"acting": "6"
},
{
"first": "8224",
"last": "8225",
"acting": "3"
},
{
"first": "8248",
"last": "8250",
"acting": "5"
}
]
}
],
"probing_osds": [
"3",
"5",
"6"
],
"down_osds_we_would_probe": [
2
],
"peering_blocked_by": [],
"peering_blocked_by_detail": [
{
"detail": "peering_blocked_by_history_les_bound"
}
]
},
{
"name": "Started",
"enter_time": "2018-12-03 23:01:24.098337"
}
],
"agent_state": {}
}
root@node48:~# ceph pg 1.b2 query --cluster loks
Hi,
When i included the new node, the petasan show the osd2 down and show me just the option for exclude him.
The new node48 show the OSDs 3 and 7, and all OSDs is up and started.
root@node48:~# ceph health detail --cluster loks
HEALTH_WARN Reduced data availability: 55 pgs inactive, 55 pgs incomplete; Degraded data redundancy: 55 pgs unclean
PG_AVAILABILITY Reduced data availability: 55 pgs inactive, 55 pgs incomplete
pg 1.7 is incomplete, acting [7,4]
pg 1.a is incomplete, acting [5,7]
pg 1.10 is incomplete, acting [3,5]
pg 1.15 is incomplete, acting [3,5]
pg 1.1c is incomplete, acting [5,3]
pg 1.22 is incomplete, acting [7,5]
pg 1.24 is incomplete, acting [5,7]
pg 1.2d is incomplete, acting [3,5]
pg 1.2e is incomplete, acting [3,5]
pg 1.32 is incomplete, acting [7,4]
pg 1.34 is incomplete, acting [4,3]
pg 1.38 is incomplete, acting [4,7]
pg 1.41 is incomplete, acting [3,4]
pg 1.42 is incomplete, acting [4,3]
pg 1.4e is incomplete, acting [3,5]
pg 1.55 is incomplete, acting [4,3]
pg 1.56 is incomplete, acting [3,4]
pg 1.5e is incomplete, acting [7,4]
pg 1.64 is incomplete, acting [4,7]
pg 1.6a is incomplete, acting [3,4]
pg 1.6e is incomplete, acting [5,7]
pg 1.70 is incomplete, acting [5,7]
pg 1.72 is incomplete, acting [3,4]
pg 1.81 is incomplete, acting [4,7]
pg 1.84 is incomplete, acting [4,3]
pg 1.8d is incomplete, acting [4,3]
pg 1.92 is incomplete, acting [5,3]
pg 1.94 is incomplete, acting [3,5]
pg 1.97 is incomplete, acting [3,4]
pg 1.9d is incomplete, acting [4,3]
pg 1.a1 is incomplete, acting [5,7]
pg 1.a3 is incomplete, acting [4,3]
pg 1.a4 is incomplete, acting [3,5]
pg 1.a7 is incomplete, acting [4,3]
pg 1.ab is incomplete, acting [7,5]
pg 1.ac is incomplete, acting [3,4]
pg 1.b2 is stuck inactive for 308250.991831, current state incomplete, last acting [5,3]
pg 1.c8 is incomplete, acting [5,7]
pg 1.ca is incomplete, acting [7,4]
pg 1.ce is incomplete, acting [3,5]
pg 1.d3 is incomplete, acting [5,3]
pg 1.d4 is incomplete, acting [5,7]
pg 1.d5 is incomplete, acting [3,5]
pg 1.d7 is incomplete, acting [7,5]
pg 1.d8 is incomplete, acting [7,4]
pg 1.d9 is incomplete, acting [3,5]
pg 1.e4 is incomplete, acting [4,3]
pg 1.e5 is incomplete, acting [4,7]
pg 1.e8 is incomplete, acting [3,4]
pg 1.ea is incomplete, acting [7,6]
pg 1.fc is incomplete, acting [7,4]
PG_DEGRADED Degraded data redundancy: 55 pgs unclean
pg 1.7 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.a is stuck unclean for 338320.011063, current state incomplete, last acting [5,7]
pg 1.10 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.15 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.1c is stuck unclean for 339842.526709, current state incomplete, last acting [5,3]
pg 1.22 is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.24 is stuck unclean for 337308.425823, current state incomplete, last acting [5,7]
pg 1.2d is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.2e is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.32 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.34 is stuck unclean for 337069.758337, current state incomplete, last acting [4,3]
pg 1.38 is stuck unclean for 337601.590150, current state incomplete, last acting [4,7]
pg 1.41 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.42 is stuck unclean for 337060.232791, current state incomplete, last acting [4,3]
pg 1.4e is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.55 is stuck unclean for 337017.317347, current state incomplete, last acting [4,3]
pg 1.56 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.5e is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.64 is stuck unclean for 350982.214792, current state incomplete, last acting [4,7]
pg 1.6a is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.6e is stuck unclean for 337086.227150, current state incomplete, last acting [5,7]
pg 1.70 is stuck unclean for 337088.813288, current state incomplete, last acting [5,7]
pg 1.72 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.81 is stuck unclean for 337200.926400, current state incomplete, last acting [4,7]
pg 1.84 is stuck unclean for 337072.863306, current state incomplete, last acting [4,3]
pg 1.8d is stuck unclean for 337026.276733, current state incomplete, last acting [4,3]
pg 1.92 is stuck unclean for 337286.897884, current state incomplete, last acting [5,3]
pg 1.94 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.97 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.9d is stuck unclean for 337027.837860, current state incomplete, last acting [4,3]
pg 1.a1 is stuck unclean for 337019.656995, current state incomplete, last acting [5,7]
pg 1.a3 is stuck unclean for 337896.060862, current state incomplete, last acting [4,3]
pg 1.a4 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.a7 is stuck unclean for 339452.951245, current state incomplete, last acting [4,3]
pg 1.ab is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.ac is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.b2 is stuck unclean for 337076.245823, current state incomplete, last acting [5,3]
pg 1.c8 is stuck unclean for 337027.407565, current state incomplete, last acting [5,7]
pg 1.ca is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.ce is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.d3 is stuck unclean for 338267.489514, current state incomplete, last acting [5,3]
pg 1.d4 is stuck unclean for 337923.960109, current state incomplete, last acting [5,7]
pg 1.d5 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.d7 is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.d8 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.d9 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.e4 is stuck unclean for 337071.145460, current state incomplete, last acting [4,3]
pg 1.e5 is stuck unclean for 337885.022022, current state incomplete, last acting [4,7]
pg 1.e8 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.ea is stuck unclean since forever, current state incomplete, last acting [7,6]
pg 1.fc is stuck unclean since forever, current state incomplete, last acting [7,4]
root@node48:~# ceph pg 1.10 query --cluster loks
"recovery_state": [
{
"name": "Started/Primary/Peering/Incomplete",
"enter_time": "2018-12-03 23:01:24.176814",
"comment": "not enough complete instances of this PG"
},
{
"name": "Started/Primary/Peering",
"enter_time": "2018-12-03 23:01:24.140892",
"past_intervals": [
{
"first": "7293",
"last": "8250",
"all_participants": [
{
"osd": 0
},
{
"osd": 1
},
{
"osd": 2
},
{
"osd": 3
},
{
"osd": 5
},
{
"osd": 6
}
],
"intervals": [
{
"first": "7613",
"last": "7625",
"acting": "2"
},
{
"first": "7957",
"last": "7958",
"acting": "0"
},
{
"first": "8224",
"last": "8225",
"acting": "3"
},
{
"first": "8248",
"last": "8250",
"acting": "5"
}
]
}
],
"probing_osds": [
"0",
"1",
"3",
"5",
"6"
],
"down_osds_we_would_probe": [
2
],
"peering_blocked_by": [],
"peering_blocked_by_detail": [
{
"detail": "peering_blocked_by_history_les_bound"
}
]
},
{
"name": "Started",
"enter_time": "2018-12-03 23:01:24.140852"
}
],
root@node48:~# ceph pg 1.b2 query --cluster loks
],
"recovery_state": [
{
"name": "Started/Primary/Peering/Incomplete",
"enter_time": "2018-12-03 23:01:24.101226",
"comment": "not enough complete instances of this PG"
},
{
"name": "Started/Primary/Peering",
"enter_time": "2018-12-03 23:01:24.098373",
"past_intervals": [
{
"first": "7293",
"last": "8250",
"all_participants": [
{
"osd": 2
},
{
"osd": 3
},
{
"osd": 5
},
{
"osd": 6
}
],
"intervals": [
{
"first": "7613",
"last": "7625",
"acting": "2"
},
{
"first": "7957",
"last": "7960",
"acting": "6"
},
{
"first": "8224",
"last": "8225",
"acting": "3"
},
{
"first": "8248",
"last": "8250",
"acting": "5"
}
]
}
],
"probing_osds": [
"3",
"5",
"6"
],
"down_osds_we_would_probe": [
2
],
"peering_blocked_by": [],
"peering_blocked_by_detail": [
{
"detail": "peering_blocked_by_history_les_bound"
}
]
},
{
"name": "Started",
"enter_time": "2018-12-03 23:01:24.098337"
}
],
"agent_state": {}
}
root@node48:~# ceph pg 1.b2 query --cluster loks
admin
2,930 Posts
December 4, 2018, 1:24 pmQuote from admin on December 4, 2018, 1:24 pmIt is not clear for me if OSD 2 is still around or not, it is not listed in the ceph osd tree output, this will happen if it has been deleted but i need you to confirm this. Also is OSD 7 another disk or was OSD 2 deleted and re-added ? OSD 3 has not been deleted correct ?
If OSD 2 is still physically around, then the next best thing aside from trying to re-start it is to attempt to extract object in it, so please clarify this so i can help you with extracting this.
If however OSD 2 was deleted, then we can proceed with more risk that may result in some data loss, hopefully it will be data than was written the moment the power failure occurred but it may be more, so it is not without risk, We will make 2 attempts:
in the /etc/ceph/loks.conf on all nodes add this to the bottom:
osd_find_best_info_ignore_history_les=true
apply to all nodes at bottom of file then reboot.
If after 20 min the cluster is stuck and not recovering/changing (check ceph status, if pg status is changing slowly ), do a
ceph osd lost 2 --cluster loks
Again not do this unless you know the that OSD 2 is no more available, even if it is available as a physical disk out of the cluster, we may be able to extract data from it.
It is not clear for me if OSD 2 is still around or not, it is not listed in the ceph osd tree output, this will happen if it has been deleted but i need you to confirm this. Also is OSD 7 another disk or was OSD 2 deleted and re-added ? OSD 3 has not been deleted correct ?
If OSD 2 is still physically around, then the next best thing aside from trying to re-start it is to attempt to extract object in it, so please clarify this so i can help you with extracting this.
If however OSD 2 was deleted, then we can proceed with more risk that may result in some data loss, hopefully it will be data than was written the moment the power failure occurred but it may be more, so it is not without risk, We will make 2 attempts:
in the /etc/ceph/loks.conf on all nodes add this to the bottom:
osd_find_best_info_ignore_history_les=true
apply to all nodes at bottom of file then reboot.
If after 20 min the cluster is stuck and not recovering/changing (check ceph status, if pg status is changing slowly ), do a
ceph osd lost 2 --cluster loks
Again not do this unless you know the that OSD 2 is no more available, even if it is available as a physical disk out of the cluster, we may be able to extract data from it.
Last edited on December 4, 2018, 1:27 pm by admin · #12
Lerner
7 Posts
December 5, 2018, 1:17 amQuote from Lerner on December 5, 2018, 1:17 amHi,
The osd2 really was deleted, so i did the change in /etc/ceph/loks.conf and works, the iscsi is ok again.
Thank you so much for you time and help!
Hi,
The osd2 really was deleted, so i did the change in /etc/ceph/loks.conf and works, the iscsi is ok again.
Thank you so much for you time and help!
admin
2,930 Posts
December 6, 2018, 6:55 amQuote from admin on December 6, 2018, 6:55 amExcellent 🙂 remember to delete this setting from the config file and restart the nodes one at a time when cluster is active / clean, if you have io it is better re-assign the paths to other nodes while rebooting.
Excellent 🙂 remember to delete this setting from the config file and restart the nodes one at a time when cluster is active / clean, if you have io it is better re-assign the paths to other nodes while rebooting.
Last edited on December 6, 2018, 6:55 am by admin · #14
Pages: 1 2
Iscsi not start
Lerner
7 Posts
Quote from Lerner on December 4, 2018, 12:46 pmHi,
When i included the new node, the petasan show the osd2 down and show me just the option for exclude him.
The new node48 show the OSDs 3 and 7, and all OSDs is up and started.
root@node48:~# ceph health detail --cluster loks
HEALTH_WARN Reduced data availability: 55 pgs inactive, 55 pgs incomplete; Degraded data redundancy: 55 pgs unclean
PG_AVAILABILITY Reduced data availability: 55 pgs inactive, 55 pgs incomplete
pg 1.7 is incomplete, acting [7,4]
pg 1.a is incomplete, acting [5,7]
pg 1.10 is incomplete, acting [3,5]
pg 1.15 is incomplete, acting [3,5]
pg 1.1c is incomplete, acting [5,3]
pg 1.22 is incomplete, acting [7,5]
pg 1.24 is incomplete, acting [5,7]
pg 1.2d is incomplete, acting [3,5]
pg 1.2e is incomplete, acting [3,5]
pg 1.32 is incomplete, acting [7,4]
pg 1.34 is incomplete, acting [4,3]
pg 1.38 is incomplete, acting [4,7]
pg 1.41 is incomplete, acting [3,4]
pg 1.42 is incomplete, acting [4,3]
pg 1.4e is incomplete, acting [3,5]
pg 1.55 is incomplete, acting [4,3]
pg 1.56 is incomplete, acting [3,4]
pg 1.5e is incomplete, acting [7,4]
pg 1.64 is incomplete, acting [4,7]
pg 1.6a is incomplete, acting [3,4]
pg 1.6e is incomplete, acting [5,7]
pg 1.70 is incomplete, acting [5,7]
pg 1.72 is incomplete, acting [3,4]
pg 1.81 is incomplete, acting [4,7]
pg 1.84 is incomplete, acting [4,3]
pg 1.8d is incomplete, acting [4,3]
pg 1.92 is incomplete, acting [5,3]
pg 1.94 is incomplete, acting [3,5]
pg 1.97 is incomplete, acting [3,4]
pg 1.9d is incomplete, acting [4,3]
pg 1.a1 is incomplete, acting [5,7]
pg 1.a3 is incomplete, acting [4,3]
pg 1.a4 is incomplete, acting [3,5]
pg 1.a7 is incomplete, acting [4,3]
pg 1.ab is incomplete, acting [7,5]
pg 1.ac is incomplete, acting [3,4]
pg 1.b2 is stuck inactive for 308250.991831, current state incomplete, last acting [5,3]
pg 1.c8 is incomplete, acting [5,7]
pg 1.ca is incomplete, acting [7,4]
pg 1.ce is incomplete, acting [3,5]
pg 1.d3 is incomplete, acting [5,3]
pg 1.d4 is incomplete, acting [5,7]
pg 1.d5 is incomplete, acting [3,5]
pg 1.d7 is incomplete, acting [7,5]
pg 1.d8 is incomplete, acting [7,4]
pg 1.d9 is incomplete, acting [3,5]
pg 1.e4 is incomplete, acting [4,3]
pg 1.e5 is incomplete, acting [4,7]
pg 1.e8 is incomplete, acting [3,4]
pg 1.ea is incomplete, acting [7,6]
pg 1.fc is incomplete, acting [7,4]
PG_DEGRADED Degraded data redundancy: 55 pgs unclean
pg 1.7 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.a is stuck unclean for 338320.011063, current state incomplete, last acting [5,7]
pg 1.10 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.15 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.1c is stuck unclean for 339842.526709, current state incomplete, last acting [5,3]
pg 1.22 is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.24 is stuck unclean for 337308.425823, current state incomplete, last acting [5,7]
pg 1.2d is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.2e is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.32 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.34 is stuck unclean for 337069.758337, current state incomplete, last acting [4,3]
pg 1.38 is stuck unclean for 337601.590150, current state incomplete, last acting [4,7]
pg 1.41 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.42 is stuck unclean for 337060.232791, current state incomplete, last acting [4,3]
pg 1.4e is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.55 is stuck unclean for 337017.317347, current state incomplete, last acting [4,3]
pg 1.56 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.5e is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.64 is stuck unclean for 350982.214792, current state incomplete, last acting [4,7]
pg 1.6a is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.6e is stuck unclean for 337086.227150, current state incomplete, last acting [5,7]
pg 1.70 is stuck unclean for 337088.813288, current state incomplete, last acting [5,7]
pg 1.72 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.81 is stuck unclean for 337200.926400, current state incomplete, last acting [4,7]
pg 1.84 is stuck unclean for 337072.863306, current state incomplete, last acting [4,3]
pg 1.8d is stuck unclean for 337026.276733, current state incomplete, last acting [4,3]
pg 1.92 is stuck unclean for 337286.897884, current state incomplete, last acting [5,3]
pg 1.94 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.97 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.9d is stuck unclean for 337027.837860, current state incomplete, last acting [4,3]
pg 1.a1 is stuck unclean for 337019.656995, current state incomplete, last acting [5,7]
pg 1.a3 is stuck unclean for 337896.060862, current state incomplete, last acting [4,3]
pg 1.a4 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.a7 is stuck unclean for 339452.951245, current state incomplete, last acting [4,3]
pg 1.ab is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.ac is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.b2 is stuck unclean for 337076.245823, current state incomplete, last acting [5,3]
pg 1.c8 is stuck unclean for 337027.407565, current state incomplete, last acting [5,7]
pg 1.ca is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.ce is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.d3 is stuck unclean for 338267.489514, current state incomplete, last acting [5,3]
pg 1.d4 is stuck unclean for 337923.960109, current state incomplete, last acting [5,7]
pg 1.d5 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.d7 is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.d8 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.d9 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.e4 is stuck unclean for 337071.145460, current state incomplete, last acting [4,3]
pg 1.e5 is stuck unclean for 337885.022022, current state incomplete, last acting [4,7]
pg 1.e8 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.ea is stuck unclean since forever, current state incomplete, last acting [7,6]
pg 1.fc is stuck unclean since forever, current state incomplete, last acting [7,4]root@node48:~# ceph pg 1.10 query --cluster loks
"recovery_state": [
{
"name": "Started/Primary/Peering/Incomplete",
"enter_time": "2018-12-03 23:01:24.176814",
"comment": "not enough complete instances of this PG"
},
{
"name": "Started/Primary/Peering",
"enter_time": "2018-12-03 23:01:24.140892",
"past_intervals": [
{
"first": "7293",
"last": "8250",
"all_participants": [
{
"osd": 0
},
{
"osd": 1
},
{
"osd": 2
},
{
"osd": 3
},
{
"osd": 5
},
{
"osd": 6
}
],
"intervals": [
{
"first": "7613",
"last": "7625",
"acting": "2"
},
{
"first": "7957",
"last": "7958",
"acting": "0"
},
{
"first": "8224",
"last": "8225",
"acting": "3"
},
{
"first": "8248",
"last": "8250",
"acting": "5"
}
]
}
],
"probing_osds": [
"0",
"1",
"3",
"5",
"6"
],
"down_osds_we_would_probe": [
2
],
"peering_blocked_by": [],
"peering_blocked_by_detail": [
{
"detail": "peering_blocked_by_history_les_bound"
}
]
},
{
"name": "Started",
"enter_time": "2018-12-03 23:01:24.140852"
}
],
root@node48:~# ceph pg 1.b2 query --cluster loks
],
"recovery_state": [
{
"name": "Started/Primary/Peering/Incomplete",
"enter_time": "2018-12-03 23:01:24.101226",
"comment": "not enough complete instances of this PG"
},
{
"name": "Started/Primary/Peering",
"enter_time": "2018-12-03 23:01:24.098373",
"past_intervals": [
{
"first": "7293",
"last": "8250",
"all_participants": [
{
"osd": 2
},
{
"osd": 3
},
{
"osd": 5
},
{
"osd": 6
}
],
"intervals": [
{
"first": "7613",
"last": "7625",
"acting": "2"
},
{
"first": "7957",
"last": "7960",
"acting": "6"
},
{
"first": "8224",
"last": "8225",
"acting": "3"
},
{
"first": "8248",
"last": "8250",
"acting": "5"
}
]
}
],
"probing_osds": [
"3",
"5",
"6"
],
"down_osds_we_would_probe": [
2
],
"peering_blocked_by": [],
"peering_blocked_by_detail": [
{
"detail": "peering_blocked_by_history_les_bound"
}
]
},
{
"name": "Started",
"enter_time": "2018-12-03 23:01:24.098337"
}
],
"agent_state": {}
}
root@node48:~# ceph pg 1.b2 query --cluster loks
Hi,
When i included the new node, the petasan show the osd2 down and show me just the option for exclude him.
The new node48 show the OSDs 3 and 7, and all OSDs is up and started.
root@node48:~# ceph health detail --cluster loks
HEALTH_WARN Reduced data availability: 55 pgs inactive, 55 pgs incomplete; Degraded data redundancy: 55 pgs unclean
PG_AVAILABILITY Reduced data availability: 55 pgs inactive, 55 pgs incomplete
pg 1.7 is incomplete, acting [7,4]
pg 1.a is incomplete, acting [5,7]
pg 1.10 is incomplete, acting [3,5]
pg 1.15 is incomplete, acting [3,5]
pg 1.1c is incomplete, acting [5,3]
pg 1.22 is incomplete, acting [7,5]
pg 1.24 is incomplete, acting [5,7]
pg 1.2d is incomplete, acting [3,5]
pg 1.2e is incomplete, acting [3,5]
pg 1.32 is incomplete, acting [7,4]
pg 1.34 is incomplete, acting [4,3]
pg 1.38 is incomplete, acting [4,7]
pg 1.41 is incomplete, acting [3,4]
pg 1.42 is incomplete, acting [4,3]
pg 1.4e is incomplete, acting [3,5]
pg 1.55 is incomplete, acting [4,3]
pg 1.56 is incomplete, acting [3,4]
pg 1.5e is incomplete, acting [7,4]
pg 1.64 is incomplete, acting [4,7]
pg 1.6a is incomplete, acting [3,4]
pg 1.6e is incomplete, acting [5,7]
pg 1.70 is incomplete, acting [5,7]
pg 1.72 is incomplete, acting [3,4]
pg 1.81 is incomplete, acting [4,7]
pg 1.84 is incomplete, acting [4,3]
pg 1.8d is incomplete, acting [4,3]
pg 1.92 is incomplete, acting [5,3]
pg 1.94 is incomplete, acting [3,5]
pg 1.97 is incomplete, acting [3,4]
pg 1.9d is incomplete, acting [4,3]
pg 1.a1 is incomplete, acting [5,7]
pg 1.a3 is incomplete, acting [4,3]
pg 1.a4 is incomplete, acting [3,5]
pg 1.a7 is incomplete, acting [4,3]
pg 1.ab is incomplete, acting [7,5]
pg 1.ac is incomplete, acting [3,4]
pg 1.b2 is stuck inactive for 308250.991831, current state incomplete, last acting [5,3]
pg 1.c8 is incomplete, acting [5,7]
pg 1.ca is incomplete, acting [7,4]
pg 1.ce is incomplete, acting [3,5]
pg 1.d3 is incomplete, acting [5,3]
pg 1.d4 is incomplete, acting [5,7]
pg 1.d5 is incomplete, acting [3,5]
pg 1.d7 is incomplete, acting [7,5]
pg 1.d8 is incomplete, acting [7,4]
pg 1.d9 is incomplete, acting [3,5]
pg 1.e4 is incomplete, acting [4,3]
pg 1.e5 is incomplete, acting [4,7]
pg 1.e8 is incomplete, acting [3,4]
pg 1.ea is incomplete, acting [7,6]
pg 1.fc is incomplete, acting [7,4]
PG_DEGRADED Degraded data redundancy: 55 pgs unclean
pg 1.7 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.a is stuck unclean for 338320.011063, current state incomplete, last acting [5,7]
pg 1.10 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.15 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.1c is stuck unclean for 339842.526709, current state incomplete, last acting [5,3]
pg 1.22 is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.24 is stuck unclean for 337308.425823, current state incomplete, last acting [5,7]
pg 1.2d is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.2e is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.32 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.34 is stuck unclean for 337069.758337, current state incomplete, last acting [4,3]
pg 1.38 is stuck unclean for 337601.590150, current state incomplete, last acting [4,7]
pg 1.41 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.42 is stuck unclean for 337060.232791, current state incomplete, last acting [4,3]
pg 1.4e is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.55 is stuck unclean for 337017.317347, current state incomplete, last acting [4,3]
pg 1.56 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.5e is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.64 is stuck unclean for 350982.214792, current state incomplete, last acting [4,7]
pg 1.6a is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.6e is stuck unclean for 337086.227150, current state incomplete, last acting [5,7]
pg 1.70 is stuck unclean for 337088.813288, current state incomplete, last acting [5,7]
pg 1.72 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.81 is stuck unclean for 337200.926400, current state incomplete, last acting [4,7]
pg 1.84 is stuck unclean for 337072.863306, current state incomplete, last acting [4,3]
pg 1.8d is stuck unclean for 337026.276733, current state incomplete, last acting [4,3]
pg 1.92 is stuck unclean for 337286.897884, current state incomplete, last acting [5,3]
pg 1.94 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.97 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.9d is stuck unclean for 337027.837860, current state incomplete, last acting [4,3]
pg 1.a1 is stuck unclean for 337019.656995, current state incomplete, last acting [5,7]
pg 1.a3 is stuck unclean for 337896.060862, current state incomplete, last acting [4,3]
pg 1.a4 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.a7 is stuck unclean for 339452.951245, current state incomplete, last acting [4,3]
pg 1.ab is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.ac is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.b2 is stuck unclean for 337076.245823, current state incomplete, last acting [5,3]
pg 1.c8 is stuck unclean for 337027.407565, current state incomplete, last acting [5,7]
pg 1.ca is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.ce is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.d3 is stuck unclean for 338267.489514, current state incomplete, last acting [5,3]
pg 1.d4 is stuck unclean for 337923.960109, current state incomplete, last acting [5,7]
pg 1.d5 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.d7 is stuck unclean since forever, current state incomplete, last acting [7,5]
pg 1.d8 is stuck unclean since forever, current state incomplete, last acting [7,4]
pg 1.d9 is stuck unclean since forever, current state incomplete, last acting [3,5]
pg 1.e4 is stuck unclean for 337071.145460, current state incomplete, last acting [4,3]
pg 1.e5 is stuck unclean for 337885.022022, current state incomplete, last acting [4,7]
pg 1.e8 is stuck unclean since forever, current state incomplete, last acting [3,4]
pg 1.ea is stuck unclean since forever, current state incomplete, last acting [7,6]
pg 1.fc is stuck unclean since forever, current state incomplete, last acting [7,4]
root@node48:~# ceph pg 1.10 query --cluster loks
"recovery_state": [
{
"name": "Started/Primary/Peering/Incomplete",
"enter_time": "2018-12-03 23:01:24.176814",
"comment": "not enough complete instances of this PG"
},
{
"name": "Started/Primary/Peering",
"enter_time": "2018-12-03 23:01:24.140892",
"past_intervals": [
{
"first": "7293",
"last": "8250",
"all_participants": [
{
"osd": 0
},
{
"osd": 1
},
{
"osd": 2
},
{
"osd": 3
},
{
"osd": 5
},
{
"osd": 6
}
],
"intervals": [
{
"first": "7613",
"last": "7625",
"acting": "2"
},
{
"first": "7957",
"last": "7958",
"acting": "0"
},
{
"first": "8224",
"last": "8225",
"acting": "3"
},
{
"first": "8248",
"last": "8250",
"acting": "5"
}
]
}
],
"probing_osds": [
"0",
"1",
"3",
"5",
"6"
],
"down_osds_we_would_probe": [
2
],
"peering_blocked_by": [],
"peering_blocked_by_detail": [
{
"detail": "peering_blocked_by_history_les_bound"
}
]
},
{
"name": "Started",
"enter_time": "2018-12-03 23:01:24.140852"
}
],
root@node48:~# ceph pg 1.b2 query --cluster loks
],
"recovery_state": [
{
"name": "Started/Primary/Peering/Incomplete",
"enter_time": "2018-12-03 23:01:24.101226",
"comment": "not enough complete instances of this PG"
},
{
"name": "Started/Primary/Peering",
"enter_time": "2018-12-03 23:01:24.098373",
"past_intervals": [
{
"first": "7293",
"last": "8250",
"all_participants": [
{
"osd": 2
},
{
"osd": 3
},
{
"osd": 5
},
{
"osd": 6
}
],
"intervals": [
{
"first": "7613",
"last": "7625",
"acting": "2"
},
{
"first": "7957",
"last": "7960",
"acting": "6"
},
{
"first": "8224",
"last": "8225",
"acting": "3"
},
{
"first": "8248",
"last": "8250",
"acting": "5"
}
]
}
],
"probing_osds": [
"3",
"5",
"6"
],
"down_osds_we_would_probe": [
2
],
"peering_blocked_by": [],
"peering_blocked_by_detail": [
{
"detail": "peering_blocked_by_history_les_bound"
}
]
},
{
"name": "Started",
"enter_time": "2018-12-03 23:01:24.098337"
}
],
"agent_state": {}
}
root@node48:~# ceph pg 1.b2 query --cluster loks
admin
2,930 Posts
Quote from admin on December 4, 2018, 1:24 pmIt is not clear for me if OSD 2 is still around or not, it is not listed in the ceph osd tree output, this will happen if it has been deleted but i need you to confirm this. Also is OSD 7 another disk or was OSD 2 deleted and re-added ? OSD 3 has not been deleted correct ?
If OSD 2 is still physically around, then the next best thing aside from trying to re-start it is to attempt to extract object in it, so please clarify this so i can help you with extracting this.
If however OSD 2 was deleted, then we can proceed with more risk that may result in some data loss, hopefully it will be data than was written the moment the power failure occurred but it may be more, so it is not without risk, We will make 2 attempts:
in the /etc/ceph/loks.conf on all nodes add this to the bottom:
osd_find_best_info_ignore_history_les=true
apply to all nodes at bottom of file then reboot.
If after 20 min the cluster is stuck and not recovering/changing (check ceph status, if pg status is changing slowly ), do a
ceph osd lost 2 --cluster loks
Again not do this unless you know the that OSD 2 is no more available, even if it is available as a physical disk out of the cluster, we may be able to extract data from it.
It is not clear for me if OSD 2 is still around or not, it is not listed in the ceph osd tree output, this will happen if it has been deleted but i need you to confirm this. Also is OSD 7 another disk or was OSD 2 deleted and re-added ? OSD 3 has not been deleted correct ?
If OSD 2 is still physically around, then the next best thing aside from trying to re-start it is to attempt to extract object in it, so please clarify this so i can help you with extracting this.
If however OSD 2 was deleted, then we can proceed with more risk that may result in some data loss, hopefully it will be data than was written the moment the power failure occurred but it may be more, so it is not without risk, We will make 2 attempts:
in the /etc/ceph/loks.conf on all nodes add this to the bottom:
osd_find_best_info_ignore_history_les=true
apply to all nodes at bottom of file then reboot.
If after 20 min the cluster is stuck and not recovering/changing (check ceph status, if pg status is changing slowly ), do a
ceph osd lost 2 --cluster loks
Again not do this unless you know the that OSD 2 is no more available, even if it is available as a physical disk out of the cluster, we may be able to extract data from it.
Lerner
7 Posts
Quote from Lerner on December 5, 2018, 1:17 amHi,
The osd2 really was deleted, so i did the change in /etc/ceph/loks.conf and works, the iscsi is ok again.
Thank you so much for you time and help!
Hi,
The osd2 really was deleted, so i did the change in /etc/ceph/loks.conf and works, the iscsi is ok again.
Thank you so much for you time and help!
admin
2,930 Posts
Quote from admin on December 6, 2018, 6:55 amExcellent 🙂 remember to delete this setting from the config file and restart the nodes one at a time when cluster is active / clean, if you have io it is better re-assign the paths to other nodes while rebooting.
Excellent 🙂 remember to delete this setting from the config file and restart the nodes one at a time when cluster is active / clean, if you have io it is better re-assign the paths to other nodes while rebooting.