Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

OSD down, can't add it anymore

Hello,

suddenly an OSD went down. I successfully deleted it, but I can't add it anymore, when I choose "Add device" from the disk list the yellow "Adding" icon appears under status, but after few seconds it turns to empty and the plus (+) icon is displayed again.

This are the last lines of the log file of the failed OSD (osd.5):

   -25> 2020-04-28 12:44:40.130 7f55b5d00700  5 bluestore.MempoolThread(0x563b56d02b68) _trim_shards cache_size: 2845415832 kv_alloc: 1073741824 kv_used: 92051200 meta_alloc: 1040187392 meta_used: 58043937 data_alloc: 721420288 data_used: 97234944
-24> 2020-04-28 12:44:41.138 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-23> 2020-04-28 12:44:42.134 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-22> 2020-04-28 12:44:43.138 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-21> 2020-04-28 12:44:44.050 7f55a34db700  5 osd.5 6236 heartbeat osd_stat(store_statfs(0x916e8ce0000/0x40000000/0x9273fffe000, data 0x10f82528a/0x117310000, compress 0x0/0x0/0x0, omap 0x1fd3, meta 0x3fffe02d), peers [0,1,2,3,4,6,10,11,12,13,14,15,16,17,19,20,21,24,25,26,27,28] op hist [])
-20> 2020-04-28 12:44:44.138 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-19> 2020-04-28 12:44:44.546 7f55a34db700  5 osd.5 6236 heartbeat osd_stat(store_statfs(0x916e8ce0000/0x40000000/0x9273fffe000, data 0x10f82528a/0x117310000, compress 0x0/0x0/0x0, omap 0x1fd3, meta 0x3fffe02d), peers [0,1,2,3,4,6,10,11,12,13,14,15,16,17,19,20,21,24,25,26,27,28] op hist [])
-18> 2020-04-28 12:44:45.138 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-17> 2020-04-28 12:44:45.138 7f55b5d00700  5 bluestore.MempoolThread(0x563b56d02b68) _trim_shards cache_size: 2845415832 kv_alloc: 1073741824 kv_used: 92051200 meta_alloc: 1040187392 meta_used: 58043937 data_alloc: 721420288 data_used: 97234944
-16> 2020-04-28 12:44:46.142 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-15> 2020-04-28 12:44:47.138 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-14> 2020-04-28 12:44:47.674 7f55b273b700 10 monclient: tick
-13> 2020-04-28 12:44:47.674 7f55b273b700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2020-04-28 12:44:17.679409)
-12> 2020-04-28 12:44:48.146 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-11> 2020-04-28 12:44:49.050 7f55a34db700  5 osd.5 6236 heartbeat osd_stat(store_statfs(0x916e8ce0000/0x40000000/0x9273fffe000, data 0x10f82528a/0x117310000, compress 0x0/0x0/0x0, omap 0x1fd3, meta 0x3fffe02d), peers [0,1,2,3,4,6,10,11,12,13,14,15,16,17,19,20,21,24,25,26,27,28] op hist [])
-10> 2020-04-28 12:44:49.142 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-9> 2020-04-28 12:44:49.546 7f55a34db700  5 osd.5 6236 heartbeat osd_stat(store_statfs(0x916e8ce0000/0x40000000/0x9273fffe000, data 0x10f82528a/0x117310000, compress 0x0/0x0/0x0, omap 0x1fd3, meta 0x3fffe02d), peers [0,1,2,3,4,6,10,11,12,13,14,15,16,17,19,20,21,24,25,26,27,28] op hist [])
-8> 2020-04-28 12:44:50.142 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-7> 2020-04-28 12:44:50.142 7f55b5d00700  5 bluestore.MempoolThread(0x563b56d02b68) _trim_shards cache_size: 2845415832 kv_alloc: 1073741824 kv_used: 92051200 meta_alloc: 1040187392 meta_used: 58043937 data_alloc: 721420288 data_used: 97234944
-6> 2020-04-28 12:44:51.142 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-5> 2020-04-28 12:44:52.150 7f55b5d00700  5 prioritycache tune_memory target: 4294967296 mapped: 671752192 unmapped: 1819533312 heap: 2491285504 old mem: 2845415832 new mem: 2845415832
-4> 2020-04-28 12:44:52.482 7f55aecf2700  2 osd.5 6236 ms_handle_reset con 0x563b770ba480 session 0x563b6cc22280
-3> 2020-04-28 12:44:52.482 7f55aecf2700  3 osd.5 6236 handle_osd_map epochs [6237,6237], i have 6236, src has [5561,6237]
-2> 2020-04-28 12:44:52.482 7f55aecf2700 -1 bluestore(/var/lib/ceph/osd/ceph-5) _do_read bdev-read failed: (5) Input/output error
-1> 2020-04-28 12:44:52.482 7f55aecf2700 -1 /mnt/ceph-14.2.7/src/osd/OSD.cc: In function 'void OSD::handle_osd_map(MOSDMap*)' thread 7f55aecf2700 time 2020-04-28 12:44:52.486437
/mnt/ceph-14.2.7/src/osd/OSD.cc: 8378: FAILED ceph_assert(p != added_maps_bl.end())

ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x563b4ae12e4c]
2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x563b4ae13027]
3: (OSD::handle_osd_map(MOSDMap*)+0x1c6a) [0x563b4aead4ba]
4: (OSD::_dispatch(Message*)+0xa1) [0x563b4aebe701]
5: (OSD::ms_dispatch(Message*)+0x68) [0x563b4aebeae8]
6: (DispatchQueue::entry()+0x110b) [0x563b4b80e13b]
7: (DispatchQueue::DispatchThread::entry()+0xd) [0x563b4b665a3d]
8: (()+0x76db) [0x7f55c799f6db]
9: (clone()+0x3f) [0x7f55c673f88f]

0> 2020-04-28 12:44:52.482 7f55aecf2700 -1 *** Caught signal (Aborted) **
in thread 7f55aecf2700 thread_name:ms_dispatch

ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)
1: (()+0x12890) [0x7f55c79aa890]
2: (gsignal()+0xc7) [0x7f55c665ce97]
3: (abort()+0x141) [0x7f55c665e801]
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a3) [0x563b4ae12e9d]
5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x563b4ae13027]
6: (OSD::handle_osd_map(MOSDMap*)+0x1c6a) [0x563b4aead4ba]
7: (OSD::_dispatch(Message*)+0xa1) [0x563b4aebe701]
8: (OSD::ms_dispatch(Message*)+0x68) [0x563b4aebeae8]
9: (DispatchQueue::entry()+0x110b) [0x563b4b80e13b]
10: (DispatchQueue::DispatchThread::entry()+0xd) [0x563b4b665a3d]
11: (()+0x76db) [0x7f55c799f6db]
12: (clone()+0x3f) [0x7f55c673f88f]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
0/ 5 none
0/ 1 lockdep
0/ 1 context
1/ 1 crush
1/ 5 mds
1/ 5 mds_balancer
1/ 5 mds_locker
1/ 5 mds_log
1/ 5 mds_log_expire
1/ 5 mds_migrator
0/ 1 buffer
0/ 1 timer
0/ 1 filer
0/ 1 striper
0/ 1 objecter
0/ 5 rados
0/ 5 rbd
0/ 5 rbd_mirror
0/ 5 rbd_replay
0/ 5 journaler
0/ 5 objectcacher
0/ 5 client
1/ 5 osd
0/ 5 optracker
0/ 5 objclass
1/ 3 filestore
1/ 3 journal
0/ 0 ms
1/ 5 mon
0/10 monc
1/ 5 paxos
0/ 5 tp
1/ 5 auth
1/ 5 crypto
1/ 1 finisher
1/ 1 reserver
1/ 5 heartbeatmap
1/ 5 perfcounter
1/ 5 rgw
1/ 5 rgw_sync
1/10 civetweb
1/ 5 javaclient
1/ 5 asok
1/ 1 throttle
0/ 0 refs
1/ 5 xio
1/ 5 compressor
1/ 5 bluestore
1/ 5 bluefs
1/ 3 bdev
1/ 5 kstore
4/ 5 rocksdb
4/ 5 leveldb
4/ 5 memdb
1/ 5 kinetic
1/ 5 fuse
1/ 5 mgr
1/ 5 mgrc
1/ 5 dpdk
1/ 5 eventtrace
1/ 5 prioritycache
-2/-2 (syslog threshold)
-1/-1 (stderr threshold)
max_recent     10000
max_new         1000
log_file /var/log/ceph/ceph-osd.5.log
--- end dump of recent events ---
2020-04-28 12:44:52.726 7f5f54457c00  0 set uid:gid to 64045:64045 (ceph:ceph)
2020-04-28 12:44:52.726 7f5f54457c00  0 ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable), process ceph-osd, pid 1564574
2020-04-28 12:44:52.726 7f5f54457c00  0 pidfile_write: ignore empty --pid-file
2020-04-28 12:44:52.726 7f5f54457c00 -1 bluestore(/var/lib/ceph/osd/ceph-5/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-5/block: (5) Input/output error
2020-04-28 12:44:52.726 7f5f54457c00 -1  ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-5: (2) No such file or directory
2020-04-28 12:44:53.034 7f01473dec00  0 set uid:gid to 64045:64045 (ceph:ceph)
2020-04-28 12:44:53.034 7f01473dec00  0 ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable), process ceph-osd, pid 1564606
2020-04-28 12:44:53.034 7f01473dec00  0 pidfile_write: ignore empty --pid-file
2020-04-28 12:44:53.034 7f01473dec00 -1 bluestore(/var/lib/ceph/osd/ceph-5/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-5/block: (5) Input/output error
2020-04-28 12:44:53.034 7f01473dec00 -1  ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-5: (2) No such file or directory
2020-04-28 12:44:53.286 7fe7dc0abc00  0 set uid:gid to 64045:64045 (ceph:ceph)
2020-04-28 12:44:53.286 7fe7dc0abc00  0 ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable), process ceph-osd, pid 1564624
2020-04-28 12:44:53.286 7fe7dc0abc00  0 pidfile_write: ignore empty --pid-file
2020-04-28 12:44:53.286 7fe7dc0abc00 -1 bluestore(/var/lib/ceph/osd/ceph-5/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-5/block: (5) Input/output error
2020-04-28 12:44:53.286 7fe7dc0abc00 -1  ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-5: (2) No such file or directory

Might this due to a broken disk ? It is brand new... 🙁