Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Performance issues, possible misconfiguration

Hello!

I have following config of three servers:

  1. phisical server - Petasan v2.2 - 6x12tb drives
  2. phisical server - Petasan v2.2 - 6x12tb drives
  3. virtual server - petasan v2.2 notde as monitor and manage no osd drives

I have poor performance and while at holidays backup is in progress, it gets overloaded so the VMs are crashing that are located on petasan iSCSI LUNs;

Performance is as following:

Sequential Read : 58.620 MB/s
Sequential Write : 22.492 MB/s
Random Read 512KB : 9.720 MB/s
Random Write 512KB : 15.143 MB/s
Random Read 4KB (QD=1) : 0.507 MB/s [ 123.9 IOPS]
Random Write 4KB (QD=1) : 0.756 MB/s [ 184.5 IOPS]
Random Read 4KB (QD=32) : 5.786 MB/s [ 1412.5 IOPS]
Random Write 4KB (QD=32) : 6.569 MB/s [ 1603.8 IOPS]

I thought that I need to setup more OSDs, so I installed 12 more discs so my current config is:

  1. phisical server - Petasan v2.2 - 12x12tb drives
  2. phisical server - Petasan v2.2 - 12x12tb drives
  3. virtual server - petasan v2.2 node as monitor and manage

Together I have 24 OSDs, however I still stuck with the same issues, slow performance and errors with backup is in process...

I did check Petasan Management statistics and for me it seems weird, that HDD usage is not going above 30%, see attached images and commit latency is quite high.

Any help would be apriciated!

/////////////////////

CEPH config is following (config is default, maybe it is missconfigured?):

[global]
fsid = e443b5a2-152d-4f10-b8d0-cc5666852905
mon_host = 10.129.2.195,10.129.2.193,10.129.2.194

public_network = 10.129.2.192/27
cluster_network = 10.129.2.224/27

osd_pool_default_pg_num = 256
osd_pool_default_pgp_num = 256
osd_pool_default_size = 2
osd_pool_default_min_size = 1

auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
max_open_files = 131072
rbd_default_features = 3

mon_pg_warn_min_per_osd = 10
mon_pg_warn_max_per_osd = 300
mon_max_pg_per_osd = 300
osd_max_pg_per_osd_hard_ratio = 2.5
mon_osd_min_in_ratio = 0.3
mon_allow_pool_delete = true
bluestore_block_db_size = 64424509440

[mon]
setuser_match_path = /var/lib/ceph/$type/$cluster-$id
mon_clock_drift_allowed = .300
mon compact on start = true

[osd]
osd_crush_update_on_start = true
osd_heartbeat_grace = 20
osd_heartbeat_interval = 5

osd_max_backfills = 1
osd_recovery_max_active = 1
osd_recovery_priority = 1
osd_recovery_op_priority = 1
osd_recovery_threads = 1
osd_client_op_priority = 63
osd_recovery_max_start = 1

osd_max_scrubs = 1
osd_scrub_during_recovery = false
osd_scrub_priority = 1
osd_scrub_sleep = 1
osd_scrub_chunk_min = 1
osd_scrub_chunk_max = 5
osd_scrub_load_threshold = 0.3
osd_scrub_begin_hour = 20
osd_scrub_end_hour = 6
# Generic Entry Level Hardware, use defaults
osd_op_num_shards_hdd = 5
osd_op_num_shards_ssd = 8
osd_op_num_threads_per_shard_hdd = 1
osd_op_num_threads_per_shard_ssd = 2

bluestore_prefer_deferred_size_hdd = 32768
bluestore_prefer_deferred_size_ssd = 0
bluestore_cache_size_ssd = 3221225472
bluestore_cache_size_hdd = 1073741824

Hi,

do you have

  • 10 Gbit network or greater?
  • Journal SSD devices?

I never did install a petasan node in a vm, so maybe that could be also a bottleneck.

 

I have all 1G ethernet; However if I do check network usage it also doesn't seems the bottleneck.

I have phisical server:

  • 1x ssd for system;
  • 1x ssd for journal;
  • 12x hdd for drives;

In the beginning I planned to set up that one ssd for journal however I did found that petasan allows one journal drive to be added to only one osd drive. so as I don't have 12 ssd for journals I don't have any.

Then the cause will likely be the 1-Gbit network. Latency is to high. Ceph (and Petasan) needs to have at least 10-Gbit for production.

Hemm, weird... I did install 10GB ethernet between ceph storages, speed did increase for ~ 2x, but cluster still is unstable... On a backup processes or any other multiple IO intensive operations it can crash, I am wondering, maybe I have to update these values:

# Generic Entry Level Hardware, use defaults
osd_op_num_shards_hdd = 5
osd_op_num_shards_ssd = 8
osd_op_num_threads_per_shard_hdd = 1
osd_op_num_threads_per_shard_ssd = 2

As I have overall 24 OSDs (HDD) and none SSD, should I update config like this?

osd_op_num_shards_hdd = 24
osd_op_num_shards_ssd = 0
osd_op_num_threads_per_shard_hdd = 1
osd_op_num_threads_per_shard_ssd = 2

could it be an issue?

O.

Pure hdd will not give good performance. Even if your hdd %busy is only between 30-50%, still the latency is still high.

If you do not use SSDs for storage, use them for journals and cache devices

 I did found that petasan allows one journal drive to be added to only one osd drive. so as I don't have 12 ssd for journals I don't have any.

This is not correct. We recommend 1 SSD per 4-5 HDDs for journal

Thank you for replay!

Yes, but still I am not worried now about performance, it can be slow as I am creating archive storage, but more about cluster crash if there is multiple requests at a time (eg. backupa agents job, or some other massive data transfer), I am looking how to fix cluster crashing.

 

P.S. how can I add one journal to multiple OSDs? With Petasan GUI I can find a way to add Journal to only one OSD.

Hi Oskars,

every OSD has only one journal. This can be on the hdd itsself or can be provided by an ssd. So one OSD has one journal but one SSD can serve multiple journals (for the same number of OSDs).

You have to mark your SSD as a journal device before creating OSDs.