504 Gateway Timeout - S3User Config
podilarius
44 Posts
May 4, 2022, 4:57 amQuote from podilarius on May 4, 2022, 4:57 amI am getting Error 504 - Gateway timeout when trying to view/edit S3 Users.
It seems that every time this page loads, its getting active data from radosgw-admin commands.
Since I have over 110M objects in S3 it takes about 2.5 minutes to load.
I can see the commands to get user stats traversing all users.
I am guessing its timing out waiting on that command.
I have fixed is by change the /etc/nginx/sites-enabled/petasan_admin proxy timeouts for server on port 443.
It is now:
server {
listen 443 ssl;
server_name 10.1.7.182;
ssl_certificate /opt/petasan/config/certificates/server.crt;
ssl_certificate_key /opt/petasan/config/certificates/server.key;
location /grafana/ {
proxy_pass http://stats/;
proxy_connect_timeout 5s;
proxy_send_timeout 5s;
proxy_read_timeout 5s;
}
location / {
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_redirect http://$http_host/ https://$http_host/;
proxy_pass http://127.0.0.1:5002;
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;
}
}
Perhaps a better way is to load user stats via cron into a s3stats file and perhaps have a load interval.
Or don't show that on the front page and only load stats in the user config or info context menu.
The default 60 second timeouts would work for most of our S3 Users. 120 seconds would be better, but only if we are loading single user stats.
Its late here and I hope this make sense.
I am getting Error 504 - Gateway timeout when trying to view/edit S3 Users.
It seems that every time this page loads, its getting active data from radosgw-admin commands.
Since I have over 110M objects in S3 it takes about 2.5 minutes to load.
I can see the commands to get user stats traversing all users.
I am guessing its timing out waiting on that command.
I have fixed is by change the /etc/nginx/sites-enabled/petasan_admin proxy timeouts for server on port 443.
It is now:
server {
listen 443 ssl;
server_name 10.1.7.182;
ssl_certificate /opt/petasan/config/certificates/server.crt;
ssl_certificate_key /opt/petasan/config/certificates/server.key;
location /grafana/ {
proxy_pass http://stats/;
proxy_connect_timeout 5s;
proxy_send_timeout 5s;
proxy_read_timeout 5s;
}
location / {
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_redirect http://$http_host/ https://$http_host/;
proxy_pass http://127.0.0.1:5002;
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;
}
}
Perhaps a better way is to load user stats via cron into a s3stats file and perhaps have a load interval.
Or don't show that on the front page and only load stats in the user config or info context menu.
The default 60 second timeouts would work for most of our S3 Users. 120 seconds would be better, but only if we are loading single user stats.
Its late here and I hope this make sense.
Last edited on May 4, 2022, 4:59 am by podilarius · #1
admin
2,930 Posts
May 5, 2022, 2:26 pmQuote from admin on May 5, 2022, 2:26 pmDo you have a multisite setup ?
How many users do you have ?
Do you have a multisite setup ?
How many users do you have ?
podilarius
44 Posts
May 5, 2022, 4:31 pmQuote from podilarius on May 5, 2022, 4:31 pmThis is not a multisite setup.
We have 15 now and adding 2 more in the next week or 2.
When I was testing, I had up to 20ish, but very small amount of object.
By the time we get done, we are going to have about 160 to 200 million objects.
This will grow more over time and I might have to adjust the timeout for this page to load all the stats (size and object count) if this remains the same.
This is not a multisite setup.
We have 15 now and adding 2 more in the next week or 2.
When I was testing, I had up to 20ish, but very small amount of object.
By the time we get done, we are going to have about 160 to 200 million objects.
This will grow more over time and I might have to adjust the timeout for this page to load all the stats (size and object count) if this remains the same.
admin
2,930 Posts
May 12, 2022, 6:08 pmQuote from admin on May 12, 2022, 6:08 pmCan you try the following
Open the file :
/usr/lib/python3/dist-packages/PetaSAN/core/ceph/api.py
Find a method with name ---> get_rgw_user_stats
In the body of the method , find the following line :
cmd = "radosgw-admin user stats --uid={} --sync-stats".format(id)
Remove the option "--sync-stats" from this line.
Save and close the file.
On the first 3 nodes, restart management service
systemctl restart petasan-admin
Can you try the following
Open the file :
/usr/lib/python3/dist-packages/PetaSAN/core/ceph/api.py
Find a method with name ---> get_rgw_user_stats
In the body of the method , find the following line :
cmd = "radosgw-admin user stats --uid={} --sync-stats".format(id)
Remove the option "--sync-stats" from this line.
Save and close the file.
On the first 3 nodes, restart management service
systemctl restart petasan-admin
podilarius
44 Posts
May 13, 2022, 4:39 amQuote from podilarius on May 13, 2022, 4:39 amI have made the change.
The page now loads in ~38 seconds.
That is about 2 minutes faster and should not timeout.
What is the data refresh rate in this scenerio?
I have made the change.
The page now loads in ~38 seconds.
That is about 2 minutes faster and should not timeout.
What is the data refresh rate in this scenerio?
admin
2,930 Posts
May 13, 2022, 10:54 amQuote from admin on May 13, 2022, 10:54 amVery Good. The refresh rate is within 3 min, it is define by rgw_user_quota_bucket_sync_interval.
Can you manually run the command for a specific user id:
radosgw-admin user stats --uid=XX --sync-stats
and see roughly how long it takes
Are you using SSD, HDD, mix ?
Very Good. The refresh rate is within 3 min, it is define by rgw_user_quota_bucket_sync_interval.
Can you manually run the command for a specific user id:
radosgw-admin user stats --uid=XX --sync-stats
and see roughly how long it takes
Are you using SSD, HDD, mix ?
podilarius
44 Posts
May 13, 2022, 1:58 pmQuote from podilarius on May 13, 2022, 1:58 pmWe have a mix of HDD and SSD. Only the S3 Data is on the HDD in a EC32 pool.
The metadata and index is on a SSD pool.
I ran the 3 largest buckets (40 million, 35 million, and 22 million object) with and without the --sync-stats from the command line.
The results are they run in about the same 5 seconds each.
I have now 16 buckets, so it should run in 80 seconds or something close to it.
That is still longer than the default 60 seconds.
I also cannot get near 38 seconds, might have been an anomaly.
I have run it on all 3 nodes multiple times and each is at 130 seconds to load.
Still a lot better than the 210 seconds with --sync-stats in there.
We have a mix of HDD and SSD. Only the S3 Data is on the HDD in a EC32 pool.
The metadata and index is on a SSD pool.
I ran the 3 largest buckets (40 million, 35 million, and 22 million object) with and without the --sync-stats from the command line.
The results are they run in about the same 5 seconds each.
I have now 16 buckets, so it should run in 80 seconds or something close to it.
That is still longer than the default 60 seconds.
I also cannot get near 38 seconds, might have been an anomaly.
I have run it on all 3 nodes multiple times and each is at 130 seconds to load.
Still a lot better than the 210 seconds with --sync-stats in there.
Last edited on May 14, 2022, 12:24 pm by podilarius · #7
504 Gateway Timeout - S3User Config
podilarius
44 Posts
Quote from podilarius on May 4, 2022, 4:57 amI am getting Error 504 - Gateway timeout when trying to view/edit S3 Users.
It seems that every time this page loads, its getting active data from radosgw-admin commands.
Since I have over 110M objects in S3 it takes about 2.5 minutes to load.
I can see the commands to get user stats traversing all users.
I am guessing its timing out waiting on that command.I have fixed is by change the /etc/nginx/sites-enabled/petasan_admin proxy timeouts for server on port 443.
It is now:
server {
listen 443 ssl;
server_name 10.1.7.182;
ssl_certificate /opt/petasan/config/certificates/server.crt;
ssl_certificate_key /opt/petasan/config/certificates/server.key;
location /grafana/ {
proxy_pass http://stats/;
proxy_connect_timeout 5s;
proxy_send_timeout 5s;
proxy_read_timeout 5s;
}
location / {
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_redirect http://$http_host/ https://$http_host/;
proxy_pass http://127.0.0.1:5002;
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;
}
}
Perhaps a better way is to load user stats via cron into a s3stats file and perhaps have a load interval.
Or don't show that on the front page and only load stats in the user config or info context menu.
The default 60 second timeouts would work for most of our S3 Users. 120 seconds would be better, but only if we are loading single user stats.Its late here and I hope this make sense.
I am getting Error 504 - Gateway timeout when trying to view/edit S3 Users.
It seems that every time this page loads, its getting active data from radosgw-admin commands.
Since I have over 110M objects in S3 it takes about 2.5 minutes to load.
I can see the commands to get user stats traversing all users.
I am guessing its timing out waiting on that command.
I have fixed is by change the /etc/nginx/sites-enabled/petasan_admin proxy timeouts for server on port 443.
It is now:
server {
listen 443 ssl;
server_name 10.1.7.182;
ssl_certificate /opt/petasan/config/certificates/server.crt;
ssl_certificate_key /opt/petasan/config/certificates/server.key;
location /grafana/ {
proxy_pass http://stats/;
proxy_connect_timeout 5s;
proxy_send_timeout 5s;
proxy_read_timeout 5s;
}
location / {
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_redirect http://$http_host/ https://$http_host/;
proxy_pass http://127.0.0.1:5002;
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;
}
}
Perhaps a better way is to load user stats via cron into a s3stats file and perhaps have a load interval.
Or don't show that on the front page and only load stats in the user config or info context menu.
The default 60 second timeouts would work for most of our S3 Users. 120 seconds would be better, but only if we are loading single user stats.
Its late here and I hope this make sense.
admin
2,930 Posts
Quote from admin on May 5, 2022, 2:26 pmDo you have a multisite setup ?
How many users do you have ?
Do you have a multisite setup ?
How many users do you have ?
podilarius
44 Posts
Quote from podilarius on May 5, 2022, 4:31 pmThis is not a multisite setup.
We have 15 now and adding 2 more in the next week or 2.
When I was testing, I had up to 20ish, but very small amount of object.
By the time we get done, we are going to have about 160 to 200 million objects.This will grow more over time and I might have to adjust the timeout for this page to load all the stats (size and object count) if this remains the same.
This is not a multisite setup.
We have 15 now and adding 2 more in the next week or 2.
When I was testing, I had up to 20ish, but very small amount of object.
By the time we get done, we are going to have about 160 to 200 million objects.
This will grow more over time and I might have to adjust the timeout for this page to load all the stats (size and object count) if this remains the same.
admin
2,930 Posts
Quote from admin on May 12, 2022, 6:08 pmCan you try the following
Open the file :
/usr/lib/python3/dist-packages/PetaSAN/core/ceph/api.pyFind a method with name ---> get_rgw_user_stats
In the body of the method , find the following line :
cmd = "radosgw-admin user stats --uid={} --sync-stats".format(id)Remove the option "--sync-stats" from this line.
Save and close the file.
On the first 3 nodes, restart management service
systemctl restart petasan-admin
Can you try the following
Open the file :
/usr/lib/python3/dist-packages/PetaSAN/core/ceph/api.py
Find a method with name ---> get_rgw_user_stats
In the body of the method , find the following line :
cmd = "radosgw-admin user stats --uid={} --sync-stats".format(id)
Remove the option "--sync-stats" from this line.
Save and close the file.
On the first 3 nodes, restart management service
systemctl restart petasan-admin
podilarius
44 Posts
Quote from podilarius on May 13, 2022, 4:39 amI have made the change.
The page now loads in ~38 seconds.
That is about 2 minutes faster and should not timeout.What is the data refresh rate in this scenerio?
I have made the change.
The page now loads in ~38 seconds.
That is about 2 minutes faster and should not timeout.
What is the data refresh rate in this scenerio?
admin
2,930 Posts
Quote from admin on May 13, 2022, 10:54 amVery Good. The refresh rate is within 3 min, it is define by rgw_user_quota_bucket_sync_interval.
Can you manually run the command for a specific user id:
radosgw-admin user stats --uid=XX --sync-stats
and see roughly how long it takesAre you using SSD, HDD, mix ?
Very Good. The refresh rate is within 3 min, it is define by rgw_user_quota_bucket_sync_interval.
Can you manually run the command for a specific user id:
radosgw-admin user stats --uid=XX --sync-stats
and see roughly how long it takes
Are you using SSD, HDD, mix ?
podilarius
44 Posts
Quote from podilarius on May 13, 2022, 1:58 pmWe have a mix of HDD and SSD. Only the S3 Data is on the HDD in a EC32 pool.
The metadata and index is on a SSD pool.
I ran the 3 largest buckets (40 million, 35 million, and 22 million object) with and without the --sync-stats from the command line.The results are they run in about the same 5 seconds each.
I have now 16 buckets, so it should run in 80 seconds or something close to it.
That is still longer than the default 60 seconds.I also cannot get near 38 seconds, might have been an anomaly.
I have run it on all 3 nodes multiple times and each is at 130 seconds to load.
Still a lot better than the 210 seconds with --sync-stats in there.
We have a mix of HDD and SSD. Only the S3 Data is on the HDD in a EC32 pool.
The metadata and index is on a SSD pool.
I ran the 3 largest buckets (40 million, 35 million, and 22 million object) with and without the --sync-stats from the command line.
The results are they run in about the same 5 seconds each.
I have now 16 buckets, so it should run in 80 seconds or something close to it.
That is still longer than the default 60 seconds.
I also cannot get near 38 seconds, might have been an anomaly.
I have run it on all 3 nodes multiple times and each is at 130 seconds to load.
Still a lot better than the 210 seconds with --sync-stats in there.