Podman with Ganesha: log file runs out of disk space. FULL_DEBUG enabled by default.
wid
47 Posts
August 6, 2023, 9:44 amQuote from wid on August 6, 2023, 9:44 amHey,
After the last update, the space on the root / disk began to disappear very quickly.
The culprit was a log file created in Podman, a container
localhost/petasan-nfs-ganesha:3.2.0
generated a file of more than 250 GB in /var/log/ganesha/ganesha.log
in the container, in the file
/etc/ganesha/ganesha.conf
I see that debugging is enabled on FULL_DEBUG.
LOG
{
components
{
ALL = FULL_DEBUG; # this will likely kill performance;
}
}
can this be disabled from within Petasan or do you have to rebuild the entire container image?
And a request to the readers, do they have the same setup at their place?
podman exec -it `podman ps -aq` cat /etc/ganesha/ganesha.conf | grep ALL
Hey,
After the last update, the space on the root / disk began to disappear very quickly.
The culprit was a log file created in Podman, a container
localhost/petasan-nfs-ganesha:3.2.0
generated a file of more than 250 GB in /var/log/ganesha/ganesha.log
in the container, in the file
/etc/ganesha/ganesha.conf
I see that debugging is enabled on FULL_DEBUG.
LOG
{
components
{
ALL = FULL_DEBUG; # this will likely kill performance;
}
}
can this be disabled from within Petasan or do you have to rebuild the entire container image?
And a request to the readers, do they have the same setup at their place?
podman exec -it `podman ps -aq` cat /etc/ganesha/ganesha.conf | grep ALL
admin
2,930 Posts
August 6, 2023, 3:05 pmQuote from admin on August 6, 2023, 3:05 pmwe are looking into this and will get back
we are looking into this and will get back
admin
2,930 Posts
August 6, 2023, 5:36 pmQuote from admin on August 6, 2023, 5:36 pmdownload patch:
https://www.petasan.org/fixes/321/nfs-remove-debug-logs.patch
apply patch on all nodes:
patch -p1 -d / < nfs-remove-debug-logs.patch
on the admin nodes:
systemctl restart petasan-admin
on nfs server nodes:
systemctl restart petasan-nfs-server
download patch:
https://www.petasan.org/fixes/321/nfs-remove-debug-logs.patch
apply patch on all nodes:
patch -p1 -d / < nfs-remove-debug-logs.patch
on the admin nodes:
systemctl restart petasan-admin
on nfs server nodes:
systemctl restart petasan-nfs-server
wid
47 Posts
August 6, 2023, 7:57 pmQuote from wid on August 6, 2023, 7:57 pmIt work's perfectly!
Thanks for fast response!
It work's perfectly!
Thanks for fast response!
wid
47 Posts
December 4, 2023, 1:49 pmQuote from wid on December 4, 2023, 1:49 pmUPDATE:
After a recent online update, the problem returned.
After patching the system again, the problem has been fixed.
UPDATE:
After a recent online update, the problem returned.
After patching the system again, the problem has been fixed.
admin
2,930 Posts
December 4, 2023, 4:06 pmQuote from admin on December 4, 2023, 4:06 pmThe patch is already included in 3.2.1 so it is strange you saw the issue after upgrade. Is it possible that you re-ran the patch after the upgrade by mistake and it may have reversed it ?
The patch is already included in 3.2.1 so it is strange you saw the issue after upgrade. Is it possible that you re-ran the patch after the upgrade by mistake and it may have reversed it ?
wid
47 Posts
December 8, 2023, 3:53 pmQuote from wid on December 8, 2023, 3:53 pmI'm sure nothing has changed.
This is a test cluster for learning.
The grafana stats were not working, I read that online-update fixes the problem.
I ran one command:
/opt/petasan/scripts/online-updates/update.sh
on each of the 3 nodes.
The stats fixed themselves.
After an hour, the monitoring started alerting about the huge data growth. After verification the logs were growing.
Patching and restarting - worked again .
I'm sure nothing has changed.
This is a test cluster for learning.
The grafana stats were not working, I read that online-update fixes the problem.
I ran one command:
/opt/petasan/scripts/online-updates/update.sh
on each of the 3 nodes.
The stats fixed themselves.
After an hour, the monitoring started alerting about the huge data growth. After verification the logs were growing.
Patching and restarting - worked again .
wid
47 Posts
December 26, 2023, 12:53 pmQuote from wid on December 26, 2023, 12:53 pmOnce again I have a situation where the disk begins to run out of space, it is a different node than last time.
I check the file:
/usr/lib/python3/dist-packages/PetaSAN/core/nfs/config_builder.py
It does not contain the phrase "DEBUG".
Debug is enabled in the container:
podman exec -it NFS-172-30-0-141 grep -i debug /etc/ganesha/ganesha.conf
ALL = FULL_DEBUG; # this will likely kill performance;
Test-adding something to NFS Exports changes the ganesha.conf file, but does not remove FULL_DEBUG.
--
I thought to myself that maybe the container was built once and maybe here is the problem:
I ran an empty container image to see if there was any DEBUG backlog, but there isn't, the ganesha.conf file contains a simple test one:
EXPORT
{
# Export Id (mandatory, each EXPORT must have a unique Export_Id)
Export_Id = 77;
# Exported path (mandatory)
Path = /nonexistant;
--
I handled this by removing DEBUG from the ganesha.conf file in the container and restarted. It works.
--
I went through each node, whether in /var/lib/containers/storage/*/merged/*/ganesha.conf
DEBUG logging is visible -- all clean.
Looks like something is adding that line on the fly.
If you have an idea where else the problem might be - I'd be happy to check.
Once again I have a situation where the disk begins to run out of space, it is a different node than last time.
I check the file:
/usr/lib/python3/dist-packages/PetaSAN/core/nfs/config_builder.py
It does not contain the phrase "DEBUG".
Debug is enabled in the container:
podman exec -it NFS-172-30-0-141 grep -i debug /etc/ganesha/ganesha.conf
ALL = FULL_DEBUG; # this will likely kill performance;
Test-adding something to NFS Exports changes the ganesha.conf file, but does not remove FULL_DEBUG.
--
I thought to myself that maybe the container was built once and maybe here is the problem:
I ran an empty container image to see if there was any DEBUG backlog, but there isn't, the ganesha.conf file contains a simple test one:
EXPORT
{
# Export Id (mandatory, each EXPORT must have a unique Export_Id)
Export_Id = 77;
# Exported path (mandatory)
Path = /nonexistant;
--
I handled this by removing DEBUG from the ganesha.conf file in the container and restarted. It works.
--
I went through each node, whether in /var/lib/containers/storage/*/merged/*/ganesha.conf
DEBUG logging is visible -- all clean.
Looks like something is adding that line on the fly.
If you have an idea where else the problem might be - I'd be happy to check.
admin
2,930 Posts
December 26, 2023, 2:11 pmQuote from admin on December 26, 2023, 2:11 pmonly the config_builder.py configures this file, it is read by the service when it starts. maybe something when wrong during upgrade which prevented the service to restart, maybe. If you add a test export and all nodes/containers show the correct config file, then all is working well.
only the config_builder.py configures this file, it is read by the service when it starts. maybe something when wrong during upgrade which prevented the service to restart, maybe. If you add a test export and all nodes/containers show the correct config file, then all is working well.
Podman with Ganesha: log file runs out of disk space. FULL_DEBUG enabled by default.
wid
47 Posts
Quote from wid on August 6, 2023, 9:44 amHey,
After the last update, the space on the root / disk began to disappear very quickly.
The culprit was a log file created in Podman, a container
localhost/petasan-nfs-ganesha:3.2.0
generated a file of more than 250 GB in /var/log/ganesha/ganesha.log
in the container, in the file
/etc/ganesha/ganesha.conf
I see that debugging is enabled on FULL_DEBUG.
LOG
{
components
{
ALL = FULL_DEBUG; # this will likely kill performance;
}
}can this be disabled from within Petasan or do you have to rebuild the entire container image?
And a request to the readers, do they have the same setup at their place?
podman exec -it `podman ps -aq` cat /etc/ganesha/ganesha.conf | grep ALL
Hey,
After the last update, the space on the root / disk began to disappear very quickly.
The culprit was a log file created in Podman, a container
localhost/petasan-nfs-ganesha:3.2.0
generated a file of more than 250 GB in /var/log/ganesha/ganesha.log
in the container, in the file
/etc/ganesha/ganesha.conf
I see that debugging is enabled on FULL_DEBUG.
LOG
{
components
{
ALL = FULL_DEBUG; # this will likely kill performance;
}
}
can this be disabled from within Petasan or do you have to rebuild the entire container image?
And a request to the readers, do they have the same setup at their place?
podman exec -it `podman ps -aq` cat /etc/ganesha/ganesha.conf | grep ALL
admin
2,930 Posts
Quote from admin on August 6, 2023, 3:05 pmwe are looking into this and will get back
we are looking into this and will get back
admin
2,930 Posts
Quote from admin on August 6, 2023, 5:36 pmdownload patch:
https://www.petasan.org/fixes/321/nfs-remove-debug-logs.patchapply patch on all nodes:
patch -p1 -d / < nfs-remove-debug-logs.patchon the admin nodes:
systemctl restart petasan-adminon nfs server nodes:
systemctl restart petasan-nfs-server
download patch:
https://www.petasan.org/fixes/321/nfs-remove-debug-logs.patch
apply patch on all nodes:
patch -p1 -d / < nfs-remove-debug-logs.patch
on the admin nodes:
systemctl restart petasan-admin
on nfs server nodes:
systemctl restart petasan-nfs-server
wid
47 Posts
Quote from wid on August 6, 2023, 7:57 pmIt work's perfectly!
Thanks for fast response!
It work's perfectly!
Thanks for fast response!
wid
47 Posts
Quote from wid on December 4, 2023, 1:49 pmUPDATE:
After a recent online update, the problem returned.
After patching the system again, the problem has been fixed.
UPDATE:
After a recent online update, the problem returned.
After patching the system again, the problem has been fixed.
admin
2,930 Posts
Quote from admin on December 4, 2023, 4:06 pmThe patch is already included in 3.2.1 so it is strange you saw the issue after upgrade. Is it possible that you re-ran the patch after the upgrade by mistake and it may have reversed it ?
The patch is already included in 3.2.1 so it is strange you saw the issue after upgrade. Is it possible that you re-ran the patch after the upgrade by mistake and it may have reversed it ?
wid
47 Posts
Quote from wid on December 8, 2023, 3:53 pmI'm sure nothing has changed.
This is a test cluster for learning.
The grafana stats were not working, I read that online-update fixes the problem.
I ran one command:
/opt/petasan/scripts/online-updates/update.sh
on each of the 3 nodes.The stats fixed themselves.
After an hour, the monitoring started alerting about the huge data growth. After verification the logs were growing.
Patching and restarting - worked again .
I'm sure nothing has changed.
This is a test cluster for learning.
The grafana stats were not working, I read that online-update fixes the problem.
I ran one command:
/opt/petasan/scripts/online-updates/update.sh
on each of the 3 nodes.
The stats fixed themselves.
After an hour, the monitoring started alerting about the huge data growth. After verification the logs were growing.
Patching and restarting - worked again .
wid
47 Posts
Quote from wid on December 26, 2023, 12:53 pmOnce again I have a situation where the disk begins to run out of space, it is a different node than last time.
I check the file:
/usr/lib/python3/dist-packages/PetaSAN/core/nfs/config_builder.pyIt does not contain the phrase "DEBUG".
Debug is enabled in the container:
podman exec -it NFS-172-30-0-141 grep -i debug /etc/ganesha/ganesha.conf
ALL = FULL_DEBUG; # this will likely kill performance;Test-adding something to NFS Exports changes the ganesha.conf file, but does not remove FULL_DEBUG.
--
I thought to myself that maybe the container was built once and maybe here is the problem:
I ran an empty container image to see if there was any DEBUG backlog, but there isn't, the ganesha.conf file contains a simple test one:
EXPORT
{
# Export Id (mandatory, each EXPORT must have a unique Export_Id)
Export_Id = 77;# Exported path (mandatory)
Path = /nonexistant;--
I handled this by removing DEBUG from the ganesha.conf file in the container and restarted. It works.
--
I went through each node, whether in /var/lib/containers/storage/*/merged/*/ganesha.conf
DEBUG logging is visible -- all clean.Looks like something is adding that line on the fly.
If you have an idea where else the problem might be - I'd be happy to check.
Once again I have a situation where the disk begins to run out of space, it is a different node than last time.
I check the file:
/usr/lib/python3/dist-packages/PetaSAN/core/nfs/config_builder.py
It does not contain the phrase "DEBUG".
Debug is enabled in the container:
podman exec -it NFS-172-30-0-141 grep -i debug /etc/ganesha/ganesha.conf
ALL = FULL_DEBUG; # this will likely kill performance;
Test-adding something to NFS Exports changes the ganesha.conf file, but does not remove FULL_DEBUG.
--
I thought to myself that maybe the container was built once and maybe here is the problem:
I ran an empty container image to see if there was any DEBUG backlog, but there isn't, the ganesha.conf file contains a simple test one:
EXPORT
{
# Export Id (mandatory, each EXPORT must have a unique Export_Id)
Export_Id = 77;
# Exported path (mandatory)
Path = /nonexistant;
--
I handled this by removing DEBUG from the ganesha.conf file in the container and restarted. It works.
--
I went through each node, whether in /var/lib/containers/storage/*/merged/*/ganesha.conf
DEBUG logging is visible -- all clean.
Looks like something is adding that line on the fly.
If you have an idea where else the problem might be - I'd be happy to check.
admin
2,930 Posts
Quote from admin on December 26, 2023, 2:11 pmonly the config_builder.py configures this file, it is read by the service when it starts. maybe something when wrong during upgrade which prevented the service to restart, maybe. If you add a test export and all nodes/containers show the correct config file, then all is working well.
only the config_builder.py configures this file, it is read by the service when it starts. maybe something when wrong during upgrade which prevented the service to restart, maybe. If you add a test export and all nodes/containers show the correct config file, then all is working well.