Opened 3 weeks ago

Closed 3 weeks ago

#1542 closed defect (fixed)

update-smart-drivedb doesn't kill it's gpg-agent and throws errors on systems with root on NFS

Reported by: rwofyiyj.awdawdq Owned by: Christian Franke
Priority: minor Milestone: Release 7.3
Component: all Version: 7.2
Keywords: Cc:

Description (last modified by Christian Franke)

I use smartd daemon on machines with root filesystem.

update-smart-drivedb doesn't kill it's gpg-agent process and tries to remove it's home dir (line 494).
On NFS it doesn't work. Directory cannot be removed because nfs preserves delete open sockets/files. (script would have to rm -rf the directory twice)

This error causes stderr message which creates additional mail (false positive) when cron is configured to send mail.

This small patch kills the gpg-agent and allows for error free $gnupgtmp deletion.

diff -c backup /usr/sbin/update-smart-drivedb
*** backup	2021-11-04 14:29:14.665707108 +0100
--- /usr/sbin/update-smart-drivedb	2021-11-04 14:31:08.927219722 +0100
***************
*** 491,496 ****
--- 491,497 ----
      echo "$out" >&2
    fi
  
+   kill $(ps ax | awk '$5=="gpg-agent" && /smartmontools/{print $2}')
    rm -f -r "$gnupgtmp"
    return $r

Reproducible: Always

Steps to Reproduce:

  1. run /etc/cron.monthly/smartmontools-update-drivedb on machine with root on NFS

Actual Results:

LC_ALL=C /etc/cron.monthly/smartmontools-update-drivedb
rm: cannot remove '/var/db/smartmontools/.gnupg.139163.tmp': Directory not empty

Script executes normal but has problem deleting it's gpg directory.

Expected Results:
There should be no stderr output when scripts executes normally.

Script doesn't delete it's temporary directory.

Change History (7)

comment:1 Changed 3 weeks ago by Christian Franke

Description: modified (diff)

comment:2 in reply to:  description Changed 3 weeks ago by Christian Franke

Milestone: Release 7.3

Thanks for reporting this. In the early days of the update-smart-drivedb script, gpg did not start gpg-agent if secret keys are not used.

diff -c backup /usr/sbin/update-smart-drivedb
...
+   kill $(ps ax | awk '$5=="gpg-agent" && /smartmontools/{print $2}')
    rm -f -r "$gnupgtmp"
    return $r

Sorry, no. If this command finds nothing (because an older gpg is used), kill may itself write an error message to stderr. The command may also find other unrelated agents.

Fortunately gpgconf is provided, so please test this line instead of the kill ...:

gpgconf --homedir="$gnupgtmp" --kill gpg-agent

comment:3 Changed 3 weeks ago by rwofyiyj.awdawdq

Thank you, I'm not very familiar with gpg-agent.
I've tried to use

gpgconf --homedir="$gnupgtmp" --kill gpg-agent

but it's not fast enough and still causes issues with rm

+ gpgconf --homedir=/var/db/smartmontools/.gnupg.79944.tmp --kill gpg-agent
+ rm -f -r /var/db/smartmontools/.gnupg.79944.tmp
rm: cannot remove '/var/db/smartmontools/.gnupg.79944.tmp/.nfs00000000003b5b9a0000368f': Device or resource busy
rm: cannot remove '/var/db/smartmontools/.gnupg.79944.tmp/.nfs00000000003b5b9b00003690': Device or resource busy
rm: cannot remove '/var/db/smartmontools/.gnupg.79944.tmp/.nfs00000000003b5b9c00003691': Device or resource busy
rm: cannot remove '/var/db/smartmontools/.gnupg.79944.tmp/.nfs00000000003b5b9d00003692': Device or resource busy
+ return 0

This could be fixed by adding small delay after gpgconf command

diff -u backup /usr/sbin/update-smart-drivedb
--- backup	2021-11-04 14:29:14.665707108 +0100
+++ /usr/sbin/update-smart-drivedb	2021-11-05 15:17:09.772832571 +0100
@@ -491,6 +491,8 @@
     echo "$out" >&2
   fi
 
+  gpgconf --homedir="$gnupgtmp" --kill gpg-agent
+  sleep 0.2
   rm -f -r "$gnupgtmp"
   return $rc
 }

This version works well on my system and doesn't produce any errors.
In my test sleep 0.01 was enough to avoid the error, but I'm using more conservative 0.2 sec.
If you want you could check if gpg-agent is still running, but I'm not what's the proper way to get it's pid without parsing ps output.

(I'm using gnupg-2.2.32)

comment:4 in reply to:  3 Changed 3 weeks ago by Christian Franke

+ gpgconf --homedir=/var/db/smartmontools/.gnupg.79944.tmp --kill gpg-agent
+ rm -f -r /var/db/smartmontools/.gnupg.79944.tmp
rm: cannot remove '/var/db/smartmontools/.gnupg.79944.tmp/.nfs00000000003b5b9d00003692': Device or resource busy

Possibly a NFS timing issue, as gpgconf apparently waits until gpg-agent exits.

+  sleep 0.2

This is not portable. I would suggest something like

if ! rm -f -r "$gnupgtmp" >/dev/null 2>&1; then
  sleep 1
  rm -f -r "$gnupgtmp"
fi

comment:5 Changed 3 weeks ago by rwofyiyj.awdawdq

I've tested your solution.

   fi
 
+  gpgconf --homedir="$gnupgtmp" --kill gpg-agent
+  if ! rm -f -r "$gnupgtmp" >/dev/null 2>&1; then
+    sleep 1
+    rm -f -r "$gnupgtmp"
+  fi
   rm -f -r "$gnupgtmp"

It works as expected.
There are no errors and no orphaned temporary directories.

I think you can close this issue and commit the fix.

Thank you for your help.

comment:6 Changed 3 weeks ago by Christian Franke

Owner: set to Christian Franke
Status: newaccepted

Thanks for quick testing.

comment:7 Changed 3 weeks ago by Christian Franke

Resolution: fixed
Status: acceptedclosed
Note: See TracTickets for help on using tickets.