Discussion:
stuck on ghak100 testsuite script
Richard Guy Briggs
2018-11-09 21:48:31 UTC
Permalink
Hi Paul, Ondrej,

I've got a couple of patches with two different approaches to address
ghak100:
https://github.com/linux-audit/audit-kernel/issues/100

The patches work, but I've not posted them yet because I wanted to
update the audit-testsuite first to consistently test it.

I've written a test to automate the regression test to add to
audit-testsuite based on the reproducer recipe provided in ghak100. The
procedure in the description of ghak100 works, but I'm having some
trouble with the script. In particular, it is hanging the script on the
"kill 'SIGSTOP' $pid_fuse" line. Once it hangs, the main script, the
test subscript and both backgrounded processes (fuse and umount) are
still hanging around.

Here's the script:
https://github.com/linux-audit/audit-testsuite/compare/master...rgbriggs:ghak100-missing-mount-hang

Do either of you have any insight why this might be happenning and how
to fix or work around it?

A couple of minor notes:
- The $pid_fuse += 1 is necessary since it forks from the PID reported
to the shell.
- The SIGSTOP is necessary to simulate the hung filesystem.


- RGB

--
Richard Guy Briggs <***@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
Ondrej Mosnacek
2018-11-11 16:24:04 UTC
Permalink
Hi Richard,
Post by Richard Guy Briggs
Hi Paul, Ondrej,
I've got a couple of patches with two different approaches to address
https://github.com/linux-audit/audit-kernel/issues/100
The patches work, but I've not posted them yet because I wanted to
update the audit-testsuite first to consistently test it.
I've written a test to automate the regression test to add to
audit-testsuite based on the reproducer recipe provided in ghak100. The
procedure in the description of ghak100 works, but I'm having some
trouble with the script. In particular, it is hanging the script on the
"kill 'SIGSTOP' $pid_fuse" line. Once it hangs, the main script, the
test subscript and both backgrounded processes (fuse and umount) are
still hanging around.
https://github.com/linux-audit/audit-testsuite/compare/master...rgbriggs:ghak100-missing-mount-hang
Do either of you have any insight why this might be happenning and how
to fix or work around it?
- The $pid_fuse += 1 is necessary since it forks from the PID reported
to the shell.
I don't understand... why do you expect the forked PID to be exactly
one higher? This doesn't seem to be the case in general:

$ (echo $$; exec bash -c 'echo $$' &)
10995
13693
Post by Richard Guy Briggs
- The SIGSTOP is necessary to simulate the hung filesystem.
- RGB
--
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
--
Linux-audit mailing list
https://www.redhat.com/mailman/listinfo/linux-audit
--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
Richard Guy Briggs
2018-11-11 22:36:16 UTC
Permalink
This post might be inappropriate. Click to display it.
Ondrej Mosnacek
2018-11-12 11:32:05 UTC
Permalink
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Hi Richard,
Post by Richard Guy Briggs
Hi Paul, Ondrej,
I've got a couple of patches with two different approaches to address
https://github.com/linux-audit/audit-kernel/issues/100
The patches work, but I've not posted them yet because I wanted to
update the audit-testsuite first to consistently test it.
I've written a test to automate the regression test to add to
audit-testsuite based on the reproducer recipe provided in ghak100. The
procedure in the description of ghak100 works, but I'm having some
trouble with the script. In particular, it is hanging the script on the
"kill 'SIGSTOP' $pid_fuse" line. Once it hangs, the main script, the
test subscript and both backgrounded processes (fuse and umount) are
still hanging around.
https://github.com/linux-audit/audit-testsuite/compare/master...rgbriggs:ghak100-missing-mount-hang
Do either of you have any insight why this might be happenning and how
to fix or work around it?
- The $pid_fuse += 1 is necessary since it forks from the PID reported
to the shell.
I don't understand... why do you expect the forked PID to be exactly
$ (echo $$; exec bash -c 'echo $$' &)
10995
13693
I was not happy with this hack, but this was the most expedient way to
try to get a first attempt working... I suppose a better way might be
to spawn the client which forks, then use something like pgrep to find
all the instances and eliminate the PID that was returned by the launch.
How about something like:

system("cd $basedir/$clientdir; mkfifo /tmp/fifo; sh -c 'echo $$ >
/tmp/fifo; exec ./$client -f -s $tmpdir' & cat /tmp/fifo");

That should always give you the right PID. You just need to tweak it
to create the FIFO as a temporary file and clean it up afterwards. It
is more complicated, but should be reliable.
Post by Richard Guy Briggs
As far as I can tell, I was hitting the right task since hitting the
wrong or non-existant task didn't hang the test.
Yes, when I fork from a fresh shell, I also get the forked PID one
greater practically every time, so that will be a different problem...
I didn't look at the hang problem yet, I will try it later in the
afternoon.
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
- The SIGSTOP is necessary to simulate the hung filesystem.
- RGB
Ondrej Mosnacek <omosnace at redhat dot com>
- RGB
--
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
Richard Guy Briggs
2018-11-12 13:32:45 UTC
Permalink
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Hi Richard,
Post by Richard Guy Briggs
Hi Paul, Ondrej,
I've got a couple of patches with two different approaches to address
https://github.com/linux-audit/audit-kernel/issues/100
The patches work, but I've not posted them yet because I wanted to
update the audit-testsuite first to consistently test it.
I've written a test to automate the regression test to add to
audit-testsuite based on the reproducer recipe provided in ghak100. The
procedure in the description of ghak100 works, but I'm having some
trouble with the script. In particular, it is hanging the script on the
"kill 'SIGSTOP' $pid_fuse" line. Once it hangs, the main script, the
test subscript and both backgrounded processes (fuse and umount) are
still hanging around.
https://github.com/linux-audit/audit-testsuite/compare/master...rgbriggs:ghak100-missing-mount-hang
Do either of you have any insight why this might be happenning and how
to fix or work around it?
- The $pid_fuse += 1 is necessary since it forks from the PID reported
to the shell.
I don't understand... why do you expect the forked PID to be exactly
$ (echo $$; exec bash -c 'echo $$' &)
10995
13693
I was not happy with this hack, but this was the most expedient way to
try to get a first attempt working... I suppose a better way might be
to spawn the client which forks, then use something like pgrep to find
all the instances and eliminate the PID that was returned by the launch.
system("cd $basedir/$clientdir; mkfifo /tmp/fifo; sh -c 'echo $$ >
/tmp/fifo; exec ./$client -f -s $tmpdir' & cat /tmp/fifo");
That should always give you the right PID. You just need to tweak it
to create the FIFO as a temporary file and clean it up afterwards. It
is more complicated, but should be reliable.
I don't understand why all this extra? It will still list the first
PID of the newly forked task. There are several examples of
capturing the PID of a backgrounded process already in the
audit-testsuite which all work fine:
tests/filter_sessionid/test (touch)
tests/login_tty/test (echo)
tests/lost_reset/test (auditctl --reset-lost)
tests/user_msg/test (auditctl -m ...)

The problem here is that the client that is executed then forks, which
isn't the case with any of the examples listed above.
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
As far as I can tell, I was hitting the right task since hitting the
wrong or non-existant task didn't hang the test.
Yes, when I fork from a fresh shell, I also get the forked PID one
greater practically every time, so that will be a different problem...
I didn't look at the hang problem yet, I will try it later in the
afternoon.
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
- The SIGSTOP is necessary to simulate the hung filesystem.
- RGB
Ondrej Mosnacek <omosnace at redhat dot com>
- RGB
Ondrej Mosnacek <omosnace at redhat dot com>
- RGB

--
Richard Guy Briggs <***@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
Ondrej Mosnacek
2018-11-12 14:02:47 UTC
Permalink
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Hi Richard,
Post by Richard Guy Briggs
Hi Paul, Ondrej,
I've got a couple of patches with two different approaches to address
https://github.com/linux-audit/audit-kernel/issues/100
The patches work, but I've not posted them yet because I wanted to
update the audit-testsuite first to consistently test it.
I've written a test to automate the regression test to add to
audit-testsuite based on the reproducer recipe provided in ghak100. The
procedure in the description of ghak100 works, but I'm having some
trouble with the script. In particular, it is hanging the script on the
"kill 'SIGSTOP' $pid_fuse" line. Once it hangs, the main script, the
test subscript and both backgrounded processes (fuse and umount) are
still hanging around.
https://github.com/linux-audit/audit-testsuite/compare/master...rgbriggs:ghak100-missing-mount-hang
Do either of you have any insight why this might be happenning and how
to fix or work around it?
- The $pid_fuse += 1 is necessary since it forks from the PID reported
to the shell.
I don't understand... why do you expect the forked PID to be exactly
$ (echo $$; exec bash -c 'echo $$' &)
10995
13693
I was not happy with this hack, but this was the most expedient way to
try to get a first attempt working... I suppose a better way might be
to spawn the client which forks, then use something like pgrep to find
all the instances and eliminate the PID that was returned by the launch.
system("cd $basedir/$clientdir; mkfifo /tmp/fifo; sh -c 'echo $$ >
/tmp/fifo; exec ./$client -f -s $tmpdir' & cat /tmp/fifo");
That should always give you the right PID. You just need to tweak it
to create the FIFO as a temporary file and clean it up afterwards. It
is more complicated, but should be reliable.
I don't understand why all this extra? It will still list the first
PID of the newly forked task. There are several examples of
capturing the PID of a backgrounded process already in the
tests/filter_sessionid/test (touch)
tests/login_tty/test (echo)
tests/lost_reset/test (auditctl --reset-lost)
tests/user_msg/test (auditctl -m ...)
The problem here is that the client that is executed then forks, which
isn't the case with any of the examples listed above.
Do you mean that the client itself forks, too? I thought you are only
trying to deal with the fork done by the shell via the '&' operator.
My solution of course handles only that (by forking a shell, reporting
its PID into a fifo and then exec'ing the fuse client). But shouldn't
the '-f' and '-s' option of fusexmp prevent it from forking on its
own?
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
As far as I can tell, I was hitting the right task since hitting the
wrong or non-existant task didn't hang the test.
Yes, when I fork from a fresh shell, I also get the forked PID one
greater practically every time, so that will be a different problem...
I didn't look at the hang problem yet, I will try it later in the
afternoon.
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
- The SIGSTOP is necessary to simulate the hung filesystem.
- RGB
Ondrej Mosnacek <omosnace at redhat dot com>
- RGB
Ondrej Mosnacek <omosnace at redhat dot com>
- RGB
--
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
Richard Guy Briggs
2018-11-12 14:46:32 UTC
Permalink
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Hi Richard,
Post by Richard Guy Briggs
Hi Paul, Ondrej,
I've got a couple of patches with two different approaches to address
https://github.com/linux-audit/audit-kernel/issues/100
The patches work, but I've not posted them yet because I wanted to
update the audit-testsuite first to consistently test it.
I've written a test to automate the regression test to add to
audit-testsuite based on the reproducer recipe provided in ghak100. The
procedure in the description of ghak100 works, but I'm having some
trouble with the script. In particular, it is hanging the script on the
"kill 'SIGSTOP' $pid_fuse" line. Once it hangs, the main script, the
test subscript and both backgrounded processes (fuse and umount) are
still hanging around.
https://github.com/linux-audit/audit-testsuite/compare/master...rgbriggs:ghak100-missing-mount-hang
Do either of you have any insight why this might be happenning and how
to fix or work around it?
- The $pid_fuse += 1 is necessary since it forks from the PID reported
to the shell.
I don't understand... why do you expect the forked PID to be exactly
$ (echo $$; exec bash -c 'echo $$' &)
10995
13693
I was not happy with this hack, but this was the most expedient way to
try to get a first attempt working... I suppose a better way might be
to spawn the client which forks, then use something like pgrep to find
all the instances and eliminate the PID that was returned by the launch.
system("cd $basedir/$clientdir; mkfifo /tmp/fifo; sh -c 'echo $$ >
/tmp/fifo; exec ./$client -f -s $tmpdir' & cat /tmp/fifo");
That should always give you the right PID. You just need to tweak it
to create the FIFO as a temporary file and clean it up afterwards. It
is more complicated, but should be reliable.
I don't understand why all this extra? It will still list the first
PID of the newly forked task. There are several examples of
capturing the PID of a backgrounded process already in the
tests/filter_sessionid/test (touch)
tests/login_tty/test (echo)
tests/lost_reset/test (auditctl --reset-lost)
tests/user_msg/test (auditctl -m ...)
The problem here is that the client that is executed then forks, which
isn't the case with any of the examples listed above.
Do you mean that the client itself forks, too? I thought you are only
trying to deal with the fork done by the shell via the '&' operator.
My solution of course handles only that (by forking a shell, reporting
its PID into a fifo and then exec'ing the fuse client). But shouldn't
the '-f' and '-s' option of fusexmp prevent it from forking on its
own?
Yes, the client itself forks as well. I hadn't even looked at the
client's options... I'll look at those.
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
As far as I can tell, I was hitting the right task since hitting the
wrong or non-existant task didn't hang the test.
Yes, when I fork from a fresh shell, I also get the forked PID one
greater practically every time, so that will be a different problem...
I didn't look at the hang problem yet, I will try it later in the
afternoon.
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
- The SIGSTOP is necessary to simulate the hung filesystem.
- RGB
Ondrej Mosnacek <omosnace at redhat dot com>
- RGB
Ondrej Mosnacek <omosnace at redhat dot com>
- RGB
--
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
- RGB

--
Richard Guy Briggs <***@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
Ondrej Mosnacek
2018-11-12 14:37:50 UTC
Permalink
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Hi Richard,
Post by Richard Guy Briggs
Hi Paul, Ondrej,
I've got a couple of patches with two different approaches to address
https://github.com/linux-audit/audit-kernel/issues/100
The patches work, but I've not posted them yet because I wanted to
update the audit-testsuite first to consistently test it.
I've written a test to automate the regression test to add to
audit-testsuite based on the reproducer recipe provided in ghak100. The
procedure in the description of ghak100 works, but I'm having some
trouble with the script. In particular, it is hanging the script on the
"kill 'SIGSTOP' $pid_fuse" line. Once it hangs, the main script, the
test subscript and both backgrounded processes (fuse and umount) are
still hanging around.
https://github.com/linux-audit/audit-testsuite/compare/master...rgbriggs:ghak100-missing-mount-hang
Do either of you have any insight why this might be happenning and how
to fix or work around it?
- The $pid_fuse += 1 is necessary since it forks from the PID reported
to the shell.
I don't understand... why do you expect the forked PID to be exactly
$ (echo $$; exec bash -c 'echo $$' &)
10995
13693
I was not happy with this hack, but this was the most expedient way to
try to get a first attempt working... I suppose a better way might be
to spawn the client which forks, then use something like pgrep to find
all the instances and eliminate the PID that was returned by the launch.
system("cd $basedir/$clientdir; mkfifo /tmp/fifo; sh -c 'echo $$ >
/tmp/fifo; exec ./$client -f -s $tmpdir' & cat /tmp/fifo");
That should always give you the right PID. You just need to tweak it
to create the FIFO as a temporary file and clean it up afterwards. It
is more complicated, but should be reliable.
Post by Richard Guy Briggs
As far as I can tell, I was hitting the right task since hitting the
wrong or non-existant task didn't hang the test.
Yes, when I fork from a fresh shell, I also get the forked PID one
greater practically every time, so that will be a different problem...
I didn't look at the hang problem yet, I will try it later in the
afternoon.
I think I figured it out. When you send SIGTERM to the fuse process in
the cleanup section, it is still stopped, so it can't handle it. You
need to send it SIGCONT first (or kill it with SIGKILL):
[...]
###
# cleanup
kill 'SIGCONT', $pid_fuse;
kill 'SIGTERM', $pid_umnt;
kill 'SIGTERM', $pid_fuse;
system("auditctl -D >& /dev/null");

With the above tweak it no longer hangs for me.
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
- The SIGSTOP is necessary to simulate the hung filesystem.
- RGB
Ondrej Mosnacek <omosnace at redhat dot com>
- RGB
--
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635
--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.
Richard Guy Briggs
2018-11-13 20:45:29 UTC
Permalink
Post by Ondrej Mosnacek
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Hi Richard,
Post by Richard Guy Briggs
Hi Paul, Ondrej,
I've got a couple of patches with two different approaches to address
https://github.com/linux-audit/audit-kernel/issues/100
The patches work, but I've not posted them yet because I wanted to
update the audit-testsuite first to consistently test it.
I've written a test to automate the regression test to add to
audit-testsuite based on the reproducer recipe provided in ghak100. The
procedure in the description of ghak100 works, but I'm having some
trouble with the script. In particular, it is hanging the script on the
"kill 'SIGSTOP' $pid_fuse" line. Once it hangs, the main script, the
test subscript and both backgrounded processes (fuse and umount) are
still hanging around.
https://github.com/linux-audit/audit-testsuite/compare/master...rgbriggs:ghak100-missing-mount-hang
Do either of you have any insight why this might be happenning and how
to fix or work around it?
- The $pid_fuse += 1 is necessary since it forks from the PID reported
to the shell.
I don't understand... why do you expect the forked PID to be exactly
$ (echo $$; exec bash -c 'echo $$' &)
10995
13693
I was not happy with this hack, but this was the most expedient way to
try to get a first attempt working... I suppose a better way might be
to spawn the client which forks, then use something like pgrep to find
all the instances and eliminate the PID that was returned by the launch.
system("cd $basedir/$clientdir; mkfifo /tmp/fifo; sh -c 'echo $$ >
/tmp/fifo; exec ./$client -f -s $tmpdir' & cat /tmp/fifo");
That should always give you the right PID. You just need to tweak it
to create the FIFO as a temporary file and clean it up afterwards. It
is more complicated, but should be reliable.
Post by Richard Guy Briggs
As far as I can tell, I was hitting the right task since hitting the
wrong or non-existant task didn't hang the test.
Yes, when I fork from a fresh shell, I also get the forked PID one
greater practically every time, so that will be a different problem...
I didn't look at the hang problem yet, I will try it later in the
afternoon.
I think I figured it out. When you send SIGTERM to the fuse process in
the cleanup section, it is still stopped, so it can't handle it. You
[...]
###
# cleanup
kill 'SIGCONT', $pid_fuse;
kill 'SIGTERM', $pid_umnt;
kill 'SIGTERM', $pid_fuse;
system("auditctl -D >& /dev/null");
With the above tweak it no longer hangs for me.
It no longer hangs, but waitpid doesn't work as expected either...

I was making this overly complicated. A simple check in /proc/$pid_umnt
was sufficient to see if the process was still hanging around.

Thanks for looking at this. I'll update my test and post some patches...
Post by Ondrej Mosnacek
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
Post by Ondrej Mosnacek
Post by Richard Guy Briggs
- The SIGSTOP is necessary to simulate the hung filesystem.
- RGB
Ondrej Mosnacek <omosnace at redhat dot com>
- RGB
Ondrej Mosnacek <omosnace at redhat dot com>
Ondrej Mosnacek <omosnace at redhat dot com>
- RGB

--
Richard Guy Briggs <***@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

Loading...