configuration for busy docker host

Discussion:

Frederik Bosch

2018-08-20 09:56:04 UTC

Hello Audit team,

As I have not found a location anywhere else on the web, I am sending my
question to this list. I have an Ubuntu 18.04 machine with auditd and it
acts as a Docker Host machine. I have hardened the system via this
package: https://github.com/konstruktoid/hardening which installs auditd
with the configuration to be found here:
https://github.com/konstruktoid/hardening/blob/master/misc/audit.rules.

The problems I have are related to the directives -f and -b. The
hardening package uses -b 8192 and -f 2. That results in a kernel panic
very quickly because of audit backlog limit exceeded, and that causes a
reboot of the system. Now I wonder what a good configuration would be. I
started reading on the subject and read that -f 2 is probably the best
for security reasons. However, I do not want to have a system that
panics very quickly and reboots.

Should I simply increase the backlog to much higher numbers? Or should I
change -f to not cause a kernel panic? Or am I missing something and
should I change some other configuration? Thanks for your help.

Kind regards,
Frederik Bosch

Frederik Bosch

2018-08-20 14:10:44 UTC

Permalink

In my initial message I did not include the output of auditctl -s. In
the meanwhile I have disabled failure (0) and increased the backlog
limited (heavily). As you can see I still have a lost of 52.

While browsing the archives of the list I found MSG00127,
https://www.redhat.com/archives/linux-audit/2017-September/msg00127.html.
Maybe there are similarities with that problem. That user also reported
a high number of last messages.

enabled 2
failure 0
pid 760
rate_limit 0
backlog_limit 524288
lost 52
backlog 0
backlog_wait_time 0
loginuid_immutable 0 unlocked

Hopefully someone is able to help.

Post by Frederik Bosch
Hello Audit team,
As I have not found a location anywhere else on the web, I am sending
my question to this list. I have an Ubuntu 18.04 machine with auditd
and it acts as a Docker Host machine. I have hardened the system via
this package: https://github.com/konstruktoid/hardening which installs
https://github.com/konstruktoid/hardening/blob/master/misc/audit.rules.
The problems I have are related to the directives -f and -b. The
hardening package uses -b 8192 and -f 2. That results in a kernel
panic very quickly because of audit backlog limit exceeded, and that
causes a reboot of the system. Now I wonder what a good configuration
would be. I started reading on the subject and read that -f 2 is
probably the best for security reasons. However, I do not want to have
a system that panics very quickly and reboots.
Should I simply increase the backlog to much higher numbers? Or should
I change -f to not cause a kernel panic? Or am I missing something and
should I change some other configuration? Thanks for your help.
Kind regards,
Frederik Bosch
--
Linux-audit mailing list
https://www.redhat.com/mailman/listinfo/linux-audit

Steve Grubb

2018-08-20 17:48:51 UTC

Permalink

Post by Frederik Bosch
As I have not found a location anywhere else on the web, I am sending my
question to this list. I have an Ubuntu 18.04 machine with auditd and it
acts as a Docker Host machine. I have hardened the system via this
package: https://github.com/konstruktoid/hardening which installs auditd
https://github.com/konstruktoid/hardening/blob/master/misc/audit.rules.

These rules could be improved upon by condensing:

# File deletions
# Capture all unauthorized file accesses
# Capture all failures to access on critical elements
# Permissions

down to 2 rules in each, 4 for the second one. That, however, is not the
actual problem. My guess is that it is capturing way more information than is
necessary.

Post by Frederik Bosch
The problems I have are related to the directives -f and -b. The
hardening package uses -b 8192 and -f 2. That results in a kernel panic
very quickly because of audit backlog limit exceeded, and that causes a
reboot of the system. Now I wonder what a good configuration would be. I
started reading on the subject and read that -f 2 is probably the best
for security reasons. However, I do not want to have a system that
panics very quickly and reboots.

I'd say that you need to run:

aureport --start today --key --summary

and see what rule is triggering all the events. Do you really want all
deletes? Or just deletes in a specific directory? Do you really want to know
that a user changed dir permissions on a file in their homedir?

Post by Frederik Bosch
Should I simply increase the backlog to much higher numbers? Or should I
change -f to not cause a kernel panic? Or am I missing something and
should I change some other configuration? Thanks for your help.

For the moment change -f not to cause a kernel panic. I think the rules are
probably too aggressive.

-Steve

Frederik Bosch

2018-08-22 11:40:56 UTC

Permalink

Hi Steve,

Thank you very much for your reply and your suggestion. I appreciate
that. The summary looks as follows.

Key Summary Report
===========================
total key
===========================
63164 tmp
16060 docker
7206 delete
6007 admin_user_home
2760 auditlog
1595 specialfiles
675 perm_mod
69 systemd
54 systemd_tools
36 init
15 sshd
12 cron
5 login
5 actions
4 access
3 privileged
1 audit_rules_networkconfig_modification

Now I wonder why to watch /tmp and /var/tmp. As it seems, these cause
most entries in the logs. Could you think of any reason why that would
be? I have also asked this question to the owner of the package. I will
reduce the number of delete calls to specific locations and disable
watches for /home as they seem to be inappropriate for my use case.

Regards,
Frederik

Post by Steve Grubb

# File deletions
# Capture all unauthorized file accesses
# Capture all failures to access on critical elements
# Permissions
down to 2 rules in each, 4 for the second one. That, however, is not the
actual problem. My guess is that it is capturing way more information than is
necessary.

aureport --start today --key --summary
and see what rule is triggering all the events. Do you really want all
deletes? Or just deletes in a specific directory? Do you really want to know
that a user changed dir permissions on a file in their homedir?

For the moment change -f not to cause a kernel panic. I think the rules are
probably too aggressive.
-Steve

Steve Grubb

2018-08-22 12:42:00 UTC

Permalink

Post by Frederik Bosch
Hi Steve,
Thank you very much for your reply and your suggestion. I appreciate
that. The summary looks as follows.
Key Summary Report
===========================
total key
===========================
63164 tmp
16060 docker
7206 delete
6007 admin_user_home
2760 auditlog
1595 specialfiles
675 perm_mod
69 systemd
54 systemd_tools
36 init
15 sshd
12 cron
5 login
5 actions
4 access
3 privileged
1 audit_rules_networkconfig_modification
Now I wonder why to watch /tmp and /var/tmp.

This can be a staging ground for exploits. However, if they are mounted with
the noexec option, they should be harmless. Also, the whole section titled:

# Capture all failures to access on critical elements

really is not necessary. Do you really need to know an open failed because of
ENOENT? For example, every time a program is executed, ld.so tries to open 3
or 4 nonexisting files. This is not needed for security purposes and is
normal system activity. The only time things matter is when you fail to open
for permissions.

About the docker section...why do you need to know all reads of those files?
I'm not sure of the reason you'd want that information.

-Steve

Post by Frederik Bosch
As it seems, these cause
most entries in the logs. Could you think of any reason why that would
be? I have also asked this question to the owner of the package. I will
reduce the number of delete calls to specific locations and disable
watches for /home as they seem to be inappropriate for my use case.
Regards,
Frederik

Post by Steve Grubb

aureport --start today --key --summary
and see what rule is triggering all the events. Do you really want all
deletes? Or just deletes in a specific directory? Do you really want to
know that a user changed dir permissions on a file in their homedir?

For the moment change -f not to cause a kernel panic. I think the rules
are probably too aggressive.
-Steve

Frederik Bosch

2018-08-22 14:49:20 UTC

Permalink

Hi Steve,

That was really helpful, again. My aureport looks much healthier now! I
have one remaing question. When running auditctl -s I still have a lost
value of 51 after boot.

enabled 2
failure 1
pid 779
rate_limit 0
backlog_limit 8192
lost 51
backlog 0
backlog_wait_time 0
loginuid_immutable 0 unlocked

What could be the cause? My aureport now looks like this.

sudo aureport --start boot --key --summary

Key Summary Report
===========================
total key
===========================
289 auditlog
120 specialfiles
73 docker
69 privileged
29 access
19 perm_mod
17 delete
12 actions
11 audit_rules_networkconfig_modification
10 cron
10 modules
10 login
6 apparmor_tools
6 audit_time_rules
5 systemd_tools
5 audit_rules_usergroup_modification
5 pam
4 power
3 audittools
3 group_modification
3 user_modification
3 init
3 modprobe
3 sshd
2 apparmor
2 systemd
2 export
2 auditconfig
2 mail
2 admin_user_home
1 audispconfig
1 MAC-policy
1 passwd_modification
1 logins
1 libpath
1 localtime
1 audit_time_ruleszone
1 sysctl

If I understand things correctly with failure set to 1, I should find a
message in dmesg due to printk, but I have not found any that is
related. My auditd.conf is as follows.

local_events = yes
write_logs = yes
log_file = /var/log/audit/audit.log
log_group = adm
log_format = RAW
flush = INCREMENTAL_ASYNC
freq = 50
max_log_file = 8
num_logs = 5
priority_boost = 4
disp_qos = lossy
dispatcher = /sbin/audispd
name_format = NONE
##name = mydomain
max_log_file_action = keep_logs
space_left = 75
space_left_action = email
verify_email = yes
action_mail_acct = root
admin_space_left = 50
admin_space_left_action = halt
disk_full_action = SUSPEND
disk_error_action = SUSPEND
use_libwrap = yes
##tcp_listen_port = 60
tcp_listen_queue = 5
tcp_max_per_addr = 1
##tcp_client_ports = 1024-65535
tcp_client_max_idle = 0
enable_krb5 = no
krb5_principal = auditd
##krb5_key_file = /etc/audit/audit.key
distribute_network = no

Or is it something I should not be worried about?

Regards,
Frederik

Post by Steve Grubb

This can be a staging ground for exploits. However, if they are mounted with
# Capture all failures to access on critical elements
really is not necessary. Do you really need to know an open failed because of
ENOENT? For example, every time a program is executed, ld.so tries to open 3
or 4 nonexisting files. This is not needed for security purposes and is
normal system activity. The only time things matter is when you fail to open
for permissions.
About the docker section...why do you need to know all reads of those files?
I'm not sure of the reason you'd want that information.
-Steve

Post by Steve Grubb

aureport --start today --key --summary
and see what rule is triggering all the events. Do you really want all
deletes? Or just deletes in a specific directory? Do you really want to
know that a user changed dir permissions on a file in their homedir?

For the moment change -f not to cause a kernel panic. I think the rules
are probably too aggressive.
-Steve

Steve Grubb

2018-08-23 14:18:53 UTC

Permalink

Post by Frederik Bosch
Hi Steve,
That was really helpful, again. My aureport looks much healthier now! I
have one remaing question. When running auditctl -s I still have a lost
value of 51 after boot.
enabled 2
failure 1
pid 779
rate_limit 0
backlog_limit 8192
lost 51
backlog 0
backlog_wait_time 0
loginuid_immutable 0 unlocked
What could be the cause?

By default, the audit subsystem reserves 64 slots for audit events. Systemd
can easily overrun this before auditd starts. So, you need to boot with the
following kernel boot options:

audit=1 audit_backlog_limit=8192

Does you have this for boot options?

Post by Frederik Bosch
My aureport now looks like this.
sudo aureport --start boot --key --summary
Key Summary Report
===========================
total key
===========================
289 auditlog
120 specialfiles
73 docker
69 privileged
29 access
19 perm_mod
17 delete
12 actions
11 audit_rules_networkconfig_modification
10 cron
10 modules
10 login
6 apparmor_tools
6 audit_time_rules
5 systemd_tools
5 audit_rules_usergroup_modification
5 pam
4 power
3 audittools
3 group_modification
3 user_modification
3 init
3 modprobe
3 sshd
2 apparmor
2 systemd
2 export
2 auditconfig
2 mail
2 admin_user_home
1 audispconfig
1 MAC-policy
1 passwd_modification
1 logins
1 libpath
1 localtime
1 audit_time_ruleszone
1 sysctl
If I understand things correctly with failure set to 1, I should find a
message in dmesg due to printk, but I have not found any that is
related.

There may be a chance that these were lost before auditd rules were loaded.

Post by Frederik Bosch
My auditd.conf is as follows.
local_events = yes
write_logs = yes
log_file = /var/log/audit/audit.log
log_group = adm
log_format = RAW
flush = INCREMENTAL_ASYNC
freq = 50
max_log_file = 8
num_logs = 5

Btw, these two settings only allow 40Mb of logs. Typically if you really need
auditing you need more than this.

Post by Frederik Bosch
priority_boost = 4
disp_qos = lossy
dispatcher = /sbin/audispd
name_format = NONE
##name = mydomain
max_log_file_action = keep_logs
space_left = 75
space_left_action = email
verify_email = yes
action_mail_acct = root
admin_space_left = 50
admin_space_left_action = halt
disk_full_action = SUSPEND
disk_error_action = SUSPEND
use_libwrap = yes
##tcp_listen_port = 60
tcp_listen_queue = 5
tcp_max_per_addr = 1
##tcp_client_ports = 1024-65535
tcp_client_max_idle = 0
enable_krb5 = no
krb5_principal = auditd
##krb5_key_file = /etc/audit/audit.key
distribute_network = no
Or is it something I should not be worried about?

Maybe. Let's see what the boot options are. Also, what kernel version are you
using?

-Steve

Frederik Bosch

2018-08-23 16:01:59 UTC

Permalink

Hi Steve,

That was the trick, to add audit_backlog_limit=8192. Thanks a lot for
all your answers, things are much clearer for me now!

Regards,
Frederik

Post by Steve Grubb

By default, the audit subsystem reserves 64 slots for audit events. Systemd
can easily overrun this before auditd starts. So, you need to boot with the
audit=1 audit_backlog_limit=8192
Does you have this for boot options?

There may be a chance that these were lost before auditd rules were loaded.

Btw, these two settings only allow 40Mb of logs. Typically if you really need
auditing you need more than this.

Maybe. Let's see what the boot options are. Also, what kernel version are you
using?
-Steve