Skip to content

2 processes listening on livestatus socket #117

@pvdputte

Description

@pvdputte

I've previously commented on this livestatus issue but probably should have opened a new one here instead. Sorry.

Basically, the problem I see is that even in a fresh install without any custom configuration except for the TCP livestatus socket, after a systemctl reload naemon, there are two processes listening:

vagrant@bookworm:~$ sudo netstat -tupan | grep -e Recv -e naemon
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:6557          0.0.0.0:*               LISTEN      4067/naemon         
tcp        3      0 0.0.0.0:6557          0.0.0.0:*               LISTEN      4072/naemon   

One of them is not responding (waiting to be reaped? although top is not saying it's a zombie)
As a result, Thruk sometimes behaves erratically, says the backend is down etc.

This is the config:

vagrant@bookworm:~$ cat /etc/naemon/module-conf.d/livestatus.cfg 
# Naemon config
broker_module=/usr/lib/naemon/naemon-livestatus/livestatus.so inet_addr=0.0.0.0:6557 debug=1
event_broker_options=-1

This can be easily reproduced in vagrant.

$ vagrant init debian/bookworm64
$ vagrant up
$ vagrant ssh

Next, copy these commands into a script and execute it.

vagrant@bookworm:~$ wget -O reproduce https://github.com/naemon/naemon-livestatus/files/13950328/reproduce.txt
vagrant@bookworm:~$ chmod +x reproduce
vagrant@bookworm:~$ ./reproduce

This should result in something like

<installation>


----------
Restarting

tcp        0      0 0.0.0.0:6557            0.0.0.0:*               LISTEN      5408/naemon         

naemon,5408 --daemon /etc/naemon/naemon.cfg
  ├─naemon,5409 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5410 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5411 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5412 --worker /var/lib/naemon/naemon.qh
  └─naemon,5413 --daemon /etc/naemon/naemon.cfg

systemd,5367 --user
  └─(sd-pam),5368


----------------------
Reloading until broken

Success.
tcp        0      0 0.0.0.0:6557            0.0.0.0:*               LISTEN      5408/naemon         
tcp        0      0 0.0.0.0:6557            0.0.0.0:*               LISTEN      5413/naemon         

naemon,5408 --daemon /etc/naemon/naemon.cfg
  ├─naemon,5413 --daemon /etc/naemon/naemon.cfg
  ├─naemon,5434 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5435 --worker /var/lib/naemon/naemon.qh
  ├─naemon,5436 --worker /var/lib/naemon/naemon.qh
  └─naemon,5437 --worker /var/lib/naemon/naemon.qh

systemd,5367 --user
  └─(sd-pam),5368

---------------------------------
Running "GET status" every second, response size 0 is not good:
2024-01-16T13:08:05+00:00 976
2024-01-16T13:08:06+00:00 976
2024-01-16T13:08:07+00:00 0
2024-01-16T13:08:10+00:00 977
2024-01-16T13:08:11+00:00 0
^C

Notice that process 5413 already exists when naemon is first started, but only after the reload, it also starts listening on that socket.

My current workaround is to restart instead of reload after each config change, but this takes a lot longer than reloading (rather large config). Or I should go back to xinetd.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions