Skip to content

Current Merlin releases appear to trigger instability in Naemon #158

@eschoeller

Description

@eschoeller

There is a release available here on github: 2022.06.02. I started off using that. I then migrated to using packages hosted on this mirror:
https://download.opensuse.org/repositories/home:/itrs-op5/CentOS_8_Stream/

In both cases I ran into situations where naemon would crash (and dump a core) and merlind would eventually peg itself at 99% CPU usage. I went in circles for awhile trying to determine what was going on. I started to narrow in on service and host checks that would return a CRITICAL state and cause naemon to crash when it was attempting to generate a notification (even though I had notifications disabled globally). During my initial load testing I was using mostly ping checks that all returned OK, so I rarely hit this condition. But the moment I started getting checks that returned CRITICAL, things would break.

Anyway, long story short - I built merlin from source and everything is fine now. But given the run-around I went through, I figured I'd report this here for anyone else who may encounter this problem -or- merely as a suggestion that it might be an appropriate time to package a new release.

I did see in the github issues (#146 ) there was a 2022.06.30 release, but I never actually found it.

I can re-configure these systems to trigger the issue again pretty easily if you need more info, but since the issue is fixed in the current source code I doubt any further troubleshooting is needed.

Some additional information about the systems where I encountered these problems:
CentOS Stream release 8
4.18.0-527.el8.x86_64 #1 SMP Thu Nov 23 14:16:19 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
libnaemon-1.4.1-18.1.x86_64
naemon-thruk-1.4.1-13.1.noarch
naemon-livestatus-1.4.1-14.1.x86_64
naemon-1.4.1-13.1.noarch
naemon-devel-1.4.1-18.1.x86_64
naemon-core-1.4.1-18.1.x86_64
naemon-vimvault-1.4.0-3.2.x86_64

NAME="Red Hat Enterprise Linux"
VERSION="8.9 (Ootpa)"
4.18.0-513.5.1.el8_9.x86_64 #1 SMP Fri Sep 29 05:21:10 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
naemon-livestatus-1.4.1-14.1.x86_64
naemon-1.4.1-13.1.noarch
libnaemon-1.4.1-18.1.x86_64
naemon-vimvault-1.4.0-3.2.x86_64
naemon-core-1.4.1-18.1.x86_64
naemon-thruk-1.4.1-13.1.noarch

(Both CentOS and RHEL systems were originally fetching naemon from https://labs.consol.de/repo/stable/rhel8/x86_64/ but switched to https://download.opensuse.org/repositories/home:/naemon/CentOS_7/)

It also seems like I may have had this exact same problem on a set of Debian machines on 3/6/2023 after naemon got upgraded there. I suppose the fix slipped my mind!
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
4.19.0-25-amd64 #1 SMP Debian 4.19.289-2 (2023-08-08) x86_64 GNU/Linux
ii libnaemon:amd64 1.4.1-1 amd64
ii naemon 1.4.1-1 amd64
ii naemon-core 1.4.1-1 amd64
ii naemon-dev 1.4.1-1 amd64
ii naemon-livestatus 1.4.1-1 amd64
ii naemon-thruk 1.4.1-1 amd64
ii naemon-vimvault 1.4.0-1 amd64
ii thruk 3.10-1 amd64

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions