bug: singletons fail to relocate on leader leaving a cluster

**Bug description**
I am trying to get a singleton scheduled on a cluster. I am creating a persistent scheduler (backed by a database), to allow grains to schedule messages at a later date, and the cluster to survive node outages, including full cluster outages.

When experimenting with it, I notice often that my cluster is left without the singleton running. This could full well be that I am doing something wrong initializing the singleton.

It seems this is a composition of several bugs on top of each other, so I am just going to write out what I was trying and what I observed. Hopefully you can make something out of that :D If not, let me know, and I will see if I can write small test applications :) But first, the main issue I was running into.

**How to reproduce it?**
Running two nodes in a cluster (with quorum of 2, with relocation enabled), and the following snippet on startup:

```
n.actorSystem.SpawnSingleton(ctx, "scheduler", scheduler.New())
```

Now when I shut down the node running the `scheduler`, I get this message:

`{"level":"error","ts":"2025-12-27T10:00:14.898023+0100","caller":"actor/relocator.go:106","msg":"cluster rebalancing failed: spawn error: singleton already exists"}`

And the cluster is left without a `scheduler` running.

**Expected behavior**
Always have the singleton running somewhere on the cluster.

**Library Version:**
 - Go-Akt version: 1b11cfd16e5711cfce2cac58a53ea90a691fe3ce
 - Go version: 1.25.3

**Additional context**
Initially I was running my cluster with `WithoutRelocation`. In my use-case, I don't need grains to relocate when a node is shut down. They will just spin up a new instance on next invocation. No need for migration: only wasting CPU.
However, for singletons, if the leader goes away, it doesn't restart the singleton on the new leader either. I somewhat did expect that, as even so I use `WithoutRelocation`, I did expect singletons to be the exception here. But this was an assumption on my side :D
So now the question becomes: how do I keep a singleton running on a cluster with `WithoutRelocation`? As for grains, it is very much exactly what I want. Singletons however are the exception in my use-case. But I also can't tell when using `client.TellGrain` that the grain shouldn't be relocated on node issues.

Another thing I ran into, but this really is a "me" problem: I had a hard time getting singletons working. Initially I just did a `n.actorSystem.SpawnSingleton(ctx, "scheduler", scheduler.New())` when each nodes starts. But this errors on the second node (correctly, if you ask me). The error however was difficult to deal with: `internal: actor=(scheduler) actor already exists`. This is not `singleton already exists`, and I couldn't differentiate between "there is another error" vs "this singleton already started".
In the end I just added an `IsLeader` check in front of the `SpawnSingleton`, once I figured out that "oldest node" means "is-leader" :) So my code became:

```
	if isLeader, _ := n.actorSystem.IsLeader(ctx); isLeader {
		n.actorSystem.SpawnSingleton(ctx, "scheduler", scheduler.New())
	}
```
But there might be better ways to ensure a singleton is running on start of a cluster?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

bug: singletons fail to relocate on leader leaving a cluster #1036

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

bug: singletons fail to relocate on leader leaving a cluster #1036

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions