-
Notifications
You must be signed in to change notification settings - Fork 540
Description
Environment
- Elixir version (elixir -v): any
- Absinthe version (mix deps | grep absinthe): 1.7.8
- Client Framework and version (Relay, Apollo, etc): latest apollo client with react
Expected behavior
When pushing several updates, one after another, using Absinthe.Subscription.publish, they should arrive in the same order they were sent.
Actual behavior
They might arrive in a different order.
Relevant Schema/Middleware Code
Given a subscription like this:
subscription {
someEntityUpdates {
id
version
}
}and code like this:
Absinthe.Subscription.publish(MyAppWeb.Endpoint, %{id: 1, version: 1}, entity_updated: "entity:1")
# some work, possibly very fast because of cache
Absinthe.Subscription.publish(MyAppWeb.Endpoint, %{id: 1, version: 2}, entity_updated: "entity:1")
# some more work, possibly very fast because of cache
Absinthe.Subscription.publish(MyAppWeb.Endpoint, %{id: 1, version: 3}, entity_updated: "entity:1")If consecutive calls to publish happen very quickly, updates to entity 1 might be delivered in the wrong order.
I observe this in a multi-node Phoenix setup using phoenix_redis_pubsub to relay messages between instances. The Phoenix Redis adapter guarantees the ordering of messages.
The bug originates from the fact that when routing mutation_result in Absinthe.Subscription
absinthe/lib/absinthe/subscription.ex
Line 220 in 3d0823b
| shard = :erlang.phash2(mutation_result, pool_size) |
There is a second place in the code that adds to this problem:
| Task.Supervisor.start_child(state.task_super, Subscription.Local, :publish_mutation, [ |
here, actual channel publish happens asynchronously using Task.Supervisor and introduces another race condition(so even if the payloads are routed to the same shard, after firing up Tasks, we can't be sure how BEAM schedules work). However, this one can be resolved by flipping async flag that has been recently merged in.
The only workaround I can think of for now is to use pool_size: 1, async: false.
I'd love to help prepare a fix if this is indeed considered a bug and we can agree on a desired solution.