ProducerPool - easier remote publishing #311

jehiah · 2020-11-20T19:08:33Z

go-nsq does not provide good primitives to handle failures when publishing to a nsq instances; Handling publish errors is left as an exercise for the developer when that logic is important for reliable message creation. Common guidance has been to prefer publishing to a colocated nsqd instance, but that configuration is less desirable in many cloud environments.

I'm proposing we add a ProducerPool which provides a Publish interface that can be configured with multiple nsqd instances and which will retry publishing to another instance on error. This will provide a simple way for applications publishing messages to handle write failures in a fault tolerant way.

Based on outcome of nsqio/nsq#1300 this producer may also lazily organize nsqd instances into groupings of node local, zone local, region local or global and provide prioritization base on topology.

Functionally this is similar to the reliable publishing in nsq_to_nsq where publish happens in a round robin, or host pool to a pool of nsqd instances (but without the same backpressure).

cc: nsqio/nsq#1254 which tracks facilitating use of nsq in a cloud environment.

mreiferson · 2020-11-25T03:36:16Z

Sounds good 👍

jehiah · 2020-12-11T05:16:46Z

Thanks to @jharshman for contributing a PublishAsync implementation. I still want to work on making this transparently pick the best nsqd to write to.

producer_pool.go

ploxiln · 2020-12-15T21:45:44Z

another possibly related idea: "publish to N different nsqd" (e.g. 2) for users that want stronger guarantees that no messages are ever lost

mreiferson · 2020-12-28T23:11:48Z

Not sure if this is ready for review, but a few things as is:

Looks like the target nsqd are static in a ProducerPool, meaning if the list needs to change, a user is expected to discard a ProducerPool instance and create a new one with a new list (and new connections). If so, then ProducerPool needs some sort of Close() func to cleanup.
PublishAsync shouldn't be managing its own goroutine directly, perhaps a ProducerPool instance should manage the lifecycle of a goroutine that handles retries?

jehiah · 2020-12-29T01:27:51Z

Not sure if this is ready for review, but a few things as is:

No, not RFR (see below) - but I think it has some useful contents and figuring out the right way to expose initialization needs some thought so definitely RFC, so comments are good.

Based on outcome of nsqio/nsq#1300 this producer may also lazily organize nsqd instances into groupings of node local, zone local, region local or global and provide prioritization base on topology.

PublishAsync shouldn't be managing its own goroutine directly, perhaps a ProducerPool instance should manage the lifecycle of a goroutine that handles retries?

I don't like how this implementation spawns goroutines, but i don't really like needing to mange a long lived goroutine either; This implementation should be "correct" so that's a start. I'd love this targeted an interface (I have a version in use that does) that would make it easier to land testing, but this package needs some re-work to get to a point of targeting an interface - sort of out of scope of this PR.

JensRantil · 2025-05-23T21:57:18Z

Implementing a ProducerPool is easier said than done. How retries should be done, whether publishing a messages multiple times is acceptable are not straight forward, and what timeouts should be used before retrying are some details. I am not entirely convinced this should live in the client library. My two cents.

jehiah · 2025-05-24T01:32:47Z

Implementing a ProducerPool is easier said than done.

@JensRantil I agree, but there are two sides to that coin - because it isn't easy the go-nsq should help.

whether publishing a messages multiple times is acceptable are not straight forward

It's worth clarifying that nsq already makes a design choice to prefer "at least once" message delivery. There are many edge cases where you could end up with multiple messages or a message delivered more than once, and that's the outcome nsq desires vs loosing a message. ProducerPool helps facilitate that.

As ProducerPool is entirely optional i think the end user can make implementation choices appropriately if they prefer a different approach.

Timeouts are tricky so this will probably be unlikely to be merged before #365

jehiah added the enhancement label Nov 20, 2020

ProducerPool - initial version

912be8b

jehiah mentioned this pull request Nov 23, 2020

nsq: DRAINING mode nsqio/nsq#1302

Open

ProducerPool.PublishAsync

4574426

ploxiln reviewed Dec 15, 2020

View reviewed changes

producer_pool.go Show resolved Hide resolved

ploxiln mentioned this pull request Jan 8, 2021

About creating a new producer #280

Closed

jehiah mentioned this pull request May 17, 2021

JUST WONDER WHY PRODUCER dose not support lookupd ! is there any official explain for this? #322

Closed

jehiah mentioned this pull request Oct 18, 2023

nsq: support region/zone aware msg consumption [RFC] nsqio/nsq#1300

Closed

jehiah mentioned this pull request Jan 17, 2024

*: context.Context and timeouts #360

Open

jehiah added the chore label Jul 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ProducerPool - easier remote publishing #311

ProducerPool - easier remote publishing #311

Uh oh!

jehiah commented Nov 20, 2020 •

edited

Loading

Uh oh!

mreiferson commented Nov 25, 2020

Uh oh!

jehiah commented Dec 11, 2020

Uh oh!

Uh oh!

ploxiln commented Dec 15, 2020

Uh oh!

mreiferson commented Dec 28, 2020

Uh oh!

jehiah commented Dec 29, 2020 •

edited

Loading

Uh oh!

JensRantil commented May 23, 2025 •

edited

Loading

Uh oh!

jehiah commented May 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ProducerPool - easier remote publishing #311

Are you sure you want to change the base?

ProducerPool - easier remote publishing #311

Uh oh!

Conversation

jehiah commented Nov 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mreiferson commented Nov 25, 2020

Uh oh!

jehiah commented Dec 11, 2020

Uh oh!

Uh oh!

ploxiln commented Dec 15, 2020

Uh oh!

mreiferson commented Dec 28, 2020

Uh oh!

jehiah commented Dec 29, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JensRantil commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jehiah commented May 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jehiah commented Nov 20, 2020 •

edited

Loading

jehiah commented Dec 29, 2020 •

edited

Loading

JensRantil commented May 23, 2025 •

edited

Loading