-
Notifications
You must be signed in to change notification settings - Fork 2.8k
googlecloudpubsubreceiver: growing unacked message in subscription #38164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thank you for filing this issue. I am pinging the codeowners of the googlecloudpubsubreceiver. |
Pinging code owners for receiver/googlecloudpubsub: @alexvanboxel. See Adding Labels via Comments if you do not have permissions to add labels yourself. For example, comment '/label priority:p2 -needs-triaged' to set the priority and remove the needs-triaged label. |
I'm going to keep this open with priority:p3 as it could be that even with the marshaller support, we could need to have an option what todo when marshalling fails. |
Hi @alexvanboxel, thank you for your comment. But I am not completely clear what does it mean exactly. Are these unacknowledged messages actually lost and not delivered to the backend? If yes, will adding this encoder help to process them? If no, what would be the best option for me to setup a reliable log delivery from gcp? |
The problem is the encoding, probably the Protobuf encoding fails so the message is un-acked. The new encoder (will by default just do JSON, so you can switch to a more stable parsing. (personally I don't like the Protobuf parsing, it was my mistake to have allowed it in). As a workaround you could try raw_text, if you still see un-acked messages then something else is wrong. |
@alexvanboxel it appears that receiver is struggling parsing cloudaudit.googleapis.com logs (which indeed have protoPayload type). After filtering out this source in sink I don't see backlog growing anymore. I am wondering if the new encoder is planned to have capability to parse it (or at least to pass it unparsed to the next processor in chain to deal with it). It would be great to have some output in warn if receiver fails to parse a record too. Greatly appreciate your help and your work on this receiver, thank you, Alex. |
The encodering I pointed in the previous comment will default to JSON by default, less risky, you will have to force the Proto if you want that. This will probably land in a few releases, I'm not happy with the tests yet. The code is ready. Good points on what todo when parsing fails, I'll think about it. So I like to keep this ticket open for those issues. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
The PR will enable to ignore encoding errors (with the drawback of loosing message). A metric is added to be able to monitor the issue: #39839 |
…39839) #### Description Introduce a setting to ignore errors when the configured encoder. It's advised to set this to `true` when using a custom encoder, and use `receiver.googlecloudpubsub.encoding_error` metric to monitor the number of errors. Ignoring the error will cause the receiver to drop the message. #### Link to tracking issue #38164 #### Testing Tested with a custom encoder and introducing bogus message to the topic. #### Documentation Added the configuration setting to the README
…pen-telemetry#39839) #### Description Introduce a setting to ignore errors when the configured encoder. It's advised to set this to `true` when using a custom encoder, and use `receiver.googlecloudpubsub.encoding_error` metric to monitor the number of errors. Ignoring the error will cause the receiver to drop the message. #### Link to tracking issue open-telemetry#38164 #### Testing Tested with a custom encoder and introducing bogus message to the topic. #### Documentation Added the configuration setting to the README
…pen-telemetry#39839) #### Description Introduce a setting to ignore errors when the configured encoder. It's advised to set this to `true` when using a custom encoder, and use `receiver.googlecloudpubsub.encoding_error` metric to monitor the number of errors. Ignoring the error will cause the receiver to drop the message. #### Link to tracking issue open-telemetry#38164 #### Testing Tested with a custom encoder and introducing bogus message to the topic. #### Documentation Added the configuration setting to the README
…pen-telemetry#39839) #### Description Introduce a setting to ignore errors when the configured encoder. It's advised to set this to `true` when using a custom encoder, and use `receiver.googlecloudpubsub.encoding_error` metric to monitor the number of errors. Ignoring the error will cause the receiver to drop the message. #### Link to tracking issue open-telemetry#38164 #### Testing Tested with a custom encoder and introducing bogus message to the topic. #### Documentation Added the configuration setting to the README
Uh oh!
There was an error while loading. Please reload this page.
Component(s)
receiver/googlecloudpubsub
What happened?
Description
I am using Open Telemetry Collectors deployed in GKE. This is a dedicated deployment that only works with one subscription. There is a organization sink sending all relevant logs to that subscription. Throughput is conservative, after applying filter in sink it's only around ~200k messages per day. Collectors deployment consists of 5 pods with 500mi/1Gi limits (not even close to hit them anyway). Shortly after deployment I purge messages and almost immediately observe slow but steady growth of unacked messages. There are no errors in collector output (except channel re-establishing) There are no failed or refused records in collector self telemetry. There are no messages in dlq. Subscription is set with 600 ack deadline, exactly once delivery enabled.
Steps to Reproduce
Setup organization sink, setup subscription, start collecting logs from it.
Expected Result
All sent messages are acknowledged
Actual Result
Some messages are not acknowledged and their number growths with time.
Collector version
v0.120.1
Environment information
Environment
OS: cos
OpenTelemetry Collector configuration
Log output
Additional context
The text was updated successfully, but these errors were encountered: