Skip to content

Conversation

@Sneagan
Copy link
Collaborator

@Sneagan Sneagan commented Sep 19, 2025

We encountered an issue where the service encounters a Prometheus registration error if the service encounters and recovers from a panic. Theory at this time is that there is state cleanup or instance uniqueness on the Prometheus side that does not get cleaned up or regenerated by the time the instance recovers from the panic. To handle this error we check if the collector already exists and use the one that already exists if so.

@Sneagan Sneagan requested review from clD11 and hspencer77 September 19, 2025 14:24
Copy link
Collaborator

@clD11 clD11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Approved

@Sneagan Sneagan merged commit a56a235 into master Sep 19, 2025
8 checks passed
@Sneagan Sneagan deleted the hotfix/metrics-registration branch September 19, 2025 14:40
Sneagan added a commit that referenced this pull request Sep 19, 2025
* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

* Fix metrics registration failure after panic (#830)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>
Sneagan added a commit that referenced this pull request Sep 19, 2025
* Release Simple Dependency Removal (#828)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

* Release 2025-09-19 (#834)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

* Fix metrics registration failure after panic (#830)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>
Sneagan added a commit that referenced this pull request Sep 19, 2025
* Release Simple Dependency Removal (#828)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)



* Fix prod branch naming (#829)

---------



* Release 2025-09-19 (#834)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)



* Fix prod branch naming (#829)

* Fix metrics registration failure after panic (#830)

---------



---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>
Sneagan added a commit that referenced this pull request Sep 19, 2025
* Align Prod and Master (#836)

* Release Simple Dependency Removal (#828)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

* Release 2025-09-19 (#834)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

* Fix metrics registration failure after panic (#830)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>

* Remove extra UUID package from dependencies (#821)

---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>
Sneagan added a commit that referenced this pull request Sep 19, 2025
* Fix metrics registration failure after panic (#830)

* Align Prod and Master (#836)

* Release Simple Dependency Removal (#828)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

* Release 2025-09-19 (#834)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

* Fix metrics registration failure after panic (#830)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>

* Remove extra UUID package from dependencies (#821)

* Add alert implementation and simple test on init

---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>
Sneagan added a commit that referenced this pull request Sep 25, 2025
* Fix metrics registration failure after panic (#830)

* Align Prod and Master (#836)

* Release Simple Dependency Removal (#828)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

* Release 2025-09-19 (#834)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

* Fix metrics registration failure after panic (#830)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>

* Remove extra UUID package from dependencies (#821)

* Improve pool management (#840)

* Improve pool management

* Update comments and skip lint defer expectation in loop

* Adjust error handling

---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>
Sneagan added a commit that referenced this pull request Sep 26, 2025
* Align Prod and Master (#836)

* Release Simple Dependency Removal (#828)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

* Release 2025-09-19 (#834)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

* Fix metrics registration failure after panic (#830)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>

* Remove extra UUID package from dependencies (#821)

* Improve pool management (#840)

* Improve pool management

* Update comments and skip lint defer expectation in loop

* Adjust error handling

---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>
Sneagan added a commit that referenced this pull request Oct 2, 2025
* Fix metrics registration failure after panic (#830)

* Align Prod and Master (#836)

* Release Simple Dependency Removal (#828)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

* Release 2025-09-19 (#834)

* Remove bat-go dialer

* Fix dependency naming

* Make log level configurable

* Add Integration Tests (#816)

* Add Integration Tests

- Kafka token issuance
- Kafka token redemption
- V1 HTTP Issuer GET endpoint
- V1 HTTP Token Redemption POST endpoint
- V3 HTTP Token Redemption POST endpoint

This checks the entire token redemption flow for all known production
use cases as well as checks that duplicate redemtion fails. There are
some cases with time limited issuers where a token is issued for a
future issuer whose redemption fails. These tokens are tested and
verified to fail as expected.

* Tidy

* Correct test name

* Correct README

* Rename integration test file

* Address simple PR feedback

* Remove unneeded type

* Integration testing pr feedback (#827)

Address PR comments regarding testing structure.

* Enable specifying an integration test

* Address feedback

* Adjust import order

* Alter error handling for dynamo change

* Add suggested change

* Remove  dependency (#819)

* Add MSK TLS v1.2 to Kafka dialer (#826)

Co-authored-by: Jackson <[email protected]>

* Fix prod branch naming (#829)

* Fix metrics registration failure after panic (#830)

---------

Co-authored-by: Harold Spencer Jr. <[email protected]>

---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>

* Remove extra UUID package from dependencies (#821)

* Add alert implementation and simple test on init

---------

Co-authored-by: husobee <[email protected]>
Co-authored-by: eV <[email protected]>
Co-authored-by: Ian Krieger <[email protected]>
Co-authored-by: Harold Spencer Jr. <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants