Skip to content

Commit 71e343a

Browse files
committed
Scrap the connection test in the client metadata update
Previously, if the cluster metadata was giving us back a broker which we suspected was unavailable (since it was already in our 'dead' set) then we would wait for the connection, and mark it as unavailable if the connection failed (otherwise, we simply do what the cluster tells us and let the producers/consumers deal with the connection errors). This was handy since it let us back off nicely if a broker crashed and came back, retrying metadata until the cluster had caught up and moved the leader to a broker that was up. I'm now of the opinion this was more trouble than it's worth, so scrap it. Among other things: - it does network IO while holding an important mutex, which is a bad pattern to begin with (#263) - it can mask real network errors behind "LeaderNotAvailable" (#272) The unfortunate side-effect of scrapping it is that in the producer and consumer we are more likely to fail if we don't wait long enough for the cluster to fail over leadership. The real solution if that occurs is to wait longer in the correct spot (`RetryBackoff` in the producer, currently hard-coded to 10 seconds in the consumer) instead of this hack.
1 parent f441e78 commit 71e343a

File tree

1 file changed

+1
-10
lines changed

1 file changed

+1
-10
lines changed

client.go

Lines changed: 1 addition & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -544,16 +544,7 @@ func (client *Client) update(data *MetadataResponse) ([]string, error) {
544544
client.metadata[topic.Name] = make(map[int32]*PartitionMetadata, len(topic.Partitions))
545545
for _, partition := range topic.Partitions {
546546
client.metadata[topic.Name][partition.ID] = partition
547-
switch partition.Err {
548-
case NoError:
549-
broker := client.brokers[partition.Leader]
550-
if _, present := client.deadBrokerAddrs[broker.Addr()]; present {
551-
if connected, _ := broker.Connected(); !connected {
552-
partition.Err = LeaderNotAvailable
553-
toRetry[topic.Name] = true
554-
}
555-
}
556-
case LeaderNotAvailable:
547+
if partition.Err == LeaderNotAvailable {
557548
toRetry[topic.Name] = true
558549
}
559550
}

0 commit comments

Comments
 (0)