Jay Taylor's notes

back to listing index

Getting: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes. · Issue #272 · Shopify/sarama

[web search]
Original source (github.com)
Tags: troubleshooting kafka github.com
Clipped on: 2016-04-27

Skip to content

Getting: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes. #272

Closed
luck02 opened this Issue on Feb 14, 2015 · 6 comments

Labels

bug

Milestone

No milestone

Assignee

No one assigned

Notifications

You’re not receiving notifications from this thread.

3 participants

Image (Asset 2/10) alt= luck02 Image (Asset 3/10) alt= wvanbergen Image (Asset 4/10) alt= eapache
Image (Asset 5/10) alt=

Sha:  888a760 
Kafka Version: kafka_2.9.2-0.8.2.0

I've created a kafka server with 3 nodes, completely based on the quickstart in the kafka documentation.

Everything works fine with the quickstart tutorial, I can create topics, list them, publish and consume. Including the topics listed in the below code. IE the CLI works for 'NewTopic' but the code errors out.

However when I attempt to run the producer / simple producer code locally I get this failure:

normal➜  LegacyProducer  go run main.go
> connected
panic: kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes.

goroutine 16 [running]:
runtime.panic(0x563240, 0x5)
        /usr/lib/golang/src/pkg/runtime/panic.c:279 +0xf5
main.main()
        /home/glucas/dev/goproj/src/github.com/luck02/LegacyProducer/main.go:70 +0x36f

goroutine 19 [finalizer wait]:
runtime.park(0x4133b0, 0x699808, 0x697ac9)
        /usr/lib/golang/src/pkg/runtime/proc.c:1369 +0x89
runtime.parkunlock(0x699808, 0x697ac9)
        /usr/lib/golang/src/pkg/runtime/proc.c:1385 +0x3b
runfinq()
        /usr/lib/golang/src/pkg/runtime/mgc0.c:2644 +0xcf
runtime.goexit()
        /usr/lib/golang/src/pkg/runtime/proc.c:1445

goroutine 23 [chan receive]:
github.com/shopify/sarama.(*Broker).responseReceiver(0xc208004180)
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/broker.go:359 +0xdd
github.com/shopify/sarama.*Broker.(github.com/shopify/sarama.responseReceiver)·fm()
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/broker.go:113 +0x26
github.com/shopify/sarama.withRecover(0xc208000510)
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/utils.go:42 +0x37
created by github.com/shopify/sarama.func·001
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/broker.go:113 +0x381

goroutine 27 [runnable]:
github.com/shopify/sarama.(*Client).backgroundMetadataUpdater(0xc208048090)
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/client.go:494 +0x1e0
github.com/shopify/sarama.*Client.(github.com/shopify/sarama.backgroundMetadataUpdater)·fm()
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/client.go:80 +0x26
github.com/shopify/sarama.withRecover(0xc208000680)
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/utils.go:42 +0x37
created by github.com/shopify/sarama.NewClient
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/client.go:80 +0x4b3

goroutine 44 [runnable]:
github.com/shopify/sarama.withRecover(0xc208001b70)
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/utils.go:32
created by github.com/shopify/sarama.safeAsyncClose
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/utils.go:51 +0x6c

goroutine 45 [runnable]:
github.com/shopify/sarama.withRecover(0xc208001b80)
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/utils.go:32
created by github.com/shopify/sarama.safeAsyncClose
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/utils.go:51 +0x6c

goroutine 34 [sleep]:
time.Sleep(0x12a05f200)
        /usr/lib/golang/src/pkg/runtime/time.goc:39 +0x31
net.func·019()
        /usr/lib/golang/src/pkg/net/dnsclient_unix.go:183 +0x56
created by net.loadConfig
        /usr/lib/golang/src/pkg/net/dnsclient_unix.go:212 +0x153

goroutine 47 [runnable]:
github.com/shopify/sarama.withRecover(0xc208001ba0)
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/utils.go:32
created by github.com/shopify/sarama.safeAsyncClose
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/utils.go:51 +0x6c

goroutine 46 [runnable]:
github.com/shopify/sarama.withRecover(0xc208001b90)
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/utils.go:32
created by github.com/shopify/sarama.safeAsyncClose
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/utils.go:51 +0x6c

goroutine 43 [runnable]:
github.com/shopify/sarama.func·003()
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/producer.go:235 +0x53
github.com/shopify/sarama.withRecover(0xc208001230)
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/utils.go:42 +0x37
created by github.com/shopify/sarama.(*Producer).Close
        /home/glucas/dev/goproj/src/github.com/shopify/sarama/producer.go:237 +0xae
exit status 2

normal

Code:

normalfunc main() {
        client, err := sarama.NewClient("UniqueClientID", []string{"192.168.1.76:9092"}, sarama.NewClientConfig())
        if err != nil {
                panic(err)
        } else {
                fmt.Println("> connected")
        }
        defer client.Close()

        producer, err := sarama.NewSimpleProducer(client, nil)
        if err != nil {
                panic(err)
        }
        defer producer.Close()

        for {
                err = producer.SendMessage("my_topic", nil, sarama.StringEncoder("testing 123"))
                if err != nil {
                        panic(err)
                } else {
                        fmt.Println("> message sent")
                }
        }
}

normal
Image (Asset 6/10) alt=
Shopify member

Can you try:  kafka/bin/kafka-topics.sh --describe --topic my_topic --zookeeper localhost , and paste the output here?

Image (Asset 7/10) alt=

Hi, @wvanbergen

normal➜  kafka_2.9.2-0.8.2.0  bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic my_topic --from-beginning
herp
derp
terp
^CConsumed 3 messages
➜  kafka_2.9.2-0.8.2.0  bin/kafka-topics.sh --describe --topic my_topic --zookeeper localhost:2181            
Topic:my_topic  PartitionCount:1        ReplicationFactor:1     Configs:
        Topic: my_topic Partition: 0    Leader: 0       Replicas: 0     Isr: 0
normal

code:

normalfunc main() {
        client, err := sarama.NewClient("herp23432", []string{"192.168.1.76:9092"}, sarama.NewClientConfig())
        if err != nil {
                panic(err)
        } else {
                fmt.Println("> connected")
        }
        defer client.Close()

        producer, err := sarama.NewSimpleProducer(client, nil)
        if err != nil {
                panic(err)
        }
        defer producer.Close()

        for {
                err = producer.SendMessage("my_topic", nil, sarama.StringEncoder("testing 123"))
                if err != nil {
                        panic(err)
                } else {
                        fmt.Println("> message sent")
                }
        }
}
normal

code execution:

normal➜  LegacyProducer  go run main.go 
> connected
panic: kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes.

goroutine 16 [running]:
runtime.panic(0x563240, 0x5)
        /usr/lib/golang/src/pkg/runtime/panic.c:279 +0xf5
main.main()
        /home/glucas/dev/goproj/src/github.com/luck02/LegacyProducer/main.go:70 +0x36f
normal

etc.

Thanks :)
G

Image (Asset 8/10) alt=

Please check that the broker is actually reachable from where you are running your go program (since it doesn't look like you're running it on the same machine) - could there be a firewall or routing problem in the way? I have a suspicion we are simply reporting the wrong error, and the right error is a lower-level networking problem.

If that doesn't help, please set  sarama.Logger  to a useful value and provide that output as well.

Image (Asset 9/10) alt= eapache added the bug label on Feb 15, 2015
Image (Asset 10/10) alt=

it's not a simple network issue: 192.168.1.76 == localhost, it's all on my laptop. Fedora 21 more or less stock.

output of sarama debug:

normal➜  LegacyProducer  cat sarama.debug.log 
PREFIX: 2015/02/15 19:59:38 client.go:42: Initializing new client
PREFIX: 2015/02/15 19:59:38 client.go:353: Fetching metadata for [] from broker localhost:9092
PREFIX: 2015/02/15 19:59:38 broker.go:112: Connected to broker localhost:9092
PREFIX: 2015/02/15 19:59:38 client.go:522: Registered new broker #0 at gclaptop:9092
PREFIX: 2015/02/15 19:59:38 client.go:82: Successfully initialized new client
PREFIX: 2015/02/15 19:59:38 producer.go:508: producer/flusher/0 starting up
PREFIX: 2015/02/15 19:59:38 broker.go:104: Failed to connect to broker gclaptop:9092: dial tcp: lookup gclaptop: no such host
PREFIX: 2015/02/15 19:59:38 client.go:310: Disconnecting Broker 0
PREFIX: 2015/02/15 19:59:38 producer.go:555: producer/flusher/0 state change to [closing] because dial tcp: lookup gclaptop: no such host
PREFIX: 2015/02/15 19:59:38 broker.go:135: Failed to close connection to broker gclaptop:9092: kafka: broker: not connected
PREFIX: 2015/02/15 19:59:38 utils.go:49: Error closing broker 0 : kafka: broker: not connected
PREFIX: 2015/02/15 19:59:38 producer.go:397: producer/leader state change to [retrying] on my_topic/0
PREFIX: 2015/02/15 19:59:38 producer.go:398: producer/leader abandoning broker 0 on my_topic/0
PREFIX: 2015/02/15 19:59:38 producer.go:603: producer/flusher/0 shut down
PREFIX: 2015/02/15 19:59:38 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:38 client.go:522: Registered new broker #0 at gclaptop:9092
PREFIX: 2015/02/15 19:59:39 broker.go:104: Failed to connect to broker gclaptop:9092: dial tcp: lookup gclaptop: no such host
PREFIX: 2015/02/15 19:59:39 client.go:366: Some partitions are leaderless, waiting 250ms for election... (3 retries remaining)
PREFIX: 2015/02/15 19:59:39 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:39 client.go:366: Some partitions are leaderless, waiting 250ms for election... (2 retries remaining)
PREFIX: 2015/02/15 19:59:39 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:39 client.go:366: Some partitions are leaderless, waiting 250ms for election... (1 retries remaining)
PREFIX: 2015/02/15 19:59:39 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:39 client.go:363: Some partitions are leaderless, but we're out of retries
PREFIX: 2015/02/15 19:59:39 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:39 client.go:366: Some partitions are leaderless, waiting 250ms for election... (3 retries remaining)
PREFIX: 2015/02/15 19:59:40 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:40 client.go:366: Some partitions are leaderless, waiting 250ms for election... (2 retries remaining)
PREFIX: 2015/02/15 19:59:40 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:40 client.go:366: Some partitions are leaderless, waiting 250ms for election... (1 retries remaining)
PREFIX: 2015/02/15 19:59:40 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:40 client.go:363: Some partitions are leaderless, but we're out of retries
PREFIX: 2015/02/15 19:59:40 producer.go:407: producer/leader state change to [flushing] on my_topic/0
PREFIX: 2015/02/15 19:59:40 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:40 producer.go:280: Producer shutting down.
PREFIX: 2015/02/15 19:59:40 client.go:366: Some partitions are leaderless, waiting 250ms for election... (3 retries remaining)
PREFIX: 2015/02/15 19:59:40 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:40 client.go:366: Some partitions are leaderless, waiting 250ms for election... (2 retries remaining)
PREFIX: 2015/02/15 19:59:41 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:41 client.go:366: Some partitions are leaderless, waiting 250ms for election... (1 retries remaining)
PREFIX: 2015/02/15 19:59:41 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:41 client.go:363: Some partitions are leaderless, but we're out of retries
PREFIX: 2015/02/15 19:59:41 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:41 client.go:366: Some partitions are leaderless, waiting 250ms for election... (3 retries remaining)
PREFIX: 2015/02/15 19:59:41 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:41 client.go:366: Some partitions are leaderless, waiting 250ms for election... (2 retries remaining)
PREFIX: 2015/02/15 19:59:41 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:41 client.go:366: Some partitions are leaderless, waiting 250ms for election... (1 retries remaining)
PREFIX: 2015/02/15 19:59:42 client.go:353: Fetching metadata for [my_topic] from broker localhost:9092
PREFIX: 2015/02/15 19:59:42 client.go:363: Some partitions are leaderless, but we're out of retries
PREFIX: 2015/02/15 19:59:42 producer.go:410: producer/leader state change to [normal] after "kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes." on my_topic/0
PREFIX: 2015/02/15 19:59:42 client.go:101: Closing Client
normal

And now it look like the client is doing some sort of host lookup to complete the connection:
"PREFIX: 2015/02/15 19:59:38 broker.go:104: Failed to connect to broker gclaptop:9092: dial tcp: lookup gclaptop: no such host"

gclaptop is my machine name. Going to add that to hosts...

And that solved the problem, thanks @eapache and @wvanbergen

luck02 closed this on Feb 15, 2015

Kafka advertises itself by hostname through zookeeper, which can lead to some funny behaviours like this. Sarama can do a better job here by returning the right error in the first place though, I will try and fix that.

Oh interesting, so kafka registers itself via hostname, makes some sort of sense. I wonder how the console commands work?

In any case, thanks allot.

Attach files by dragging & dropping, , or pasting from the clipboard.