The data received by a consumer for a topic might contain both compressed as well as uncompressed messages. The consumer iterator transparently decompresses compressed data and only returns an uncompressed message. The offset maintenance in the consumer gets a little tricky. In the zookeeper consumer, the consumed offset is updated each time a message is returned. This consumed offset should be a valid fetch offset for correct failure recovery. Since data is stored in compressed format on the broker, valid fetch offsets are the compressed message boundaries. Hence, for compressed data, the consumed offset will be advanced one compressed message at a time. This has the side effect of possible duplicates in the event of a consumer failure. For uncompressed data, consumed offset will be advanced one message at a time.
🐧 Linux shell_ senior operation and maintenance faction: QQ group 459096184 circle (system operation and maintenance - application operation and maintenance - automation operation and maintenance - virtualization technology research, welcome to join) 🐧 BigData-Exchange School:QQ group 521621407 circles (big data Yun Wei) (Hadoop developer) (big data research enthusiasts) welcome to join
Bidata have internal WeChat exchange group, learn from each other, join QQ group has links.