Kafka Test Questions
Which 3 properties are required when creating a producer? – answer bootstrap. Servers
reserialize
value.serializer
Name All the Kakfa Ports - answer Kafka default ports:
9092, can be changed on server. Properties;
zookeeper default ports:
2181 for client connections;
2888 for follower(other zookeeper nodes) connections;
3888 for inter nodes connections;
KSQL default ports:
8088 server listener;
Name Internal Kafka Cluster Topics - answerThere are several types of internal Kafka
topics:
__consumer_offsets is used to store offset commits per topic/partition.
__transaction_state is used to keep state for Kafka producers and consumers using
transactional semantics.
_schemas is used by Schema Registry to store all the schemas, metadata and
compatibility configuration.
Name Internal Kafka Streams Topics - answerThe following three topics are examples
of internal topics used by Kafka Streams. The first two are regular join information, the
third one is actually a RocksDB persistent StateStore:
{consumer-group}--KSTREAM-JOINOTHER-000000000X-store-changelog
{consumer-group}--KSTREAM-JOINTHIS-000000000X-store-changelog
{consumer-group}--incompleteMessageStore-changelog
Name Internal Kafka Connect Topics - answerconnect-configs stores configurations,
connect-status helps to elect leaders for connect, and
connect-offsets store source offsets for source connectors
What are the functions of Zookeeper - answer1) Store dynamic topic configurations
2) ACL information
3) Controller registration
4) Broker registration
,How can you change the producer's batching configuration? - answerProducers can
adjust batching configuration parameters
◦ batch.size message batch size in bytes (Default: 16KB)
◦ linger.ms time to wait for messages to batch together (Default: 0, i.e.,
sendimmediately)
▪ High throughput: large batch.size and linger.ms, or flush manually
▪ Low latency: small batch.size and linger.ms
The internal thread which pushes data from the producer to the brokers is triggered by
twothresholds: batch.size and linger.ms. Batching provides higher throughput due to
larger datatransfers and fewer RPCs but at the cost of higher latency due to the time it
takes toaccumulate the batches
What does stateless message processing mean? - answerStateless means processing
of each message depends only on the message.
For example, converting from JSON ot Avro or filtering a stream are both stateless
operations.
Give some examples of stateless transformation operations. - answerfilter,
map,
mapValues,
flatMap,
flatMapValues
GroupBy
Branch
Foreach
Describe the acks settings. - answerThe acks setting is a producer setting. It represents
the number of acknowledgments the Producer requires the leader to have before
considering the request complete. This controls the durability of records.
• acks=0: Producer will not wait for any acknowledgment from the server
• acks=1: Producer will wait until the leader has written the record to its local log
• acks=all: Producer will wait until all in-sync replicas have acknowledged
receipt of the record.
The acks config parameter determines behavior of Producer when sending messages.
Use this to configure the durability of messages being sent.
Define max.in.flight.requests.per.connection. What is the default? - answerThe term "in-
flight request" means a produce request that has not been acknowledged by the broker.
The default is 5.
, What is the risk of increasing max.in.flight.requests.per.connection while also enabling
retries in the producer? - answerIf retries > 0 and max.in.flight.requests.per.connection
> 1, then out-of-order messages
may happen.
For example, assume you send message batches m1, m2, and m3 with
max.in.flight.requests.per.connection=3. If all three message batches are in-flight but
only
m2 fails, then only m2 is retried. However, since m1 and m3 are already in-flight they
will likely
arrive first. The message batches will arrive in the order m1, m3, m2 — out of order.
Note that retries > 0 is "at least once" delivery and may introduce duplicates, especially
if
used with acks=all.
How to turn on exactly once semantics? - answerFor a single partition, Idempotent
producer sends remove the possibility of duplicate messages due to producer or broker
errors.
To turn on this feature and get exactly-once semantics per partition—meaning no
duplicates, no data loss,
and in-order semantics—configure your producer to set "enable.idempotence=true".
Which consumer setting do you use to associate similar topic consumers? -
answergroup.id
Describe the min.insync.replicas configuration parameter - answerThe
min.insync.replicas parameter is a topic parameter.
Use min.insync.replicas with acks=all.
the min.insync.replicas combined with acks=all defines minimum number of replicas in
ISR needed to
satisfy a produce request.
What is the Confluent Schema Registry? - answerThe Confluent Schema Registry is
your safeguard agains incompatible schema changes
and will be the component that ensures no breaking schema evolution will be possible.
Kafka brokers do not look at your payload and your payload schema, and therefore will
not
reject data.
Producing messages with a key does what? - answerThe key will help determine which
topic partition the message will go into.
Keys are necessary if you require strong ordering or grouping for messages that share
the same key.
Which 3 properties are required when creating a producer? – answer bootstrap. Servers
reserialize
value.serializer
Name All the Kakfa Ports - answer Kafka default ports:
9092, can be changed on server. Properties;
zookeeper default ports:
2181 for client connections;
2888 for follower(other zookeeper nodes) connections;
3888 for inter nodes connections;
KSQL default ports:
8088 server listener;
Name Internal Kafka Cluster Topics - answerThere are several types of internal Kafka
topics:
__consumer_offsets is used to store offset commits per topic/partition.
__transaction_state is used to keep state for Kafka producers and consumers using
transactional semantics.
_schemas is used by Schema Registry to store all the schemas, metadata and
compatibility configuration.
Name Internal Kafka Streams Topics - answerThe following three topics are examples
of internal topics used by Kafka Streams. The first two are regular join information, the
third one is actually a RocksDB persistent StateStore:
{consumer-group}--KSTREAM-JOINOTHER-000000000X-store-changelog
{consumer-group}--KSTREAM-JOINTHIS-000000000X-store-changelog
{consumer-group}--incompleteMessageStore-changelog
Name Internal Kafka Connect Topics - answerconnect-configs stores configurations,
connect-status helps to elect leaders for connect, and
connect-offsets store source offsets for source connectors
What are the functions of Zookeeper - answer1) Store dynamic topic configurations
2) ACL information
3) Controller registration
4) Broker registration
,How can you change the producer's batching configuration? - answerProducers can
adjust batching configuration parameters
◦ batch.size message batch size in bytes (Default: 16KB)
◦ linger.ms time to wait for messages to batch together (Default: 0, i.e.,
sendimmediately)
▪ High throughput: large batch.size and linger.ms, or flush manually
▪ Low latency: small batch.size and linger.ms
The internal thread which pushes data from the producer to the brokers is triggered by
twothresholds: batch.size and linger.ms. Batching provides higher throughput due to
larger datatransfers and fewer RPCs but at the cost of higher latency due to the time it
takes toaccumulate the batches
What does stateless message processing mean? - answerStateless means processing
of each message depends only on the message.
For example, converting from JSON ot Avro or filtering a stream are both stateless
operations.
Give some examples of stateless transformation operations. - answerfilter,
map,
mapValues,
flatMap,
flatMapValues
GroupBy
Branch
Foreach
Describe the acks settings. - answerThe acks setting is a producer setting. It represents
the number of acknowledgments the Producer requires the leader to have before
considering the request complete. This controls the durability of records.
• acks=0: Producer will not wait for any acknowledgment from the server
• acks=1: Producer will wait until the leader has written the record to its local log
• acks=all: Producer will wait until all in-sync replicas have acknowledged
receipt of the record.
The acks config parameter determines behavior of Producer when sending messages.
Use this to configure the durability of messages being sent.
Define max.in.flight.requests.per.connection. What is the default? - answerThe term "in-
flight request" means a produce request that has not been acknowledged by the broker.
The default is 5.
, What is the risk of increasing max.in.flight.requests.per.connection while also enabling
retries in the producer? - answerIf retries > 0 and max.in.flight.requests.per.connection
> 1, then out-of-order messages
may happen.
For example, assume you send message batches m1, m2, and m3 with
max.in.flight.requests.per.connection=3. If all three message batches are in-flight but
only
m2 fails, then only m2 is retried. However, since m1 and m3 are already in-flight they
will likely
arrive first. The message batches will arrive in the order m1, m3, m2 — out of order.
Note that retries > 0 is "at least once" delivery and may introduce duplicates, especially
if
used with acks=all.
How to turn on exactly once semantics? - answerFor a single partition, Idempotent
producer sends remove the possibility of duplicate messages due to producer or broker
errors.
To turn on this feature and get exactly-once semantics per partition—meaning no
duplicates, no data loss,
and in-order semantics—configure your producer to set "enable.idempotence=true".
Which consumer setting do you use to associate similar topic consumers? -
answergroup.id
Describe the min.insync.replicas configuration parameter - answerThe
min.insync.replicas parameter is a topic parameter.
Use min.insync.replicas with acks=all.
the min.insync.replicas combined with acks=all defines minimum number of replicas in
ISR needed to
satisfy a produce request.
What is the Confluent Schema Registry? - answerThe Confluent Schema Registry is
your safeguard agains incompatible schema changes
and will be the component that ensures no breaking schema evolution will be possible.
Kafka brokers do not look at your payload and your payload schema, and therefore will
not
reject data.
Producing messages with a key does what? - answerThe key will help determine which
topic partition the message will go into.
Keys are necessary if you require strong ordering or grouping for messages that share
the same key.