Kafka is streaming ( can also be used as messaging, message storage) platform which provide best of both traditional messaging models: message queues and publish-subscribe. At the same time it serves as durable distributed storage and provides in order guarantee of messages and above all it is scalable. Kafka APIs : Producer, Consumer , Stream ( Allow creating data processing pipelines ), Connector. For more details see Kafka Introduction Kafka clusters can span multiple data centers ( durable, distributed storage ) . Consider each topic like a folder in the file-system and each event as a file. Because each Kafka topic ( stream of records ) is stored in partitioned logs ( number is configurable) for configurable retention period, multiple clients can read those in the ways they want. Hence Kafka topics are multi-subscriber. Also there is very less overhead per consumer. The only metadata needed to be kept is offset ( in the partition ). T...
Comments
Post a Comment