Apache Kafka

Apache Kafka is Publish-Subscribe messaging rethought as a distributed commit log.

http://kafka.apache.org/

Clients en Perl, Python, Node.js, C, C++, Scala ...: https://cwiki.apache.org/confluence/display/KAFKA/Clients

=First steps with Kafka= see Quickstart

cd kafka

Launch Zookeeper ./bin/zookeeper-server-start.sh ./config/zookeeper.properties

Launch Kafka server ./bin/kafka-server-start.sh ./config/server.properties

Create a topic ./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

Launch Kafka console producer ./bin/kafka-console-producer.sh --broker-list localhost:2181 --topic test

Launch Kafka console consumer ./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test

Info on topic ./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test

Replicated topics
broker.id=1 port=9093 log.dir=/tmp/kafka-logs-1 broker.id=2 port=9094 log.dir=/tmp/kafka-logs-2
 * 1) config/server-1.properties:
 * 1) config/server-2.properties:

Launch extra servers ./bin/kafka-server-start.sh config/server-1.properties & ./bin/kafka-server-start.sh config/server-2.properties &

Create a replicated topic ./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic ./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic

Launch Kafka console producer ./bin/kafka-console-producer.sh --broker-list localhost:2181 --topic my-replicated-topic

Launch Kafka console consumer ./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic my-replicated-topic

Produce and Consume messages with Node.js
TODO

Download master zipfile from https://github.com/SOHU-Co/kafka-node/ (or git clone https://github.com/SOHU-Co/kafka-node.git)

npm install kafka-node-master cd kafka-node-master

cd example node topics.js node producer.js

node consumer.js

Voir InfluxDB pour archiver les messages reçus dans une base InfluxDB.

Produce and Consume messages with Python
TODO

Produce and Consume messages with Node-RED
TODO

https://www.npmjs.com/package/node-red-contrib-kafka-node

npm install -g node-red-contrib-kafka-node

Stop all
Stop all ./bin/kafka-server-stop.sh ./bin/zookeeper-server-stop.sh

Extra
Launch Zookeeper shell ./bin/zookeeper-shell.sh localhost:2181

Quickstart with Docker
TODO

=Un peu plus=

Livre
livre OReilly “Kafka : The Definitive Guide”, https://www.confluent.io/resources/kafka-definitive-guide-preview-edition/

Kafka UI
http://docs.datamountaineer.com/en/latest/ui.html#install

Kafka REST Proxy
provides a RESTful interface to a Kafka cluster. The API is not documented with Swagger (ie OpenAPI).
 * https://github.com/confluentinc/kafka-rest
 * https://hub.docker.com/r/confluentinc/cp-kafka-rest/

Kafka Connect
https://drive.google.com/file/d/0B_0n2CoDWpWQbkVsSUZ2SC1aQkk/view?usp=sharing

permet de connecter Kafka à des sources et puits d'info : File, MySQL, ELK, HDFS, … Des connectors sont implémentables au moyen des interfaces Connector et Task. it provides out-of-the-box features like configuration management, offset storage, parallelization, error handling, support for different data types, and standard management REST APIs. (Chapitre 7 du livre OReilly “Kafka : The Definitive Guide”)
 * https://github.com/confluentinc/kafka-connect-jdbc
 * https://github.com/confluentinc/kafka-connect-storage-cloud
 * https://github.com/confluentinc/kafka-connect-hdfs

Confluent recense et liste des connectors open-source et commerciaux : https://www.confluent.io/product/connectors/

Stream Reactor : Un gros projet opensource de connecteurs (MQTT, InfluxDB, Azure DocumentDB, MongoDB, Blockchain…) écrits en Scala. Utilise un DSL de requêtage KCQL.
 * https://github.com/datamountaineer/stream-reactor
 * https://github.com/datamountaineer/kafka-connect-tools

Cluster Replication with Kafka

 * Kafka MirrorMaker
 * Confluent Replicator
 * Uber uReplicator https://github.com/uber/uReplicator

Kafka Streams
Canevas de Event Stream Processing basé sur Kafka et Kafka Connect

Schema Registry
http://docs.confluent.io/current/schema-registry/docs/intro.html

Apache Avro est le serialisateur de prédilection de l’écosystême Kafka.

Confluent propose un registry des schemas Apache Avro utilisés.

Le Schema Registry est en dehors du projet Apache Kafka.