Apache Kafka
Apache Kafka is Publish-Subscribe messaging rethought as a distributed commit log.
Clients en Perl, Python, Node.js, C, C++, Scala ...: https://cwiki.apache.org/confluence/display/KAFKA/Clients
First steps with Kafka
see Quickstart
cd kafka
Launch Zookeeper
./bin/zookeeper-server-start.sh ./config/zookeeper.properties
Launch Kafka server
./bin/kafka-server-start.sh ./config/server.properties
Create a topic
./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
Launch Kafka console producer
./bin/kafka-console-producer.sh --broker-list localhost:2181 --topic test
Launch Kafka console consumer
./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test
Info on topic
./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test
Replicated topics
# config/server-1.properties: broker.id=1 port=9093 log.dir=/tmp/kafka-logs-1
# config/server-2.properties: broker.id=2 port=9094 log.dir=/tmp/kafka-logs-2
Launch extra servers
./bin/kafka-server-start.sh config/server-1.properties & ./bin/kafka-server-start.sh config/server-2.properties &
Create a replicated topic
./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic ./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic
Launch Kafka console producer
./bin/kafka-console-producer.sh --broker-list localhost:2181 --topic my-replicated-topic
Launch Kafka console consumer
./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic my-replicated-topic
Produce and Consume messages with Node.js
TODO
Download master zipfile from https://github.com/SOHU-Co/kafka-node/ (or git clone https://github.com/SOHU-Co/kafka-node.git)
npm install kafka-node-master cd kafka-node-master cd example node topics.js node producer.js
node consumer.js
Voir InfluxDB#Apache_Kafka_to_InfluxDB pour archiver les messages reçus dans une base InfluxDB.
Produce and Consume messages with Python
TODO
Produce and Consume messages with Node-RED
TODO
https://www.npmjs.com/package/node-red-contrib-kafka-node
npm install -g node-red-contrib-kafka-node
Stop all
Stop all
./bin/kafka-server-stop.sh ./bin/zookeeper-server-stop.sh
Extra
Launch Zookeeper shell
./bin/zookeeper-shell.sh localhost:2181
Quickstart with Docker
TODO
Un peu plus
Livre
livre OReilly “Kafka : The Definitive Guide”, https://www.confluent.io/resources/kafka-definitive-guide-preview-edition/
Kafka UI
http://docs.datamountaineer.com/en/latest/ui.html#install
Kafka REST Proxy
provides a RESTful interface to a Kafka cluster. The API is not documented with Swagger (ie OpenAPI).
Kafka Connect
https://drive.google.com/file/d/0B_0n2CoDWpWQbkVsSUZ2SC1aQkk/view?usp=sharing
permet de connecter Kafka à des sources et puits d'info : File, MySQL, ELK, HDFS, … Des connectors sont implémentables au moyen des interfaces Connector et Task.
it provides out-of-the-box features like configuration management, offset storage, parallelization, error handling, support for different data types, and standard management REST APIs. (Chapitre 7 du livre OReilly “Kafka : The Definitive Guide”)
- https://github.com/confluentinc/kafka-connect-jdbc
- https://github.com/confluentinc/kafka-connect-storage-cloud
- https://github.com/confluentinc/kafka-connect-hdfs
Confluent recense et liste des connectors open-source et commerciaux : https://www.confluent.io/product/connectors/
Stream Reactor : Un gros projet opensource de connecteurs (MQTT, InfluxDB, Azure DocumentDB, MongoDB, Blockchain…) écrits en Scala. Utilise un DSL de requêtage KCQL.
- https://github.com/datamountaineer/stream-reactor
- https://github.com/datamountaineer/kafka-connect-tools
Cluster Replication with Kafka
- Kafka MirrorMaker
- Confluent Replicator
- Uber uReplicator https://github.com/uber/uReplicator
Kafka Streams
Canevas de Event Stream Processing basé sur Kafka et Kafka Connect
Schema Registry
http://docs.confluent.io/current/schema-registry/docs/intro.html
Apache Avro est le serialisateur de prédilection de l’écosystême Kafka.
Confluent propose un registry des schemas Apache Avro utilisés.
Le Schema Registry est en dehors du projet Apache Kafka.