Apache Kafka

From air
Jump to navigation Jump to search

Apache Kafka is Publish-Subscribe messaging rethought as a distributed commit log.

http://kafka.apache.org/

Clients en Perl, Python, Node.js, C, C++, Scala ...: https://cwiki.apache.org/confluence/display/KAFKA/Clients

First steps with Kafka

see Quickstart

cd kafka

Launch Zookeeper

./bin/zookeeper-server-start.sh ./config/zookeeper.properties

Launch Kafka server

./bin/kafka-server-start.sh ./config/server.properties

Create a topic

./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

Launch Kafka console producer

./bin/kafka-console-producer.sh --broker-list localhost:2181 --topic test

Launch Kafka console consumer

./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test

Info on topic

./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test

Replicated topics

# config/server-1.properties:
    broker.id=1
    port=9093
    log.dir=/tmp/kafka-logs-1
# config/server-2.properties:
    broker.id=2
    port=9094
    log.dir=/tmp/kafka-logs-2

Launch extra servers

./bin/kafka-server-start.sh config/server-1.properties &
./bin/kafka-server-start.sh config/server-2.properties &

Create a replicated topic

./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic
./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic


Launch Kafka console producer

./bin/kafka-console-producer.sh --broker-list localhost:2181 --topic my-replicated-topic

Launch Kafka console consumer

./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic my-replicated-topic

Produce and Consume messages with Node.js

TODO

Download master zipfile from https://github.com/SOHU-Co/kafka-node/ (or git clone https://github.com/SOHU-Co/kafka-node.git)

npm install kafka-node-master
cd kafka-node-master

cd example
node topics.js
node producer.js
node consumer.js

Voir InfluxDB#Apache_Kafka_to_InfluxDB pour archiver les messages reçus dans une base InfluxDB.

Produce and Consume messages with Python

TODO

Produce and Consume messages with Node-RED

TODO

https://www.npmjs.com/package/node-red-contrib-kafka-node

npm install -g node-red-contrib-kafka-node

Stop all

Stop all

./bin/kafka-server-stop.sh
./bin/zookeeper-server-stop.sh

Extra

Launch Zookeeper shell

./bin/zookeeper-shell.sh localhost:2181

Quickstart with Docker

TODO

Un peu plus

Livre

livre OReilly “Kafka : The Definitive Guide”, https://www.confluent.io/resources/kafka-definitive-guide-preview-edition/

Kafka UI

http://docs.datamountaineer.com/en/latest/ui.html#install


Kafka REST Proxy

provides a RESTful interface to a Kafka cluster. The API is not documented with Swagger (ie OpenAPI).

Kafka Connect

https://drive.google.com/file/d/0B_0n2CoDWpWQbkVsSUZ2SC1aQkk/view?usp=sharing

permet de connecter Kafka à des sources et puits d'info : File, MySQL, ELK, HDFS, … Des connectors sont implémentables au moyen des interfaces Connector et Task.

it provides out-of-the-box features like configuration management, offset storage, parallelization, error handling, support for different data types, and standard management REST APIs. (Chapitre 7 du livre OReilly “Kafka : The Definitive Guide”)

Confluent recense et liste des connectors open-source et commerciaux : https://www.confluent.io/product/connectors/

Stream Reactor : Un gros projet opensource de connecteurs (MQTT, InfluxDB, Azure DocumentDB, MongoDB, Blockchain…) écrits en Scala. Utilise un DSL de requêtage KCQL.

Cluster Replication with Kafka

Kafka Streams

Canevas de Event Stream Processing basé sur Kafka et Kafka Connect

Schema Registry

http://docs.confluent.io/current/schema-registry/docs/intro.html

Apache Avro est le serialisateur de prédilection de l’écosystême Kafka.

Confluent propose un registry des schemas Apache Avro utilisés.

Le Schema Registry est en dehors du projet Apache Kafka.