Apache Kafka: Difference between revisions
(→Extra) |
|||
| (27 intermediate revisions by the same user not shown) | |||
| Line 6: | Line 6: | ||
=First steps with Kafka= |
=First steps with Kafka= |
||
see [ |
see [https://kafka.apache.org/documentation.html#quickstart Quickstart] |
||
<pre> |
<pre> |
||
| Line 12: | Line 12: | ||
</pre> |
</pre> |
||
Launch Zookeeper |
Launch [[Zookeeper]] |
||
<pre> |
<pre> |
||
./bin/zookeeper-server-start.sh ./config/zookeeper.properties |
./bin/zookeeper-server-start.sh ./config/zookeeper.properties |
||
| Line 22: | Line 22: | ||
</pre> |
</pre> |
||
Create a topic |
|||
Launch Zookeeper shell |
|||
<pre> |
<pre> |
||
./bin/ |
./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test |
||
</pre> |
</pre> |
||
Launch Kafka console producer |
Launch Kafka console producer |
||
<pre> |
<pre> |
||
./bin/kafka-console-producer.sh --broker-list localhost:2181 --topic |
./bin/kafka-console-producer.sh --broker-list localhost:2181 --topic test |
||
</pre> |
</pre> |
||
Launch Kafka console consumer |
Launch Kafka console consumer |
||
<pre> |
<pre> |
||
./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic |
./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test |
||
</pre> |
</pre> |
||
Info on topic |
|||
<pre> |
|||
./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test |
|||
</pre> |
|||
==Replicated topics== |
|||
<pre> |
|||
# config/server-1.properties: |
|||
broker.id=1 |
|||
port=9093 |
|||
log.dir=/tmp/kafka-logs-1 |
|||
</pre> |
|||
<pre> |
|||
# config/server-2.properties: |
|||
broker.id=2 |
|||
port=9094 |
|||
log.dir=/tmp/kafka-logs-2 |
|||
</pre> |
|||
Launch extra servers |
|||
<pre> |
|||
./bin/kafka-server-start.sh config/server-1.properties & |
|||
./bin/kafka-server-start.sh config/server-2.properties & |
|||
</pre> |
|||
Create a replicated topic |
|||
<pre> |
|||
./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic |
|||
./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic |
|||
</pre> |
|||
Launch Kafka console producer |
|||
<pre> |
|||
./bin/kafka-console-producer.sh --broker-list localhost:2181 --topic my-replicated-topic |
|||
</pre> |
|||
Launch Kafka console consumer |
|||
<pre> |
|||
./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic my-replicated-topic |
|||
</pre> |
|||
==Produce and Consume messages with [[Node.js]]== |
|||
TODO |
|||
Download master zipfile from https://github.com/SOHU-Co/kafka-node/ |
|||
(or git clone https://github.com/SOHU-Co/kafka-node.git) |
|||
<pre> |
|||
npm install kafka-node-master |
|||
cd kafka-node-master |
|||
cd example |
|||
node topics.js |
|||
node producer.js |
|||
</pre> |
|||
<pre> |
|||
node consumer.js |
|||
</pre> |
|||
Voir [[InfluxDB#Apache_Kafka_to_InfluxDB]] pour archiver les messages reçus dans une base [[InfluxDB]]. |
|||
==Produce and Consume messages with [[Python]]== |
|||
TODO |
|||
==Produce and Consume messages with [[Node-RED]]== |
|||
TODO |
|||
https://www.npmjs.com/package/node-red-contrib-kafka-node |
|||
<pre> |
|||
npm install -g node-red-contrib-kafka-node |
|||
</pre> |
|||
==Stop all== |
|||
Stop all |
Stop all |
||
<pre> |
<pre> |
||
| Line 43: | Line 119: | ||
./bin/zookeeper-server-stop.sh |
./bin/zookeeper-server-stop.sh |
||
</pre> |
</pre> |
||
===Extra=== |
|||
Launch Zookeeper shell |
|||
<pre> |
|||
./bin/zookeeper-shell.sh localhost:2181 |
|||
</pre> |
|||
==Quickstart with [[Docker]]== |
|||
TODO |
|||
=Un peu plus= |
|||
==Livre== |
|||
livre OReilly “Kafka : The Definitive Guide”, https://www.confluent.io/resources/kafka-definitive-guide-preview-edition/ |
|||
==Kafka UI== |
|||
http://docs.datamountaineer.com/en/latest/ui.html#install |
|||
==[[Kafka REST Proxy]]== |
|||
provides a RESTful interface to a Kafka cluster. The API is not documented with [[Swagger]] (ie [[OpenAPI]]). |
|||
* https://github.com/confluentinc/kafka-rest |
|||
* https://hub.docker.com/r/confluentinc/cp-kafka-rest/ |
|||
==[[Kafka Connect]]== |
|||
https://drive.google.com/file/d/0B_0n2CoDWpWQbkVsSUZ2SC1aQkk/view?usp=sharing |
|||
permet de connecter Kafka à des sources et puits d'info : File, MySQL, ELK, HDFS, … Des connectors sont implémentables au moyen des interfaces Connector et Task. |
|||
''it provides out-of-the-box features like configuration management, offset storage, parallelization, error handling, support for different data types, and standard management REST APIs.'' (Chapitre 7 du livre OReilly “Kafka : The Definitive Guide”) |
|||
* https://github.com/confluentinc/kafka-connect-jdbc |
|||
* https://github.com/confluentinc/kafka-connect-storage-cloud |
|||
* https://github.com/confluentinc/kafka-connect-hdfs |
|||
Confluent recense et liste des connectors open-source et commerciaux : https://www.confluent.io/product/connectors/ |
|||
Stream Reactor : Un gros projet opensource de connecteurs ([[MQTT]], [[InfluxDB]], [[Azure DocumentDB]], [[MongoDB]], Blockchain…) écrits en [[Scala]]. Utilise un DSL de requêtage [http://docs.datamountaineer.com/en/latest/kcql.html#kcql KCQL]. |
|||
* https://github.com/datamountaineer/stream-reactor |
|||
* https://github.com/datamountaineer/kafka-connect-tools |
|||
===Cluster Replication with Kafka=== |
|||
* Kafka MirrorMaker |
|||
* Confluent Replicator |
|||
* Uber uReplicator https://github.com/uber/uReplicator |
|||
==[[Kafka Streams]]== |
|||
Canevas de [[Event Stream Processing]] basé sur [[Apache Kafka|Kafka]] et [[Kafka Connect]] |
|||
==Schema Registry== |
|||
http://docs.confluent.io/current/schema-registry/docs/intro.html |
|||
[[Apache Avro]] est le serialisateur de prédilection de l’écosystême Kafka. |
|||
Confluent propose un registry des schemas [[Avro|Apache Avro]] utilisés. |
|||
Le Schema Registry est en dehors du projet Apache Kafka. |
|||
Latest revision as of 09:47, 1 August 2017
Apache Kafka is Publish-Subscribe messaging rethought as a distributed commit log.
Clients en Perl, Python, Node.js, C, C++, Scala ...: https://cwiki.apache.org/confluence/display/KAFKA/Clients
First steps with Kafka
see Quickstart
cd kafka
Launch Zookeeper
./bin/zookeeper-server-start.sh ./config/zookeeper.properties
Launch Kafka server
./bin/kafka-server-start.sh ./config/server.properties
Create a topic
./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
Launch Kafka console producer
./bin/kafka-console-producer.sh --broker-list localhost:2181 --topic test
Launch Kafka console consumer
./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test
Info on topic
./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test
Replicated topics
# config/server-1.properties:
broker.id=1
port=9093
log.dir=/tmp/kafka-logs-1
# config/server-2.properties:
broker.id=2
port=9094
log.dir=/tmp/kafka-logs-2
Launch extra servers
./bin/kafka-server-start.sh config/server-1.properties & ./bin/kafka-server-start.sh config/server-2.properties &
Create a replicated topic
./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic ./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic
Launch Kafka console producer
./bin/kafka-console-producer.sh --broker-list localhost:2181 --topic my-replicated-topic
Launch Kafka console consumer
./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic my-replicated-topic
Produce and Consume messages with Node.js
TODO
Download master zipfile from https://github.com/SOHU-Co/kafka-node/ (or git clone https://github.com/SOHU-Co/kafka-node.git)
npm install kafka-node-master cd kafka-node-master cd example node topics.js node producer.js
node consumer.js
Voir InfluxDB#Apache_Kafka_to_InfluxDB pour archiver les messages reçus dans une base InfluxDB.
Produce and Consume messages with Python
TODO
Produce and Consume messages with Node-RED
TODO
https://www.npmjs.com/package/node-red-contrib-kafka-node
npm install -g node-red-contrib-kafka-node
Stop all
Stop all
./bin/kafka-server-stop.sh ./bin/zookeeper-server-stop.sh
Extra
Launch Zookeeper shell
./bin/zookeeper-shell.sh localhost:2181
Quickstart with Docker
TODO
Un peu plus
Livre
livre OReilly “Kafka : The Definitive Guide”, https://www.confluent.io/resources/kafka-definitive-guide-preview-edition/
Kafka UI
http://docs.datamountaineer.com/en/latest/ui.html#install
Kafka REST Proxy
provides a RESTful interface to a Kafka cluster. The API is not documented with Swagger (ie OpenAPI).
Kafka Connect
https://drive.google.com/file/d/0B_0n2CoDWpWQbkVsSUZ2SC1aQkk/view?usp=sharing
permet de connecter Kafka à des sources et puits d'info : File, MySQL, ELK, HDFS, … Des connectors sont implémentables au moyen des interfaces Connector et Task.
it provides out-of-the-box features like configuration management, offset storage, parallelization, error handling, support for different data types, and standard management REST APIs. (Chapitre 7 du livre OReilly “Kafka : The Definitive Guide”)
- https://github.com/confluentinc/kafka-connect-jdbc
- https://github.com/confluentinc/kafka-connect-storage-cloud
- https://github.com/confluentinc/kafka-connect-hdfs
Confluent recense et liste des connectors open-source et commerciaux : https://www.confluent.io/product/connectors/
Stream Reactor : Un gros projet opensource de connecteurs (MQTT, InfluxDB, Azure DocumentDB, MongoDB, Blockchain…) écrits en Scala. Utilise un DSL de requêtage KCQL.
- https://github.com/datamountaineer/stream-reactor
- https://github.com/datamountaineer/kafka-connect-tools
Cluster Replication with Kafka
- Kafka MirrorMaker
- Confluent Replicator
- Uber uReplicator https://github.com/uber/uReplicator
Kafka Streams
Canevas de Event Stream Processing basé sur Kafka et Kafka Connect
Schema Registry
http://docs.confluent.io/current/schema-registry/docs/intro.html
Apache Avro est le serialisateur de prédilection de l’écosystême Kafka.
Confluent propose un registry des schemas Apache Avro utilisés.
Le Schema Registry est en dehors du projet Apache Kafka.