Difference between revisions of "Apache Flink"
Line 86: | Line 86: | ||
</pre> |
</pre> |
||
+ | Configure CLI with AWS credential ([http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html link]) |
||
+ | <pre> |
||
+ | aws configure |
||
+ | </pre> |
||
+ | ''NB : credential file is ~/.aws/credentials and config file is ~/.aws/config'' |
||
⚫ | |||
+ | |||
− | Create an cluster on AWS EMR (Elastic Map Reduce) in your AWS console. |
||
+ | |||
⚫ | |||
[[Image:AWS_EMR_Dashboard.png]] |
[[Image:AWS_EMR_Dashboard.png]] |
||
Revision as of 09:19, 16 September 2016
Apache Flink® is an open source platform for distributed stream and batch data processing. Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams.
Getting started
Installation
wget http://www.apache.org/dyn/closer.lua/flink/flink-1.1.2/flink-1.1.2-bin-hadoop27-scala_2.11.tgz tar xf flink-1.1.2-bin-hadoop27-scala_2.11.tgz FLINK_HOME=~/flink-1.1.2 cd $FLINK_HOME ls bin ls examples
Local Execution
Terminal 1: start Flink
cd $FLINK_HOME bin/start-local.sh
Open the UI http://localhost:8081/#/overview
Run the SocketWindowWordCount example (source).
Terminal 2: Start netcat
nc -l 9000
Terminal 3: Submit the Flink program:
cd $FLINK_HOME bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000
Terminal 2: Add words in netcat input
lorem ipsum ipsum ipsum ipsum bye
Terminal 4:
cd $FLINK_HOME tail -f log/flink-*-jobmanager-*.out
Terminal 1: stop Flink
cd $FLINK_HOME bin/stop-local.sh
Shell
cd $FLINK_HOME bin/start-scala-shell.sh local
TBC
Cluster execution
Amazon AWS EMR
Install AWS CLI
sudo apt-get install awscli aws help
Configure CLI with AWS credential (link)
aws configure
NB : credential file is ~/.aws/credentials and config file is ~/.aws/config
Create an cluster on AWS EMR (Elastic Map Reduce) in your AWS console (link).
File:AWS EMR Dashboard.png
The nodes of the EMR cluster are listed in the AWS EC2 panel of your AWS console.
Connect to Master node
ssh -i ~/.ssh/awskey.pem hadoop@ec2-52-12-35-67.eu-west-1.compute.amazonaws.com