Spark: Difference between revisions
Jump to navigation
Jump to search
(Created page with "http://spark.apache.org/ ''Apache Spark™ is a fast and general engine for large-scale data processing. Write applications quickly in Java, Scala, Python, R. Spark offers...") |
No edit summary |
||
| Line 2: | Line 2: | ||
''Apache Spark™ is a fast and general engine for large-scale data processing. Write applications quickly in Java, Scala, Python, R. Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python and R shells. Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application. '' |
''Apache Spark™ is a fast and general engine for large-scale data processing. Write applications quickly in Java, Scala, Python, R. Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python and R shells. Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application. '' |
||
=Liens= |
|||
* http://spark-packages.org/ |
|||
=Livres= |
|||
* Advanced Analytics with Spark : Patterns for Learning from Data at Scale, http://shop.oreilly.com/product/0636920035091.do |
|||
code https://github.com/sryza/aas |
|||
Revision as of 14:49, 3 October 2015
Apache Spark™ is a fast and general engine for large-scale data processing. Write applications quickly in Java, Scala, Python, R. Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python and R shells. Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application.
Liens
Livres
- Advanced Analytics with Spark : Patterns for Learning from Data at Scale, http://shop.oreilly.com/product/0636920035091.do
code https://github.com/sryza/aas
Installation (depuis un Mac)
wget http://apache.crihan.fr/dist/spark/spark-1.5.1/spark-1.5.1.tgz tar xvf spark-1.5.1.tgz cd spark-1.5.1 more READ.md build/mvn -DskipTests clean package
Programmation interactive en Scala
./bin/spark-shell scala> sc.parallelize(1 to 1000).count()
Programmation interactive en Python
./bin/pyspark >> sc.parallelize(range(1000)).count()
./bin/run-example SparkPi MASTER=spark://host:7077 ./bin/run-example SparkPi ./dev/run-tests