Difference between revisions of "Avro"
Jump to navigation
Jump to search
FAURE.ADRIEN (talk | contribs) |
FAURE.ADRIEN (talk | contribs) |
||
Line 1: | Line 1: | ||
− | = Apache Avro = |
+ | == Apache Avro == |
Avro is a remote procedure call and data serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format. Its primary use is in Apache Hadoop, where it can provide both a serialization format for persistent data, and a wire format for communication between Hadoop nodes, and from client programs to the Hadoop services. |
Avro is a remote procedure call and data serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format. Its primary use is in Apache Hadoop, where it can provide both a serialization format for persistent data, and a wire format for communication between Hadoop nodes, and from client programs to the Hadoop services. |
||
Line 9: | Line 9: | ||
* Scala |
* Scala |
||
* C Sharp |
* C Sharp |
||
− | * C |
+ | * C |
* C++ |
* C++ |
||
* Python |
* Python |
Revision as of 14:54, 6 April 2016
Apache Avro
Avro is a remote procedure call and data serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format. Its primary use is in Apache Hadoop, where it can provide both a serialization format for persistent data, and a wire format for communication between Hadoop nodes, and from client programs to the Hadoop services. It is similar to Thrift, but does not require running a code-generation program when a schema changes (unless desired for statically-typed languages).
Languages with APIs
Though theoretically any language could use Avro, the following languages have APIs written for them.
- Java
- Scala
- C Sharp
- C
- C++
- Python
- Ruby
Features
Avro provides:
- Rich data structures.
- A compact, fast, binary data format.
- A container file, to store persistent data.
- Remote procedure call (RPC).
- Simple integration with dynamic languages. Code generation is not required to read or write data files nor to use or implement RPC protocols. Code generation as an optional optimization, only worth implementing for statically typed languages.