Avro

Apache Avro
Avro is a remote procedure call and data serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format. Its primary use is in Apache Hadoop, where it can provide both a serialization format for persistent data, and a wire format for communication between Hadoop nodes, and from client programs to the Hadoop services. It is similar to Thrift, but does not require running a code-generation program when a schema changes (unless desired for statically-typed languages).

Languages with APIs
Though theoretically any language could use Avro, the following languages have APIs written for them.
 * Java
 * Scala
 * C Sharp
 * C
 * C++
 * Python
 * Ruby

Features
Avro provides:


 * Rich data structures.
 * A compact, fast, binary data format.
 * A container file, to store persistent data.
 * Remote procedure call (RPC).
 * Simple integration with dynamic languages. Code generation is not required to read or write data files nor to use or implement RPC protocols. Code generation as an optional optimization, only worth implementing for statically typed languages.

Schema Declaration
A Schema is represented in JSON by one of:

{"type": "typeName" ...attributes...} where typeName is either a primitive or derived type name, as defined below. Attributes not defined in this document are permitted as metadata, but must not affect the format of serialized data.
 * A JSON string, naming a defined type.
 * A JSON object, of the form:
 * A JSON array, representing a union of embedded types.

Primitive Types
The set of primitive type names is:


 * null: no value
 * boolean: a binary value
 * int: 32-bit signed integer
 * long: 64-bit signed integer
 * float: single precision (32-bit) IEEE 754 floating-point number
 * double: double precision (64-bit) IEEE 754 floating-point number
 * bytes: sequence of 8-bit unsigned bytes
 * string: unicode character sequence
 * Primitive types have no specified attributes.

Primitive type names are also defined type names. Thus, for example, the schema "string" is equivalent to:

{"type": "string"}