2 min read

Codec for any data type

When implementing a system that has external communication with users, or other services it is inevitable to rely on data transformation that is used to transfer the data from and to our system. Most of the time these transformations are handled by the libraries and SDKs we use for external communication (for example Jason library to convert from and to JSON, Ecto to transform our data that is understandable by SQL databases, and so on). Although these tools are doing a great job handling the majority of our needs, sometimes we need to send arbitrary data like small functions, deeply nested structures containing tuples (which is known not to be compatible with JSON), maybe even proccess identifiers or PIDs.

In short: The goal of this pill is to demonstrate how you can transform any data structure or type into a binary format that can be sent over the network.

Our codec module in Elixir could look like:

defmodule Codec do
  def encode(term) do
    term
    |> :erlang.term_to_binary()
    |> :zlib.gzip()
    |> Base.encode64()
  end

  def decode(encoded_term) when is_binary(encoded_term) do
    encoded_term
    |> Base.decode64!()
    |> :zlib.gunzip()
    |> :erlang.binary_to_term()
  end
end

As we can see we are not imposing any restriction on the term received in encode, and neither is Erlang, so it can be any data structure that is understood by Erlang. To encode, we firstly transform our term into a binary, followed by a lossless compression to potentially reduce the size of our data, and finally we base 64 encode the binary to have just a simple string that can be handled by any transfer environment. In the decode function we just simply call the inverse operations in a reversed order, nothing fancy.

As I mentioned above the resulting string can be sent over any network, which means that we can potentially send any data to other systems and let them interpret it, or use these strings to query data from our system, or even store any data this way.

Caveat: In order to be able to encode or decode any data this way one needs to have an Erlang system, or at minimum to have the same codec algorithm that Erlang uses implemented in other technologies. This is of course normal as we are talking about Erlang structures here.

As a finishing note I am leaving you with a Kangaroo family generated by DALL-E.