Custom PostgreSQL driver and adapter for Ecto

Pavel Tsiukhtsiayeu
9 min readApr 1, 2019

--

Ever wonder how does Ecto work with a database? What Postgrex, DBConnection, and EctoSQL are for? “Tear apart and put it back together” — that’s how I learn. Follow me along as I build DBConnection compliant PostgreSQL driver, write an adapter for Ecto, and put it all together in a fancy way. The goal of this exercise is to build a naive implementation of Postgrex and Ecto.Adapters.Postgres and to understand how everything is connected under the hood. Let’s start right away.

I. The driver

A database driver gives a higher level of abstraction to work with a database over the wire. It can talk to a database in a binary format and know how to handle the connection (UNIX socket/TCP/SSL), authentication, binary data encoding/decoding, data transfer, and who knows what else. That’s what Postgrex is for. It utilizes Erlang’s gen_tcp module, which is an interface for TCP/IP socket. Well, that’s a low level, isn’t it? At least for me. Googling around for not too long, and here it is — epgsql Erlang PostgreSQL client library. I will utilize it to communicate with the database.

Reading thru the docs I’ve found the very basic functions that I need for the start: connect, squery, and close. That should be sufficient. I will create a new Elixir project with a supervision tree and call it EpgsqlEx. Here we go:

mix new epgsql_ex

Add Erlang’s epgsql library as a dependency.

mix.exs

Get dependencies and compile the project:

mix do deps.get, compile

My PostgreSQL database is listening on port 5432 and has a user postgres with a password postgres. Also, I’ve prepared a PostgreSQL database named epgsql with a single table users in it: CREATE TABLE users (id SERIAL PRIMARY KEY, name VARCHAR(255))

Now I’m eager to see the epgsql library in action. I will fire up interactive elixir console via iex -S mix, start epsql process and ask it to run a query for me.

epgsql in action

Notice that connect function accepts keyword list values as charlist, not a string. That’s a common practice in Erlang.

Great! Let me wrap that Erlang library in Elixir’s GenServer behaviour, so I can refer to it in a way I used to in Elixir. I will do start_link/1 and stop/1 to start the linked process and stop it by a pid. Also, a client interface will have query/2 and query/3 to execute an arbitrary query given a pid, query as a String, and optional params for a query. epgsql can run extended queries with params. The extended query protocol combines parse, bind, and execute using the unnamed prepared statement and portal. That’s what query/3 does.

This is what EpgsqlEx.Proxy GenServer wrapper looks like:

lib/epgsl_ex/proxy.ex

Alright, recompile the code and give it a try:

EpgsqlEx.Proxy in action

Not bad. We have EpgsqlEx.Proxy running with a linked epgsql process handling the connection to PostgreSQL. Time to move on to DBConnection!

II. The DBConnection

Developers can implement their own SQL adapters by using Ecto.Adapters.SQL and by implementing the callbacks required by Ecto.Adapters.SQL.Connection for handling connections and performing queries. The connection handling and pooling for SQL adapters should be built using the DBConnection library.

As of now, EpgsqlEx.Proxy can establish a connection to a PostgreSQL database and interact with a server using a binary protocol over TCP. But there is much more to do: connection pool, detect faulty connections and reconnect with exponential backoff, keep the connection alive, transactions, prepare/execute, cursors, and quite a few more. That’s where DBConnection comes at a rescue. Besides all that, it also a requirement to implement DBConnection behaviour in order to use EpgsqlEx as a driver for Ecto adapter.

In short, DBConnection is a behaviour module for implementing efficient database connection client processes, pools and transactions.

Alright, let’s get to work. First I will add :db_connection dependency, get dependencies and compile the project:

mix.exs
mix do deps.get, compile

Next, I need to define a module that will implement DBConnection callbacks. I’ve looked at Postgrex and Mariaex drivers source code and figured out that EpgsqlEx.Protocol is the best name for the module. Also, there are quite a few callbacks to implement, but I plan to implement the minimum to get my code up and running, leaving the rest with a raise "Not implemented" implementation.

Using my magic gift of foresight, I plan to implement 8 callbacks:

  • connect/1
  • disconnect/2
  • handle_execute/4
  • ping/1
  • checkin/1
  • checkout/1
  • handle_prepare/3
  • handle_close/3

I was using IEx helper function b/1 to print callback info. Hex docs or source code should work as well.

b DBConnection.connect

Going thru all callbacks one at a time, finally, I came up with this implementation for my EpgsqlEx.Protocol

lib/epgsql_ex/protocol.ex

Fancy isn’t it. Let me try it out.

DBConnection.Query protocol not implemented for “SELECT * FROM users”

Hmm… Looks like query string that I pass to DBConnection.execute/4 and EpgsqlEx.Protocol.handle_execute/4 is not expected. Let me take another look at the docs:

OK. Let’s see what each of those types are:

Great! conn is a pid, as I’ve expected. params and result are any terms. query is the only arg that DBConnection behaviour cares about — it must implement DBConnection.Query protocol. Given all that, I plan to define a module EpgsqlEx.Query and a protocol implementation in a file lib/epgsql_ex/query.ex

defmodule EpgsqlEx.Query do
defstruct [:statement]
defimpl DBConnection.Query do
def parse(query, _opts), do: query
def describe(query, _opts), do: query def encode(_query, params, _opts), do: params def decode(_query, result, _opts), do: result
end
defimpl String.Chars do
alias EpgsqlEx.Query
def to_string(%{statement: sttm}) do
case sttm do
sttm when is_binary(sttm) -> IO.iodata_to_binary(sttm)
%{statement: %Query{} = q} -> String.Chars.to_string(q)
end
end
end
end

As you can see, I’ve also implemented String.Chars protocol using my magic gift of foresight again. I will need it later, when implementing SQL Adapter.

Also, I need to update my EpgsqlEx.Protocol.handle_execute/4 to accept new struct:

lib/epgsql_ex/protocol.ex

Can’t wait to see if it works. Back to IEx:

Hooray! It’s working. Now I have a bare minimum code to implement a DBConnection compliant driver for PostgreSQL, that in turn will be used by Ecto Adapter I will build in the next episode. Bear with me.

III. The adapter

EctoSQL provides building blocks for writing SQL adapters for Ecto.

It features:

  • The Ecto.Adapters.SQL module as an entry point for all SQL-based adapters
  • Default implementations for Postgres (Ecto.Adapters.Postgres) and MySQL (Ecto.Adapters.MySQL)
  • A test sandbox (Ecto.Adapters.SQL.Sandbox) that concurrently run database tests inside transactions
  • Support for database migrations via Mix tasks

First thing I did, was to read Ecto.Adapters.SQL module documentation once again. As I quoted at the very beginning of chapter two:

Developers can implement their own SQL adapters by using Ecto.Adapters.SQL and by implementing the callbacks required by Ecto.Adapters.SQL.Connection for handling connections and performing queries. The connection handling and pooling for SQL adapters should be built using the DBConnection library.

Not wasting time thinking, I’m adding {:ecto_sql, “~> 3.0.5”} dependency and compiling my project:

mix.exs
mix do deps.get, compile

When using Ecto.Adapters.SQL, the following options are required:

:driver (required) — the database driver library. For example: :postgrex

:migration_lock — the lock to use on migration locks. For example: “FOR UPDATE”. It may also be nil (for no lock). The user can still override this by setting :migration_lock in the repository configuration

On a second attempt, I ended up with a module EpgsqlEx.EctoAdapter and a hand full of warnings module EpgsqlEx.EctoAdapter.Connection is not available

lib/epgsql_ex/ecto_adapater.ex

Obviously, the next step I take is to define EpgsqlEx.EctoAdapter.Connection as required by Ecto.Adapters.SQL. It must implement Ecto.Adapters.SQL.Connection with 14 callbacks! I will use the not implemented stub for most of the callbacks, implementing only the bare minimum. Let`s give it a try.

lib/epgsql_ex/ecto_adapter/connection.ex

All OTP related callbacks are delegated to EpgsqlEx driver, which in turn utilizes DBConnection functionality. To keep everything at its’ place, I need to update lib/epgsql_ex.ex as follow:

lib/epgsql_ex.ex

To make sure no typos, I will compile a project at this stage with mix compile

All good. Let`s try it out! I will create another project Demo in a side folder with a supervision tree by running mix new --sup demo. Right from the start, I will add Ecto and EpgsqlEx as dependencies and compile the project with mix do deps.get, compile

mix.exs (demo project)

Now I will define User Ecto schema and a Repo

lib/user.ex and lib/repo.ex (demo project)

Newly defined Repo has to be added to a supervision tree of the Demo application. Head to lib/demo/application.ex and update children list:

lib/demo/application.ex (demo project)

Important part not to forget to update config/config.exs with a Repo config:

config/config.exs (demo project)

Those options will be passed to my EpgsqlEx driver via DBConnection hopefully. Let me see if the Demo app will start at this point.

That’s impressive. The application has successfully started. It failed to query the DB for User , but that was expected. :erlang.iolist_to_binary(:TODO) is a good sign — that is my stub in EpgsqlEx.EctoAdapter.Connection.all/1

I wonder what my supervision tree looks like at this point. :observer.start() will give me a good picture.

:observer.start()

You can see DBConnection in action here. That’s my connection pool created by DBConnection! Take a look for example at linked pids 226, 237, 249:

  • pid(0,226,0) — is a process for EpgsqlEx.Protocol
  • pid(0,237,0) — is a process for Epgsql.Proxy, which is a wrapper for :epgsql Erlang module
  • pid(0,249,0) — is a process for :epgsql TCP socket

You can kill any of those processes, and see DBConnection restart lost processes with new ones. Impressive!

The next step I take is to implement EpgsqlEx.EctoAdapter.Connection.all/1 under the EpgsqlEx project with a dummy result to satisfy Ecto expectations.

lib/epgsql_ex/ecto_adapter/connection.ex (epgsql_ex project)

Here we go — real data from PostgreSQL server. Fantastic! But Ecto expects different data, not the one :epgsql returns. Quick look at lib/ecto/adapters/sql.ex:556 and reading function docs in this module gave me a clear understanding that it expects Map like this: %{num_rows: integer(), rows: list([])}

After a few attempts, I’ve figured out EpgsqlEx.Query.decode/3 is in charge of decoding driver results in a more readable format. I will put a stub there for now:

lib/epgsql_ex/query.ex (epgsql_ex project)

Fingers crossed. Let see what I get.

Not bad at all. With all the stubs I’m so close to success. I need real data from DB. In order to do that, I need two things:

  1. Transform :epgsql results to %{num_rows: integer(), rows: list([])} at lib/epgsql_ex/query.ex and cast value types
  2. Traverse AST of Ecto.Query passed to EpgsqlEx.EctoAdapter.Connection at lib/epgsql_ex/ect_adapater/connection.ex and build a query to be consumed by EpgsqlEx.execute/4

The first one seems to be simpler, so I will start with it.

Basically, I need to convert a tuple of 3 elements {:ok, <columns>, <rows>} to %{num_rows: integer(), rows: list([])} That should be easy with recursion. I will take columns information and apply it to each row with a type cast function. Notice 3rd element in a column tuple{:column, "id", :int4, 23, 4, -1, 0} That’s a type I need to cast a value to.

Here what EpgsqlEx.Query.decode/3 turned out to be:

lib/epgsql_ex/query.ex (epgsql_ex project)
Real data from PostgreSQL

Now the hard part:

Traverse AST of Ecto.Query passed to EpgsqlEx.EctoAdapter.Connection and build a query to be consumed by EpgsqlEx.execute/4

Honestly, that’s the hardest part of this whole exercise. I’ve spent a couple of hours reading Ecto.Adapters.Postgres.Connection source code. All 1,135 lines of code went thru my brain, leaving me completely exhausted. Absolutely empty and depressed. Whoever wrote it (according to git blame José Valim it is) is a real smart ass.

This is my oversimplified version of a Connection behaviour implementation. No comments. Sorry…

lib/epgsql_ex/ecto_adapter/connection.ex (epgsql_ex project)

Final check in iex console within Demo project:

All done

That’s it. A naive implementation of PostgreSQL adapter and Ecto adapter is complete. All wired up together, and capable to perform a simple action of selecting all users from a table. That was fun!

Conclusion

It’s damn hard to write a database driver considering all edge cases. It’s even harder to write a reusable and pluggable adapter. Thank you Elixir community for sharing all that hard work for free.

--

--

Responses (2)