Custom PostgreSQL driver and adapter for Ecto
Ever wonder how does Ecto work with a database? What Postgrex, DBConnection, and EctoSQL are for? “Tear apart and put it back together” — that’s how I learn. Follow me along as I build DBConnection compliant PostgreSQL driver, write an adapter for Ecto, and put it all together in a fancy way. The goal of this exercise is to build a naive implementation of Postgrex and Ecto.Adapters.Postgres and to understand how everything is connected under the hood. Let’s start right away.
I. The driver
A database driver gives a higher level of abstraction to work with a database over the wire. It can talk to a database in a binary format and know how to handle the connection (UNIX socket/TCP/SSL), authentication, binary data encoding/decoding, data transfer, and who knows what else. That’s what Postgrex is for. It utilizes Erlang’s gen_tcp module, which is an interface for TCP/IP socket. Well, that’s a low level, isn’t it? At least for me. Googling around for not too long, and here it is — epgsql Erlang PostgreSQL client library. I will utilize it to communicate with the database.
Reading thru the docs I’ve found the very basic functions that I need for the start: connect, squery, and close. That should be sufficient. I will create a new Elixir project with a supervision tree and call it EpgsqlEx. Here we go:
Add Erlang’s epgsql library as a dependency.
Get dependencies and compile the project:
My PostgreSQL database is listening on port 5432 and has a user postgres
with a password postgres
. Also, I’ve prepared a PostgreSQL database named epgsql
with a single table users
in it: CREATE TABLE users (id SERIAL PRIMARY KEY, name VARCHAR(255))
Now I’m eager to see the epgsql library in action. I will fire up interactive elixir console via iex -S mix
, start epsql
process and ask it to run a query for me.
Notice that connect
function accepts keyword list values as charlist, not a string. That’s a common practice in Erlang.
Great! Let me wrap that Erlang library in Elixir’s GenServer behaviour, so I can refer to it in a way I used to in Elixir. I will do start_link/1
and stop/1
to start the linked process and stop it by a pid. Also, a client interface will have query/2
and query/3
to execute an arbitrary query given a pid, query as a String, and optional params for a query. epgsql can run extended queries with params. The extended query protocol combines parse, bind, and execute using the unnamed prepared statement and portal. That’s what query/3
does.
This is what EpgsqlEx.Proxy
GenServer wrapper looks like:
Alright, recompile the code and give it a try:
Not bad. We have EpgsqlEx.Proxy
running with a linked epgsql
process handling the connection to PostgreSQL. Time to move on to DBConnection!
II. The DBConnection
Developers can implement their own SQL adapters by using
Ecto.Adapters.SQL
and by implementing the callbacks required byEcto.Adapters.SQL.Connection
for handling connections and performing queries. The connection handling and pooling for SQL adapters should be built using the DBConnection library.
As of now, EpgsqlEx.Proxy
can establish a connection to a PostgreSQL database and interact with a server using a binary protocol over TCP. But there is much more to do: connection pool, detect faulty connections and reconnect with exponential backoff, keep the connection alive, transactions, prepare/execute, cursors, and quite a few more. That’s where DBConnection comes at a rescue. Besides all that, it also a requirement to implement DBConnection behaviour in order to use EpgsqlEx as a driver for Ecto adapter.
In short, DBConnection is a behaviour module for implementing efficient database connection client processes, pools and transactions.
Alright, let’s get to work. First I will add :db_connection
dependency, get dependencies and compile the project:
Next, I need to define a module that will implement DBConnection callbacks. I’ve looked at Postgrex and Mariaex drivers source code and figured out that EpgsqlEx.Protocol
is the best name for the module. Also, there are quite a few callbacks to implement, but I plan to implement the minimum to get my code up and running, leaving the rest with a raise "Not implemented"
implementation.
Using my magic gift of foresight, I plan to implement 8 callbacks:
connect/1
disconnect/2
handle_execute/4
ping/1
checkin/1
checkout/1
handle_prepare/3
handle_close/3
I was using IEx helper function b/1
to print callback info. Hex docs or source code should work as well.
Going thru all callbacks one at a time, finally, I came up with this implementation for my EpgsqlEx.Protocol
Fancy isn’t it. Let me try it out.
Hmm… Looks like query
string that I pass to DBConnection.execute/4
and EpgsqlEx.Protocol.handle_execute/4
is not expected. Let me take another look at the docs:
OK. Let’s see what each of those types are:
Great! conn
is a pid
, as I’ve expected. params
and result
are any terms. query
is the only arg that DBConnection behaviour cares about — it must implement DBConnection.Query
protocol. Given all that, I plan to define a module EpgsqlEx.Query
and a protocol implementation in a file lib/epgsql_ex/query.ex
defmodule EpgsqlEx.Query do
defstruct [:statement] defimpl DBConnection.Query do
def parse(query, _opts), do: query def describe(query, _opts), do: query def encode(_query, params, _opts), do: params def decode(_query, result, _opts), do: result
end defimpl String.Chars do
alias EpgsqlEx.Query def to_string(%{statement: sttm}) do
case sttm do
sttm when is_binary(sttm) -> IO.iodata_to_binary(sttm)
%{statement: %Query{} = q} -> String.Chars.to_string(q)
end
end
end
end
As you can see, I’ve also implemented String.Chars
protocol using my magic gift of foresight again. I will need it later, when implementing SQL Adapter.
Also, I need to update my EpgsqlEx.Protocol.handle_execute/4
to accept new struct:
Can’t wait to see if it works. Back to IEx:
Hooray! It’s working. Now I have a bare minimum code to implement a DBConnection compliant driver for PostgreSQL, that in turn will be used by Ecto Adapter I will build in the next episode. Bear with me.
III. The adapter
EctoSQL provides building blocks for writing SQL adapters for Ecto.
It features:
- The
Ecto.Adapters.SQL
module as an entry point for all SQL-based adapters - Default implementations for Postgres (Ecto.Adapters.Postgres) and MySQL (Ecto.Adapters.MySQL)
- A test sandbox (Ecto.Adapters.SQL.Sandbox) that concurrently run database tests inside transactions
- Support for database migrations via Mix tasks
First thing I did, was to read Ecto.Adapters.SQL
module documentation once again. As I quoted at the very beginning of chapter two:
Developers can implement their own SQL adapters by using
Ecto.Adapters.SQL
and by implementing the callbacks required byEcto.Adapters.SQL.Connection
for handling connections and performing queries. The connection handling and pooling for SQL adapters should be built using the DBConnection library.
Not wasting time thinking, I’m adding {:ecto_sql, “~> 3.0.5”}
dependency and compiling my project:
When using Ecto.Adapters.SQL, the following options are required:
:driver (required) — the database driver library. For example: :postgrex
:migration_lock — the lock to use on migration locks. For example: “FOR UPDATE”. It may also be nil (for no lock). The user can still override this by setting :migration_lock in the repository configuration
On a second attempt, I ended up with a module EpgsqlEx.EctoAdapter
and a hand full of warnings module EpgsqlEx.EctoAdapter.Connection is not available
Obviously, the next step I take is to define EpgsqlEx.EctoAdapter.Connection
as required by Ecto.Adapters.SQL. It must implement Ecto.Adapters.SQL.Connection
with 14 callbacks! I will use the not implemented stub for most of the callbacks, implementing only the bare minimum. Let`s give it a try.
All OTP related callbacks are delegated to EpgsqlEx
driver, which in turn utilizes DBConnection
functionality. To keep everything at its’ place, I need to update lib/epgsql_ex.ex
as follow:
To make sure no typos, I will compile a project at this stage with mix compile
All good. Let`s try it out! I will create another project Demo
in a side folder with a supervision tree by running mix new --sup demo
. Right from the start, I will add Ecto and EpgsqlEx as dependencies and compile the project with mix do deps.get, compile
Now I will define User
Ecto schema and a Repo
Newly defined Repo
has to be added to a supervision tree of the Demo application. Head to lib/demo/application.ex
and update children list:
Important part not to forget to update config/config.exs
with a Repo config:
Those options will be passed to my EpgsqlEx
driver via DBConnection
hopefully. Let me see if the Demo app will start at this point.
That’s impressive. The application has successfully started. It failed to query the DB for User
, but that was expected. :erlang.iolist_to_binary(:TODO)
is a good sign — that is my stub in EpgsqlEx.EctoAdapter.Connection.all/1
I wonder what my supervision tree looks like at this point. :observer.start()
will give me a good picture.
You can see DBConnection in action here. That’s my connection pool created by DBConnection! Take a look for example at linked pids 226, 237, 249:
pid(0,226,0)
— is a process for EpgsqlEx.Protocolpid(0,237,0)
— is a process for Epgsql.Proxy, which is a wrapper for:epgsql
Erlang modulepid(0,249,0)
— is a process for:epgsql
TCP socket
You can kill any of those processes, and see DBConnection restart lost processes with new ones. Impressive!
The next step I take is to implement EpgsqlEx.EctoAdapter.Connection.all/1
under the EpgsqlEx project with a dummy result to satisfy Ecto expectations.
Here we go — real data from PostgreSQL server. Fantastic! But Ecto expects different data, not the one :epgsql
returns. Quick look at lib/ecto/adapters/sql.ex:556
and reading function docs in this module gave me a clear understanding that it expects Map like this: %{num_rows: integer(), rows: list([])}
After a few attempts, I’ve figured out EpgsqlEx.Query.decode/3
is in charge of decoding driver results in a more readable format. I will put a stub there for now:
Fingers crossed. Let see what I get.
Not bad at all. With all the stubs I’m so close to success. I need real data from DB. In order to do that, I need two things:
- Transform
:epgsql
results to%{num_rows: integer(), rows: list([])}
atlib/epgsql_ex/query.ex
and cast value types - Traverse AST of
Ecto.Query
passed toEpgsqlEx.EctoAdapter.Connection
atlib/epgsql_ex/ect_adapater/connection.ex
and build a query to be consumed byEpgsqlEx.execute/4
The first one seems to be simpler, so I will start with it.
Basically, I need to convert a tuple of 3 elements {:ok, <columns>, <rows>}
to %{num_rows: integer(), rows: list([])}
That should be easy with recursion. I will take columns information and apply it to each row with a type cast function. Notice 3rd element in a column tuple{:column, "id", :int4, 23, 4, -1, 0}
That’s a type I need to cast a value to.
Here what EpgsqlEx.Query.decode/3
turned out to be:
Now the hard part:
Traverse AST of
Ecto.Query
passed toEpgsqlEx.EctoAdapter.Connection
and build a query to be consumed byEpgsqlEx.execute/4
Honestly, that’s the hardest part of this whole exercise. I’ve spent a couple of hours reading Ecto.Adapters.Postgres.Connection
source code. All 1,135 lines of code went thru my brain, leaving me completely exhausted. Absolutely empty and depressed. Whoever wrote it (according to git blame
José Valim it is) is a real smart ass.
This is my oversimplified version of a Connection
behaviour implementation. No comments. Sorry…
Final check in iex console within Demo project:
That’s it. A naive implementation of PostgreSQL adapter and Ecto adapter is complete. All wired up together, and capable to perform a simple action of selecting all users from a table. That was fun!
Conclusion
It’s damn hard to write a database driver considering all edge cases. It’s even harder to write a reusable and pluggable adapter. Thank you Elixir community for sharing all that hard work for free.
Source code: https://github.com/paveltyk/epgsql_ex