Store git objects and refs in PostgreSQL tables. Standard git push/clone work against the database through a libgit2-based backend.
The extension gives you everything: tables, PL/pgSQL functions, materialized views, plus a native git_oid type with fast C implementations of SHA1 hashing and tree parsing. A separate libgit2-based backend handles git push/clone through libpq.
For more on why you'd want git data in a database, see Git in Postgres.
The fastest way to try gitgres. Builds everything and starts Postgres with the extension loaded:
docker build -t gitgres .
docker run --rm -it gitgres
From inside the container:
./backend/gitgres-backend init "dbname=gitgres user=postgres" myrepo
./backend/gitgres-backend push "dbname=gitgres user=postgres" myrepo /path/to/repo
psql -U postgres -d gitgres
To run the tests inside the container:
docker exec <container> bash -c "cd /gitgres && make test"
Requires PostgreSQL with pgcrypto, libgit2, libpq, and OpenSSL.
brew install libgit2
Build and install the extension:
make ext
make -C ext install
Create a database and enable the extension:
CREATE EXTENSION gitgres CASCADE;This creates all tables (repositories, objects, refs, reflog), functions, and materialized views. The CASCADE pulls in pgcrypto automatically.
Build the libgit2 backend (for push/clone support):
make backend
Initialize a repository in the database:
./backend/gitgres-backend init "dbname=gitgres" myrepo
Push a local git repo into Postgres:
./backend/gitgres-backend push "dbname=gitgres" myrepo /path/to/repo
Clone from Postgres back to disk:
./backend/gitgres-backend clone "dbname=gitgres" myrepo /path/to/dest
List refs stored in the database:
./backend/gitgres-backend ls-refs "dbname=gitgres" myrepo
Import a repo using the shell script (no compilation needed):
./import/gitgres-import.sh /path/to/repo "dbname=gitgres" myrepo
After importing or pushing a repo, refresh the materialized views:
REFRESH MATERIALIZED VIEW commits_view;
REFRESH MATERIALIZED VIEW tree_entries_view;Then query commits like a regular table:
SELECT sha, author_name, authored_at, message
FROM commits_view
ORDER BY authored_at DESC;Walk a tree:
SELECT path, mode, encode(oid, 'hex')
FROM git_ls_tree_r(1, decode('abc123...', 'hex'));make test
Runs 30 Minitest tests against a gitgres_test database. Each test runs in a transaction that rolls back on teardown. Tests cover object hashing (verified against git hash-object), object store CRUD, tree and commit parsing, ref compare-and-swap updates, and a full push/clone roundtrip.
Git objects (commits, trees, blobs, tags) are stored in an objects table with their raw content and a SHA1 OID computed the same way git does: SHA1("<type> <size>\0<content>"). Refs live in a refs table with compare-and-swap updates for safe concurrent access.
The libgit2 backend implements git_odb_backend and git_refdb_backend, the two interfaces libgit2 needs to treat any storage system as a git repository. The backend reads and writes objects and refs through libpq. When receiving a push, it uses libgit2's packfile indexer to extract individual objects from the incoming pack, then stores each one in Postgres.
The extension provides a proper git_oid type (20-byte fixed binary with hex I/O and btree/hash indexing), C implementations of SHA1 hashing and tree parsing, and the full SQL layer: tables, PL/pgSQL functions for object I/O, tree walking, commit parsing, and ref management, plus materialized views for querying commits and tree entries. omni_git builds on this to add HTTP transport and deploy-on-push.
With git data in Postgres, a git forge doesn't need filesystem storage at all. Forgejo already keeps everything except git repos in the database. Its entire git interaction goes through a single Go package (modules/git) that shells out to the git binary. Replace that package with SQL queries against the gitgres schema and the filesystem dependency disappears. One Postgres instance, one backup, one replication stream.
See forgejo.md for a detailed analysis of Forgejo's git layer and what replacing it would involve.
libgit2-backends -- the official collection of pluggable ODB backends for libgit2. Includes MySQL, SQLite, Redis, and Memcached implementations. No Postgres backend, which is what prompted this project.
gitbase -- a MySQL-compatible SQL interface for querying git repositories, built on go-git. Reads from on-disk repos rather than storing objects in the database. The SQL query layer is the goal rather than the storage layer.
JGit DFS -- JGit's abstract distributed file system storage. Defines the interface for storing git pack files on arbitrary backends. Google's internal git infrastructure builds on this. An earlier attempt (JGit DHT) tried storing objects directly in databases but was abandoned because no database could keep up with the access patterns.
Gitaly -- GitLab's git storage service. An RPC server that wraps git operations rather than replacing the storage layer. Still uses the filesystem for actual object storage.
Dolt -- a SQL database with git-style versioning (branch, merge, diff) built on prolly trees. Comes at the problem from the opposite direction: it's a database that borrowed git's semantics, not git storage backed by a database.