Crafting an Automatic ERD Generator: A Journey from DDL to Diagram
The Challenge: Converting DDL to ERD
Recently, I embarked on a quest to create an ERD from our database automatically. I needed to represent tables, with their columns and foreign keys constraints explicitly. The main challenge was to find a tool or a library that can handle this conversion while providing detailed representation of primary and foreign keys relationships, not just between tables, but specifically between columns.
The Initial Players: MermaidJS and Graphviz
My first instinct was to use MermaidJS, a markdownish syntax I often utilize for generating flowcharts and sequence diagrams. However, it quickly became apparent that MermaidJS fell short when it came to supporting ‘port’ arrows, thus leaving me unable to pinpoint the exact columns functioning as foreign and primary keys. (here is a github issue explaining that fix that isn’t trivial)
Next, I considered Graphviz, a powerful graph visualization software. While it allowed a more complex edge routing and it was possible to represent each node as an HTML table, I found the syntax rather cumbersome and the user experience less than satisfactory. At this point, it seemed my tool of choice was yet to be found.
digraph ERD {
rankdir=LR;
node [shape=plaintext];
Student [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
<TR><TD COLSPAN="2"><B>Student</B></TD></TR>
<TR><TD PORT="studentid">StudentId</TD><TD>PK</TD></TR>
<TR><TD>StudentName</TD><TD></TD></TR>
<TR><TD>StudentAge</TD><TD></TD></TR>
</TABLE>
>];
Course [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
<TR><TD COLSPAN="2"><B>Course</B></TD></TR>
<TR><TD PORT="courseid">CourseId</TD><TD>PK</TD></TR>
<TR><TD>CourseName</TD><TD></TD></TR>
</TABLE>
>];
Enrollment [label=<
<TABLE BORDER="0" CELLBORDER="1" CELLSPACING="0">
<TR><TD COLSPAN="2"><B>Enrollment</B></TD></TR>
<TR><TD PORT="enrollmentid">EnrollmentId</TD><TD>PK</TD></TR>
<TR><TD PORT="studentidfk">StudentId</TD><TD>FK</TD></TR>
<TR><TD PORT="courseidfk">CourseId</TD><TD>FK</TD></TR>
</TABLE>
>];
Student:studentid -> Enrollment:studentidfk [label="1..N"];
Course:courseid -> Enrollment:courseidfk [label="1..N"];
Enrollment:studentidfk -> Student:studentid [label="N..1"];
Enrollment:courseidfk -> Course:courseid [label="N..1"];
}
the above graphviz would be rendered as
The Potential Solutions: Terrastruct/d2 and dbdiagrams.io
As I continued my search, I stumbled upon terrastruct/d2. Although it showed promise, I was met with the drawback of needing a paid license for the layout engine to enable column-to-column arrows.
Following that, I explored https://dbdiagram.io/, which looked quite impressive but again, it required a log in to save and share diagrams.
The silver lining, however, was that it led me to discover the Database Markup Language (DBML) which is open source by the authors of https://dbdiagram.io/.
DBML attempts to provide a standard way to represent DB tables/columns/constraints as a markup. The syntax is easy to read and there are variety of tools for that standard (VS code extensions, language bindings etc).
See example DBML below.
Table users {
id integer
username varchar
role varchar
created_at timestamp
}
Table posts {
id integer [primary key]
title varchar
body text [note: 'Content of the post']
user_id integer
status post_status
created_at timestamp
}
Enum post_status {
draft
published
private [note: 'visible via URL only']
}
Ref: posts.user_id > users.id // many-to-one
As I dug deeper, I found that someone had already created a DBML renderer using Graphviz (https://github.com/softwaretechnik-berlin/dbml-renderer/) , providing an impressive markup with HTML tables. The output was aesthetically pleasing, but unfortunately, this wasn’t released as a web version.
The author implemented pretty clever yet simple drawing of HTML tables in Graphviz. Yes, final DOT/Graphviz markup is verbose and ugly, but given it’s produced and consumed by a program and not by humans, it doesn’t matter.
so the input is DBML and the output is a pretty diagram
Combining it all together
I already had a website that had graphviz playground. I was able to connect it all together
- You paste your DB DDL.
- DDL gets converted into DBML using DBML open source JS library.
- DBML gets converted into Graphviz with custom code inspired by dbml-renderer
- Graphviz is rendered online using viz.js.
Result can be shared with DDL just serialized into the query string, so no data is stored on the server (there is no server, it’s just hosted as static site on Netlify).
You can also see DBML on the same page, or get redirected to another page to DBML editor.
I admit, that the result is not as pretty as most of the commercial products out there like terrastruct or dbdiagram.io, but
- It helps me to go DDL -> DBML -> picture in one tool.
- It allows sharing using just URL string.
I invite you to try it and share your feedback.