In this post, we will explore our open-source project to understand how to establish a database connection in SQLAlchemy, and how to leverage strong types to design safe and maintainable methods.

In our previous post, we discussed how to encode data models in SQLAlchemy and Python. Today, we will see how create a database connection in SQLAlchemy, and how to use strong types to design more maintainable and safer methods.
In sem
, database operations are performed by the CRUDHandler
class, a wrapper around SQLAlchemy functions defined in the modules/crud_handler.py
module. CRUD stands for “create, record, update, delete“, and is the typical set of operations to be performed on a database.
crud_handler.py
also contains the definition of the CRUDHandlerError
exception, used by CRUDHandler
to signal errors encountered in database and data access.
# modules/crud_handler.py
1 """Class mediating CRUD operations on the DB.
2
3 Classes
4 -----------------------
5 CRUDHandlerError
6 Exception raised for errors in DB operations.
7 CRUDHandler
8 Class mediating CRUD operations.
9
10 Functions
11 -----------------------
12 CRUDHandlerContext()
13 Manage context for CRUDHandler.
14 """
The CRUDHandlerError
Exception
CRUDHandlerError
is a custom exception, which will be raised (together with an explicative message) when the CRUDHandler
encounters an error. Creating such custom exceptions is good to distinguish them from exceptions which other parts of the program may raise.
# modules/crud_handler.py
40 from contextlib import contextmanager
41
42 from pydantic import ValidationError
43
44 from sqlalchemy import select
45 from sqlalchemy import func
46 from sqlalchemy import and_
47 from sqlalchemy.sql import text
48
49 from modules.models import Expense
50 from modules.session import init_session
51 from modules.schemas import ExpenseAdd
52 from modules.schemas import ExpenseUpdate
53 from modules.schemas import QueryParameters
<---------------------------snip--------------------------->
56 class CRUDHandlerError(Exception):
57 """Exception raised for errors in DB operations."""
The standard procedure is to define a derived type of the builtin Exception
class. Even without adding method and attributes, the exception will operate like the default ones (e.g., accepting an error message as a string on construction, as we will see below).
Database Connection
Initialization details
We can now examine the CRUDHandler
class itself, beginning from its initialization, which needs to set up three elements to have an operational database connection.
At first, a connection string needs to be set up. This will be a string containing the necessary information for SQLAlchemy to connect to or create the database.
Its first component is usually a driver section, carrying information on the underlying database service and on components used to connect to it. The string will then carry database details, in a pre-defined format. These will be user credentials, database service location (i.e., HTTP host and port to connect to for data exchanges) and database name.
The second step is creating an SQLAlchemy engine
, a low-level object connected to the database and used to perform low-level, non-CRUD operations. These include, for instance, creating the database and the associated tables.
Finally, the initialization method will create an SQLAlchemy Session
, a high-level construct used to execute CRUD operations. This object is based on an engine
, and an instance of this type will be stored as member in CRUDHandler
.
Initialization function
I placed these operations in the init_session()
function of the modules/session.py
module. This will allow more customizability of the connection parameters (as we will see when we will operate with containers).1
# modules/session.py
32 from sqlalchemy import create_engine
33 from sqlalchemy.orm import Session
34 from sqlalchemy_utils import database_exists
35 from sqlalchemy_utils import create_database
36
37 from modules.models import Base
38
39
40 def init_session(database: str) -> Session:
41 """Init connection to specified DB and write schema.
42
43 Parameters
44 -----------------------
45 database : str
46 Database name to connect to.
47
48 Returns
49 -----------------------
50 sqlalchemy.orm.Session
51 The initialized Session.
52 """
53 DRIVER = "postgresql+psycopg"
54 USER = "postgres"
55 PASSWORD = ""
56 HOST = "localhost"
57 PORT = "5432"
58
59 DB = f"{DRIVER}://{USER}:{PASSWORD}@{HOST}:{PORT}/{database}"
60 if not database_exists(DB):
61 create_database(DB)
62
63 engine = create_engine(DB)
64 # Building schema
65 Base.metadata.create_all(engine)
66
67 return Session(bind=engine)
psycopg
is one of the possible interfaces to connect to a PostgreSQL database, and uses under the hood the psycopg3
driver. The rest of the connection string sets up a database for the user postgres
(the default username in most PostgreSQL installations) without password, and instructs SQLAlchemy to send data to the HTTP address localhost
on port 5432
(the default location where the PostgreSQL service can be contacted).
The database_exists()
and create_database()
SQLAlchemy functions allow to test the existence of a database and to create it following the connection string. The initialization function then creates an engine
object based on the connection string.
This object is immediately used to create the tables required to host data following our specified model. This is done by calling the create_all()
method of the SQLAlchemy
object contained in MetaData
as the Base
metadata
attribute. All models inherit access to this attribute (through
), allowing it to create tables to host all the created models.Base
Finally, the engine
is passed as argument to create a Session
object, which will be used to actually perform CRUD operations.
The CRUDHandler
will use init_session()
to initialize its internal session
attribute. It will also need a function to close the session (which can be done calling the
method), to be invoked when database operations are completed.
.close()Session
# modules/crud_handler.py
60 class CRUDHandler:
61 """Class mediating CRUD operations.
62
63 Attributes
64 -----------------------
65 session : sqlalchemy.Orm.Session
66 Session connected to the specified DB.
67
68 Public methods
69 -----------------------
70 __init__()
71 Initialize class instance.
72 close()
73 Close DB connection.
74 add()
75 Add expense to the DB.
76 query()
77 Return expenses matching specified filters.
78 summarize()
79 Summarize expenses matching specified filters.
80 update()
81 Update expense selected by ID.
82 remove()
83 Remove selected expenses from the DB.
84 erase()
85 Remove all expenses from the DB.
86 load()
87 Append the contents of a CSV file to the database.
88 save()
89 Save the current contents of the DB to a CSV file.
90 """
91
92 def __init__(self, database: str):
93 """Initialize class instance.
94
95 Parameters
96 -----------------------
97 database : str
98 Database name to connect to.
99 """
100 self.session = init_session(database)
101
102 def close(self):
103 """Close DB connection."""
104 self.session.close()
Context management
In classes such as
, which manage resources, it is useful to define a context manager, which will create an access to the resource and perform cleanup when execution finishes, either normally or if an exception is raised. I implemented this functionality in the CRUDHandler
CRUDHandlerContext()
function (which is not a method of
), defined at the end of CRUDHandler
modules/crud_handler.py
.
# modules/crud_handler.py
334 @contextmanager
335 def CRUDHandlerContext(database: str) -> CRUDHandler:
336 """Manage context for CRUDHandler.
337
338 Parameters
339 -----------------------
340 database : str
341 Database name.
342
343 Yields
344 -----------------------
345 CRUDHandler
346 The context-managed CRUDHandler.
347 """
348 ch = CRUDHandler(database)
349 try:
350 yield ch
351 finally:
352 ch.close()
The contextlib.contextmanager
decorator allows to create a context manager from a generator-like function, which yields the created object, without having to write a generator class with nontrivial __enter__()
and __exit__()
methods. The yield
instruction should be included in a try
block, with the finally
block being responsible for the cleanup.
In practice, this context manager will be used as
with CRUDHandlerContext(<database name>) as <handle>:
<operations using the CRUDHandler object <handle>>
with cleanup ensured when exiting the context. We will see concrete application examples in the next installments.
Strong Types
We are almost ready to begin delving into the CRUD operations. Before proceeding, however, we need to understand the concept of strong type, which is a powerful programming paradigm for the design of functions, and is heavily used in sem
.
The key idea is to create types to aggregate the parameters accepted by a function in a single object (similarly to what is done in C, where struct
s are used to return multiple values). This idiom allows increased security, by decreasing the possibility of parameter mismatch and making parameter validation easier.
In sem
, I defined the types used by the CRUD functions in modules/schemas.py
.
# modules/schemas.py
1 """Types for CRUD operations.
2
3 Classes
4 -----------------------
5 ExpenseBase
6 Base expense class.
7 ExpenseAdd
8 Derived expense class for insertion operations.
9 QueryParameters
10 Strong type for query parameters.
11 ExpenseRead
12 Derived expense class for query operations.
13 ExpenseUpdate
14 Container for data to update existing expenses with.
15 """
We will at first examine ExpenseAdd
, the type used in “create” operations in CRUDHandler
. I designed it so that the “create” method in CRUDHandler
could have the signature
add(self, data: ExpenseAdd)
with no need for additional arguments.
ExpenseAdd
must then hold all the information necessary to create an Expense
object, which means all the fields except the ID (which should be automatically assigned by the database). Since these fields will also be used by other strong types, I encapsulated them in a separate ExpenseBase
base class.
# modules/schemas.py
39 import datetime
40 from typing import Optional
41
42 from pydantic import BaseModel
43 from pydantic import Field
44
45
46 class ExpenseBase(BaseModel):
47 """Base expense class.
48
49 Attributes
50 -----------------------
51 date : datetime.date
52 Date of the expense.
53 type : str
54 Low-level group of the expense.
55 category : Optional[str]
56 High-level group of the expense. Default is `None` -> "".
57 amount : float
58 Amount of the expense.
59 description : str
60 Description of the expense.
61 """
62
63 date: datetime.date = Field(description="Date of the expense.")
64 type: str = Field(description="Low-level group of the expense.")
65 category: str = Field(
66 default="",
67 description="High-level group of the expense. Default is
''.",
68 )
69 amount: float = Field(description="Amount of the expense.")
70 description: str = Field(description="Description of the
expense.")
71
72
73 class ExpenseAdd(ExpenseBase):
74 """Type for insertion operations."""
ExpenseBase
inherits from BaseModel
, a type defined in the pydantic
module. This is essentially a dataclass with builtin validation for field values, supporting all Python builtin types: this gives strong security at initialization at zero cost for the developer.
BaseModel
also come with a pre-defined keyword argument constructor
instance = ExpenseAdd(date=..., type=..., ...)
and serialization functions, and is strongly integrated with FastAPI, the API framework of choice for sem
.
In pydantic
BaseModel
s, the Field
construct is used to provide advanced type annotations, which provide information beyond the simple data type. Here, we are specifying descriptions for the fields, which will appear in the documentation, as well as default values, where applicable.
Why Two Sets of Types?
As we have seen,
shares many similarities with the original ExpenseAdd
type that we saw in our previous post, and which defines the data model. If you wonder why I re-defined an almost identical copy of an existing type, there are a few reasons.Expense
Firstly, in SQLAlchemy instances of the model type remain connected to their associated data in the database, and changes applied to the former can be exported to the latter. While we have not seen this pattern yet, it will play a crucial role in the “update” CRUD method.
Due to this feature, working with model instances outside of CRUD operations may lead to unexpected behavior and changes in the database, making a good case for independent types which insulate the contained data.
Furthermore, data models (inheriting from SQLAlchemy.orm.
) do not share the same properties as our strong types (which inherit from DeclarativeBase
pydantic.
). This makes them miss out on, e.g., validation at construction, serialization functions…BaseModel
Finally, both our model and our strong types use type annotations to provide additional information on the fields. However, the annotations of the former are based on what SQLAlchemy requires to create the database, and do not mirror the (arguably more helpful for documentation and the developer) annotations of the strong types.
As such, even though a little duplication is required, having an independent set of validated strong types is by and large a better idea.
What’s Next?
We have learned how to establish an SQLAlchemy database connection and how to use strong types for the design of safe and maintainable methods.
In the next installment in this series, we will explore the
class, to understand how to add, query, update, and remove data.CRUDHandler
Author: Adriano Angelone
After obtaining his master in Physics at University of Pisa in 2013, he received his Ph. D. in Physics at Strasbourg University in 2017. He worked as a post-doctoral researcher at Strasbourg University, SISSA (Trieste) and Sorbonne University (Paris), before joining eXact-lab as Scientific Software Developer in 2023.
In eXact-lab, he works on the optimization of computational codes, and on the development of data engineering software.
- The file
session.py
has since been modified for the purpose of containerization. The version displayed here can be accessed by checking out to thelegacy
branch of thesem
repository. ↩︎