In this post, we will continue our exploration of our open-source project, to understand how to perform CRUD (Create, Read, Update, Delete) operations on database records.
In our previous post, we started exploring the structure of our
class, showing how to connect to a database and how strong types help us design safe and reliable methods. Today, we will understand how CRUD (Create, Read, Update, Delete) operations are implemented in practice on database records.CRUDHandler
Creating Records
The first of the CRUD operations is the creation of new records, handled by the
method. This function which will receive as argument an instance of CRUDHandler
.add()
(check out our previous post for its definition) containing the new record data.ExpenseAdd
# modules/crud_handler.py
106 def add(self, data: ExpenseAdd):
107 """Add expense to the DB.
108
109 Parameters
110 -----------------------
111 data : ExpenseAdd
112 Expense data.
113 """
114 # Primary key added automatically
115 self.session.add(Expense(**data.model_dump()))
116 self.session.commit()
In line 115, I transform the passed
in a dictionary having the fields as keys, using the ExpenseAdd
method. The dictionary is then unpacked, and its key-value pairs are passed as constructor arguments to an
.model_dump()BaseModel
instance. Here I do not need to specify the missing Expense
id
field, since it has been set as a primary key (and therefore will be auto-inserted by the database).
The created
is then marked for addition to the database using the Expense
Session
.add()
method (which accepts a data record as argument). The Session
.commit()
method is finally called to permanently commit the change to the database.
Read (Query) Records
Designing a query function requires in the first place a decision on what criteria will be used to query the database: based on my experience with programs similar to sem
in the past, I decided for the following.
- The user should be able to select a date window which the returned expenses should be within. One of or both the dates should be optional: specifying only a start or end date will return all expenses after or before it, respectively, and no window at all will make all expenses match.
- The user should be able to filter by type and category. These arguments should be passed as lists, in case the user wants more than one allowed value. They should also be optional: if no values are passed, no filtering should be done based on these criteria.
Strong Types for Query Functions
Once the query criteria have been decided, the first step is to encode them in a strong type QueryParameters
, which will be passed as argument to the query functions.
# modules/schemas.py
77 class QueryParameters(BaseModel):
78 """Strong type for query parameters.
79
80 Attributes
81 -----------------------
82 start : Optional[datetime.date]
83 Start date (included).
Not filtered on if `None`. Default is `None`.
84 end : Optional[datetime.date]
85 End date (included).
Not filtered on if `None`. Default is `None`.
86 types : Optional[list[str]]
87 Types to filter the query.
If `None`, all types. Default is `None`.
88 categories : Optional[list[str]]
89 Categories to filter the query. If `None`, all types. Default
is
90 `None`.
91 """
92
93 start: Optional[datetime.date] = Field(
94 default=None,
95 description="""Start date (included).
Not filtered on if `None`.
96 Default is `None`.""",
97 )
98 end: Optional[datetime.date] = Field(
99 default=None,
100 description="""End date (included).
Not filtered on if `None`. Default
101 is `None`.""",
102 )
103 types: Optional[list[str]] = Field(
104 default=None,
105 description="""Types to filter the query.
If `None`, all types. Default
106 is `None`.""",
107 )
108 categories: Optional[list[str]] = Field(
109 default=None,
110 description="""Categories to filter the query.
If `None`, all types.
111 Default is `None`.""",
112 )
I implemented QueryParameters
as a BaseModel
, to inherit its field validation and serialization functionality. Validation on nullable fields is ensured using the
type (which can receive an instance of its contained type or Optional
None
) defined in the typing
module. When one of the fields is nulled, I do not apply the relative query criteria.
I also added an
type, which contains the data associated to an expense when queried from the database (i.e., all fields, including ExpenseRead
id
, which was excluded in
). As we will see, this type will not be directly used in querying operations, but it is good to have around for testing and documentation purposes.ExpenseAdd
# modules/schemas.py
115 class ExpenseRead(ExpenseBase):
116 """Type for selection operations.
117
118 Attributes
119 -----------------------
120 id : int
121 ID of the expense, primary key field.
122 """
123
124 id: int = Field(description="ID of the expense, primary key
field.")
Query
The stage is now set to explore the query functions in CRUDHandler
. When creating the class, I decided to have two separate querying functions: one should return a list of expenses to browse through, while the other was supposed to summarize expenses (summing the amounts grouped by category and type, for end-of-the-month accounting and such).
Both of these function share the same query criteria, which turned out to be slightly involved to construct. To avoid repetitions, I wrote the private method
, which accepts a CRUDHandler
._build_query_conditions()
object and returns an SQLAlchemy query condition.QueryParameters
# modules/crud_handler.py
118 def _build_query_conditions(self, params: QueryParameters):
119 if params.start is None:
120 start_condition = True
121 else:
122 start_condition = Expense.date >= params.start
123
124 if params.end is None:
125 end_condition = True
126 else:
127 end_condition = Expense.date <= params.end
128
129 if params.types is None:
130 type_condition = True
131 else:
132 type_condition = Expense.type.in_(params.types)
133
134 if params.categories is None:
135 cat_condition = True
136 else:
137 cat_condition = Expense.category.in_(params.categories)
138
139 return and_(
140 start_condition, end_condition, type_condition,
cat_condition
141 )
SQLAlchemy will accept conditions written as expressions in terms of the fields of the model (using the syntax <model>.<field>
). These expressions can use SQL operators, some of which are defined in Python as well (e.g., <=
and >=
). SQLAlchemy also offers methods operating on model fields which can reproduce more complex SQL operators (such as IN
, via in_()
).
In the code above, we create for each field of
a condition, which does not actually give a constraint unless the associated field is non-null. The obtained conditions are then joined with the QueryParameters
and_()
SQLAlchemy function, which reproduces the SQL AND
operator.
The first query function is
, which returns a list of
.query()CRUDHandler
objects matching the passed filters, and orders them by their Expense
date
and id
fields. This second ordering is imposed to make sure expenses performed in the same day are always ordered in the same way.
# modules/crud_handler.py
143 def query(self, params: QueryParameters) -> list[Expense]:
144 """Return expenses matching specified filters.
145
146 Parameters
147 -----------------------
148 params : QueryParameters
149 Parameters of the query.
150
151 Returns
152 -----------------------
153 list[Expense]
154 List of expenses matching the criteria.
155 """
156 return self.session.scalars(
157 select(Expense)
158 .where(self._build_query_conditions(params))
159 .order_by(Expense.date, Expense.id)
160 ).all()
Here, we are using the Session.scalars().all()
method chain, which performs a query and returns the results as a list of instances of the model datatype (other return types are also possible).
The argument of this method is an SQLAlchemy Select
object, created using as constructor arguments the fields to return (or the whole model, to request all fields). The equivalent of SQL clauses such as WHERE
, GROUP BY
(see below) or ORDER BY
can be invoked as methods of the Select
object, receiving boolean-valued conditions and fields (as <model>.<field>
) like in SQL.
Note that the method returns a list of Expense
, rather than a list of ExpenseRead
(which would require an explicit conversion). This will not be a problem, since this function is meant to only be called from the API, where the result will be immediately converted in JSON format. The created Expense
instances are immediately discarded, and there is no possibility for unintended changes to the database.
Summarize
The second query method of CRUDHandler
is summarize()
, which receives the same criteria as query()
but returns sums of the amount
field, grouped by category
and type
. The returned result is a dictionary of dictionaries, where the outer and inner keys are category and type, respectively:
{
"category 1": {"type 1": <amount 1>, "type 2": <amount 2>, ...},
"category 2": {...},
...
}
The function implementation is
# modules/crud_handler.py
162 def summarize(
163 self, params: QueryParameters
164 ) -> dict[str, dict[str, float]]:
165 """Summarize expenses matching specified filters.
166
167 Parameters
168 -----------------------
169 params : QueryParameters
170 Parameters of the query.
171
172 Returns
173 -----------------------
174 dict[dict[str, float]]
175 Dictionary containing inner {type: amount} dictionaries,
176 grouped by category.
177 """
178 sums = self.session.execute(
179 select(
180 Expense.category,
181 Expense.type,
182 (func.sum(Expense.amount)).label("sum"),
183 )
184 .where(self._build_query_conditions(params))
185 .group_by(Expense.category, Expense.type)
186 .order_by(Expense.category, Expense.type)
187 ).all()
188
189 res = {}
190
191 # Creating outer dictionaries
192 for s in sums:
193 res[s.category] = {}
194
195 # Creating inner dictionaries
196 for s in sums:
197 res[s.category][s.type] = s.sum
198
199 return res
Here, I am using the sum()
function, which reproduces the SQL aggregation function SUM()
, and is contained in the SQLAlchemy func
submodule, together with other functions applicable to model fields. I am also assigning an alias to the sum field via the label()
method, which fields and expressions involving fields admit. The final step is setting up the result dictionary (with some repetition in key creation, which I deemed negligible).
Update Records
Included in the CRUD operations is the updating of existing records, which I chose to select based on their ID. The relative function should accept the new values for the fields to modify (without the need to give the ones which do not change). I encoded these requirements in the
strong type, which the update function will receive as argument (together with the ID of the expense to modify).ExpenseUpdate
# modules/schemas.py
127 class ExpenseUpdate(BaseModel):
128 """Container for data to update existing expenses with.
129
130 Attributes set to `None` will not be changed.
131
132 Attributes
133 -----------------------
134 date : Optional[datetime.date]
135 Date of the expense. Default is `None`.
136 type : Optional[str]
137 Low-level group of the expense. Default is `None`.
138 category : Optional[str]
139 High-level group of the expense. Default is `None`.
140 amount : Optional[float]
141 Amount of the expense. Default is `None`.
142 description : Optional[str]
143 Description of the expense. Default is `None`.
144 """
145
146 date: Optional[datetime.date] = Field(
147 default=None,
148 description="Date of the expense.",
149 )
150 type: Optional[str] = Field(
151 default=None,
152 description="Low-level group of the expense.",
153 )
154 category: Optional[str] = Field(
155 default=None,
156 description="High-level group of the expense.",
157 )
158 amount: Optional[float] = Field(
159 default=None,
160 description="Amount of the expense.",
161 )
162 description: Optional[str] = Field(
163 default=None,
164 description="Description of the expense.",
165 )
While this type is in appearance similar to
, the fields here are all ExpenseAdd
, allowing to only specify the ones to change, as promised.Optional
Instances of this type will be passed to
, which will update expenses selected based on their ID. Here, I modified the record with the selected ID through its mapped model instance. The function will also raise an error if the requested ID is not found in the database.
.update()CRUDHandler
# modules/crud_handler.py
201 def update(self, ID: int, data: ExpenseUpdate):
202 """Update expense selected by ID.
203
204 Parameters
205 -----------------------
206 ID : int
207 ID of the expense to update.
208 data : ExpenseUpdate
209 New data to patch the expense with. Unset fields will not
change.
210
211 Raises
212 -----------------------
213 CRUDHandlerError
214 If specified ID is not found.
215 """
216 exp = self.session.get(Expense, ID)
217 if exp is None:
218 raise CRUDHandlerError(f"ID {ID} not found")
219
220 for k, v in data.model_dump().items():
221 if v is not None:
222 setattr(exp, k, v)
223 self.session.commit()
The Session.get()
method allows to perform queries for fields (or the entire model, such as here) based on a primary key identifier (here, the ID).
The result will be None
if the query produced no result, in which case the function raises a
. Otherwise, the returned CRUDHandlerError
object is modified with the new values, unless they are set to Expense
None
(in which case the old values are maintained). The changes at this point are not yet applied to the database, requiring a final call to Expense.commit()
.
Delete Records
The last among the CRUD operations is the removal of existing expenses, performed in CRUDHandler.remove()
based on a list of IDs. “Missing ID” errors will have to be considered here as well.
I implemented the error checks with the additional requirement to rollback any changes performed in a request where an ID was not found (even if others were successfully found and removed). This allows the entire operation to be requested again once the errors are fixed.
# modules/crud_handler.py
225 def remove(self, ids: list[int]):
226 """Remove selected expenses from the DB.
227
228 Parameters
229 -----------------------
230 ids : list[int]
231 IDs of the removed expenses.
232
233 Raises
234 -----------------------
235 CRUDHandlerError
236 If a specified ID is not found.
237 """
238 for ID in ids:
239 exp = self.session.get(Expense, ID)
240 if exp is None:
241 self.session.rollback()
242 raise CRUDHandlerError(f"ID {ID} not found")
243
244 self.session.delete(exp)
245
246 self.session.commit()
Error management is a bit more involved here. Expense
instances matching the passed IDs are extracted and marked for deletion (the last step is performed calling the
method). If one of the IDs is not found, all changes are rolled back and an exception is raised. If no errors have been raised, changes are committed.Session
.delete()
I also implemented an erase()
method, which removes all expenses. This is especially useful for testing, and I thought it best to separate it from remove()
, to avoid accidental erasures. As previously, Expense
instances (in this case, all of them) will be queried and marked for deletion. The changes are then committed.
# modules/crud_handler.py
248 def erase(self):
249 """Remove all expenses from the DB and reset ID field."""
250 for exp in self.query(QueryParameters()):
251 self.session.delete(exp)
252
253 # Resetting primary key
254 self.session.execute(text("ALTER SEQUENCE expenses_id_seq
RESTART"))
255 self.session.commit()
The Session.execute()
command is used here to execute an SQL command (wrapped in the SQLAlchemy text()
converter): this SQL command resets the primary key, which would otherwise continue from the last assigned value before erasure.
What’s Next?
In this post, we understood how to implement CRUD (Create, Read, Update, Delete) operations on a database using SQLAlchemy methods and constructs.
In the next installment in this series, we will learn how to write an API which will access these functions and communicate with the user.
Author: Adriano Angelone
After obtaining his master in Physics at University of Pisa in 2013, he received his Ph. D. in Physics at Strasbourg University in 2017. He worked as a post-doctoral researcher at Strasbourg University, SISSA (Trieste) and Sorbonne University (Paris), before joining eXact-lab as Scientific Software Developer in 2023.
In eXact-lab, he works on the optimization of computational codes, and on the development of data engineering software.