Moving forward with the development of the ParagonMeasure web application, it’s come time to set up the database and configure the test suite to transact/rollback test data.
Setting up a Postgres database was surprisingly easy; there’s a simple tutorial which worked exactly as expected.
Setting up the database to work with my tests, however, was a different story. I tried creating ‘setUp’ and ‘tearDown’ functions to create and drop tables according to my schema, but found that Postgres would flat-out lock whenever I tried tried to call
db.drop_all(). No error, just a complete freeze until I shut down the entire Postgres server.
Some investigation turned up some evidence suggesting that this is not uncommon behavior with Postgres – attempting to drop tables can cause Postgres to lock, if there happen to be any outstanding connections.
Of course, coming from the Candyland of Rails, I’d never had to work with databases at this level before (
db.drop_tables was a common and casual activity). This was a new challenge, but one I was determined to solve.
Engines, Connections, DBAPIs
First steps were getting a better handle on the under-the-hood workings of both SQLAlchemy and SQLAlchemy-Flask, the libraries I was using to mediate between Flask and my database. The SQLAlchemy docs on sessions and engine configuration were very helpful, providing useful diagrams like this:
Reading through the docs, I learned some useful terms:
DBAPI: a low-level protocol for interacting with a database; not something that I would use, but something that an ORM library would use at the lower levels.
Engine: an object which manages connections to the database. Given that a database is designed to be accessed by many, many computers at once, engines are used to control and manage the database’s resources and DBAPI connections. Engines are able to give connections to the database to other objects. In general, one engine is created once when an application is initialized, and stays alive for the duration of the application.
Session: the abstraction layer most suitable for my purposes. A session represents a series of transactions with a database, as well as holds all objects that have not yet been written to the database. A session then represents a kind of “staging area” where my application can add and manipulate objects before “flushing” the entire thing to the database in a series of optimized SQL statements. I believe that sessions are generally created and destroyed during the life of an application, as needed, requesting a connection from the engine, which persists.
Sessionmaker: A factory object for creating sessions. Similar to an engine, this object is created once when an application is first instantiated (part of the application’s configuration), and persists in memory, creating new sessions as needed. In SQLAlchemy, a sessionmaker is instantiated by passing an engine as an argument, which should suggest the connection between those two objects:
Session is a factory class created by calling
sessionmaker bound to
session is an individual session object, created by instantiating the
It seemed that, for my purposes of controlling the database, I would be fine sticking with the default engine, and focusing on learning about sessions.
Flask and SQLAlchemy
Another challenge was figuring out how to balance between SQLAlchemy and Flask-SQLAlchemy. The former is the robust ORM library suitable for any Python project, while the latter is a Flask extension designed to facilitate using SQLAlchemy in Flask. I wanted to find an elegant solution that took advantage of both APIs and used the Flask-SQLAlchemy API whenever possible, so I wanted to figure out how they were connected.
Popping into the terminal, I did some investigation.
Here’s a snippet from my
And so in my terminal:
db in this case is part of Flask-SQLAlchemy, not SQLAlchemy proper. Good to know; this is a good sign. Digging further:
Alright, it seems that I’ve crossed over into SQLAlchemy land. My hunch would be that Flask-SQLAlchemy subclassed the
db object and added some Flask-specific features, which is why it contains methods and attributes from regular SQLAlchemy. Open source is fun.
I knew my testing problem was something to do with transactions – I was starting them, but for whatever reason not finishing them up properly.
Reading the SQLAlchemy session docs, I arrived at the idea of ‘object states’. There are four possible states an object can be in, within a session:
- Transient – created, within the session, but not yet saved to the database.
- Pending – an object added to the session using the
- Persistent – an object both in the session and saved to the database.
- Detached – an object in the database, but not a part of any session.
Understanding these distinctions was super helpful in understanding the role and behavior of sessions. Consider the following terminal session:
Well, at least something makes sense. Moving on.
Well, wait a second – might we be able to use
webapp.db.session.rollback() to restore our database after each test? Let’s try:
Darn. I was hoping that I wouldn’t show up in those results.
Alright, so it seems that
webapp.db.session.rollback() can reset the session, but can’t do much once we’ve flushed the session to the database. Also good to know.
webapp.db.session object seems to have a lot of goodies. Let’s see what else it can do:
Neat. And look at that –
.close_all(). Those look like they might be able to solve our Postgres locking problem. Let’s see what they do, through a little example:
Very cool. I saved myself to the database, and retrieved myself using the
.query.all() method that my
Participant model inherited from
Now let’s try dropping the table (fingers crossed):
Alright, time to restart Postgres… let’s try this again.
SUCCESS!!! It seems like all that was missing was a call to the session telling it to close the connection.
Back to the Point: Unit Tests
Let’s make some changes to our test file and see what we’ve got:
Now, I’m sure there are better ways to write this test file – I struggle with writing test files (although not the tests themselves), because I’m always struggling to balance the desire to keep the code DRY, to keep the tests fast, and to ensure that they’re adequately rigorous. It seems like one of those “pick two of three” situations. Anyway, the subject for another post.
I’m very sure, though, that my database is getting wiped between each test. I know this because the last line of the test prints
1, even if I run the test over and over again.
And there you have it. I’ve gotten my test suite to wipe the database between each test. Rigorous testing, here we come.
As an aside, I wonder if this “Postgres locking” situation is worth making a pull request… open source contributor, here I come.
– Update – Database troubles aren’t over – if you find yourself struggling with an error like this:
There’s a great blog post addressing the issue here.
– Update 2 –
session.close() wasn’t my only option. Compare with