Common Advice – Correct?
Decent developers usually know that they have to try/catch/finally to ensure they clean up connections, file handles, or any number of things. But then, for Java, you hear “just use JdbcTemplate! it does all this boilerplate for you!”.
Uncommon Scenario
Normally when you’re writing an average app, you generally want lots of queries to be able to run in parallel, efficiently, using the same user and password. In this case, you can easily just use a connection pool and “not worry about it”. Spring JdbcTemplates will just grab connections from your data source and pool them appropriately based on the data source. You don’t have to worry about if they are opened, closed, or whatever.
I ran into a scenario today where that was not true though. I have an app where each user connects to each back-end data-source using their own personal account which is managed by the application itself. So, each user needs his or her own connection. So… pooling would not make much sense unless each user had to do parallel operations (which they don’t).
What Happens to the Connections?
So, here’s the fun part. I had, for the longest time, assumed that JdbcTemplates would clean up connections in addition to results sets. In fact, you’ll see this online a lot. But be careful! This does not appear to be the case, or if it is, it is at least data source dependent… and that actually makes sense if you think about their purose.
Here is how I verified this. I created a JdbcTemplate which is based on a new data source each time (which is needed as the user/password change).
private NamedParameterJdbcTemplate getJdbcTemplate(String email, String password) { SimpleDriverDataSource ds = new SimpleDriverDataSource(); ds.setDriverClass(HiveDriver.class); ds.setUrl(url); ds.setUsername(email); ds.setPassword(password); return new NamedParameterJdbcTemplate(ds); }
Then I used the template for a number of queries in a normal manner (like this):
getDirectHiveJdbcTemplate(email, catalog) .queryForList("describe extended `mytable`.`mytable`", new MapSqlParameterSource())
Then I took a heap dump of the process with this command (run it from your command line in your JDK bin folder in Program Files or the Linux install location with minor changes):
jmap.exe -F -dump:format=b,file=C:\temp\dump.bin your-pid
You can get the PID easily by looking at your running process from JVisualVM (which is also in the bin directory).
Once the dump is complete, load the file into JVisualVM (you need to use the 3rd option of file type to make it go in, I think its pattern is . or something.
Finally, go to the classes tab, go to the very bottom of the screen, and search for the class of interest (in my case HiveConnection). I can see as many instances as I have run queries as each query made a new connection from a new data source. They are definitely not being cleaned up.
This surprised me because even though creating a new template/data-source each time is not normal, I expected them to clean up the connections when they were garbage collected or as part of normal operations. After thinking about it more, I realize operations in my case would not me “normal”, but the lack of clean up when out of scope still definitely is a surprise to me.