Last August (2013) I ran into an issue where I wanted to get the latest 10 created entities from a database table but I was getting the first 10 instead. The relevant components of the stack involved were PostgreSQL 9.3, with Hibernate 4.2 used as the EntityManager provider for Spring Data JPA 1.4. Turns out that the cause is an old, obscure (but known) issue in Hibernate: HHH-5574.
In my case I was mapping the creation date of the entity, of type Joda-Time‘s DateTime using Jadira‘s User Type PersistentDateTimeAndZoneWithOffset, which mapped the DateTime to two columns, which in Postgres are of types timestamp without time zone
and character varying(255)
for the timestamp and time zone id, respectively. The reason the order was incorrect was because when Hibernate transformed the high-level Spring Data JPA “query” into actual SQL, it was creating a query similar to:
select * from entity order by created_datetime, created_datetime_tz desc
instead of:
select * from entity order by created_datetime desc, created_datetime_tz
Before I eventually found the bug on Hibernate’s Jira I actually tracked this all the way to the generated Antlr code for parsing Hibernate’s own HQL, at which point I gave up.. and found out that this was an already reported issue, but still open, with no available generic workarounds.
Changing tactics I realised that ideally the solution was to find a way to configure my system to map DateTime to a single timestamp with time zone
, but I remember trying a few different ways of forcing this and, with an extensive unit test suite to verify, I didn’t find a way down this path that ensured that the time zone specified in the DateTime would deserialise back as expected (often the time zone would be converted to the offset from UTC).
Another potential (more of a quick-fix) solution would involve fetching all the entities and sorting in memory. There are two reasons I tried to avoid this: (1) the table was destined to hold lots and lots of entries and (2) I also had paging mechanics in place (via URL params), so I would either have to drop them by, for instance, hard-coding them or port over the functionality to my own hacky workaround (and maintain it all).
Eventually I started thinking about the ISO 8601 standard, which (in my mind at the time) would work because it was lexicographically sortable, and it included the time zone information. The first issue I found was the ISO 8601’s string format isn’t sortable because the date and time portions are in its timezone, which would make (sorting from most recent to less recent) 12:25 Europe/Rome
come before 12:10 Europe/London
which is incorrect. Secondly, ISO 8601 doesn’t specify time zone id but just offset, which would mean to me that serialising and deserialising would be lossy- storing 11:00 Europe/London
(during BST) would return as 11:00 +01:00
.
Eventually I created a solution, which wasn’t ideal but worked, which involved saving the DateTime as a single varchar column, which contained all the information, and could be lexicographically sorted. The string’s format consisted of three parts:
- the DateTime transformed into UTC time, formatted as such: yyyy-MM-dd’T’HH:mm:ss.SSS, ie. ISO 8601, sans time zone
- the underscore symbol (_)
- the id of the DateTime’s original time zone (for example Europe/London or UTC)
I did so by creating my own User Type, which I called PersistentDateTimeAsUtcString
, which I eventually contributed back to Jadira, and shipped in their 3.1.0.GA release, and now lives in their javadoc. For the technical detail of how the deserialisation & serialisation works see the source.
This is why I love open-source.
P.S. In addition to solving this issue reasonably nicely, this whole endeavour had the side-effect of leading me to finally understand the basics of timezones, and I now know the difference between UTC, GMT & BST, and which way round they relate to one another. 🙂