EclipseLink joins without mapped attributes

EDIT April 2016: Originally I believed I described pure JPA 2.1 solution, much later I found it is merely an EclipseLink extension to the JPQL that is not supported in standard.

OK, whoever knows it is possible, just stop reading right away (before my JPA rant starts). Note, that this post builds on JPA 2.1 syntax for ON and EclipseLink extension that allows to join root entities (JPA specification allows only association paths).

JPA is hard

Java Persistence API is not easy – let’s face it. Please. Because it is easy only for two kinds of programmers – those that know everything (again, congratulation) and those who think they know everything. Or something similar. While I’m all for minimizing complexity to make things easy, it is known fact that making things simple (that is non-complex) is hard. Everytime I hear “no problem, that’s easy” I know what I’ll get. Complex ball of mud that does not correspond to “easy” at all.

I’ve read parts of Hibernate reference, I’ve read Pro JPA 2/2.1 – and I’d love to read it all finally, but there is always something else to read too. And so while I was aching for solution how to get rid of some my @Many/OneToOne mappings from entities (because these are not guaranteed to be lazy and can trigger too many selects), I thought I couldn’t because I’ll be unable to join in the direction of that association. I was wrong, but first…

Can’t you avoid triggered selects for *ToOne somehow?

Yes, you can. You can eagerly fetch that association with JOIN, some smart ORMs do that for you to some depth. Let’s talk about this example:

A Dog points to a Breed. How can I populate a table on the user interface with selected attributes from the dog and its breed as well?

  • The worst case (and still traditionally used!) – select Dogs and don’t care. This will likely result in N+1 select (officially name does not reflect the cause and effect – as it is rather 1+N). This plain sucks. I really don’t know why some mainstream ORM solutions still execute this as N+1 even when it’s eager (*ToOne is by default), but that just underlines… well, the state of ORM, obviously.
  • Fetch the relationship! Of course. It is bound to happen anyway, so let’s deal with it in one select. Does it mean we’re done in single select? If you have more *ToOne relations on your Dog – or Breed – then you’re not. Any fetch may trigger additional selects for further to-one relation.
  • Fetch and select columns explicitly. This works, but you have to deal with Lists of arrays of Objects, or – in better case – Querydsl’s handy Tuple. Or you may use projection to DTO. When you need to do this with 60 columns you’ll know the pain.

But yes, it is possible. The question is if it’s worth to use JPA if 80% of time you go around these problems. Depending on relationship optionality you need to use LEFT JOIN of course. “Use LEFT JOIN, Luke!” is my answer when another developer asks me questions like “when I want to display fields from breed, some rows just disappear from my results!” or similar with a bit of fun in it “count select returns different count than is the size of final result!” – because you don’t need to join for count unless you perform WHERE part on joined columns.

Summed up, it’s hard. Gone is the premise that you’ll enjoy working with some mapped objects. No, no, deal with Lists of Object[] or DTOs! It’s hard to say what is better – DTO in theory, but if the list of columns is changing a lot then it’s just another place you’ll have to change. If you still use vanilla JPA, consider Querydsl, seriously. Tuple may be lightweight DTO for you and on the other side you can get the stuff out using expression paths that are compile time safe – without the need to mess with accessors. (Old enough to remember this article? :-))

Dropping @ManyToOne annotation

(The same applies for @OneToOne where relevant.)

To write left join from Dog to Breed you can find virtually exclusively examples like this (Querydsl, but the same goes for JPQL):

QBreed breed = new QBreed(“breed”); // alias
List<Tuple> result = new JPAQuery().from(
  .leftJoin(, breed)

That is, in left join you first state the path how to get to the breed from the dog and then you assign an alias to it (breed). Sometimes you can live without aliases, but in general you need them for any deeper joins. Notice also how the leftJoin implies ON clause. This is logical and expected, that’s why we have the mapping there.

But thing I never realized (and decided to try it the first time today) is that you can just leftJoin to alias and add your ON explicitly – just like you would in SQL! (This is not JPA standard, but EclipseLink QL extension. Querydsl does not have any problem to generate it for you.)

QBreed breed = new QBreed(“breed”); // alias
List<Tuple> result = new JPAQuery().from(

Obviously you have to have breedId attribute mapped on Dog – but that is probably some plain type like Integer. This will not trigger any additional select. If you really want, you can have dual mapping for the same column like this:

@Column(name = "breed_id", insertable = false, updatable = false)
private Integer breedId;

@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name = "breed_id")
private Breed breed;

Notice the insertable/updatable=false – this has to be present on one of those mappings. The question is on which one. If you save/update the entity a lot and breed’s ID is good enough then move it to @JoinColumn annotation. But if you don’t need Breed in most cases at all, remove it completely. Also notice how the code is lying to you about LAZY fetch. It is not. In theory it can be with some byte-code modified Breed (Hibernate does it out of the box), but that is not portable anyway. If you absolutely need that attribute in Dog class, you can still make it @Transient, fetch it and list it explicitly (here I show it for singleResult for brevity):

QBreed breed = new QBreed(“breed”); // alias
Tuple result = new JPAQuery().from(
  .singleResult(, breed);
Dog dog = result.get(;
dog.setBreed(result.get(breed)); // sets transient field

No problem at all. Full control. Couple of lines longer. Probably way more performant on scale.

What if JPA allowed us to ignore relations on demand?

If you use only id value that copies exactly FK in your entities then you’ll lose some of the ORM benefits – but as said you’ll also dump a lot of its burden that is in many cases unconsidered. Imagine you could specify mappings like this:

@Column(name = "breed_id")
@ManyToOne(targetEntity = Breed.class) // exists, but invalid when types don’t match
private Integer breedId;

This way you’d say what the target entity is, generated metamodels could offer you joins, etc. I’m not going to evolve this idea at all because there are some obvious disadvantages. Your JOIN path would be dog.breedId, not breed, so if you followed it to Breed’s attributes it would read confusingly (assuming that when used as a path it would act as Breed of course). Then maybe this:

@ManyToOne(fetch = FetchType.JOIN_ONLY) // does not exist
@JoinColumn(name = "breed_id")
private Breed breed;

This would mean that unless you require JOIN explicitly in your select this field will NOT be populated. Actually, this would be best to combine with dual mapping as shown sooner. ID is easy to get without JOIN – and referenced entity would be insertable/updatable false. Otherwise semantics of save would be questionable. Does null mean that entity is removed and FK should be nulled? Or does it mean to ignore it? Should JPA ignore it you need that explicit breedId field. If not ignored even trivial find/change single attribute/save (or let it happen automatically at the transaction commit) would delete your relation unexpectedly. You can add some dirty checking magic of course (“if the attribute changes then do that, otherwise don’t do anything”), but we’d just add to existing mess. So this definitely breaks “least surprise” principle. But with dual mapping that is already possible, I’d love to have this really lazy fetch type.

In case you wonder why LAZY is not guaranteed (without byte-code modification magic) – the answer is that breed attribute should be null only if NULL in database. So if the FK is there, you need to instantiate some Breed and if that supposes to be lazy, it has to be some kind of proxy to a real one that is lazily loaded when you actually touch (get something from) the breed object. This is very easy with collections (*ToMany) and ORM’s can do that (although Eclipselink’s Vector is spectacularly wrong way to do it) – that’s why I’m so possessed with *ToOne problem.

There is also another alternative solution – we would have Breed instantiated as a wrapper for the FK value. That is would not be null, but other attributes of dog.breed would not be initialized. Programmer would have to know – after all he requested it this way. That would allow the mapping to be still object based, but we would have more low-level control over the fetches.

Is it still ORM?

With dual mapping or with FK wrapper it would still be ORM as we know it. You’d have bigger control, single em.find would not trigger 47 selects (real story, seriously, and all traversing *ToOne relationships) and there would be minimal implications to JPA (although there are more competent to say so or otherwise).

Currently I plan to drop quite a lot of *ToOne mappings because:

  • There is no portable way to avoid the fetch when I know I don’t want it.
  • When I want it, I can join it myself anyway.
  • If I want to pick specific relation I can just find it by ID that is available on the entity (and then store it to @Transient field for example).

Am I still using it as ORM then? I’m dropping a lot of “mappings” of my “relations(hips)” from my “objects”. But I can still utilize a lot of JPA (and Querydsl). There still is persistence context, you can use L2 cache (if you want), and maybe you don’t have uber-cool mapping to your domain… but let’s face it. How often we really map it? How often we just mirror our databases?

Do you map @ManyToMany or do you map the association table with two IDs instead? With explicit mapping you can get list of IDs of related object by ID of owning object. In SQL it is a single table query. Eclipselink joins Master, MasterDetail and Detail to get information that is clearly available on MasterDetail table alone (not mapped explicitly). Is it so difficult to treat IDs properly? It probably is. How does ORM lower the complexity here?

Execution postponed

JPA adds tons of accidental complexity every time. Not first, but later. And virtually all teams working with it have no real JPA expert. I study JPA more than most people I know and it still keeps surprising me. Rarely in a good way. Left join using unrelated alias (at least from mapping point of view) and explicit ON is the good one. I don’t even know why I never tried it when I officially longed for it! I ranted: “Why don’t we have free joins on any numbers (PK/FK) just like SQL can do?” And nobody ever contradicted me. Most examples always start with path going from FROM entity, using alias only as their second argument (Querydsl, but JPQL examples are the same). We are slaves of our *ToOne relations and can’t get rid of undesired selects – and there is a simple way how to do it all the time. Partial reason for this is that ON for JOIN requires JPA 2.1 and didn’t work before.

If you don’t care about *ToOne problem, no problem. To be fair, if you have lazy set up properly you may be happy with it. If you do want to get rid of to-one relations the treatment is:

  • Replace your *ToOne mapping with simple ID as a value attribute.
  • Change anyJoin(x.y, alias) to anyJoin(alias).on(x.yId.eq( – this is Querydsl example but it transforms to JPQL directly. List both entities in projection if you need it (maybe you just WHERE on it).
  • If you really want +1 select, do it with em.find yourself. That may actually use L2 cache (if it’s not in the persistence context already).

With this tiny speck of knowledge I’m backing off from my plans to get rid of ORM in our project (you can imagine what pain that would be). Now it gives me so-so enough control to contain the select explosions to required levels.

EDIT 2016: Just to repeat it – this is based on EclipseLink support for JOIN for root entities. Maybe it will get into JPA eventually (I’d love it), but as of 2.1 it is not there. ON condition is there already, but without root entity JOIN it is not enough.


JPA – is it worth it? Horror stories with EclipseLink and Hibernate

One friend of mine brought Hibernate to our dev-team back in 2004 or so. But now he uses something much simpler and avoids ORM/JPA whenever possible. He can, because he is mostly master of his projects.

I had to get more familiar with JPA on my path. There is always more and more to learn about it. When you discover something like orphanRemoval = true on @OneToMany, it may bring you to the brink of crying. Because of happiness of course. (Stockholm Syndrome probably.) But then there are other days, when you just suffer. Days when you find bugs in JPA implementations – or something close to them. And you actually can’t choose from so many providers, can you? There are just two mainstream players.

Right now we are using EclipseLink and there are two bugs (or missing features or what) that kinda provoked our big effort to switch to Hibernate. What were our problems and what was the result of our switch? Read on…

EclipseLink and Java 8

You can avoid this problem, although you have to be extra careful. EclipseLink’s lazy list – called IndirectList – cannot be used with streams. Actually – if it didn’t compile, it would be good. Worse is it works but very badly.

Guys creating Java do their best to make streams work effortlessly, there are those default methods in interfaces, etc. But no, no luck here. IndirectList returns empty stream. Why? Because of this bug. (It works fine for IndirectSet.)

What is the cause of this problem? Well… someone decided to extend Vector, but instead of sticking with it, they also decided that the extended Vector will not be the actual backing collection, but they added their Vector delegate too. This works fine for interfaces like List, but not so for extended things. Now add “ingenious” design of Vector – full of protected fields you have to take care of… here IndirectList clearly collided with Vector badly.

Why do you need to take care of protected fields? Because creating the stream on Vector uses its special java.util.Vector.VectorSpliterator, that uses these protected values. So if you delegate to other Vector, you either delegate java.util.Vector#spliterator too (but then it won’t compile with earlier Java) or – preferably – don’t delegate to other vector and use that extended one as your backing collection. Of course, guys at Java might have used size() and get(int) instead of accessing Vector’s protected elementCount, elementData, etc. – but that would not be as efficient.

BTW: Anybody here likes Vector? I personally hate Java’s wannabe compatibility that really hurts its progress as we need to drag more and more burden with us.

JPA and counts on joins with distinct

The other trouble was related to a specific select where we used join with distinct instead of subquery with exists. We have an entity called Client that can be bound to any number of Domains, @ManyToMany mapping is used on Client side. Domain does not know about Client – hence the arrow.


We often need to obtain Clients where any of their domains is in a provided list (actual input parameter contains IDs of those domains). Using Querydsl, it looks like this:

QDomain cd = new QDomain("cd");
JPAQuery query = new JPAQuery(em).from(QClient.client)
    .join(, cd)

I’m sure you can easily transform it mentally to Criteria or JPQL – Querydsl produces JPQL, BTW. Distinct here has a special meaning that is used by JPA specification. Now you can call query.list(QClient.client) – it produces nice join and returns only distinct Clients. You try query.count(), and it works as expected as well.

But imagine that Client has composite primary key. We actually have exactly the same case, an entity bound again to domains, everything looks the same – just that entity has composite PK. List works alright, but if you try query.count() you’ll get completely different illogical query using exists and the join to Domains is lost completely. Results are wrong, definitely not the size of the list result.

There are some bugs filed for this – for instance this one. Because there was a discussion about this behavior we decided to find out how Hibernate treats this.


The trouble is that it is not the bug after all – at least not according to JPA 2.1specification which reads in section 4.8.5: The use of DISTINCT with COUNT is not supported for arguments of embeddable types or map entry types.

I found this today when I was gathering proofs for this post. Our story however would unfold in different direction. Are we using JPA? Yes. Are there other options? Yes!

Spoiler 2:

We used subselect with exists in the end, but that’s not the point. 😉

So we switch to Hibernate, right?

We got kinda carried away by some bug reports resolved as FIXED – like this one or the ones you can see in its links. But switching JPA provider isn’t that easy as promised. In the end we found out that count distinct for entities with composite PKs doesn’t work with SQL Server anyway. But we learned a lot of interesting things about how far both implementations are apart when it comes to translating JPQL to SQL. Mind you, I’m not actually sure whether everything we wrote in Querydsl (that generates JPQL) is 100% correct, but then it should say something. When it works I expect it to work after changing a JPA implementation.

Hibernate screwing deletes with “joins”

We have a lot of delete clauses that worked just fine in EclipseLink:

new JPADeleteClause(em, QGuiRolePermission.guiRolePermission)

This essentially means “delete all role permission assignments for role with specified description”. Piece-a-cake, right? We’re using implicit join there again, but it is the same story all over again – and it worked in EclipseLink just fine. EclipseLink creates proper SQL with exists:

  SELECT t1.gui_role_id FROM GUI_Roles t0, GUI_RolePermissions t1
    WHERE ((t0.description = ?) AND ( = t1.gui_role_id)) AND t1.gui_role_id = GUI_RolePermissions.gui_role_id
      AND t1.domain_id = GUI_RolePermissions.domain_id AND t1.permission_id = GUI_RolePermissions.permission_id)

It is not perfect – and we get back to it in section Random Query Generator – but it works. Let’s just compare it to Hibernate now. This is JPQL (actually this is the same for both providers as it’s produced by Querydsl):

delete from GuiRolePermission guiRolePermission
  where guiRolePermission.guiRole.description = ?1

This does not seem alarming – but the SQL is completely off:

delete from GUI_RolePermissions cross join GUI_Roles guirole1_ where description=?

This does not work on our currently used SQL Server, whatever dialect we choose. Why not go with exists? We have helper method that takes entity (Querydsl base path) and its where condition and performs delete. Now instead of:


We have to write this:

    new JPASubQuery().from(QGuiRolePermission.guiRolePermission)

Not terrible… but why? EDIT 2016-01-07: There is an issue filed for this, so far not fixed.

Another Hibernate’s twist

Things may get complicated when more relationships are involved. Take this simple JPQL for instance:

delete from Security security
  where security.issuer.priority = ?1

We want to remove all Securities with issuer (Client) having specific priority value:


There is another implicit join there, but one subquery with exists should cover it. Security class contains @ManyToMany relationships to Domain class through intermediate table Securities_Domains. That’s why we need two deletes here – and this is what EclipseLink generates (issuer is of class Client):

  SELECT FROM "Clients" t0, Securities t1
    WHERE ((t0."priority" = ?) AND (t0."id" = t1."issuer_id"))
      AND = Securities_Domains.security_id);
  SELECT FROM "Clients" t0, Securities t1
    WHERE ((t0."priority" = ?) AND (t0."id" = t1."issuer_id"))
      AND =;

It works just fine. But Hibernate shows its muscles really strong here!

delete from Securities_Domains where (security_id) in (
   select id from Securities where priority=?)

Obviously right the first delete is missing join to the Client entity in that subselect – and fails spectacularly. And we’re actually lucky we don’t have any other column called priority on Securities as well. 🙂 That could hide the error for ages.

With or without id?

Love that song… I mean the U2 one, not the Hibernate’s one. When you see JPQL with equality test on two entities you assume it is performed on their ids. So if you actually specify those ids, it should be still the same, right? Maybe it’s just some JPA myth after all. Consider this JPQL:

delete from MetaObjectSetup metaObjectSetup
  where not exists (select 1 from Permission permission
    where permission.metaObjectSetup = metaObjectSetup)

This produces query that does not work properly, probably taking last id from permission1_.

delete from META_Object_Setups where  not (exists (
  select 1 from Permissions permission1_
    where permission1_.meta_object_setup_id=id))

Version working with Hibernate must perform eq on id fields explicitly (EclipseLink doesn’t mind either):

   new JPASubQuery().from(QPermission.permission)

SQL result:

delete from META_Object_Setups where  not (exists (
  select 1 from Permissions permission1_

Experiences like these were the last drop for us and we reverted all the efforts to start using Hibernate.

Happy ending?

We switched back to EclipseLink after this. I don’t remember so much resistance from Hibernate like… ever. Maybe our JPQLs were too loose for it, joins were not explicit, aliases were missing, etc. But in the end it did not solve our problem.

It is really shame that count and distinct are not possible in the way making using query.list and query.count operations consistent in libraries like Querydsl. It is also shame that when it is not supported (as officially stated in the specification) it does not throw exception and does something fishy instead – silently.

You can do the same in SQL wrapping the select into another one with count – but JPQL does not support queries in the FROM clause. Pitty. However this is one of those cases when you can’t go wrong with correlated subquery. You just have to remember that subquery does not imply any equality implied by JPA joins (it can’t actually, there are cases when this would be an obstacle) and you have to do it yourselves – see the last examples from previous part (With Or Without Id).

This is all really crazy. Remotely, it reminds me horror stories about JSF and bugs of its various implementations (always different). Sure, things are really complicated, but then maybe the trouble is they are. Maybe it can be simpler. Maybe it’s wrong that I have to define @ManyToOne to be able to express joins. (Edit: Not true for JPA 2.1 as fixed in this post.) Any @XToOne has consequences often unseen even by experienced JPA users. Maybe some jOOQ or Querydsl over plain SQL is better than this stuff. I just don’t know…

Random Query Generator in EclipseLink

Let’s look at something else while we stick with JPA. Here we are back to Eclipse and there is one very easy thing I want to do in SQL:

  SELECT FROM Clients t0 WHERE ((t0.priority = ?) AND =

In plain wording – I want to remove all Client’s Contacts when this Client has some specific priority. If you ask for mapping… I hope there is nothing screwed here:

@Table(name = "Clients")
public class Client {
    @OneToMany(mappedBy = "client", cascade = CascadeType.ALL)
    private List<ClientContact> contacts;

@Table(name = "Contacts")
public class ClientContact {
    @Column(name = "client_id", insertable = false, updatable = false)
    private Integer clientId;

    @JoinColumn(name = "client_id")
    private Client client;

Back reference to client is mapped both ways, but one mapping is read-only. Both primary keys are simple Integers. No miracles here. And now three ways we tried, always with Querydsl, JPQL and SQL:

new JPADeleteClause(em, QClientContact.clientContact).where(
    new JPASubQuery().from(QClient.client)

// JPQL - logically exactly what we want
delete from ClientContact clientContact
  where exists (select client from Client client
    where client.priority = ?1 and client = clientContact.client)

// SQL contains double EXISTS and two unnecessary Clients more
    SELECT ? FROM Clients t2, Clients t1
      WHERE ((t2.priority = ?) AND ( = t0.client_id)))  AND =
bind => [1, 47]

Ok, it works, right? Can we help it when we mention id equality explicitly (…

new JPADeleteClause(em, QClientContact.clientContact).where(
    new JPASubQuery().from(QClient.client)

// JPQL looks promising again, I bet this must generate proper query - or at least the same
delete from ClientContact clientContact
  where exists (select client from Client client
    where client.priority = ?1 and =

// Three clients?!
    SELECT ? FROM Clients t3, Clients t2, Clients t1
      WHERE (((t2.priority = ?) AND ( = AND ( = t0.client_id)))
        AND =
bind => [1, 47]

You gotta be kidding me, right? What for?! I always believed that if id is entitie’s @Id, than it can do virtually the same with or without it. I even dreamt about using foreign key directly – but that’s way beyond capabilities of current JPA providers, so it seems.

Ok, let’s try something lame. Something I’d write only in JPA, not in real SQL, of course. Something with a lot of implicit identity equality hidden between the lines:

new JPADeleteClause(em, QClientContact.clientContact)

// not like SQL at all, but pretty logical if you know the mapping
delete from ClientContact clientContact
  where clientContact.client.priority = ?1

/* Surprise! Still one unnecessary Contacts in inner join that could go through FK,
 * but definitely the best result so far.
 * Mind though - this refused to work with Hibernate, at least with SQL Server dialect (any). */
  SELECT FROM Clients t0, Contacts t1
    WHERE ((t0.priority = ?) AND ( = t1.client_id)) AND =
bind => [47]

I hope this demonstrates the impotence of this field of human activity after more than a decade of effort. I’m not saying it all sucks – but after seeing this I’m not far from that.

With JPA doing too many things you don’t want or expect (unless you’re an expert) and many limitations compared to SQL I’d expect at least semi-good SQL. This is not close really.

Ramble On

So that’s another day with JPA. With proper query (or best approximation when it’s not critical) there is just one last thing we have to pay attention to… those IndirectLists that can’t be streamed. Just one last thing I said? Ouch… of course, we have to check our SQLs still and watch for any bugs or gray zones.

Yeah, we’re staying with JPA for a while on this project. But the next one I’ll start with metamodel directly over DB. I hope I can’t suffer more that way. 🙂 And maybe I’ll be able to do ad-hoc joins without @XToY annotations that always bring in stuff you don’t want. That’s actually one of my biggest gripes with JPA.

Many people avoid as much relation mappings as possible while they still can express their joins. Many people avoid @ManyToMany and map the association table explicitly, so they can reach all entities A for B’s id (or list of ids) – with a single join. Otherwise EclipseLink stubbornly joins all three tables as it’s understanding of PKs and FKs is obviously lacking.

Lessons learned? Not that much actually. But we’re definitely not switching provider for another couple of months at least!