JPA – modularity denied

Here we are for another story about JPA (Java Persistence API) and its problems. But before we start splitting our persistence unit into multiple JARs, let’s see – or rather hear – how much it is denied. I can’t even think about this word normally anymore after I’ve heard the sound. 🙂

Second thing to know before we go on – we will use Spring. If your application is pure Java EE this post will not work “out of the box” for you. It can still help you with summarizing the problem and show some options.

Everybody knows layers

I decided to modularize our application, because since reading Java Application Architecture book I couldn’t sleep that well with typical mega-jars containing all the classes at a particular architecture level. Yeah, we modularize but only by levels. That is, we typically have one JAR with all entity classes, some call it “domain”, some just “jpa-entities”, whatever… (not that I promote ad-lib names).

In our case we have @Service classes in different JAR (but yeah, all of them in one) – typically using DAO for group of the entities. Complete stack actually goes from REST resource class (again in separate JAR, using Jersey or Spring MVC) which uses @Service which in turns talks to DAOs (marked as @Repository, although it’s not a repository in the pure sense). Complex logic is pushed to specific @Components somewhere under service layer. It’s not DDD, but at least the dependencies flow nicely from top down.

Components based on features

But how about dependencies between parts of the system at the same level? Our system has a lot of entity classes, some are pure business (Clients, their Transactions, financial Instruments), some are rather infrastructural (meta model, localization, audit trail). Why can’t these be separated? Most of the stuff depends on meta model, localization is quite independent in our case, audit trail needs meta model and permission module (containing Users)… It all clicks when one thinks about it – and it is more or less in line with modularity based on features, not on technology or layers. Sure we can still use layer separation and have permission-persistence and permission-service as well.

Actually, this is quite repeating question: Should we base packages by features or by layer/technology/pattern? From sources I’ve read (though I might have read what I wanted to :-)) it seems that the consensus was reached – start by feature – which can be part of your business domain. If stuff gets big, you can split them into layers too.

Multiple JARs with JPA entities

So I carefully try to put different entity classes into different JAR modules. Sure, I can just repackage them in the same JAR and check how tangled they are with Sonar, but it is recommended to enforce the separation and to make dependencies explicit (not only in the Java Application Architecture book). My experience is not as rich as of the experts writing books, but it is much richer compared to people not reading any books at all – and, gosh, how many of them is there (both books and people not reading them)! And this experience confirms it quite clearly – things must be enforced.

And here comes the problem when your persistence is based on JPA. Because JPA clearly wasn’t designed to have a single persistence unit across multiple persistence.xml files. So what are these problems actually?

  1. How to distribute persistence.xml across these JARs? Do we even have to?
  2. What classes need to be mentioned where? E.g., we need to mention classes from upstream JARs in persistence.xml if they are used in relations (breaks DRY principle).
  3. When we have multiple persistence.xml files, how to merge them in our persistence unit configuration?
  4. What about configuration in persistence XML? What properties are used from what file? (Little spoiler, you just don’t know reliably!) Where to put them so you don’t have to repeat yourself again?
  5. We use EclipseLink – how to use static weaving for all these modules? How about a module with only abstract mapped superclass (some dao-common module)?

That’s quite a lot of problems for age when modularity is so often mentioned. And for technology that is from “enterprise” stack. And they are mostly phrased as questions – because the answers are not readily available.

Distributing persistence.xml – do I need it?

This one is difficult and may depend on the provider you use and the way you use it. We use EclipseLink and its static weaving. This requires persistence.xml. Sure we may try keep it together in some neutral module (or any path, as it can be configured for weaving plugin), but it kinda goes against the modularity quest. What options do we have?

  • I can create some union persistence.xml in module that depends on all needed JARs. This would be OK if I had just one such module – typically some downstream module like WAR or runnable JAR or something. But we have many. If I made persistence.xml for each they would contain a lot of repetition. And I’d reference downstream resource, which is ugly!
  • We can have dummy upstream module or out of module path with union persistence.xml. This would keep things simple, but it would be more difficult to develop modules independently, maybe even with different teams.
  • Keep persistence.xml in the JAR with related classes. This seems best from modularity point of view, but it means we need to merge multiple persistence.xml files when the persistence unit starts.
  • Or can we have different persistence unit for each persistence.xml? This is OK, if they truly are in different databases (different JDBC URL), otherwise it doesn’t make sense. In our case we have rich DB and any module can see the part it is interested in – that is entities from the JARs it has on the classpath. If you have data in different databases already, you’re probably sporting microservices anyway. 🙂

I went for third option – especially because EclipseLink’s weaving plugin likes it and I didn’t want to redirect to non-standard path to persistence.xml – but it also seems to be the right logical way. However, there is nothing like dependency between persistence.xml files. So if you have b.jar that uses a.jar, and there is entity class B in b.jar that contains @ManyToOne to A entity from a.jar, you have to mention A class in persistence.xml in b.jar. Yes, the class is already mentioned in a.jar, of course. Here, clearly, engineers of JPA didn’t even think about possibility of using multiple JARs in a really modular way.

In any case – this works, compiles, weaves your classes during build – and more or less answers questions 1 and 2 from our problem list. And now…

It doesn’t start anyway

When you have a single persistence.xml, it will get found as a unique resource, typically in META-INF/persistence.xml – in any JAR actually. But when you have more of them, they don’t get all picked and merged magically – and the application fails during startup. We need to merge all those persistence.xml files during the initialization of our persistence unit. Now we’re tackling questions 3 and 4 at once, for they are linked.

To merge all the configuration XMLs into one unit, you can use this configuration for PersistenceUnitManger in Spring (clearly, using MergingPersistenceUnitManager is the key):

@Bean
public PersistenceUnitManager persistenceUnitManager(DataSource dataSource) {
    MergingPersistenceUnitManager persistenceUnitManager =
        new MergingPersistenceUnitManager();
    persistenceUnitManager.setDefaultDataSource(dataSource);
    persistenceUnitManager.setDefaultPersistenceUnitName("you-choose");
    // default persistence.xml location is OK, goes through all classpath*
    return persistenceUnitManager;
}

But before I unveil the whole configuration we should talk about the configuration that was in the original singleton persistence.xml – which looked something like this:

<exclude-unlisted-classes>true</exclude-unlisted-classes>
<shared-cache-mode>ENABLE_SELECTIVE</shared-cache-mode>
<properties>
    <property name="eclipselink.weaving" value="static"/>
    <property name="eclipselink.allow-zero-id" value="true"/>
<!--
Without this there were other corner cases when field change was ignored. This can be worked-around calling setter, but that sucks.
-->
    <property name="eclipselink.weaving.changetracking" value="false"/>
</properties>

The biggest question here is: What is used during build (e.g. by static weaving) and what can be put into runtime configuration somewhere else? Why somewhere else? Because we don’t want to repeat these properties in all XMLs.

But before finishing the programmatic configuration we should take a little detour to shared-cache-mode that showed the problem with merging persistence.xml files in the most bizarre way.

Shared cache mode

Firstly, if you mean it seriously with JPA, I cannot recommend enough one excellent and comprehensive book that answered tons of my questions – often before I even asked them. I’m talking about Pro JPA 2, of course. Like, seriously, go and read it unless you are super-solid in JPA already.

We wanted to enable cached entities selectively (to ensure that @Cacheable annotations have any effect). But I made a big mistake when I created another persistence.xml file – I forgot to mention shared-cache-mode there. My persistence unit picked both XMLs (using MergingPersistenceUnitManager), but my caching went completely nuts. It cached more than expected and I was totally confused. The trouble here is – persistence.xml don’t get really merged. The lists of classes in them do, but the configurations do not. Somehow my second persistence XML became dominant (one always does!) and because there was no shared-cache-mode specified, it used defaults – which is anything EclipseLink thinks is the best. No blame there, just another manifestation that JPA people didn’t even think about this setup scenarios.

It’s actually the other way around – you can have multiple persistence units in one XML, that’s a piece of cake.

If you really want to get some hard evidence how things are in your setup, put a breakpoint somewhere where you can reach your EntityManagerFactory, and when it stops there, dig deeper to find what your cache mode is. Or anything else – you can check the list of known entity classes, JPA properties, … anything really. And it’s much faster than mess around just guessing.

jpa-emf-debuggedIn the picture above you can see, that now I can be sure what shared cache mode I use. You can also see which XML file was used, in our case it was from meta-model module (JAR), so this one would dominate. Luckily, I don’t rely on this anymore, not for runtime configuration at least…

Putting the Spring configuration together

Now we’re ready to wrap up our configuration and move some stuff from persistence.xml into Spring configuration – in my case it’s Java-based configuration (XML works too, of course).

Most of our properties were related to EclipseLink. I read their weaving manual, but I still didn’t understand what works when and how. I had to debug some of the stuff to be really sure.

It seems that eclipselink.weaving is the crucial property namespace, that should stay in your persistence.xml, because it gets used by the plugin performing the static weaving. I debugged maven build and the plugin definitely uses eclipselink.weaving.changetracking property value (we set it to false which is not default). Funny enough, it doesn’t need eclipselink.weaving itself, because running the plugin implies you wish for static weaving. During startup it gets picked though, so EclipseLink knows it can treat classes as statically weaved – which means it can be pushed into programmatic configuration too.

The rest of the properties (and shared cache mode) are clearly used at the startup time. Spring configuration may then look like this:

@Bean public DataSource dataSource(...) { /* as usual */ }

@Bean
public JpaVendorAdapter jpaVendorAdapter() {
    EclipseLinkJpaVendorAdapter jpaVendorAdapter = new EclipseLinkJpaVendorAdapter();
    jpaVendorAdapter.setShowSql(true);
    jpaVendorAdapter.setDatabase(
        Database.valueOf(env.getProperty("jpa.dbPlatform", "SQL_SERVER")));
    return jpaVendorAdapter;
}

@Bean
public PersistenceUnitManager persistenceUnitManager(DataSource dataSource) {
    MergingPersistenceUnitManager persistenceUnitManager =
        new MergingPersistenceUnitManager();
    persistenceUnitManager.setDefaultDataSource(dataSource);
    persistenceUnitManager.setDefaultPersistenceUnitName("you-choose");
    persistenceUnitManager.setSharedCacheMode(
        SharedCacheMode.ENABLE_SELECTIVE);
    // default persistence.xml location is OK, goes through all classpath*

    return persistenceUnitManager;
}

@Bean
public FactoryBean<EntityManagerFactory> entityManagerFactory(
    PersistenceUnitManager persistenceUnitManager, JpaVendorAdapter jpaVendorAdapter)
{
    LocalContainerEntityManagerFactoryBean emfFactoryBean =
        new LocalContainerEntityManagerFactoryBean();
    emfFactoryBean.setJpaVendorAdapter(jpaVendorAdapter);
    emfFactoryBean.setPersistenceUnitManager(persistenceUnitManager);

    Properties jpaProperties = new Properties();
    jpaProperties.setProperty("eclipselink.weaving", "static");
    jpaProperties.setProperty("eclipselink.allow-zero-id",
        env.getProperty("eclipselink.allow-zero-id", "true"));
    jpaProperties.setProperty("eclipselink.logging.parameters",
        env.getProperty("eclipselink.logging.parameters", "true"));
    emfFactoryBean.setJpaProperties(jpaProperties);
    return emfFactoryBean;
}

Clearly, we can set the database platform, shared cache mode and all runtime relevant properties programmatically – and we can do it just once. This is not a problem for a single persistence.xml, but in any case it offers better control. You can now use Spring’s @Autowired private Environment env; and override whatever you want with property files or even -D JVM arguments – and still fallback to default values – just as we do for database property of the JpaVendorAdapter. Or you can use SpEL. This is flexibility persistence.xml simply cannot provide.

And of course, all the things mentioned in the configuration can now be removed from all your persistence.xml files.

I’d love to get rid of eclipselink.weaving.changetracking in the XML too, but I don’t see any way how to provide this as the Maven plugin configuration option, which we have neatly unified in our parent POM. That would also eliminate some repeating.

Common DAO classes in separate JAR

This one (question 5 from our list) is no problem after all the previous, just a nuisance. EclipseLink refuses to weave your base class regardless of @MappedSuperclass usage. But as mentioned in one SO question/answer, just add dummy concrete @Entity class and you’re done. You never use it, it is no problem at all. And you can vote for this bug.

This is probably not problem for load-time weaving (haven’t tried it), or for Hibernate. I never had to solve any weaving problem with Hibernate, but on the other hand current project pushed my JPA limits further, so maybe I would learn something about Hibernate too (if it was willing to work for us in the first place).

Any Querydsl problems?

Ah, I forgot to mention my favourite over-JPA solution! Were there any Querydsl related problems? Well, not really. The only hiccup I got was NullPointerException when I moved some entity base classes and my subclasses were not compilable. Before javac could have printed reasonable error, Querydsl went in and gave up without good diagnostic on this most unexpected case. 🙂 I filed an issue for this, but after I fixed my import statements for the superclasses, everything was OK again.

Conclusion

Let’s do it in bullets, shall we?

  • JPA clearly wasn’t designed with modularity in mind – especially not when modules form a single persistence unit, which is perfectly legitimate usage.
  • It is possible to distribute persistence classes into multiple JARs, and then:
    • You can go either with a single union persistence.xml, which can be downstream or upstream – this depends, if you need it only in runtime or during build too.
    • I believe it is more proper to pack partial persistence.xml in each JAR, especially if you need it during build. Unfortunately, there is no escape from repeating some upstream classes in the module again, just because they are referenced in relations (typical culprit when “I don’t understand how this is not entity, when it clearly is!”).
  • If you have multiple persistence.xml files, it is possible to merge them using Spring’s MergingPersistenceUnitManager. I don’t know if you can use it for non-Spring applications, but I saw this idea reimplemented and it wasn’t that hard. (If I had to reimplement it, I’d try to merge the configuration part too!)
  • When you’re merging persistence.xml files, it is recommended to minimize configuration in them, so it doesn’t have to be repeated. E.g., for Eclipselink we leave only stuff necessary for built-time static weaving, the rest is set programmatically in our Spring @Configuration class.

There are still some open questions, but I think they lead nowhere. Can I use multiple persistence units with a single data source? This way I can have each persistence.xml as a separate unit. But I doubt relationships would work across these and the same goes for transactions (without XA that is). If you think multiple units is relevant solution, let me know, please.

I hope this helps if you’re struggling with the noble quest of modularity. Don’t be shy to share your experiences in the comments too! No registration required. 😉

Advertisements

About virgo47
Java Developer by profession in the first place. Gamer and amateur musician. And father too. Naive believer in brighter future. Step by step.

One Response to JPA – modularity denied

  1. Pingback: Last three years with software | Virgo's Naive Stories

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s