Classpath too long… with Spring Boot and Gradle

Java applications get more and more complex and we rely on more libraries than before. But command lines have some length limits and eventually you can get into troubles if your classpath gets too long. There are ways how to dodge the problem for a while – like having your libraries on shorter paths. Neither Gradle nor Maven are helping it with their repository formats. But this is still just a pseudo-solution.

When you suddenly can’t run the application

On Windows 10 we hit the wall with the command line of our Spring Boot application going over 32 KB. On Linux this limit is configurable and in general much higher, but there is often some hard limit for a single argument – and classpath is just that, a single argument. The question is whether we really want to blow our command line with all those JARs or whether we can do better.

Before we get there, though, let’s propose some other solutions:

  • Shorten the common path (as mentioned before). E.g. copy all your dependencies into something like c:\lib and make those JARs your classpath.
  • With all the JARs in a single place, you may actually use Java 6+ feature -cp “lib/*”. That is, wildcard classpath using * (not *.jar!) and quotes. This is not a shell wildcard (that would just expand it into a long command line again) but actual feature of java command (here docs from version 7). This is actually quite usable and it also scales – but you have to copy the JARs.
  • Perhaps you want to use environment variable CLASSPATH instead? This does not work, the limit is 32 KB as well. So no-solution.
  • You can also extract all the JARs into a single tree and then repackage as a single JAR. This also scales well, but involves a lot of disk operations. Also, because first class appearance is used, you have to extract the JARs in classpath order without overrides (or in reverse with overrides).

From all these options I like the second one the best. But there must be some better one, or not?

JAR with Class-Path in its manifest

I’m sure you know the old guy META-INF/MANIFEST.MF that contains meta-information about the JAR. It can also contain the classpath which will be added to the initial one on the command line. Let’s say the MANIFEST.MF in some my-cp.jar contains a line like this:

Class-Path: other1.jar other2.jar

If you run the application with java -cp my-cp.jar MainClass it will search for that MainClass (and other needed ones) in both “other” JARs mentioned in the manifest. Now I recommend you to experiment with this feature a bit and perhaps Google around it because it seems easy, but it has couple of catches:

  • The paths can be relative. Typically, you have your app.jar with classpath declared in manifest and deliver some ZIP with all the dependencies with known relative paths from your app.jar. You can still run your application with java -cp app.jar MainClass, or, even better, java -jar app.jar with Main-Class declared in the manifest as well.
  • The paths can also be absolute, but then you need to start with a slash (natural on Linux, not so on Windows). On Windows it can be any slash actually, I guess it works the same on Linux (compile once, run anywhere?).
  • If the path is a directory (like exploded JAR) it has to end with a slash too.
  • And with spaces you get into some escaping troubles… but by that time you’d probably figure out that the paths are not paths (as in -classpath argument) but in fact URLs.
  • Now throw in the specifics of the format of MANIFEST.MF like max line length of 72 chars, continuation lines with leading space, CRLF, … Oh, and if you try to do your Manifest manually, don’t forget to add one empty line – or, in other words, don’t forget to terminate the last line with CRLF as well. (Talking about line separators and line terminators can get very confusing.)

Quickly you wish you had a tool that does this all for you. And luckily you do.

Gradle to the rescue

We actually had also specific needs for our classpath. We ran bootRun task with the test classes as well for development reasons. In the end, bootRun is not used for anything but for development, right?

Adding test runtime classpath to the total classpath “helped” us to go over that command-line limit too. But we still needed it. So instead of just having classpath = sourceSets.test.runtimeClasspath in the bootRun section we needed to prepare the classpath JAR first. For that I created classpathJar task like so:

task classpathJar(type: Jar) {
  inputs.files sourceSets.test.runtimeClasspath

  archiveName = "runboot-classpath.jar"
  doFirst {
    // If run in configuration phase, some artifacts may not exist yet (after clean)
    // and File.toURI can’t figure out what is directory to add the critical trailing slash.
    manifest {
      def classpath = sourceSets.test.runtimeClasspath.files
      attributes "Class-Path": classpath.collect {f -> f.toURI().toString()}.join(" ")
    }
  }
}

This code requires couple of notes, although some of it is already in the comments:

  • We need to treat the files (components of the classpath) as URLs and join them with spaces.
  • To do that properly, all the components of the classpath must exist at the time of processing.
  • Because after clean at the time of the task configuration (see Gradle’s Build Lifecycle) some components don’t exist yet, we need to set the classpath in the task execution phase. What may not exist yet? JARs for other projects/modules of our application or classes dirs for the current project. Important stuff, obviously. (If you run into seemingly illogical class not found problems, this may be the culprit.)
  • Another reason why these artifacts may not exist is missing proper dependencies. That’s why I mention all three concatenated components of the classpath in the inputs.files declaration.

EDIT: For the first day of this post I’ve got dependsOn instead of inputs.files. It was a mistake causing unreliable task execution when something upstream was changed. Sorry for that. (I am, I suffered.)

And that’s it

Now we need to just mention this JAR in the bootRun section:

bootRun {
  classpath = classpathJar.outputs.files
  //…other settings, like...
  main = appMainClass // used to specify alternative "devel" main from test classpath
}

I’m pretty sure we can do this in other build tools and we can make some plugin for it too. It would probably be possible with some doFirst directly in the bootRun, but I didn’t want to mix it there.

But again, this nicely shows that Gradle lets you do what you need to do without much fuss. It constantly shouts: “Yes, you can!” And I like that.

 

Advertisements

Self-extracting install shell script with Gradle

My road to Gradle was much longer than I wanted. Now I use it on the project at the company and I definitely don’t want to go back. Sure, I got used to Maven and we got familiar – although I never loved it (often on the contrary). I’m not sure I love Gradle (yet), but I definitely feel empowered with it. It’s my responsibility to factor my builds properly, but I can always rely on the fact that I CAN do the stuff. And that’s very refreshing.

Learning

Gradle is not easier to learn that Maven I guess. I read Building and Testing with Gradle when it was easily downloadable from Gradle’s books page (not sure what happened to it but you probably still can get it somehow). The trouble with Gradle is that sometimes the DSL changes a bit – and your best bet is to know how the DSL relates to the API. The core concepts are more important and more stable than DLS and some superficial idioms.

Does it mean you have to invest heavily in Gradle? Well, not heavily, but you don’t want to merely scratch the surface if you want to crack why some StackOverflow solutions from 2012 don’t work anymore out-of-the-box. I’m reading Gradle in Action now, nearly finished, and I can just say – it was another good time investment.

My problem

I wanted to put together couple of binary artefacts and make self-extracting shell script from them. This, basically, is just a zip and a cat command with some install.sh head script. Zip gets all the binary artefacts together and cat joins the install.sh with this ZIP file – both separated by some clear separator. I used this “technology” back in the noughties, and even then it was old already.

How can such an install.sh head script look like? Depends on many things. Do you need to unzip into a temporary directory and run some installer from there? Is unzipping itself the installation? Let’s just focus on the “stage separation”, because the rest clearly depends on your specific needs. (This time I used this article for my “head” script, but there are probably many ways how to unzip the “tail” of the file. Also, in the article TGZ was used, I went for ZIP as people around me are more familiar with that one.)

#!/bin/bash
set -eu

# temporary dir? target installation dir?
EXTRACT_TO=out

echo "Extracting (AKA installing)..."
ARCHIVE=`awk '/^__ARCHIVE_BELOW__/ {print NR + 1; exit 0; }' $0`
tail -n+$ARCHIVE $0 > tmp.zip
unzip -q tmp.zip -d $EXTRACT_TO
rm -f tmp.zip

# the rest is custom, but exit 0 is needed before the separator
exit 0

__ARCHIVE_BELOW__

Now, depending on the way how you connect both files you may or may not need empty line under the separator.

To be more precise and complicated at the same time, there may need to be LF (\n or ASCII 10) at the end of the separator line or not. Beware of differences in “last empty line” meaning in various editors (Windows vs Linux), e.g. vim by defaults expects line terminator at the end of the file, but does not show empty line, while Windows editors typically do (see the explanation).

Concatenation… with Gradle (or Ant?)

Using cat command is easy (and that one requires line-feed after separator). But I don’t want to script it this time. I want to write it in Gradle. And Gradle gives me multiple superpowers. One is called Groovy (or Kotlin, if you like, but I’m not there yet). The other is called Ant. (Ant? Seriously?! Yes, seriously.)

Now I’m not claiming that built-in Ant is the best solution for this particular problem (as we will see), but Ant already has a task called concat. BTW: Ant’s task is just a step or action you can execute, if you thing Gradle-tasks think Ant-targets, and Ant’s targets are not of our concern here.

Ant provides many actions out of the box and all you need to do is use “ant.” to get to Gradle’s AntBuilder. But before we try that, let’s try something straightforward, because if you can access file, you can access its content too. One option is to use File’s text property and something like in this answer. Groovy script looks like this:

apply plugin: 'base' // to support clean task

task createArchive(type: Zip) {
    archiveName 'binary.zip'
    from 'src/content'
}

task createInstallerRaw(dependsOn: createArchive) {
  doLast {
    file("${buildDir}/install.sh").text =
      file('src/install.sh').text + createArchive.outputs.files.singleFile.text
  }
}

OK, so let’s try it:

./gradlew clean createInstallerRaw
# it does it’s stuff
build/install.sh
diff build/distributions/binary.zip tmp.zip
# this prints out:
Exctracting (AKA installing)...
Binary files build/distributions/binary.zip and tmp.zip differ

Eh, the last line is definitely something we don’t want to see. I used couple of empty files in src/content, but with realistic content you’d also see something like:

  error:  invalid compressed data to inflate out/Hot Space/01 - Staying Power.mp3
out/Hot Space/01 - Staying Power.mp3  bad CRC 00000000  (should be 9faa50ed)

Let’s get binary

File.text is for strings, not for grown-ups. Let’s do it better. We may try the bytes property, perhaps joining the byte arrays and eventually I ended up with something like:

task createInstallerRawBinary(dependsOn: createArchive) {
  doLast {
    file("${buildDir}/install.sh").withOutputStream {
      it.write file('src/install.sh').bytes
      it.write createArchive.outputs.files.singleFile.bytes
    }
  }
}

Now this looks better:

./gradlew clean createInstallerRawBinary
# it does it’s stuff
build/install.sh
diff build/distributions/binary.zip tmp.zip

And the diff says nothing. And even Hot Space mp3 files play back flawlessly (well, it’s not FLAC, I know). But wait – let’s try no-op build:

./gradlew createInstallerRawBinary
#...
BUILD SUCCESSFUL in 1s
2 actionable tasks: 1 executed, 1 up-to-date

See that 1 executed in the output? This build zips the stuff again and again. It works, but it definitely is not right. It’s not Gradlish enough.

Inputs/outputs please!

Gradle tasks have inputs and outputs properties that declaratively specify what the task needs and what it produces. There is nothing that prevents you from using more than you declare, but then you break your own contract. This mechanism is very flexible as it allows Gradle to check what needs to be run and what can be skipped. Let’s use it:

task createInstallerWithInOuts {
  inputs.files 'src/install.sh', createArchive
  outputs.file "${buildDir}/install.sh"

  doLast {
    outputs.files.singleFile.withOutputStream { outStream ->
      inputs.files.each {
        outStream.write it.bytes
      }
    }
  }
}

Couple of points here:

  • It’s clear what code configures the task (first lines declaring inputs/outputs) and what is task action (closure after doLast). You should know basics about Gradle’s build lifecycle.
  • With both inputs and outputs declared, we can use them without any need to duplicate the file names. We foreshadowed this in the previous task already when we used createArchive.outputs.files.singleFile… instead of “${buildDir}/distributions/binary.zip”. This works its magic when you change the archiveName in the createArchive task – you don’t have to do anything in downstream tasks.
  • No dependsOn is necessary here, just mentioning the createArchive task as input (Gradle reads it as “outputs from createArchive task”, of course) adds the implicit, but quite clear, dependency.
  • With inputs.files we can as well try to iterate over them. Here I chose default it for the inner closure and had to name the parameter outStream.

Does it fix our no-op build? Sure it does – just try to run it twice yourself (without clean of course).

Where is that Ant?

No, I didn’t forget Ant, but I wanted to use some Groovy before we get to it. I actually didn’t measure which is better, for archives in tens of megabytes it doesn’t really matter. What does matter is that Ant clearly says “concat”:

task createInstallerAntConcat {
  inputs.files 'src/install.sh', createArchive
  outputs.file "${buildDir}/install.sh"

  doLast {
    // You definitely want binary=true if you append ZIP, otherwise expect corruptions
    ant.concat(destfile: outputs.files.singleFile, binary: true) {
      // ...or mulitple single-file filesets
      inputs.files.each { file ->
        fileset(file: relativePath(file))
      }
    }
  }
}

This uses Ant task concat – it concatenates files mentioned in nested fileset. This is equivalent Ant snippet:

<concat destfile="${build.dir}/install.sh" binary="yes">
  <fileset file="${src.dir}/install.sh"/>
  <fileset file="${build.dir}/distributions/binary.zip"/>
</concat>

It’s imperative to set the binary flag to true (default false), as we work with binary content (ZIP). Using single-file filesets assures the order of concatenation, if we used something like (in the doLast block)…

ant.concat(destfile: outputs.files.singleFile, binary: true) {
  fileset(dir: projectDir.getPath()) {
    inputs.files.each { file ->
      include(name: relativePath(file))
    }
  }
}

…we may get lucky and get the right result, but just as likely the ZIP will be first. The point is, fileset does not represent files in the order of nested includes.

We may try filelist instead. Instead of include elements it uses file elements. So let’s do it:

ant.concat(destfile: outputs.files.singleFile, binary: true) {
  filelist(dir: projectDir.getPath()) {
    inputs.files.each { file ->
      file(name: relativePath(file))
    }
  }
}

If we run this task the build fails on runtime error (during the execution phase):

* What went wrong:
Execution failed for task ':createInstallerAntConcatFilelistBadFile'.
> No signature of method: java.io.File.call() is applicable for argument types: (java.util.LinkedHashMap) values: [[name:src\install.sh]]
  Possible solutions: wait(), any(), wait(long), each(groovy.lang.Closure), any(groovy.lang.Closure), list()

Hm, file(…) tried to create new java.io.File, not Ant’s file element. In other words, it did the same thing like anywhere else in the Gradle script, we’ve already used file(…) construct. But it doesn’t like maps and, most importantly, is not what we want here.

What worked for include previously -although the build didn’t do what we wanted for other reasons – does not work here. We need to tell the Gradle explicitly we want to use Ant – and all we need to do is to use ant.files(…).

Wrapping it up

Now when I’m trying it I have to say I’m glad I learned more about Gradle-Ant integration, but I’ll just use one of the non-ant solutions. It seems that ant.concat is considerably slower.

In any case it’s good when you understand Gradle build phases (lifecycle) and you know how to specify task inputs/outputs.

When working with files, it’s always important to realize whether you work with texts or binaries, whether it matters, how it’s supported, etc. It’s important to know if/how your solution supports order of the files when it matters.

Lastly – when working with shell scripts it’s also important to assure they use the right kind of line terminators. With Git’s typically automatic line breaks you can’t just pack a shell script with CRLF and run it on Linux – this typically results in a rather confusing error that /bin/bash is not the right interpreter. Using editor in some binary mode helps to discover the problem (e.g. vi -b myscript.sh). But that is not Gradle topic anymore.

I like the flexibility Gradle provides to me. It pays off to learn its basics and to know how to work with its documentation and API. But with that you mostly get your rewards.

Opinionated JPA with Querydsl book finished!

I’m not sure I’ve ever had such a long pause in blogging – not that I blog that often, but still. I either didn’t want to blog about how not to do things (current project I’m working on), or made various notes for myself in GitHub markdown – or, slowly but surely, working on my book. And this post is about the book.

The book is complete

I finally finished the first edition (and perhaps the last, but not necessarily) of my first technology book named Opinionated JPA with Querydsl. I started the book on December 16th, 2015. I planned to do it sometime during 2016. But September 2017 isn’t that late after all – especially with so little happening around JPA nowadays.

When I started I imagined a book around 100 pages but the thing grew over 200 in the end. Sure, I made font a bit bigger so it’s easy to read on ebook readers even in PDF format which still seems to be superior in presentation although less flexible on small readers. And even in this volume I didn’t cover all the things that are not covered in traditional JPA/ORM books.

Don’t mess with JPA, will ye?

I have to admit that I still don’t know JPA in and out although I can navigate the specification pretty well when I need to find something. There are features I simply refused to use, but for most of these I know they don’t solve the problems I typically have. If I must put it into a single point it would be better control over generated SQL.

Now I can hear those ORM purists and I believe I understand this topic reasonably well. I’ve heard about ORM being leaky abstraction, heard why it’s bad and when it’s actually good, I’ve read many articles on the topic and worked many hours using Java ORM solutions. If you want something extensive, there is always Ted Neward’s The Vietnam of Computer Science which was written in 2006 and hardly anything is out of date.

But I don’t care about academic ideas here, ORM is real, it’s used and I actually like many of its features. The least I like its effort to hide SQL from us though. I like its type conversion when compared to very poor low-level JDBC. I can live with unit-of-work as well but there are cases when it’s simply not suitable. And then you’re left on your own.

Streaming long query straight to a file or socket? Expect out of memory if you’re querying entities that fill up your persistence context eventually, even so you don’t need them there at all. Even without persistence context, it simply tries to create the whole list first before you can work with it. No cursor, nothing. Is this really such an unexpected and rare need?

Not compliant, not knowing it

I always firmly believed that if you work with SQL database one should know SQL. Whatever blanket you put over it, ignorance is hardly ever a good thing. I wrote quite a lot of articles on JPA. I saw first-hand what happens when you consider open-session-in-view a pattern instead of what it really is (antipattern). I tacked N+1 problem in context of pagination or thought about repeating problem of mapping enums to arbitrary database values. I realize that all the theory about specification crumbles in practice when you get into crossfire of various bugs in various JPA providers. I tried to modularize single entity model (persistence unit).

However I also liked improvements in JPA 2.1 and ORM still made my life easier in most situations. When I discovered that I can actually join on arbitrary value – e.g. map a foreign key as a plain value and then use join with on clause explicitly – I was blown away. That’s when I asked myself: “Why other people don’t try it too? Why we keep fighting ins and outs of relation mappings? Why we rely on convoluted configurations or particular providers to give us lazy to-one mapping?”

And then I decided to write a book about it. There was more to it – I wanted to lump more of my rogue ideas about JPA/ORM, staying still more or less concerned user of JPA, not a hater. I also wanted to see whether I can pull it off, all the way. I wanted to see how it is to self-publish a book on something like Leanpub. I didn’t expect much of a profit from that though, I realized this is no Perennial Seller as it’s too technology related and really niche, destined to be out-of-date rather soon.

But then during writing the book while I was testing my ideas both with Hibernate and EclipseLink, I found out that Hibernate does not support JOIN on root entity (e.g. join Dog d on …), only on entity paths (e.g. join person.dog). What the… how could they miss this thing?! And then it dawned on me… this is not part of a specification. My book more or less stopped for a couple of months, but eventually went on admitting openly that I’m not JPA compliant anymore. Good thing is that Hibernate eventually joined the club and since 5.1 they support this so called “ad hoc joins”.

Here we are

I’d just like to return to abstractions we talked about previously before I end this post. Right now I’m reading Patterns of Software by Richard P. Gabriel, written in 1996. We can argue that some of the problems are solved already, but I’d not be that sure. There’s a chapter called Abstraction Descant. I found out it really relates to me. Abstractions are important tools in our arsenal, but not everything can be solved by abstraction.

After reading this I realized I care even less whether ORM is leaky abstraction or not – it should be practical and not really that much how clean or perfect abstraction it is. Especially ORM being quite a big beast. It’s not a low level where abstractions shine best – like data structures, etc. I’m not going to say more, read that part from the book and make your own mind.

So – I’ve finished the book, hopefully not in vein. Kind of a longer blog post if you will. If you’re interested but not sure about it, you can grab it for free (and you can eventually pay later if you like it and feel it helped, I’m sure it’s possible :-)).

How I unknowingly deviated from JPA

In a post from January 2015 I wrote about possibility to use plain foreign key values instead of @ManyToOne and @OneToOne mappings in order to avoid eager fetch. It built on the JPA 2.1 as it needed ON clause not available before and on EclipseLink which is a reference implementation of the specification.

To be fair, there are ways how to make to-one lazy, sure, but they are not portable and JPA does not assure that. They rely on bytecode magic and properly configured ORM. Otherwise lazy to-one mapping wouldn’t have spawned so many questions around the internet. And that’s why we decided to try it without them.

Total success

We applied this style on our project and we liked it. We didn’t have to worry about random fetch cascades – in complex domain models often triggering many dozens of fetches. Sure it can be “fixed” with second-level cache, but that’s another thing – we could stop worrying about cache too. Now we could think about caching things we wanted, not caching everything possibly reachable even if we don’t need it. Second-level cache should not exist for the sole reason of making this flawed eager fetch bearable.

When we needed a Breed for a Dog we could simply do:

Breed breed = em.find(Breed.class, dog.getBreedId());

Yes, it is noisier than dog.getBreed() but explicit solutions come with a price. We can still implement the method on an entity, but it must somehow access entityManager – directly or indirectly – and that adds some infrastructure dependency and makes it more active-record-ish. We did it, no problem.

Now this can be done in JPA with any version and probably with any ORM. The trouble is with queries. They require explicit join condition and for that we need ON. For inner joins WHERE is sufficient, but any outer join obviously needs ON clause. We don’t have dog.breed path to join, we need to join breed ON dog.breedId = breed.id. But this is no problem really.

We really enjoyed this style while still benefiting from many perks of JPA like convenient and customizable type conversion, unit of work pattern, transaction support, etc.

I’ll write a book!

Having enough experiences and not knowing I’m already outside of JPA specification scope I decided to conjure a neat little book called Opinionated JPA. The name says it all, it should have been a book that adds a bit to the discussion about how to use and tweak JPA in case it really backfires at you with these eager fetches and you don’t mind to tune it down a bit. It should have been a book about fighting with JPA less.

Alas, it backfired on me in the most ironic way. I wrote a lot of material around it before I got to the core part. Sure, I felt I should not postpone it too long, but I wanted to build an argument, do the research and so on. What never occurred to me is I should have tested it with some other JPA too. And that’s what is so ironic.

In recent years I learned a lot about JPA, I have JPA specification open every other day to check something, I cross reference bugs in between EclipseLink and Hibernate – but trying to find a final argument in the specification – I really felt good at all this. But I never checked whether query with left join breed ON dog.breedId = breed.id works in anything else than EclipseLink (reference implementation, mind you!).

Shattered dreams

It does not. Today, I can even add “obviously”. JPA 2.1 specification defines Joins in section 4.4.5 as (selected important grammar rules):

join::= join_spec join_association_path_expression [AS] identification_variable [join_condition]
join_association_path_expression ::=
  join_collection_valued_path_expression |
  join_single_valued_path_expression |
  TREAT(join_collection_valued_path_expression AS subtype) |
  TREAT(join_single_valued_path_expression AS subtype)
join_spec::= [ LEFT [OUTER] | INNER ] JOIN
join_condition ::= ON conditional_expression

The trouble here is that breed in left join breed does not conform to any alternative of the join_association_path_expression.

Of course my live goes on, I’ve got a family to feed, I’ll ask my colleagues for forgiveness and try to build up my professional credit again. I can even say: “I told myself so!” Because the theme that JPA can surprise again and again is kinda repeating in my story.

Opinionated JPA revisited

What does it mean for my opinionated approach? Well, it works with EclipseLink! I’ll just drop JPA from the equation. I tried to be pure JPA for many years but even during these I never ruled out proprietary ORM features as “evil”. I don’t believe in an easy JPA provider switch anyway. You can use the most basic JPA elements and be able to switch, but I’d rather utilize chosen library better.

If you switch from Hibernate, where to-one seems to work lazily when you ask for it, to EclipseLink, you will need some non-trivial tweaking to get there. If JPA spec mandated lazy support and not define it as mere hint I wouldn’t mess around this topic at all. But I understand that the topic is deeper as Java language features don’t allow it easily. With explicit proxy wrapping the relation it is possible but we’re spoiling the domain. Still, with bytecode manipulation being rather ubiquitous now, I think they could have done it and remove this vague point once for all.

Not to mention very primitive alternative – let the user explicitly choose he does not want to cascade eager fetches at the moment of usage. He’ll get a Breed object when he calls dog.getBreed(), but this object will not be managed and will contain only breed’s ID – exactly what user has asked for. There is no room for confusion here and at least gives us the option to break the deadly fetching cascade.

And the book?

Well the main argument is now limited to EclipseLink and not to JPA. Maybe I should rename it to Opinionated ORM with EclipseLink (and Querydsl). I wouldn’t like to leave it in a plane of essay about JPA and various “horror stories”, although even that may help people to decide for or against it. If you don’t need ORM after all, use something different – like Querydsl over SQL or alternatives like JOOQ.

I’ll probably still describe this strategy, but not as a main point anymore. Main point now is that JPA is very strict ORM and limited in options how to control its behavior when it comes to fetching. These options are delegated to JPA providers and this may lock you to them nearly as much as not being JPA compliant at all.

Final concerns

But even when I accept that I’m stuck to EclipseLink feature… is it a feature? Wouldn’t it be better if reference implementation strictly complained about invalid JPQL just like Hibernate does? Put aside the thought that Hibernate is perfect JPA 2.1 implementation, it does not implement other things and is not strict in different areas.

What if EclipseLink reconsiders and removes this extension? I doubt the next JPA will support this type of paths after JOINs although that would save my butt (which is not so important after all). I honestly believed I’m still on the standard motorway just a little bit on the shoulder perhaps. Now I know I’m away from any mainstream… and the only way back is to re-introduce all the to-one relations into our entities which first kills the performance, then we turn on the cache for all, which hopefully does not kill memory, but definitely does not help. Not to mention we actually need distributed cache across multiple applications over the same database.

In the most honest attempt to get out of the quagmire before I get stuck deep in it I inadvertently found myself neck-deep already. ORM indeed is The Vietnam of Computer Science.

Last three years with software

Long time ago I decided to blog about my technology struggles – mostly with software but also with consumer devices. Don’t know why it happened on Christmas Eve though. Two years later I repeated the format. And here we are three years after that. So the next post can be expected in four years, I guess. Actually, I split this into two – one for software, mostly based on professional experience, and the other one for consumer technology.

Without further ado, let’s dive into this… well… dive, it will be obviously pretty shallow. Let’s skim the stuff I worked with, stuff I like and some I don’t.

Java case – Java 8 (verdict: 5/5)

This time I’m adding my personal rating right into the header – little change from previous post where it was at the end.

I love Java 8. Sure, it’s not Scala or anything even more progressive, but in context of Java philosophy it was a huge leap and especially lambda really changed my life. BTW: Check this interesting Erik Meijer’s talk about category theory and (among other things) how it relates to Java 8 and its method references. Quite fun.

Working with Java 8 for 17 months now, I can’t imagine going back. Not only because of lambda and streams and related details like Map.computeIfAbsent, but also because date and time API, default methods on interfaces and the list could probably go on.

JPA 2.1 (no verdict)

ORM is interesting idea and I can claim around 10 years of experience with it, although the term itself is not always important. But I read books it in my quest to understand it (many programmers don’t bother). The idea is kinda simple, but it has many tweaks – mainly when it comes to relationships. JPA 2.1 as an upgrade is good, I like where things are going, but I like the concept less and less over time.

My biggest gripes are little control over “to-one” loading, which is difficult to make lazy (more like impossible without some nasty tricks) and can result in chain loading even if you are not interested in the related entity at all. I think there is reason why things like JOOQ cropped up (although I personally don’t use it). There are some tricks how to get rid of these problems, but they come at cost. Typically – don’t map these to-one relationships, keep them as foreign key values. You can always fetch the stuff with query.

That leads to the bottom line – be explicit, it pays off. Sure, it doesn’t work universally, but anytime I leaned to the explicit solutions I felt a lot of relief from struggles I went through before.

I don’t rank JPA, because I try to rely on less and less ORM features. JPA is not a bad effort, but it is so Java EE-ish, it does not support modularity and the providers are not easy to change anyway.

Querydsl (5/5)

And when you work with JPA queries a lot, get some help – I can only recommend Querydsl. I’ve been recommending this library for three years now – it never failed me, it never let me down and often it amazed me. This is how criteria API should have looked like.

It has strong metamodel allowing to do crazy things with it. We based kinda universal filtering layer on it, whatever the query is. We even filter queries with joins, even on joined fields. But again – we can do that, because our queries and their joins are not ad-hoc, they are explicit. 🙂 Because you should know your queries, right?

Sure, Querydsl is not perfect, but it is as powerful as JPQL (or limited for that matter) and more expressive than JPA criteria API. Bugs are fixed quickly (personal experience), developers care… what more to ask?

Docker (5/5)

Docker stormed into our lives, for some practically for others at least through the media. We don’t use it that much, because lately I’m bound to Microsoft Windows and SQL Server. But I experimented with it couple of times for development support – we ran Jenkins in the container for instance. And I’m watching it closely because it rocks and will rock. Not sure what I’m talking about? Just watch DockerCon 2015 keynote by Solomon Hykes and friends!

Sure – their new Docker Toolbox accidentally screwed my Git installation, so I’ll rather install Linux on VirtualBox and test docker inside it without polluting my Windows even further. But these are just minor problems in this (r)evolutionary tidal wave. And one just must love the idea of immutable infrastructure – especially when demonstrated by someone like Jérôme Petazzoni (for the merit itself, not that he’s my idol beyond professional scope :-)).

Spring 4 and on (4/5)

I have been aware of the Spring since the dawn of microcontainers – and Spring emerged victorious (sort of). A friend of mine once mentioned how much he was impressed by Rod Johnson’s presentation about Spring many years ago. How structured his talk and speech was – the story about how he disliked all those logs pouring out of your EE application server… and that’s how Spring was born (sort of).

However, my real exposure to Spring started in 2011 – but it was very intense. And again, I read more about it than most of my colleagues. And just like with JPA – the more I read, the less I know, so it seems. Spring is big. And start some typical application and read those logs – and you can see EE of 2010’s (sort of).

That is not that I don’t like Spring, but I guess its authors (and how many they are now) simply can’t see anymore what beast they created over the years. Sure, there is Spring Boot which reflects all the trends now – like don’t deploy into container, but start the container from within, or all of its automagic features, monitoring, clever defaults and so on. But that’s it. More you don’t do, but you better know about it. Or not? Recently I got to one of the newer Uncle Bob’s articles – called Make the Magic go away. And there is undeniably much to it.

Spring developers do their best, but the truth is that many developers just adopt Spring because “it just works”, while they don’t know how and very often it does not (sort of). You actually should know more about it – or at least some basics for that matter – to be really useful. Of course – this magic problem is not only about Spring (or JPA), but these are the leaders of the “it simply works” movement.

But however you look at it, it’s still “enterprise” – and that means complexity. Sometimes essential, but mostly accidental. Well, that’s also part of the Java landscape.

Google Talk (RIP)

And this is for this post’s biggest let down. Google stopped supporting their beautifully simple chat client without any reasonable replacement. Chrome application just doesn’t seem right to me – and it actually genuinely annoys me with it’s chat icon that hangs on the desktop, sometimes over my focused application, I can’t relocate it easily… simply put, it does not behave as normal application. That means it behaves badly.

I switched to pidgin, but there are issues. Pidgin sometimes misses a message in the middle of the talk – that was the biggest surprise. I double checked, when someone asked me something reportedly again, I went to my Gmail account and really saw the message in Chat archive, but not in my client. And if I get messages when offline, nothing notifies me.

I activated the chat in my Gmail after all (against my wishes though), merely to be able to see any missing messages. But sadly, the situation with Google talk/chat (or Hangout, I don’t care) is dire when you expect normal desktop client. 😦

My Windows toolset

Well – now away from Java, we will hop on my typical developer’s Windows desktop. I mentioned some of my favourite tools, some of them couple of times I guess. So let’s do it quickly – bullet style:

  • Just after some “real browser” (my first download on the fresh Windows) I actually download Rapid Environment Editor. Setting Windows environment variables suddenly feels normal again.
  • Git for Windows – even if I didn’t use git itself, just for its bash – it’s worth it…
  • …but I still complement the bash with GnuWin32 packages for whatever is missing…
  • …and run it in better console emulator, recently it’s ConEmu.
  • Notepad2 binary.
  • And the rest like putty, WinSCP, …
  • Also, on Windows 8 and 10 I can’t imagine living without Classic Shell. Windows 10 is a bit better, but their Start menu is simply unusable for me, classic Start menu was so much faster with keyboard!

As an a developer I sport also some other languages and tools, mostly JVM based:

  • Ant, Maven, Gradle… obviously.
  • Groovy, or course, probably the most popular alternative JVM language. Not to mention that groovsh is good REPL until Java 9 arrives (recently delayed beyond 2016).
  • VirtualBox, recently joined by Vagrant and hopefully also something like Chef/Puppet/Ansible. And this leads us to my plans.

Things I want to try

I was always friend of automation. I’ve been using Windows for many years now, but my preference of UNIX tools is obvious. Try to download and spin up virtual machine for Windows and Linux and you’ll see the difference. Linux just works and tools like Vagrant know where to download images, etc.

With Windows people are not even sure how/whether they can publish prepared images (talking about development only, of course), because nobody can really understand the licenses. Microsoft started to offer prepared Windows virtual machines – primarily for web development though, no server class OS (not that I appreciate Windows Server anyway). They even offer Vagrant, but try to download it and run it as is. For me Vagrant refused to connect to the started VirtualBox machine, any reasonable instructions are missing (nothing specific for Vagrant is in the linked instructions), no Vagrantfile is provided… honestly, quite lame work of making my life easier. I still appreciate the virtual machines.

But then there are those expiration periods… I just can’t imagine preferring any Microsoft product/platform for development (and then for production, obviously). The whole culture of automation on Windows is just completely different – read anything from “nonexistent for many” through “very difficult” to “made artificially restricted”. No wonder many Linux people can script and too few Windows guys can. Licensing terms are to be blamed as well. And virtual machine sizes for Windows are also ridiculous – although Microsoft is reportedly trying to do something in this field to offer reasonably small base image for containerization.

Anyway, back to the topic. Automation is what I want to try to improve. I’m still doing it anyway, but recently the progress is not that good I wished it to be. I fell behind with Gradle, I didn’t use Docker as much as I’d like to, etc. Well – but life is not work only, is it? 😉

Conclusion

Good thing is there are many tools available for Windows that make developer’s (and former Linux user’s) life so much easier. And if you look at Java and its whole ecosystem, it seems to be alive and kicking – so everything seems good on this front as well.

Maybe you ask: “What does 5/5 mean anyway?” Is it perfect? Well, probably not, but at least it means I’m satisfied – happy even! Without happiness it’s not 5, right?

Expression evaluation in Java (4)

Previously we looked at various options how to evaluate expressions, then we implemented our own evaluator in ANTLR v4 and then we complicated it a bit with more types and more operations. But it still doesn’t make sense without variables. What good would any expression based rule engine be if we can’t change input parameters? 🙂

But before that…

All the code related to this blog post is here on GitHub, package name is called expr3 this time as it is our third iteration of the ANTLR solution (even though we are in the 4th post). Tests are here. We will focus on variables, but there are some changes made to expr3 beyond variables themselves.

  • Grammar supports NULL literal and identifiers (for variables) – but we will get to this.
  • ExpressionCalculatorVisitor now accepts even more types and method convertToSupportedType is taking care of this. This is important to support reasonable palette of object types for variables (and will work the same for return values from functions later).
  • Numbers can be represented as Integer or BigDecimal. If the number fits into Integer range (and is not decimal number, of course) it will be represented with Integer, otherwise BigDecimal is used. This does complicate arithmetic and relational (comparisons) operations, we need some promotions here and there, etc. As of now it is coded in a bit crude way – had the implicit conversion rules been more complicated, some more sophisticated solution would be better.
  • Java Date Time API types can be used as variable values, but these will be converted to the ISO extended strings (which still allows comparison!). LocalDate, LocalDateTime and Instant are supported in this demo.

As I said, I’ll not focus on these changes, they are in the code and while they affect how we treat variables, they are not inherently related to introducing them. I’ll also not talk about related tests (like literal resolution into Integer vs BigDecimal) – again, it is in the repo.

Identifiers and null

When we’re working with variables, we need to write them somehow into the expression – and that’s where identifiers come in. As you’d expect, identifiers represent the variables on their respective place in the expression (or rather their value), so they are one kind of elemental expressions, just like various literals. Second thing we may need is NULL value. This is not strictly necessary in all contexts, but null is so common in our Java world that I decided to support it too. Our expr node in its fullness looks like this:

expr: STRING_LITERAL # stringLiteral
    | BOOLEAN_LITERAL # booleanLiteral
    | NUMERIC_LITERAL # numericLiteral
    | NULL_LITERAL # nullLiteral
    | op=('-' | '+') expr # unarySign
    | expr op=(OP_MUL | OP_DIV | OP_MOD) expr # arithmeticOp
    | expr op=(OP_ADD | OP_SUB) expr # arithmeticOp
    | expr op=(OP_LT | OP_GT | OP_EQ | OP_NE | OP_LE | OP_GE) expr # comparisonOp
    | OP_NOT expr # logicNot
    | expr op=(OP_AND | OP_OR) expr # logicOp
    | ID # variable
    | '(' expr ')' # parens
    ;

Null literal is predictably primitive:

NULL_LITERAL : N U L L;

Identifiers are not very complicated either, and I guess they are pretty much similar to Java syntax:

ID: [a-zA-Z$_][a-zA-Z0-9$_.]*;

Various tests for null in the expression (without variables first) may look like this:

    @Test
    public void nullComparison() {
        assertEquals(expr("null == null"), true);
        assertEquals(expr("null != null"), false);
        assertEquals(expr("5 != null"), true);
        assertEquals(expr("5 == null"), false);
        assertEquals(expr("null != 5"), true);
        assertEquals(expr("null == 5"), false);
        assertEquals(expr("null > null"), false);
        assertEquals(expr("null < null"), false);
        assertEquals(expr("null <= null"), false); assertEquals(expr("null >= null"), false);
    }

We don’t need much for this – resolving the NULL literal is particularly simple:

@Override
public Object visitNullLiteral(NullLiteralContext ctx) {
    return null;
}

We also modified visitComparisonOp – now it starts like this:

@Override
public Boolean visitComparisonOp(ExprParser.ComparisonOpContext ctx) {
    Comparable left = (Comparable) visit(ctx.expr(0));
    Comparable right = (Comparable) visit(ctx.expr(1));
    int operator = ctx.op.getType();
    if (left == null || right == null) {
        return left == null && right == null && operator == OP_EQ
            || (left != null || right != null) && operator == OP_NE;
    }
...

The rest is dealing with non-null values, etc. We may also let this method return null when null is involved anywhere except for EQ/NE, now it returns false. Depends on what logic we want.

Variable resolver

Variable resolving inside the calculator class is also quite simple. We need something that resolves them – that’s that variableResolver field initialized in the constructor and used in visitVariable:

private final ExpressionVariableResolver variableResolver;

public ExpressionCalculatorVisitor(ExpressionVariableResolver variableResolver) {
    if (variableResolver == null) {
        throw new IllegalArgumentException("Variable resolver must be provided");
    }
    this.variableResolver = variableResolver;
}

@Override
public Object visitVariable(VariableContext ctx) {
    Object value = variableResolver.resolve(ctx.ID().getText());
    return convertToSupportedType(value);
}

Anything this resolver returns is converted to supported types as mentioned in the introduction. ExpressionVariableResolver is again very simple:

public interface ExpressionVariableResolver {
    Object resolve(String variableName);
}

And how we can implement this? In Java 8 you must just love it – here is piece of test:

private ExpressionVariableResolver variableResolver;

@BeforeMethod
public void init() {
    variableResolver = var -> null;
}

@Test
public void primitiveVariableResolverReturnsTheSameValueForAnyVarName() {
    variableResolver = var -> 5;
    assertEquals(expr("var"), 5);
    assertEquals(expr("anyvarworksnow"), 5);
}

I use field that is set in init method to default “implementation” that always return null. In another test method I change it and it always return 5, regardless of actual variable name (parameter var) as this test clearly demonstrates. Next test is more useful, because this resolver returns value only for specific value:

@Test
public void variableResolverReturnsValueForOneVarName() {
    variableResolver = var -> var.equals("var") ? 5 : null;
    assertEquals(expr("var"), 5);
    assertEquals(expr("var != null"), true);
    assertEquals(expr("var == null"), false);

    assertEquals(expr("anyvarworksnow"), null);
    assertEquals(expr("anyvarworksnow == null"), true);
}

Now the actual name of the variable (identifier) must be “var”, otherwise it returns null again. You might have heard that lambdas may work as super-short test implementations – and yes, they can.

You may wonder why I have the field instead of using it only in the test method itself. This would be better contained and preventing any accidental leaks (although that @BeforeMethod covered me on this). Trouble is that variableResolver is used deeper in expr(…) method and I didn’t want to add it as parameter everywhere, hence the field:

private Object expr(String expression) {
    ParseTree parseTree = ExpressionUtils.createParseTree(expression);
    return new ExpressionCalculatorVisitor(variableResolver)
        .visit(parseTree);
}

Any real-life implementation?

Variable resolvers in the test were obviously very primitive, so let’s try something more realistic. First try is also very simple, but indeed realistic. Remember Bindings in Java’s ScriptEngine? It actually extends Map – so how about resolver that wraps existing Map<String, Object> (mapping variable name to it’s value)? Ok, it – again – may be too primitive:

ExpressionVariableResolver resolver = var -> map.get(var);

Bah, we need some bigger challenge! Let’s say I have a Java Bean or any POJO and I want to explicitly specify my variable names and how they should be resolved with that object. This may be call of method, like getter, so right now we don’t have the values readily available in a collection (or a map).

Important thing to realize here is that resolver will be different from an object to object, because for different objects it needs to provide different values. However, the way how it obtains the values will be the same. And we will wrap this “way” into VariableMapper that knows how to get values from an object of specific type (using generics) – and it will also help us to resolve the value for specific instance. Tests show how I intend to use it:

private VariableMapper<SomeBean> variableMapper;
private ParseTree myNameExpression;
private ParseTree myCountExpression;

@BeforeClass
public void init() {
    variableMapper = new VariableMapper<SomeBean>()
        .set("myName", o -> o.name)
        .set("myCount", SomeBean::getCount);
    myNameExpression = ExpressionUtils.createParseTree("myName <= 'Virgo'"); myCountExpression = ExpressionUtils.createParseTree("myCount * 3"); } @Test public void myNameExpressionTest() { SomeBean bean = new SomeBean(); ExpressionCalculatorVisitor visitor = new ExpressionCalculatorVisitor( var -> variableMapper.resolveVariable(var, bean));

    assertEquals(visitor.visit(myNameExpression), false); // null comparison is false
    bean.name = "Virgo";
    assertEquals(visitor.visit(myNameExpression), true);
    bean.name = "ABBA";
    assertEquals(visitor.visit(myNameExpression), true);
    bean.name = "Virgo47";
    assertEquals(visitor.visit(myNameExpression), false);
}

@Test
public void myCountExpressionTest() {
    SomeBean bean = new SomeBean();
    ExpressionCalculatorVisitor visitor = new ExpressionCalculatorVisitor(
        var -> variableMapper.resolveVariable(var, bean));

// assertEquals(visitor.visit(myCountExpression), false); // NPE!
    bean.setCount(3f);
    assertEquals(visitor.visit(myCountExpression), new BigDecimal("9"));
    bean.setCount(-1.1f);
    assertEquals(visitor.visit(myCountExpression), new BigDecimal("-3.3"));
}

public static class SomeBean {
    public String name;
    private Float count;

    public Float getCount() {
        return count;
    }

    public void setCount(Float count) {
        this.count = count;
    }
}

VariableMapper can live longer, you set it up and then reuse it. Its configuration is its state (methods set), concrete object is merely input parameter. Variable resolver itself works like a closure around the concrete instance. Keep in mind that instantiating calculator visitor is cheap and visiting itself is something you have to do anyway. Creating parsing tree is expensive, but we don’t repeat this between tests – and that’s probably how you want to use it in your application too. Cache the parse trees, create visitors – even with state specific for a single calculation – and then throw them away. This is also safest from threading perspective. You don’t want to use calculator that “closes over” another target object in another thread in the middle of your visiting business. 🙂

Ok, how does that VariableMapper look like?

public class VariableMapper<T> {
    private Map<String, Function<T, Object>> variableValueFunctions = new HashMap<>();

    public VariableMapper<T> set(String variableName, Function<T, Object> valueFunction) {
        variableValueFunctions.put(variableName, valueFunction);
        return this;
    }

    public Object resolveVariable(String variableName, T object) {
        Function<T, Object> valueFunction = variableValueFunctions.get(variableName);
        if (valueFunction == null) {
            throw new ExpressionException("Unknown variable " + variableName);
        }
        return valueFunction.apply(object);
    }
}

As said, it keeps the configuration, but not the state of the object used in concrete calculation – that’s what the variable resolver does (and again, using lambda, one simply can’t resist in this case). Sure, you can combine VariableResolver with the mapping configuration too, but that will either 1) work in a single-threaded environment only, 2) or you have to reconfigure that mapping for each resolver in each thread. It simply doesn’t make sense. Mapper (long-lived) keeps the “way” how to get stuff from an object of some type in a particular computation context while variable resolver (short-lived) merely closes over the concrete instance.

Of course, our mapper can stand some improvements, it would be good if one could “seal” the configuration and no more “set” calls are allowed after that (probably throwing IllegalStateException).

Conclusion

So here we are, supporting even more types (Integer/BigDecimal), but – most importantly – variables! As you can see, now every computation can bring different result. That’s why it’s advisable to rethink how you want to instantiate your visitors, especially in case of multi-threaded environment.

Our ExpressionVariableResolver interface is very simple, it supports only variable name – so if you want to resolve from something stateful (and probably mutable) it’s important to wrap around it somehow. Variable resolver doesn’t know how to get stuff from an object, because there is no such input parameter. That’s why we introduced VariableMapper that supports getting values from an object of some type (generic). And we “implement” variable resolver as lambda to close over the configured variable mapper and an object that is then fed to its resolveVariable method. This method, in contrast to variable resolver’s resolve, takes in the object as a parameter.

It doesn’t have to be an object – you may implement other ways to get variable values in different contexts, you just have to wrap around that context (in our case object) somehow. I dare to say that Java 8 functional programming capabilities make it so much easier…

Still, the main hero here is ANTLR v4, of course. Now our expression evaluator truly makes sense. I’m not promising any continuation of this series, but maybe I’ll talk about functions too. Although I guess you can easily implement them yourselves by now.

Exploring the cloud with AWS Free Tier (2)

In the first part of this “diary” I found a cloud provider for my developer’s testing needs – Amazon’s AWS. This time we will mention some hiccups one may encounter when doing some basic operations around their EC2 instance. Finally, we will prepare some Docker image for ourselves, although this is not really AWS specific – at least not in our basic scenario case.

Shut it down!

When you shutdown your desktop computer, you see what it does. I’ve been running Windows for some years although being a Linux guy before (blame gaming and music home recording). On servers, no doubt, I prefer Linux every time. But I honestly don’t remember what happens if I enter shutdown now command without further options.

If I see the computer going on and on although my OS is down already, I just turn it off and remember to use -h switch the next time. But when “my computer” runs far away and only some dashboard shows what is happening, you simply don’t know for sure. There is no room for “mechanical sympathy”.

Long story short – always use shutdown now -h on your AMI instance if you really want to stop it. Of course, check instance’s Shutdown Behavior setup – by default it’s Stop and that’s probably what you want (Terminate would delete the instance altogether). With magical -h you’ll soon see that the state of the instance goes through stopping to stopped – without it it just hangs there running, but not really reachable.

Watch those volumes

When you shut down your EC2 instances they will stop taking any “instance-hours”. On the other hand, if you spin up 100 t2.micro instances and run them for an hour, you’ll spend 100 of your 750 limit for a month. It’s easy to understand this way of “spending”.

However, volumes (disk space for your EC2 instance) work a bit differently. They are reserved for you and they are billed for all the time you have them available – whether the instance runs or not. Also, how much of it you really use is NOT important. Your reserved space (typically 8 GiB for t2.micro instance if you use defaults) is what counts. Two sleeping instances for the whole month would not hit the limit, but three would – and 4 GiB above 20GiB/month would be billed to you (depending on the time you are above limit as well).

In any case, Billing Management Console is your friend here and AWS definitely provides you with all the necessary data to see where you are with your usage.

Back to Docker

I wanted to play with Docker before I decided to couple it with cloud exploration. AWS provides so called EC2 Container Service (ECS) to give you more power when managing containers, but today we will not go there. We will create Docker image manually right on our EC2 instance. I’d rather take baby steps than skip some “maturity levels” without understanding the basics.

When I want to “deploy” a Java application in a container, I want to create some Java base image for it first. So let’s connect to our EC2 instance and do it.

Java 32-bit base image

Let’s create our base image for Java applications first. Create a dir (any name will do, but something like java-base sounds reasonable) and this Dockerfile in it:

FROM ubuntu:14.04
MAINTAINER virgo47

# We want WGET in any case
RUN apt-get -qqy install wget

# For 32-bit Java we need to enable 32-bit binaries
RUN dpkg --add-architecture i386
RUN apt-get -qqy update
RUN apt-get -qqy install libc6:i386 libncurses5:i386 libstdc++6:i386

ENV HOME /root

# Install 32-bit JAVA
WORKDIR $HOME
RUN wget -q --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebac kup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u60-b27/jdk-8u60-linux-i586.tar.gz
RUN tar xzf jdk-8u60-linux-i586.tar.gz
ENV JAVA_HOME $HOME/jdk1.8.0_60
ENV PATH $JAVA_HOME/bin:$PATH

Then to build it (you must be in the directory with Dockerfile):

$ docker build -t virgo47/jaba .

Jaba stands for “java base”. And to test it:

$ docker run -ti virgo47/jaba
root@46d1b8156c7c:~# java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) Client VM (build 25.60-b23, mixed mode)
root@46d1b8156c7c:~# exit

My application image

Now I want to run my HelloWorld application in that base image. That means creating another image based on virgo47/jaba. Create another directory (myapp) and the following Dockerfile:

FROM virgo47/jaba
MAINTAINER virgo47

WORKDIR /root/
COPY HelloWorld.java ./
RUN javac HelloWorld.java
CMD java HelloWorld

Easy enough, but before we can build it we need that HelloWorld.java too. I guess anybody can do it, but for the sake of completeness:

public class HelloWorld {
        public static void main(String... args) {
                System.out.println("Hello, world!");
        }
}

Now let’s build it:

$ docker build -t virgo47/myapp .

And to test it:

$ docker run -ti virgo47/myapp
Hello, world!

So it actually works! But we should probably deliver JAR file directly into the image build and not compiling it during the build. Can we automate it? Sure we can, but maybe in another post.

To wrap up…

I hope I’ll get to Amazon’s ECS later, because the things above are working, are kinda Docker(file) practice, but they definitely are not for real world. You may at least run it all from your local machine as a combination of scp/ssh, instead of creating Dockerfiles and other sources on the remote machine – because that doesn’t make sense, of course. We need to build Docker image as part of our build process, publish it somewhere and just download it to the target environment. But let’s get away from Docker and back to AWS.

In the meantime one big AWS event occurred – AWS re:Invent 2015. I have to admit I wasn’t aware of this at all until now, I just got email notifications about the event and the keynotes as an AWS user. I am aware of other conferences, I’m happy enough to attend some European Sun TechDays (how I miss those :-)), TheServerSide Java Symposiums (miss those too) and one DEVOXX – but just judging from the videos, the re:Invent was really mind-blowing.

I don’t know what more to say, so I’m over and out for now. It will probably take me another couple of weeks to get more of concrete impressions about AWS, but I plan to add the third part – hopefully again loosely coupled to Docker.