Self-extracting install shell script with Gradle

My road to Gradle was much longer than I wanted. Now I use it on the project at the company and I definitely don’t want to go back. Sure, I got used to Maven and we got familiar – although I never loved it (often on the contrary). I’m not sure I love Gradle (yet), but I definitely feel empowered with it. It’s my responsibility to factor my builds properly, but I can always rely on the fact that I CAN do the stuff. And that’s very refreshing.

Learning

Gradle is not easier to learn that Maven I guess. I read Building and Testing with Gradle when it was easily downloadable from Gradle’s books page (not sure what happened to it but you probably still can get it somehow). The trouble with Gradle is that sometimes the DSL changes a bit – and your best bet is to know how the DSL relates to the API. The core concepts are more important and more stable than DLS and some superficial idioms.

Does it mean you have to invest heavily in Gradle? Well, not heavily, but you don’t want to merely scratch the surface if you want to crack why some StackOverflow solutions from 2012 don’t work anymore out-of-the-box. I’m reading Gradle in Action now, nearly finished, and I can just say – it was another good time investment.

My problem

I wanted to put together couple of binary artefacts and make self-extracting shell script from them. This, basically, is just a zip and a cat command with some install.sh head script. Zip gets all the binary artefacts together and cat joins the install.sh with this ZIP file – both separated by some clear separator. I used this “technology” back in the noughties, and even then it was old already.

How can such an install.sh head script look like? Depends on many things. Do you need to unzip into a temporary directory and run some installer from there? Is unzipping itself the installation? Let’s just focus on the “stage separation”, because the rest clearly depends on your specific needs. (This time I used this article for my “head” script, but there are probably many ways how to unzip the “tail” of the file. Also, in the article TGZ was used, I went for ZIP as people around me are more familiar with that one.)

#!/bin/bash
set -eu

# temporary dir? target installation dir?
EXTRACT_TO=out

echo "Extracting (AKA installing)..."
ARCHIVE=`awk '/^__ARCHIVE_BELOW__/ {print NR + 1; exit 0; }' $0`
tail -n+$ARCHIVE $0 > tmp.zip
unzip -q tmp.zip -d $EXTRACT_TO
rm -f tmp.zip

# the rest is custom, but exit 0 is needed before the separator
exit 0

__ARCHIVE_BELOW__

Now, depending on the way how you connect both files you may or may not need empty line under the separator.

To be more precise and complicated at the same time, there may need to be LF (\n or ASCII 10) at the end of the separator line or not. Beware of differences in “last empty line” meaning in various editors (Windows vs Linux), e.g. vim by defaults expects line terminator at the end of the file, but does not show empty line, while Windows editors typically do (see the explanation).

Concatenation… with Gradle (or Ant?)

Using cat command is easy (and that one requires line-feed after separator). But I don’t want to script it this time. I want to write it in Gradle. And Gradle gives me multiple superpowers. One is called Groovy (or Kotlin, if you like, but I’m not there yet). The other is called Ant. (Ant? Seriously?! Yes, seriously.)

Now I’m not claiming that built-in Ant is the best solution for this particular problem (as we will see), but Ant already has a task called concat. BTW: Ant’s task is just a step or action you can execute, if you thing Gradle-tasks think Ant-targets, and Ant’s targets are not of our concern here.

Ant provides many actions out of the box and all you need to do is use “ant.” to get to Gradle’s AntBuilder. But before we try that, let’s try something straightforward, because if you can access file, you can access its content too. One option is to use File’s text property and something like in this answer. Groovy script looks like this:

apply plugin: 'base' // to support clean task

task createArchive(type: Zip) {
    archiveName 'binary.zip'
    from 'src/content'
}

task createInstallerRaw(dependsOn: createArchive) {
  doLast {
    file("${buildDir}/install.sh").text =
      file('src/install.sh').text + createArchive.outputs.files.singleFile.text
  }
}

OK, so let’s try it:

./gradlew clean createInstallerRaw
# it does it’s stuff
build/install.sh
diff build/distributions/binary.zip tmp.zip
# this prints out:
Exctracting (AKA installing)...
Binary files build/distributions/binary.zip and tmp.zip differ

Eh, the last line is definitely something we don’t want to see. I used couple of empty files in src/content, but with realistic content you’d also see something like:

  error:  invalid compressed data to inflate out/Hot Space/01 - Staying Power.mp3
out/Hot Space/01 - Staying Power.mp3  bad CRC 00000000  (should be 9faa50ed)

Let’s get binary

File.text is for strings, not for grown-ups. Let’s do it better. We may try the bytes property, perhaps joining the byte arrays and eventually I ended up with something like:

task createInstallerRawBinary(dependsOn: createArchive) {
  doLast {
    file("${buildDir}/install.sh").withOutputStream {
      it.write file('src/install.sh').bytes
      it.write createArchive.outputs.files.singleFile.bytes
    }
  }
}

Now this looks better:

./gradlew clean createInstallerRawBinary
# it does it’s stuff
build/install.sh
diff build/distributions/binary.zip tmp.zip

And the diff says nothing. And even Hot Space mp3 files play back flawlessly (well, it’s not FLAC, I know). But wait – let’s try no-op build:

./gradlew createInstallerRawBinary
#...
BUILD SUCCESSFUL in 1s
2 actionable tasks: 1 executed, 1 up-to-date

See that 1 executed in the output? This build zips the stuff again and again. It works, but it definitely is not right. It’s not Gradlish enough.

Inputs/outputs please!

Gradle tasks have inputs and outputs properties that declaratively specify what the task needs and what it produces. There is nothing that prevents you from using more than you declare, but then you break your own contract. This mechanism is very flexible as it allows Gradle to check what needs to be run and what can be skipped. Let’s use it:

task createInstallerWithInOuts {
  inputs.files 'src/install.sh', createArchive
  outputs.file "${buildDir}/install.sh"

  doLast {
    outputs.files.singleFile.withOutputStream { outStream ->
      inputs.files.each {
        outStream.write it.bytes
      }
    }
  }
}

Couple of points here:

  • It’s clear what code configures the task (first lines declaring inputs/outputs) and what is task action (closure after doLast). You should know basics about Gradle’s build lifecycle.
  • With both inputs and outputs declared, we can use them without any need to duplicate the file names. We foreshadowed this in the previous task already when we used createArchive.outputs.files.singleFile… instead of “${buildDir}/distributions/binary.zip”. This works its magic when you change the archiveName in the createArchive task – you don’t have to do anything in downstream tasks.
  • No dependsOn is necessary here, just mentioning the createArchive task as input (Gradle reads it as “outputs from createArchive task”, of course) adds the implicit, but quite clear, dependency.
  • With inputs.files we can as well try to iterate over them. Here I chose default it for the inner closure and had to name the parameter outStream.

Does it fix our no-op build? Sure it does – just try to run it twice yourself (without clean of course).

Where is that Ant?

No, I didn’t forget Ant, but I wanted to use some Groovy before we get to it. I actually didn’t measure which is better, for archives in tens of megabytes it doesn’t really matter. What does matter is that Ant clearly says “concat”:

task createInstallerAntConcat {
  inputs.files 'src/install.sh', createArchive
  outputs.file "${buildDir}/install.sh"

  doLast {
    // You definitely want binary=true if you append ZIP, otherwise expect corruptions
    ant.concat(destfile: outputs.files.singleFile, binary: true) {
      // ...or mulitple single-file filesets
      inputs.files.each { file ->
        fileset(file: relativePath(file))
      }
    }
  }
}

This uses Ant task concat – it concatenates files mentioned in nested fileset. This is equivalent Ant snippet:

<concat destfile="${build.dir}/install.sh" binary="yes">
  <fileset file="${src.dir}/install.sh"/>
  <fileset file="${build.dir}/distributions/binary.zip"/>
</concat>

It’s imperative to set the binary flag to true (default false), as we work with binary content (ZIP). Using single-file filesets assures the order of concatenation, if we used something like (in the doLast block)…

ant.concat(destfile: outputs.files.singleFile, binary: true) {
  fileset(dir: projectDir.getPath()) {
    inputs.files.each { file ->
      include(name: relativePath(file))
    }
  }
}

…we may get lucky and get the right result, but just as likely the ZIP will be first. The point is, fileset does not represent files in the order of nested includes.

We may try filelist instead. Instead of include elements it uses file elements. So let’s do it:

ant.concat(destfile: outputs.files.singleFile, binary: true) {
  filelist(dir: projectDir.getPath()) {
    inputs.files.each { file ->
      file(name: relativePath(file))
    }
  }
}

If we run this task the build fails on runtime error (during the execution phase):

* What went wrong:
Execution failed for task ':createInstallerAntConcatFilelistBadFile'.
> No signature of method: java.io.File.call() is applicable for argument types: (java.util.LinkedHashMap) values: [[name:src\install.sh]]
  Possible solutions: wait(), any(), wait(long), each(groovy.lang.Closure), any(groovy.lang.Closure), list()

Hm, file(…) tried to create new java.io.File, not Ant’s file element. In other words, it did the same thing like anywhere else in the Gradle script, we’ve already used file(…) construct. But it doesn’t like maps and, most importantly, is not what we want here.

What worked for include previously -although the build didn’t do what we wanted for other reasons – does not work here. We need to tell the Gradle explicitly we want to use Ant – and all we need to do is to use ant.files(…).

Wrapping it up

Now when I’m trying it I have to say I’m glad I learned more about Gradle-Ant integration, but I’ll just use one of the non-ant solutions. It seems that ant.concat is considerably slower.

In any case it’s good when you understand Gradle build phases (lifecycle) and you know how to specify task inputs/outputs.

When working with files, it’s always important to realize whether you work with texts or binaries, whether it matters, how it’s supported, etc. It’s important to know if/how your solution supports order of the files when it matters.

Lastly – when working with shell scripts it’s also important to assure they use the right kind of line terminators. With Git’s typically automatic line breaks you can’t just pack a shell script with CRLF and run it on Linux – this typically results in a rather confusing error that /bin/bash is not the right interpreter. Using editor in some binary mode helps to discover the problem (e.g. vi -b myscript.sh). But that is not Gradle topic anymore.

I like the flexibility Gradle provides to me. It pays off to learn its basics and to know how to work with its documentation and API. But with that you mostly get your rewards.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s