Kubernetes AppOps Security Series

In 2019 I wrote a series of articles comprising six articles on Kubernetes AppOps Security (Network Policies, Security Context and PodSecurityPolicies) that has successively been published on German magazine JavaSPEKTRUM, starting in it’s 05/2019 issue.

I’m pleased to announce that the series is now completely published and available in English and German on the Cloudogu Blog:

  1. Network Policies – Part 1 – Good Practices | 🖺 original article PDF (German)
  2. Network Policies – Part 2 – Advanced Topics and Tips | 🖺 original article PDF (German)
  3. Security Context – Part 1: Good Practices | 🖺 original article PDF (German)
  4. Security Context – Part 2: Background | 🖺 original article PDF (German)
  5. Pod Security Policies – Part 1: Good Practices | 🖺 original article PDF (German)
  6. Pod Security Policies – Part 2: Exceptions and Troubleshooting | 🖺 original article PDF (German)

Going along with the articles went some open source demos showcasing the appOps security features: cloudogu/k8s-security-demos.

In addition, I had the honor of presenting the topic on several conferences.

Finally, we created a “Cloud Native Appli­cation Security” training at Cloudogu where you can get your hands on these topics, among others.

It has been a most intersting journey on which I learned a lot and experienced lots of support from my dear colleages at Cloudogu. Thank you so much!

Advertisement

Automatic Let’s Encrypt Certificates with Apache Tomcat / Spring Boot

TLDR;

I implemented a solution for fetching and renewing TLS certs without restart via Let’s Encrypt that works with standalone and embedded Tomcat as well as Spring Boot. It’s packaged into a Docker image, allowing for easy reuse. The certs are fetched and renewed using dehydrated, a bash script with few dependencies. This makes the TLS requesting agnostic to the application or framework in use,  as long as the application is shipped as a Docker image. The downside is, that it’s not a plain Java solution, i.e. it requires an additional process (within the container) and a Linux environment.


Having used cloud-native products like Caddy, Traefik, CertManager or Micronaut that naturally obtain and renew Let’s Encrypt certificates for HTTPS communication, I wondered if there is an easy way to use a feature such as this in a web server as long-established as Apache Tomcat. You might say: Tomcat usually  is used behind reverse proxies that do all the TLS offloading. But then there’s something like the Apache Portable Runtime (APR) based Native library for Tomcat that brings the heart of the Apache Web Server to Tomcat as a system library and with it it’s high TLS performance, that might even outperform NGINX. So obviously, this should be possible.

Surprisingly, I couldn’t find a solution with batteries included. So, let’s quickly implement our own. How hard can it be?

Dehydrated tomcat

Presenting: letsencrypt-tomcat – it packs everything needed for automatic TLS with tomcat into a single docker image:

  • dehydrated to manage certs via Let’s Encrypt,
  • tomcat-reloading-connector for hot reloading certs at runtime after renewal,
  • an init system (dumb-init) for properly handling tomcat and dehydrated processes,
  • an entrypoint script that starts up tomcat and dehydrated as well as
  • a pre-compiled version of Apache Portable Runtime (APR) and JNI wrappers for APR used by Tomcat (libtcnative), so tomcat delivers the best TLS performance possible.

This sounds like a lot of complicated stuff, but as the examples bellow will show, it’s rather easy to use. If you care about the details, check the “Challenges” bellow.

From the image everything can be copied as needed in the actual application’s Dockerfile. It’s not intended to as a base image (“inheritance”), but as a mere file archive to copy everything from a single source.

Usage

The way to use depends on the tomcat “kind” in use: standalone, embedded Tomcat orSpring Boot (which also uses embedded tomcat under the hood).

There’s one thing the setup for all kinds have in common: you pass the FQDN as environment variable DOMAIN to the container at startup, e.g. with docker run -e DOMAIN=example.com <image>

If necessary more parameters (optional) are available.

The remaining setup depends on the kind of tomcat used.

Sprint Boot

  • pom.xml or build.gradle: Add tomcat-reloading-connector to your dependencies (example)
  • Dockefile:
    • Copy /letsencrypt and /lib folder to the root (/) of your application’s docker image (example),
    • Set ENTRYPOINT or CMD to execute /meta-entrypoint.sh and pass the usual command for starting your spring boot app as parameter, for example:
      ENTRYPOINT [ "/meta-entrypoint.sh", "java", "-jar", "/app/app.jar" ]
  • application.properties or .yaml (example):
    • Configure Tomcat to serve Let’s Encrypt challenges: spring.resources.staticLocations=file:/static
    • Configure paths for certificate files: server.ssl.certificateKeyFile, etc.
    • Serve both HTTP and HTTPS ports
  • In the code that creates your Spring Boot app: Setup tomcat-reloading-connector: @Import(ReloadingTomcatServletWebServerFactory.class) (complete example)

Embedded tomcat

  • pom.xml or build.gradle: Add tomcat-reloading-connector to your dependencies (example)
  • Dockefile:
    • Copy /letsencrypt and /lib folder to the root (/) of your application’s docker image (example),
    • Set ENTRYPOINT or CMD to execute /meta-entrypoint.sh and pass the usual command for starting your app as parameter, for example:
      ENTRYPOINT [ "/meta-entrypoint.sh", "java", "-jar", "-Dcatalina.home=/tmp", "/app/app.jar" ]
  • In the code that creates embedded tomcat:
    • Configure Tomcat to serve Let’s Encrypt challenges (example)
    • Use tomcat-reloading-connector and configure paths for certificate files: ReloadingTomcatConnectorFactory.addHttpsConnector(tomcat, HTTPS_PORT, PK, CRT, CA); (complete example)

Standalone tomcat

  • Dockefile:
    • copy the following to the root (/) of your application’s docker image (example):
      • /letsencryptfolder,
      • tomcat-reloading-connector and
      • (if necessary) /lib folder (already included in bitnami-tomcat used in our example) ,
    • Make standalone tomcat serve Let’s Encrypt challenges (example)
  • server.xml:
    • Use tomcat-reloading-connector (example)
    • Configure paths for certificate files (example)

Real-world example

My use case for using this whole thing is my git-based wiki smeagol-galore, which relies on letsencrypt-tomcat. It already had a quite complex startup process, so I decided here to not use the meta-entrypoint but to copy the parts needed into the existing entrypoint.

Challenges

Now that we have seen working examples, it’s a good time to sum up technical challenges and decisions encountered within the process. Turns out the whole indeavor had more challenges than expected.

Choosing an ACME implementation

As of 09/2020 there doesn’t seem to be a drop-in web app for tomcat or a spring boot annotation that just handles the ACME protocol used by Let’s encrypt for us.

There are Java libraries that promise to help achieving this, most prominently acme4j. But looking at the docs or this tutorial the whole process seems rather complicated. There’s the spring-boot-starter-acme which seems be able to get a cert. To use the cert, we have to restart the server, though. It also does not renew, is still a SNAPSHOT and hasn’t been updates for some years.

A completely different alternative would be to use the certbot client. But then it brings it’s own server, which would require our application to stop and start again, once certbot has finished renewing the cert.

This is where dehydrated comes in handy: it implements the whole ACME protocol used by Let’s encrypt, in a single bash script with no other dependencies than bash, curl and OpenSSL. It’s also easy to use and does not bring it’s own webserver. That’s good, because we already have a webserver: Tomcat.

How to best integrate a bash script and a tomcat process into a single artifact? Why not use an OCI (a.k.a. Docker) image? Two processes in one container? But doesn’t this violate the “one process per container” mantra?
I’m convinced that if there are good reasons for it, why not use two coherent processes in one container? We might even start chaning the mantra to “one thing per container“. The fact that there a multiple popular init systems for Docker images makes me think that I’m not the only one holding this opinion.

So let’s combine tomcat with dehydrated within a Docker container.

Integrating Tomcat and Dehydrated

The basic idea is:

  • Tomcat starts,
  • Dehydrated starts the certificate signing process via Let’s Encrypt,
  • serves the challenge via Tomcat, then
  • downloads the certificate, so that finally
  • Tomcat can server the certificate.
  • Once a day, Dehydrates checks again if the certs must be renewed.

This results in two “interfaces” between Tomcat and Dehydrated:

  • the folder where Dehydrated places the challenges must be served by Tomcat at http://$DOMAIN/.well-known/acme-challenge/and
  • the cert files, written by Dehydrated and read by Tomcat.

Depending on the kind of Tomcat, configuring the cert files and making Tomcat serve the the challenge folder on the HTTP port requires a different setup (see “Usage” above).

Self-signed cert

Once this is done, let’s start Tomcat. Only it does not start without certificate. But we need it started, in order to answer Let’s Encrypt’s challenge.
Who was first, the chicken or the egg?

One solution: create a self-signed cert at startup (see meta-entrypoint.sh). With a validity of 30 days or less, dehydrated will try to renew the cert right away.

Synchronize startup of Tomcat and Dehydrated

Tomcat might get its HTTP ports ready to serve traffic in a matter of seconds but, depending on the application, it could als take minutes. The latter resulting in Dehydrated failing to get the the certificate from Let’s Encrypt. Solution: Poll the HTTP port, waiting for a succesfull HTTP status code.

PEM vs KeyStore

With Tomcat’s default Java implementation of HTTPS (Http11NioProtocol) we would also have issues with the certificate file formats, because it requires a Java Keystore, whereas Let’s Encrypt provides private key, certificate and CA as PEM. However, we are using APR (Http11AprProtocol), which also requires PEM. So it just works™️.

Reloading the certificate

Believe it or now: At this point we have a valid certificate. Only that Tomcat still serves the old self-signed one.

Next stop, how to make Tomcat reload the cert? Tomcat 9.x (might even have been backported to 8.5) provides an AbstractHttp11Protocol.reloadSslHostConfigs() method. But how to call it?
Dehydrated offers a hook for calling “something” after success.  But how to call this method inside Tomcat from a shell script? JMX? Or via an URL request on Tomcat Manager web app? These solution would all require configuration, increase complexity and might even impair security.

The other option is to call the method from within Tomcat, by implementing our own protocol class. It could regularly check and reload the certificate on change. But then polling is just a waste of resources and it will always take longer than something that is called on change.
Other idea: Set a file watch. For better reuse, maintainance and separation of concerns, I implemented this in a separate project, tomcat-reloading-connector (too late I figured out that “tomcat-reloading-protocol” would be a better fit 🙈 ) and published it on maven central. While implementing, I came to the realization that reloading “the certificate” should be an atomic operation, containing all certificate-related files. After all, what I refer to as “the certificate” above, actually consists of a number of files, like private key, cert and CA. Reloading only one file might result in an inconsistent state. In order to avoid this, I added a reload delay of a few seconds. A changing file within the folder of the certificate results in all SSLHostConfigs being reloaded a couple of seconds later. Still not perfect, as the delay is a compromise between a fast reload and an inconsistent SSLHostConfig. For starters, let’s stick with it. For Dehydrated 3s seemed fine from my empirical studies. If it does not work for you, just define what ever suits your needs via the TOMCAT_DELAY_RELOAD_CERTIFICATES_MILLIS environment variable.

Conclusion

Creating a dehydrated Tomcat wasn’t straightforward. The most laborious task being the reload of the certificate, resulting in a separate project: tomcat-reloading-connector.

With letsencrypt-tomcat most of the work has been done, allowing for rather easy reuse. However, for each project, there still is some bootstrapping to be done. A 3rd party solution such as this can never compete with built-in support for Let’s Encrypt as offered by other products. On the other hand there now is a reliable solution that can run on all kinds of Tomcats be it standalone or embedded, even with Spring Boot.

Future work

That’s all for now. As always, there some work is still to be done for the curios reader.

Getting letsencrypt-tomcat to work wit Let’s Encrypt’s TLS-ALPN-01 challenge would be useful to get rid of the HTTP port. It generally is supported by Dehydrated but might be challengin to realize using Tomcat.

One issue with relying on a custom Dockerfile for creating the artifact is that it does not work intuitively with Cloud Native Build Packs. Spring Boot 2.3 comes with built-in support for building OCI images without Dockerfile. How can we integrate this with letsencrypt-tomcat? It should be possible to just add another layer, and adapt the entrypoint.

Continuous Delivery to Maven Central with Travis CI

Having used continuous delivery in my professional life for years, I would never want to miss it, even with my private projects. Recently, I had the challenge of deploying an artifact to maven central. I implemented this before using Jenkins. However, there’s no way (I know of) to get a Jenkins As A Service for free for OSS projects. So it was convenient to use TravisCI.

Surprisingly, I didn’t find a turnkey solution online. So I had to come up with my own solution. Here’s how it goes:

language: java
jdk:
- openjdk8

script:
  - |
    cat << EOF > settings.xml
    <settings>
      <servers>
        <server>
          <id>ossrh</id>
          <username>\${env.OSSRH_USER}</username>
          <password>\${env.OSSRH_TOKEN}</password>
        </server>
      </servers>
    </settings>
    EOF

- TMP_KEY="$(mktemp)"
- echo "${PK_BASE64}" | base64 -d > "${TMP_KEY}"
- export PGP_SECRETKEY="keyfile:${TMP_KEY}"

- ./mvnw clean package

deploy:
skip_cleanup: true
provider: script
script: bash ./mvnw -Prelease -s settings.xml deploy -DskipTests
on:
branch: master

cache:
directories:
- '$HOME/.m2/repository'
- '$HOME/.sonar/cache'

This requires four environment variables defined in your travis-ci project and a profile in your pom.xml:

  • OSSRH_USER – Username to Nexus
  • OSSRH_TOKEN – Access Token to Nexus
  • PGP_PASSPHRASE in the format literal:PASSPHRASE
  • PK_BASE64 – Content of your PK, base64 encoded, create with e.g. base64 -w0 pk.asc

Details will be described in the following paragraphs.

Preface

If you haven read a guide on how to deploy to maven central, do it now. The requirements, keys, signing, etc. are not within the scope of this article.

Repository credentials / settings.xml

  • The OSSRH_USER and OSSRH_TOKEN are needed for authenticating against the nexus repository where the artifacts are deployed. While you could use your username and password, I recommend to create an access token, though. It could be revoked more easily.
  • Credentials are passed to maven via settings.xml. In order to make this process secure, we create a settings.xml in the travis.yaml and tell it to read the actual credentials from the environment. We need to escape the $ because it is not supposed to be interpreted by the shell that writes the settings.xml, but by maven when reading the settings.xml
  • When calling maven, we tell it to use the local settings.xml via the -s parameter.

Signing artifacts

For signing the artifacts, kohsuke’s pgp-maven-plugin (from 2014 😲) still seems the easiest way for headless use such as in CI environments.
First, we have to specify the plugin in the pom.xml. My example pom.xml shows how.

We then can easily pass the passphrase for the private key to the plugins using the env var: PGP_PASSPHRASE. Note that we can’t just copy the plain passphrase, we have to tell the plugin, that’s its a literal, by using the format literal:PASSPHRASE.

For the private key itself, there are two challenges:

  1. How do we get the key into travis?
  2. How do we get the key into maven?

How do we get the key into travis?

Unfortunately, other than Jenkins, travis does not seem to provide a way of uploading secret files. If I get the docs right, I could encrypt the private key and push it into my repo. Has anything ever felt so wrong as pushing a private key into a public repository? So I tried loading it into an env var. Here the docs tell me to “escape any Bash special characters”, which is quite a challenge for a private key. Workaround: Base64 encode the whole thing before pasting it into travis.

How do we get the key into maven?

Now that the private key is ready in an env var, it turns out that the plugin’s sole open issue (opened 2018) is that it can’t read a private key from an env var.
So, while base64 decoding the private key, it is pasted into a temporary file, which is then passed to the plugin via an env var.

Epilogue

Only 20 lines of yaml to continuously deliver artifacts to maven central, including signing and all. Not so bad, is it?

Please keep in mind that the version within your pom.xml decides where the artifact is deployed to:

  • If it ends in -SNAPSHOT it is deployed to the SNAPSHOT repository
  • Otherwise, it’s deployed to the releases repository and promoted to maven central.

Querying docker image sizes via the command line

TLDR;
Querying and comparing the size of different docker image tags from the command line can be as easy as follows:

docker run --rm schnatterer/docker-image-size <docker image name> [<extended grep regex on docker tag>]

# e.g.
docker run --rm schnatterer/docker-image-size adoptopenjdk '^11.0.4' \
   | grep 'amd64 linux'

# A single image size can be queried faster like so
docker run --rm --entrypoint docker-image-size-curl.sh schnatterer/docker-image-size adoptopenjdk:11.0.4_11-jre-hotspot

Unfortunately, it’s takes minutes.


When starting a new project, building a CI/CD pipeline or when deploying an app off the shelf we often have the challenge of selecting not only a suitable docker / OCI image but also which tag of the image to use.

The trouble at Docker Hub

The first stop is usually on hub.docker.com where it is confusing to select among the different versions, variants and architectures, e.g. JDK, node, ruby or .net (which does not even have tags on docker hub).

reg to the rescue

In situations such as these genuinetools/reg is very helpful to get a textual overview of all tags of a repo on the command line which can easily be grepped for what is needed. For example:

reg tags adoptopenjdk | grep 11.0.4
# We can apply more complex regex
reg tags adoptopenjdk | grep -P '^(?!.*windows)11.0.4'

# Actually we don't even need to install reg, we can just user docker for distribution
docker run r.j3ss.co/reg tags openjdk | grep -e '^11.0.4-'

# BTW I use:
alias reg='docker run r.j3ss.co/reg'
# Every now and then I update like so
docker pull r.j3ss.co/reg

Image size

One thing that is missing though, is the image size. It’s not that important but it gives a hint which tags point to the same image and also might be another fact to choose or not to choose one image over the other.

While preparing one of my docker training courses, I answered a question on stackoverflow on how Docker Hub sums up the image size. After that I found myself returning to my own answer every now and then because I wanted to reuse the script for finding out image sizes via the command line.

docker-images-size

Having done this too often, I decided to implement a script that does this more convenient: schnatterer/docker-image-size.

It turned out to become four scripts, actually:

  • Three for the different ways of querying docker manifests (reg manifest, docker manifest and via curl)
  • and another one that uses the above one and combines it with the reg tags command for comparing image sizes.

They also come neatly packed into a docker image, and can be used like so:

docker run --rm schnatterer/docker-image-size  <docker image name> [<extended grep regex on docker tag>]

# If used more often, an alias provides more convenience:
alias docker-image-sizes='docker run --rm -e DIS_IMPL schnatterer/docker-image-size'

docker-image-sizes <docker image name> [<extended grep regex on docker tag>]

Different Implementations

While implementing, I completely underestimated the peculiarities of the docker distribution protocol (standardized as OCI distribution spec), with all those different implementations in registries (Docker Hub, MCR, GCR, quay.io, etc.), multiple architectures, repo digests, manifest versions (V1 vs V2) and docker login. This lead to the different implementations providing different features:

Docker Image Size features

Retrospectively, using docker manifest seems like the most versatile solution, being compatible with most repos. Unfortunately, querying a single manifest might take more than 10 seconds.

Still, we can now easily compare different tags of docker image using the following commands, for example:


# Match all tags containing '11' (a whole lot. Will take ages!)
docker-image-sizes adoptopenjdk 11
# More accurate (and faster) output
docker-image-sizes adoptopenjdk '^11.0.4'
# Multi arg results can be filtered using grep
docker-image-sizes adoptopenjdk '^11.0.4' | grep 'amd64 linux'

The results will take some minutes (as mentioned above) and look like this:

adoptopenjdk:11.0.4_11-jdk-hotspot amd64 linux : 235 MB
adoptopenjdk:11.0.4_11-jdk-hotspot-bionic amd64 linux : 235 MB
adoptopenjdk:11.0.4_11-jdk-openj9-0.15.1 amd64 linux : 236 MB
adoptopenjdk:11.0.4_11-jdk-openj9-0.15.1-bionic amd64 linux : 236 MB
adoptopenjdk:11.0.4_11-jre-hotspot amd64 linux : 80 MB
adoptopenjdk:11.0.4_11-jre-hotspot-bionic amd64 linux : 80 MB
adoptopenjdk:11.0.4_11-jre-openj9-0.15.1 amd64 linux : 79 MB
adoptopenjdk:11.0.4_11-jre-openj9-0.15.1-bionic amd64 linux : 79 MB

Faster responses

If we don’t care about multi-arch anyway, we could use the curl implementation like so:

DIS_IMPL=curl docker-image-sizes adoptopenjdk '^11.0.4' 2>/dev/null

This results in a much faster response (about a minute), but fails for the windows variants (which we generously pipe to /dev/null).

Querying the size of a single image

If you care about size of one image only you can use the implementations directly, which for the curl implementation responds in about a second. Note that it takes only one parameter, similar to docker run:

$ docker-image-size-curl.sh adoptopenjdk:11.0.4_11-jre-hotspot
# Or using the docker image
$ docker run --rm --entrypoint docker-image-size-curl.sh schnatterer/docker-image-size adoptopenjdk:11.0.4_11-jre-hotspot
adoptopenjdk:11.0.4_11-jre-hotspot: 80 MB
# If used more often, an alias provides more convenience:
$ alias docker-image-size='docker run --rm --entrypoint docker-image-size-curl.sh schnatterer/docker-image-size'

The largest image ever

Fun fact at the end: Using docker-images-sizes, I found the largest image I ever (?):

adoptopenjdk:11.0.4_11-jdk-hotspot amd64 windows 10.0.14393.3204: 6100 MB

It still makes me smirk that a “leightweight” docker image for Windows is 67x the size of a whole Linux VM:

$ docker-image-sizes weaveworks/ignite-kernel 4.19.47
weaveworks/ignite-kernel:4.19.47: 14 MB

$ docker-image-sizes weaveworks/ignite-ubuntu 19.04-v0.5.2
weaveworks/ignite-ubuntu:19.04-v0.5.2: 77 MB

Generating a hard-coded build number/version name in your Java app

TLDR;

It’s now possible to easily generate a version number for Java apps during the build as a static final field without any runtime dependencies using the annotation processor of the cloudogu/versionName library.


While developing and operating apps with Java for more than a decade, I found one of the recurring problems when working with (or debugging) an app is the matching of the app to the corresponding source code revision/commit/version. So it’s usually a good idea to add a version name, commit id, or build number to your app. I wrote about this topic multiple times and authored a library (now cloudogu/versionName) to automate this task. I thought I’d be through with this topic. Thouroughly. Honestly.

However, while getting started with GraalVM native images, I struck me that the cloudogu/versionName library has shortcomings: It reads the version name from a file (MANIFEST.MF or properties file) within the jar. GraalVM native images are not packaged as jar, though. While we could of course copy the file manually, it made me wonder if a reading a read-only value from a text files isn’t the proper solution. Why don’t we hard-code this into a Java constant? After all, this feels more read-only than a text file and is much easier to read. Also, there’s no more risk of IOExceptions, different behaviours in JDKs, or environments (jar/war). But obviously the writing of the version name field (i.e. generating code) has to be automated.

I sat and thought how this could be realized. The official way of generating code with Java are annotation processors.  From a user perspective an implementation that uses byte code manipulation could be easier to use. OTOH this would require lots of vodoo. If you’re interested in the amount of voodoo, I recommend reading this classic: Project Lombok – Trick Explained.

So, annotations processors it is! Now, I’m happy to announce that the new version 2.1.0 of cloudogu/versionName contains an annotation processor, that generates a Version.NAME constant. This has another neat side effect: at runtime, there is no longer a need to include the library itself. All is done at compile time, zero dependencies at runtime!

All we have to do is add the dependency during compilation (e.g. using Maven’s provided scope) and set the version we want to create as a compiler argument. When using a build tool, it’s easy to generate the version name or build number as described in one the previous articles, and then pass them on to the compiler. For example with Maven:

<dependency>
    <groupId>com.cloudogu.versionName</groupId>
    <artifactId>processor</artifactId>
       <version>2.1.0</version>
   </dependency>
    <!-- This dependency is only needed during compile time and should not be packaged into final JAR -->
    <scope>provided</scope>
</dependency>
 
<!--- .. -->
 
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-compiler-plugin</artifactId>
    <version>3.8.0</version>
    <configuration>
        <compilerArgs>
            <arg>-AversionName=${versionName}</arg>
        </compilerArgs>
    </configuration>
</plugin>

As it is an annotation processor, we finally have to add an annotation @VersionName to a package or class for the annotation processor to hook in the compilation process. This will cause the processor to create a class Version containing a static final field NAME, during compilation. In the application, this can be read like any other field without the risk of any failures at runtime. Why didn’t I think of this before!
Here’s an example for switiching from the regular versionName library to the annotation processor.

One last side note: The annotation processor’s features are verified using unit tests, which is more complex as it might seem with the compilation process and all. Fortunately, I could reuse the approach of the Shiro Static Permission library (by my colleage @cloudogu, Sebastian Sdorra) that uses Google’s compile-testing library. Unit tests for annotation processors, how awesome is that?!

GraalVM (a bit) beyond Hello World

TLDR;

GraalVM native images are a real innovation for Java apps, providing much smaller Images and faster startup, especially useful for apps that are horizontally autoscaled. However, this causes a higher development effort, especially for existing apps. If included in new apps from the start, this effort should be smaller, but on the other hand we’re pinned to JDK8 with native images.


After working as developer with Java and Docker for many years (and more recently also as trainer) it’s obvious to me that Docker and Java are not a perfect match. As for all interpreted languages, the interpreter (the JVM, in case of Java) imposes a burden in both size and startup time on our Docker image/container.

Java shares this fate with other platforms such as .net, node.js and ruby. To make a point, the follwoing table shows a brief overview of the current “alpine” variants (i.e. the smallest) of those platform’s base images:

Image Size(MB)
 adoptopenjdk/openjdk8:jre8u212-b04-alpine 48
mcr.microsoft.com/dotnet/core/runtime:2.2.4-alpine 36
node:12.0.0-alpine 26
ruby:2.6.3-alpine3.9 26

(The sizes are the compresses size of all layers of the images within DockerHub. The image sizes is calculated as described here).

In contrast, nativly compiled apps written in Go or C/C++ can be built from scratch (if statically compiled), starting of at 0 MB.

GraalVM to the rescue

About a year ago, Oracle announced the first release candiadate of GraalVM offering better interoperability for Java Programms with non-JVM languages and precompiled native images with instant start up and low memory footprint (using ahead-of-time (AOT) compilation). In the announcement they stated that Twitter already uses GraalVM in production to efficiently run their Scala workloads.

From the start, this sounded like a innovation to me. As I’m always a bit low on free time to play around with technologies, I started by following the topic by reading articles that crossed my way, such as

Great introductions, though they all have one thing in common: They compile a single “Hello World” Java file into a native image.

This made me wonder if it’s also that easy for more real-life projects and what pitfalls there are. Especially regarding the known limitations for native image generation, like reflection.

Getting the hands dirty

My first approach was to naively adopt the Dockerfiles for two existing projects to produce native Images. Containing the compilation in Dockerfiles has the advantage of not requiring anything installed locally and is also a kind of “infrastructure as code”, providing deterministic results and allowing for version control.

The projects I chose were

BTW – As they both failed in the beginning (with GraalVM 1.0.0-rc14), I had a look on Java frameworks that offically support GraalVM native images. See my article: Short comparison: Building Graal Native Images with Quarkus, Micronaut and Helidon.

When I started, the latest version was GraalVM 1.0.0-rc14 (the 14th release candidate for version 19.0.0 🤔). For both projects I stumbled upon a number of confusing (to me) errors (like NullPointerExceptions) that all magically vanished once I updated to the “ready for production use” version GraalVM 19.0.0 (as soon as it was available).
Nice work by the GraalVM team 👍  (
if anyone is interested in the errors, I carved them into the Dockerfile or Git Commit messages, see fernflower and colander).

Here are my most interesting findings (I will elaborate on them bellow):

  • the fernflower CLI app works with a Docker image of only 5.3 MB in size! This is really revolutionary for a Java app 🎉
    Note that it only compiles statically (which is what I wanted anyway, to be able to use a scratch Docker image)
  • the colander native image also compiles (it’s also only 5 MB). However, the app does not work, because of it’s dependencies. These were the kind of real-world problem findinds I was looking for.

Findings in detail

Happy days

As said, the GraalVM native image build for fernflower works like charm. In fact, I was so charmed I created an automated build at DockerHub, that regularly builds Docker images for the latest version of fernflower, using different base images: schnatterer/fernflower-docker.

So if you’d ever need a Java decompiler just do a docker run --rm -v $(pwd):/src schnatterer/fernflower and 5MB later you’ll have you’re decompiler ready.
These images also allow for a nice comparison of image sizes for different Java base images.

Image Size(MB)
native/scratch image
adoptopenjdk/openjdk8:jre
adoptopenjdk/openjdk8:alpine-jre
gcr.io/distroless/java:8

5 vs 45MB comparing native image to regular JRE. Stunning, isn’t it?

BTW – this is the final Dockerfile to build the native image.

Facing real world challenges

Of course, I love it when a plan comes together. But on the other hand I suspected this wouldn’t always be the case for a such complex a thing as GraalVM native images. Here are the issues I encountered for the colander app (as said before, these issues are well documented known limitations of GraalVM):

  • When starting the native image, there’s not much output to the console.
    Reason: Not surprisingly, unlike the jar, the native image does not contain a logback.xml. In order to fix this, we would have to copy the logback.xml manually into the final docker image during the build.
  • The command line option --help does not show any options.
    Reason: These options are defined in annotations, and are read at runtime using reflection via the JCommander framework. How to fix?
    Configure the reflection for native image generation. Fortunately, some Java CLI frameworks like picocli support generating the config files out of the box. So migrating would also be an option.
  • Colander can’t show its own version name.
    Reason: The version name is read from the MANIFEST.mf file using the cloudogu/versionName library. The file is (again, not surprisingly) is not contained in the native image. This made me wonder if it wouldn’t be much simpler to read the version name from a Java constant. A hard-coded value just feels more read-only than a text file such as the MANIFEST.mf. I added this feature to the library using annotation processors.

Conclusion

The issues encountered definitely proof my hypothesis, that GraalVM native images, while a technological innovation that provides a lot of potential, causes extra efforts during development. So we have to decide if these extra efforts are justified by the advantages.

I presume that, if planned from the beginning, the efforts are a lot less, because we can take GraalVM support into account when making our technical decisions. There are frameworks like Quarkus, Micronaut and Helidon that have native support for GraalVM, minimizing the extra effort. For large existing apps, though this does not help. The colander example is really small (with only about 1k of net LOC) and already causes a lot of effort.

So would I use GraalVM native images in production?

I probably wouldn’t migrate existing applications, except they run at a massive scale and the potential savings (faster horizontal scaling, smaller memory footprint) justify the effort for building the native image in the first place. For new projects, I would assess building GraalVM native images from the start, if sticking with JDK8 is OK.  Native images only support JDK8, as of GraalVM 19.0.0.
In the long run most popular Java frameworks, not only Quarkus, Micronaut and Helidon are likely to support native image generation. For now, Spring is still a WIP, also mentioned in the GraalVM 19.0 release announcement. If I had a teams with profound Spring experience, I would only switch to another framework for a good reason.
On the other hand, if developing using the microservices architecture pattern, first experiences with GraalVM native images could be gained by implementing small services using GraalVM.

Short comparison: Building Graal Native Images with Quarkus, Micronaut and Helidon

The technological innovations of the last years such as the adoption of containers, cloud-native technologies, the microservice architectural style, the inception of GraalVM and the end of JavaEE (as we know it) has energized the Java framework market.

As of May 2019, there are at least three frameworks supporting GraalVM native images out of the box, targeting cloud-native microservices:

  • Quarkus,
  • Micronaut and
  • Helidon.

As building GraalVM native images is a bit challenging, I was curious to find out how these three frameworks keep up with their promises. I worked through the respective getting started guides and wrote down some similarities and differences resulting in this short (and surely incomplete) comparison of the three frameworks. See the following table for an overview.

General comparison

Quarkus Micronaut Helidon
Core Project Source quarkusio/quarkus micronaut-projects/micronaut-core oracle/helidon
Website quarkus.io micronaut.io helidon.io
Started/Backed By RedHat objectcomputing Oracle
First Commit 2018-06-22 2017-03-06 2018-08-28
GitHub Stars (05/2019) 1693 2283 1371
GitHub Contributers (05/2019) 88 120 26
# Commits (05/2019) 3970 5907 617
Supported languages Java, Kotlin Java, Groovy, Kotlin Java
Supported build tools mvn, Gradle mvn, Gradle mvn
Supported APIs for graal native Microprofile, vert.x,

 

Micronaut, ReactiveX/RxJava Helidon SE (Microprofile, without native image)
Programming paradigms for graal native Imperative, reactive Reactive, imperative? Reactive (imperative, without native image)
Code generation via mvn plugin CLI (mn) mvn archetype
Getting Started (Graal native) Guide Docs Blog
Resulting src of Getting Started schnatterer/quarkus-getting-started schnatterer/micronaut-getting-started schnatterer/helidon-getting-started
Size of getting started docker image
Getting started base docker image fedora-minimal alpine-glibc scratch
Getting started uses native image? N N Y
Size of getting started in scratch docker image

All three projects are rather young (grandpa Micronaut is about 2 years old as of 05/2019) but have what looks like extensive documentation at first glance. The only thing that made my stumble a bit was that Helidon’s docs don’t return a result for “graal”. I later found a brand new getting started with graal on oracle’s developers blog. Hopefully, this will be added to the docs soon.

There are a couple of notable differences between the three frameworks:

    • Programming style (reactive vs. imperative)
      • Quarkus explicitly supports both (reactive as an extension),
      • Helidon claims to support both, but only reactive in conjunction with native images right now
      • Micronaut is reactive only From the docs it seems that micronaut focuses on reactive, but blocking approaches are supported (see Graeme Rocher’s comment).
    • Language
      • Micronaut and Quarkus both support Java and Kotlin.
        Micronaut also supports Groovy 🎉 (having Graeme Rocher, the creator of Grails, on board it’s probably a must)
      • Helidon only supports Java
    • Build tool / code generation
      • Micronaut and Quarkus support Maven and Gradle.
        • Quarkus uses a Maven plugin for code generation (bad luck for Gradle users) whereas
        • Micronaut brings its own CLI tool that thankfully can easily be installed using sdkman.
      • Helidon supports only Maven and has only initial code generation support via a Maven archetype.
    • Kubernetes
    • Community
      Hard to tell. The amount of discussions on my tweet about Quarkus makes me think they’re the ones that are most interested in feedback and people getting involved.

GraalVM native Image / Docker Image

  • The Dockerfiles provided by the getting started of Quarkus and Micronaut each require an external Maven build.
    The images base on fedora-minimal (resulting in a 44MB compressed image) or alpine-glibc (resulting in a 32MB compressed image) respectively.
    A base image containing a libc is required because the native image is linked dynamically.
  • Helidon provides a proper self-contained Dockerfile that can be built by simply calling docker build, not requiring anything locally (except Docker, of course).
    Here, the native image is linked statically. Therefore the binary can run in an empty scratch image (resulting in an 8MB compressed image).

Bearing in mind that a Java 8 JRE Image requires about 100MB (debian) or 50 MB (alpine), 44MB or even 32MB for a small webapp is not so bad. OTOH the 8 MB for the statically linked image are a real revelation, leaving me stunned.

The fact that Helidon plays well with GraalVM shouldn’t be too surprising, as they both are official Oracle products.

Beyond getting started

As Quarkus was the first framework I tried, I wondered why they rely on fedora and not just compile a static binary (later, I learned about some of their reasons on twitter). So I tried a couple of other images, eventually setting the switch for creating a static binary and using a scratch image. Voilà: It results in a 7MB image, even a wee bit smaller than the Helidon one. See the table bellow for an overview of images and their features and sizes (taken from the README of my getting started repo).

Base Image Size Shell Package Manager libc Basic Linux Folders Static Binary Dockerfile
fedora 📄
debian 📄
alpine-glibc 📄
distroless-base 📄
busybox 📄
distroless-static 📄
scratch 📄

I applied more or less the same on Micronaut. Here, the scratch image is only 5 MB smaller than the alpine one – 27 MB. This is not too surprising, because the plain alpine-glibc image is only about 6MB. It also felt like the native image generation took longer and needed more memory (observed with docker stats).

As for Helidon’s self-contained, scratch image containing only a static binary, there was not much to be done. I only extend the Dockerfile by a maven cache stage for faster Docker builds.

There’s one last thing I changed in all Dockerfiles: Don’t run as root. I used the USER statement in the Dockerfile. docker run -u ... would also be fine. This way, it’s much more unlikely that possible vulnerabilities (such as CVE-2019-5736 in runc) are exploited.

So summing up: Quarkus and Helidon can be used to create really small docker images, Micronauts are “only” small 😉. It’s worth mentioning that I didn’t look what features are included in those images, so maybe it’s a bit naive to just compare the minimal sizes resulting from the individual getting started guides.

Going even further

If I were to continue my comparison at this point (which I won’t because it’s only a short comparison) I would look into the following features of each framework:

  • integration and unit testing,
  • extensions (e.g. Cloud Native features, Tracing, Monitoring, etc.)

Summary

So, for a new green field project, which one of e frameworks would I use?

As far as I can tell after completing the getting started, all three look promising. As for all architectural decisions, I’d definitely try to build a walking skeleton (technical roundtrip) before finally deciding, in order to gain more field experience and find out what’s beyond getting started.

I’d base this decision on the experience or preferences of the team

  • reactive vs. imperative
  • Maven vs. Gradle
  • Java vs. Kotlin (or even groovy)
  • APIs – Microprofile, vert.x, RxJava

Personally, I like the fact that Quarkus builds on existing APIs such as Microprofile, so existing experience can be reused for faster results. It also seems to me the most flexible of the three, supporting Java, Kotlin, Maven, Gradle, reactive and imperative.

As for native images, I’d definitely either try it from the beginning or stick to a regular JRE. I suppose switching from plain JRE-based to native could be complicated for an existing app, due to the native image limitations. If the app under development does not have the requirement to be scaled horizontally, this could be an argument for skipping the native image part. But this is beyond the scope of this article.

As for the docker image – it’s obviously not only the size that matters. An image without shell and package manager is always more secure but harder to debug.

 

Edits:

  • 2019/05/17: John Clingan pointed out that Quarkus supports Kubernetes resource generation an multiple reactive extensions
  • 2019/05/19: As commentef by Graeme Rocher’s, Micronaut also supports blocking workloads

Coding Continuous Delivery with Jenkins Pipelines

Starting in their 01/2018 issue, Java aktuell published my four-part articles series Coding Continuous Delivery in German. I’m happy to announce that all parts are now available in English, courtesy of Cloudogu.

The series takes you from zero to continuously delivering your software through a sophisticated Jenkins pipeline. It starts with the fundamentals, heading on to advanced topics such as nightly builds, parallel execution, docker, shared libraries, unit testing, static code analysis with SonarQube and deployment to Kubernetes. All of the topics are described hands-on with examples comparing the scripted with the declarative syntax provided by the Jenkins Pipeline Plugin.

  1. Jenkins pipeline plugin basics | 🖺 original article PDF (German)
  2. Performance optimization for the Jenkins Pipeline | 🖺 original article PDF (German)
  3. Helpful Tools for the Jenkins Pipeline | 🖺 original article PDF (German)
  4. Static Code Analysis with SonarQube and Deployment on Kubernetes et al. with the Jenkins Pipeline Plugin | 🖺 original article PDF (German)

The examples to all articles are contained in this GitHub repository: triologygmbh/jenkinsfile and the builds can be seen in action on this Jenkins server: opensource.triology.de.

My awesome colleagues at Cloudogu GmbH and Triology GmbH – thank you so much for your support. Especially my co-author from the first article, Daniel Behrwind, who got this whole thing started.

The pragmatic migration to JUnit 5

This article shows how get from JUnit 3.x / 4.x to JUnit 5.x as fast as possible.

Just a short clarification of the term “JUnit 5” (from the user guide) before we take off:

JUnit 5 = JUnit Platform + JUnit Jupiter + JUnit Vintage

where

  • Platform provides the Maven and Gradle Plugins and is the extension point for IDE integration,
  • Vintage contains legacy JUnit 4 API and engine,
  • Jupiter contains the new JUnit 5 API and engine.

Step 1 – Run existing tests with JUnit 5 vintage

The first thing we do is to replace the existing junit:junit depedency with the following

<dependency>
	<groupId>org.junit.vintage</groupId>
	<artifactId>junit-vintage-engine</artifactId>
	<version>5.1.0</version>
</dependency>

For Gradle see this article.

For a real world example see this commit.

Note: After upgrading fromjunit:junit:4.12 to org.junit.vintage:junit-vintage-engine:5.1.0 the execution order of @Rule seems to have changed: They seem to be now executed sequentially (from top to bottom, as defined in the test class).

Step 2 – Getting started with JUnit Jupiter and the Platform

Now lets go from vintage to the fancy new stuff. Just add the Jupiter dependency and empower surefire to use the JUnit platform:

<dependency>
	<groupId>org.junit.jupiter</groupId>
	<artifactId>junit-jupiter-engine</artifactId>
	<version>5.1.0</version>
</dependency>

<plugin>
	<groupId>org.apache.maven.plugins</groupId>
	<artifactId>maven-failsafe-plugin</artifactId>
	<!-- ... -->
	<dependencies>
		<dependency>
			<groupId>org.junit.platform</groupId>
			<artifactId>junit-platform-surefire-provider</artifactId>
			<version>1.1.0</version>
		</dependency>
	</dependencies>
</plugin>

Make sure to juse either surefire 2.19.1 or 2.21.0+, as there seem to be a bug in the versions in between.

As above, for Gradle see this article.

For a real world example see this commit.

As of now, we’re ready to write new tests with JUnit Jupiter.

Here’s a pragmatic aproach how to introduce JUnit 5 from here:

  • Use the new API and all the new features for new test classes.
  • Don’t try to migrate all existing tests. It causes a lot of effort with no direct business value.
  • Instead, apply the boyscout rule by gradually migrating existing tests before they need to be changed.

When getting started wiht JUnit Jupiter you will recognize that some familiar features of JUnit now have a new API or can be achieved using different concepts. After that, there are a some new features to explore.

One way to get accustomed to the new API and concrepts is to migrate some (not all) existing tests, preferably the most complex ones. This way, you will find out how to use the new concepts and which limitations there still might be about JUnit Jupiter (e.g JUnit 4 rules that have not been ported to extensions).

Step 3 – Get accustomted to the API changes in JUnit Jupiter

There are some simple API changes but also two major concept changes: Rules and Runners are gone.

Simple API changes

  • public modifier can be removed (class and methods)
  • org.junit.Test➡️ org.junit.jupiter.api.Test
  • org.junit.Assert.assertX ➡️ org.junit.jupiter.api.Assertions.assertX
    (except assertThat)
  • Order of parameters changed in assert methods. The message parameter is now after expected and actual parameters! This can be a pitfall when migrating, because the message (strings) might silently turn to expected, if you just change the import.
  • assertThat is no longer part of the JUnit API. Instead, just use your favorite assertion library as AssertJ, Google truth or even hamcrest.
  • @Before ➡️ @BeforeEach
  • @After ➡️ @AfterEach
  • @BeforeClass ➡️ @BeforeAll
  • @AfterClass ➡️ @AfterAll
  • @Ignore ➡️ @Disabled
  • @Category ➡️ @Tag

For a real world example see this commit.

Make sure to not mix the APIs, because the tests are either run by the Jupiter or the vintage Engine, which will ignore unknown annotations.

Note that IntellI has a quick fix for migrating JUnit 4 to JUnit Jupiter. However, as of version 2018.1 this seems to only affect @Test, no asserts, exceptions, rules or runners.

Advanced API changes

Basically, Runners and Rules are replaced by Extensions, where one test class can have more than one extension.
However, some Runners have not been ported to Extensions, yet. For those you can try to use@EnableRuleMigrationSupport (see Temporary Folders). If this does not work, you will have to stick with the JUnit 4 API and vintage Engine for now.

Exceptions & Timeouts

Exceptions no longer need a Rule or the expected param in @Test. Instead, the API provides an assert mechanism now.

ExpectedException and @Test(expected = Exception.class) ➡️ assertThrows(Exception.class,() -> method());

For a real world example see this commit.

The same applies to timeouts:

@Test(timeout = 1) ➡️ assertTimeout(Duration.ofMillis(1), () ->method());

Mockito

Instead of the mockito runner, we use the new extension, which comes in a separate module.

<dependency>
    <groupId>org.mockito</groupId>
    <artifactId>mockito-junit-jupiter</artifactId>
    <version>${mockito.version}</version>
</dependency>

@RunWith(MockitoJUnitRunner.class) ➡️ @ExtendWith(MockitoExtension.class)

For a real world example see this commit.

Temporary Folders

Until there is an Extension, we can use @EnableRuleMigrationSupport from this module:

<dependency>
	<groupId>org.junit.jupiter</groupId>
	<artifactId>junit-jupiter-migrationsupport</artifactId>
	<version>${junit5.version}</version>
</dependency>

With this we can use the new API (org.junit.jupiter.api.Test). Howerver, rules and classes must stay public. ClassRules seem not to work.

For a real world example see this commit.

Other Rules

Here are some more rules and their equivalent in JUnit Jupiter.

  • @RunWith(SpringJUnit4ClassRunner.class) ➡️ @ExtendWith(SpringExtension.class)
  • stefanbirkner/system-rules, such as ExpectedSystemExit
    Work in progress! That is, these tests will have remain on the JUnit 4 APIs for now.
  • TestLoggerFactoryResetRule from slf4j-test
    No progress to be seen.
    Could be replaced by logback-spike. For a real world example see this commit.
  • Of course this list is non-exhaustive, there are a lot more runners I have not stumbled upon, yet.

Step 4 – Make use of new features in JUnit Jupiter

Just using the same features with different API is boring, right?
JUnit Jupiter offers some long-awaited features that we should make use of!
Here are some examples:

Optional: Further Reading

More sutainable Android Software with Project Treble and 6-y LTS Kernels on Android O?

Recently, we could read and hear about Google putting some effort in facilitating easier updates for android devices:

So, even though three years are not the promised four years, we can still see a trend of increasing years of guaranteed software support on Android phones.

That is a positive trend! I hope that Google takes the gloves off soon and guarantees four upgrades to a new android version, forcing other vendors to at least start guaranteeing something for their phones. As I said before, this would really be a unique selling point. Also this would really mean a big leap forward regarding sustainability of Android phones! Let’s see if any of the Oreo Phones being release now will support Android S 😉