Hello async my old friend

Let me present you vertx-jooq-async: the world’s first fully async and type-safe SQL code generation and execution tool for VertX™. Async? Wasn’t it asynchronous before? It was, but what the code did was wrapping JDBC-calls using Vertx’ executeBlocking-method, which just shifts the blocking code from one event loop to another (see also).


With this release however, things have changed. jOOQ is still used for generating code and creating type-safe queries, but code is executed utilizing vertx-mysql-postgresql-client. That library in turn is based on a non-blocking driver for MySQL and Postgres databases, so this time it is really non-blocking.

Although some stuff is yet missing – you should go and check out the github-page. Now.


vertx-jooq goes rx

After releasing a new version of your pet-project – what do you expect as a first reaction? Cheer? Love? Joy? No. People will ask for more. Welcome to the internet:

Clement Escoffier, vertx hero and avenger of unresolved github-issues however jumped in and added RX-Java support for vertx-jooq! Thanks to him, there is a new VertxDao that exposes various RX-like CRUD-methods:

This is cool! Also don’t forget to check the github-page for more details.

vertx-jooq 2.2.0 released

Today I’ve released version 2.2.0 of vertx-jooq with the following fixes/changes:

  • Fix: Codegeneration puts files in wrong folders #5
  • Fix: Move jooq-codegen dependency to vertx-jooq-generate #4
  • Enhancement: Create deleteAsync(Condition), fetchOneAsync(Condition) and fetchAsync(Condition)-methods #3

Especially #3 is a nice feature and increases the productivity when working with the auto-generated DAOs. Let’s have a look at the unit-test which tests the new methods:

You can add any org.jooq.Condition to fetch (row 6) or delete (row 8) records without having to write the boilerplate SQL on your own – just let vertx-jooq do the work for you.

Locking Gone Wrong

Sometimes you need to learn the hard way. Imagine the following method in your code that performs a synchronized operation like this:

The intention of the mayLock-method is to do some calculations (represented by the Thread.sleep method) guarded by the input parameter, e.g. some UserID. When you execute the code on a multicore CPU, the output should be something like:

17:14:39.905 [ForkJoinPool.commonPool-worker-2] INFO LockTest – entering method 127
17:14:39.905 [ForkJoinPool.commonPool-worker-1] INFO LockTest – entering method 127
17:14:39.913 [ForkJoinPool.commonPool-worker-2] INFO LockTest – entering lock ForkJoinPool.commonPool-worker-2
17:14:41.915 [ForkJoinPool.commonPool-worker-1] INFO LockTest – entering lock ForkJoinPool.commonPool-worker-1

Based on the timestamp, you can say, that the locking mechanic based on the argument works: about two seconds after the first Thread, the second Thread enters the synchronized section. Now changing the input parameter to 128 results into the following output:

17:20:33.849 [ForkJoinPool.commonPool-worker-1] INFO LockTest – entering method 128
17:20:33.849 [ForkJoinPool.commonPool-worker-2] INFO LockTest – entering method 128
17:20:33.858 [ForkJoinPool.commonPool-worker-1] INFO LockTest – entering lock ForkJoinPool.commonPool-worker-1
17:20:33.858 [ForkJoinPool.commonPool-worker-2] INFO LockTest – entering lock ForkJoinPool.commonPool-worker-2

That’s right. The method is executed right away without waiting for the lock. The reason behind this is Java’s autoboxing and the IntegerCache. Autoboxing is a convenient way of interoperating primitive int‘s and Integer-objects so you can write code like this:

When a primitive int is autoboxed into an Integer-object, the JVM first checks if the target value is within the range of the IntegerCache which is, by default between -127 and 127. That means the following code is valid:

Because 127 is within the bounds of the cache i refers to the same object as j. Let’s change to 128 now:

Because 128 is outside of the cache bounds, a new Integer object is created each time you do autoboxing (same goes for all values greater 127 and lower -127) and is therefore no good candidate for acquiring locks. Although you can increase the cache size as the Javadoc states, this solution does not scale very well.

 * Cache to support the object identity semantics of autoboxing for values between
 * -128 and 127 (inclusive) as required by JLS.
 * The cache is initialized on first usage.  The size of the cache
 * may be controlled by the {@code -XX:AutoBoxCacheMax=} option.
 * During VM initialization, java.lang.Integer.IntegerCache.high property
 * may be set and saved in the private system properties in the
 * sun.misc.VM class.

In addition, you don’t know what parts in your code are locking on that Integer too, especially when you are working with third party libraries.

You might be tempted to switch to Strings, but something similar applies for Strings and the internal StringPool. Locking may work if the String is inside the StringPool but is not guaranteed to. So how to circumvent this?

Following the rule “first make it work – then make it fast“, we could synchronize the whole mayLock-method. Actually, this would work in many environments, but when you’re looking for performance locking on class-level is not an option. So my suggestion here is to create a Map<Integer,Object> which stores an Object you can synchronize on or a Map<Integer,java.util.concurrent.locks.Lock> if you need finer control over the locking process.

Since we’re using a Map with a hashing algorithm, Integer can safely be used as a key. The computeIfAbsent-method of the ConcurrentHashMap guarantees, that it returns the same object even if multiple Threads are accessing the method at the same time. When we run the test again, we see that the lock is correctly used.

However the Map needs to be maintained, e.g. we have to manually remove the key-value pairs once we’re done. Switching to a WeakHashMap that stores it’s keys with a weak reference (and is garbage collected when the key is no longer referenced) is not an option because of two reasons: a) this class is not threadsafe and b) Integers within the range of the IntegerCache will never be removed because they’re “hard” referenced.

As you can see, getting multithreading right is hard. Or simply put

99% of multithreaded Java code is wrong. The other 1% is written by Doug Lea and Brian Goetz.

Configuring A Staged Java Application On AWS ElasticBeanstalk – The Maven Way

In the previous post I’ve described a way how to configure a maven project that takes environment parameters from ElasticBeanstalk to boot the application. However there was one thing that bothered me: I’ve used a shell-script to assemble everything into a deployable zip-file. In this post I show you how to do it only with maven plugins (heavily inspired by this stackoverflow answer).

Here is the new pom:

As you can see, this time I’ve used the maven shade plugin to create a runnable jar. The assembly plugin is used here to pack everything into a zip-file. The configuration happens through an external descriptor in src/assembly/assembly.xml which looks like this:

Procfile and run.sh are located in the project root. To sum it up: each time you invoke mvn clean install or mvn package, maven will create a runnable jar, and then zip it alongside with the Procfile and run.sh. If you want to alter the name of the deployment artifact just change the finalName tag of the assembly plugin’s configuration.

The project is uploaded to Github.

Happy coding!

Configuring A Staged Java Application On AWS ElasticBeanstalk


I’ve just blogged about a maven only solution.

In this blog post I’ll describe how to deploy and setup a maven driven Java application on AWS ElasticBeanstalk with different staging profiles. To understand what’s going on, you should have knowledge about AWS and how to setup an ElasticBeanstalk environment.

I have setup this github-repository which contains a very basic maven project. Let’s have a look at it’s pom:

No big surprises here: we set the compiler level to Java 8 and tell maven to build an executable fat-jar called app.jar during package-phase. The main itself just prints the passed arguments in the console as shown below:

Executing mvn clean install should generate the class files and a runnable jar file in the target-folder. One could take this file and upload it to ElasticBeanstalk. We’re done.

Just kidding.

Two problems may arise:

  1. ElasticBeanstalk will reject the file the next time you upload it, because it already has a file called app.jar.
  2. The application wouldn’t print any arguments, because you haven’t passed any to it.

Solving the first problem is quite easy: you can just append the current timestamp (or project version) to the jar. Either manually, or by changing the finalName-tag to something like




However this doesn’t solve the second problem. To tackle this, one must understand how the application is started after it was deployed. On the EC2-instance, next to the uploaded jar-file there is a file called Procfile, which is used to start the jar. By default, it consists of a one-liner:

web: java -jar app.jar

The only thing you need to know regarding the first word web is:

The command that runs the main JAR in your application must be called web, and it must be the first command listed in your Procfile.

Good news is, that a user can upload it’s own Procfile which leads us to the question: where to put the arguments, especially if you have different arguments for each staging profile. At first sight, a good fit are maven-profiles, but that solution has two major drawbacks: secret informations like passwords should not be stored in your pom. And secondly, each time you change the arguments, you have to redeploy the whole application.

A much better approach is to store that kind of data on the instance itself using ElasticBeanstalk’s configuration. To do so, go to your environment, select the application you are using (or create a new one based on Java SE-Platform), click on Configuration, Software Configuration and create two new environment variables: JAVA_OPTS and JAVA_ARGS. Set the value of JAVA_OPTS to something like -Dfile.encoding=utf8 and JAVA_ARGS to my_staging_property. The values are now available on the instance – time to create our own Procfile. To do so, I’ve created a bash-script (Windows users look here) like this:

  • Lines 3 and 4: I decided to append the timestamp to the build artifact to fix the redeploy issue.
  • Line 6, 7 and 8: Our Procfile just executes another shell script, called run.sh. That script uses the environment variables defined above to start the application.
  • Line 9: Zip everything into one file.

So instead of executing maven directly, run ./build.sh from the terminal. This will create a file with a name like app-2017-05-03_14-50-07.zip in the target/ – folder of your maven project which can then be deployed to any staging environment. Configuration is solely done by ElasticBeanstalk. E.g. set JAVA_OPTS to -Xmx256M on your dev-environment and -Xmx16G on production. Changing some parameters only requires to update your EB-configuration (e.g. tune some GC-settings during a load test) instead of redeploy everything.

The Future Is Here

Getting asynchronous operations right is hard. And when you have to pass a result of one function to the next, it can only get worse. Since today, vertx-jooq’s API for instance only allowed asynchronous operations with a callback handler, which leads to code fragments like this:

What annoys me about this is that I have to define in each handler if it succeeded and what to do if an exception occurred on the database layer. Especially when you have to nest three or more operations this looks ugly. Of course this problem is not new and there exists an alternative approach which is using and composing Java 8 java.util.concurrent.CompletableFuture. By doing so, the same code becomes easier and more readable:

But using CompletableFuture within the Vertx world leads to a problem: Vertx has it’s own threading model to achieve the performance it actually has. On the other hand, some methods of CompletableFuture, e.g. CompletableFuture.supplyAsync(Supplier), run tasks on the common ForkJoinPool which would break the Vertx contract. Open-source-software to the rescue, there is a solution to this problem: VertxCompletableFuture. This special implementation guarantees that async operations run on a Vertx context unless you explicitly specify an Executor in one of the overloaded xyzAsync-methods*.

And here comes even better news: starting from version 2, vertx-jooq also supports this way of dealing with asynchronous database operations by utilizing VertxCompletableFuture. Checkout the new vertx-jooq-future module and the according code generator to create your CompletableFuture-based DAOs.

* There have been discussions in the Vertx developer group about a CompletableFuture based API, e.g. here and especially here. The current status is that they do not provide such API officially, mostly because VertxCompletableFuture breaks the contract of the supplyAsync-methods, since it runs within the Vertx context and not the ForkJoinPool. Also when you pass this CompletableFuture subclass to code that expects a regular CompletableFuture, it breaks the Liskov substitution principle and OOP (thanks for pointing that out in the comments Julien). My opinion is, that if you are using Vertx you are aware of the special threading model and can tolerate that behavior. But, of course, it’s up to you.