Matthieu Vergne's Homepage

Last update: 19/11/2017 11:16:50

When to Instantiate/Throw an Exception in Java?

An exception in Java is, as defined by Oracle, an event, which occurs during the execution of a program, that disrupts the normal flow of the program's instructions. As such, exceptions are used to provide relevant information to the developper and recover or crash nicely after an unwanted states has been reached. Practically, exceptions are objects which are instantiated, like new NullPointerException(), and thrown when the unexpected event occurs, after which instructions are ignored and the exception-related information propagated, leading usually to some rich display in the console for debugging. In this post, we will see that throwing an exception involves two fundamental actions, instantiation and throwing, which both relate to different requirements and, consequently, should not systematically be done together.

Illustrative Example: Supplier-based Framework

In this post, we will work on an example inspired from the jMetal framework, which provides various algorithms to solve various kinds of problems. In this framework, built initially in a research environment, one may want to experiment several algorithms on several problems, to compare the algorithms and see which is more efficient in which context. Following this idea, programming an experiment may go as follows:

  1. Provide the algorithms and problems to combine
  2. Run each algorithm on each problem to produce data
  3. Format the data into exploitable results

In this context, a standardised procedure can be implemented to help the experimenter focus on the relevant stuff only. In particular, such a framework can provide an experiment builder which takes from the experimenter \(N\) algorithms, \(M\) problems, and combine them automatically into \(N \times M\) pairs to run. However, we cannot naively reuse algorithm and problem instances in each pair, like running an algorithm on one problem and then on another, until we have run it on each problem. Such a constraint can occur because algorithms should be runnable in parallel, or because algorithms may not have as a requirement to be reusable. This means that we need to create several instances of each, like 3 instances of the same algorithm to solve 3 different problems. Such an experiment builder might look like this:


// Create the experiment builder
ExperimentBuilder builder = new ExperimentBuilder();

// Tell which problem to solve
builder.addProblem(() -> new Problem1());
builder.addProblem(() -> new Problem2());
builder.addProblem(() -> new Problem3());

// Tell which algorithm to use for that
builder.addAlgorithm(() -> new Algorithm1());
builder.addAlgorithm(() -> new Algorithm2());
builder.addAlgorithm(() -> new Algorithm3());

// Configure various actions to do with the data
...

// Instantiate the experiment and run it
Experiment experiment = builder.build();
experiment.run();

This code is significantly simplified for the sake of this post, but the important point here is that, in order to let the builder manage the instantiation of the (algorithm, problem) pairs, we do not add algorithm and problem instances directly but suppliers (through lambda expressions), which are pieces of code providing instances on demand. Since Java 7, it can be implemented through the functional interface java.util.function.Supplier, although it could be done already before through custom interfaces, minus the convenient format of lambda expressions. This Supplier interface looks like this:


public interface Supplier<T> {
    T get();
}

As such, it is rather simple: it simply allows to access a ressource of a given type. In our code, each supplier generates a new instance, leading to provide a supplier for each algorithm and problem (\(N + M\)) in order to let the builder produce all the needed pairs (\(N \times M\)) without reusing the same instances. Depending on the implementation of the builder, these instantiations could be done when calling builder.build() or experiment.run(), but for the sake of simplicity, we will say that it is done in the former.

Then, we take the perspective of the framework, which implements the builder. This builder receives suppliers from the experimenter and should use them to generate the (algorithm, problem) pairs, which can be done with the following code:


Collection<Supplier<Problem>> problemSuppliers = new LinkedList<>();
Collection<Supplier<Algorithm>> algorithmSuppliers = new LinkedList<>();

public void addProblem(Supplier<Problem> supplier) {
	problemSuppliers.add(supplier);
}

public void addAlgorithm(Supplier<Algorithm> supplier) {
	algorithmSuppliers.add(supplier);
}

public Experiment build() {
	// Create independent (algorithm, problem) pairs
	Collection<Pair<Algorithm, Problem>> pairs = new LinkedList<>();
	for(Supplier<Problem> supP : problemSuppliers) {
		Problem problem = supP.get();
		for(Supplier<Algorithm> supA : algorithmSuppliers) {
			Algorithm algorithm = supA.get;
			pairs.add(new Pair(algorithm, problem));
		}
	}
	
	// Create the experiment based on them
	Experiment experiment = ...
	
	return experiment;
}

The code above is minimal, and the goal now is to identify unwanted cases to manage through exceptions. We will look at them separately in the next sections.

Exception 1: A Supplier Should Not Be Null

One potential issue is receiving null suppliers for algorithms or problems. This is a usual case, which can be solved by throwing a NullPointerException in the corresponding method. In the case of the algorithms, we would replace this:


public void addAlgorithm(Supplier<Algorithm> supplier) {
	algorithmSuppliers.add(supplier);
}

by this:


public void addAlgorithm(Supplier<Algorithm> supplier) {
	if (supplier == null) {
		throw new NullPointerException("No supplier provided");
	} else {
		algorithmSuppliers.add(supplier);
	}
}

Since Java 7, we can also use some helpers to do the same in a shorter way:


public void addAlgorithm(Supplier<Algorithm> supplier) {
	Objects.requireNonNull(supplier, "No supplier provided")
	algorithmSuppliers.add(supplier);
}

In the following, we won't consider using this facility to remain with an explicit exception. The important point here, however, is to notice that we instantiate our exception on the fly, and only if it is required. We will see later that it is not the only relevant use, though.

Exception 2: A Supplier Should Not Return Null

A second unwanted case is when the supplier itself returns null. We need a supplier, but when we call it we also need to obtain a proper problem or algorithm instance. In the case of a problem, we may argue that such guarantee is not required for some cases, so we will ignore it and focus on the algorithms. Indeed, an algorithm must be a runnable thing, and thus cannot be null by definition. To check this, we can do it in several manners.

The first solution is, like the first exception, to check it immediately, like this:


public void addAlgorithm(Supplier<Algorithm> supplier) {
	if (supplier.get() == null) {
		throw new NullPointerException("The supplier cannot return null");
	} else {
		algorithmSuppliers.add(supplier);
	}
}

The problem of this solution is that it is over-constrained. Indeed, nothing forbids the experimenter to provide a supplier which takes its instances from another source, not yet initialised at the time we use the builder. In this case, the supplier won't be able to provide an instance yet, and thus throw an exception or return null. The point is that we impose on the experimenter to guarantee that the suppliers can be used immediately, although we initially need them to be usable only when instantiating the pairs, thus when we build or run the experiment. So we need to find a better choice.

A second choice is, instead of checking the source, checking the usage, which moves us to the build() method. We may replace this:


public Experiment build() {
	// Create independent (algorithm, problem) pairs
	Collection<Pair<Algorithm, Problem>> pairs = new LinkedList<>();
	for(Supplier<Problem> supP : problemSuppliers) {
		Problem problem = supP.get();
		for(Supplier<Algorithm> supA : algorithmSuppliers) {
			Algorithm algorithm = supA.get;
			pairs.add(new Pair(algorithm, problem));
		}
	}
	
	// Create the experiment based on them
	Experiment experiment = ...
	
	return experiment;
}

by this:


public Experiment build() {
	// Create independent (algorithm, problem) pairs
	Collection<Pair<Algorithm, Problem>> pairs = new LinkedList<>();
	for(Supplier<Problem> supP : problemSuppliers) {
		Problem problem = supP.get();
		for(Supplier<Algorithm> supA : algorithmSuppliers) {
			Algorithm algorithm = supA.get;
			if (algorithm == null) {
				throw new NullPointerException("The supplier cannot return null");
			} else {
				pairs.add(new Pair(algorithm, problem));
			}
		}
	}
	
	// Create the experiment based on them
	Experiment experiment = ...
	
	return experiment;
}

Now, we really check it at the right time, but several criticisms can be made. We may criticise the elegance of this code, which becomes heavier, especially if we have to do it with problems too, but this check will anyway have to be done, so it is not relevant. A practical criticism is a matter of responsibility: this is normally the method addAlgorithm() which should ensure that we obtain a valid supplier, but here we delegate this responsibility to build(), which is not recommended. It might seem useless here, because we only move a piece of code, but this choice means that, if we need to use these suppliers in other places of the builder, we need to repeat the same check there, and thus to write redundant code. It would be preferable to have it in one place, and the most relevant place to factor it is where we receive it. Second, the help provided by the exception is diminished, but to understand that we need to see what would be the consequence of such an implementation.

Exception = Message + Stack Trace

An exception is not just a way to interrupt the program cleanly, it is also an important source of information for debugging. This information is composed of 2 fundamental pieces. The first one is the message of the exception, which tells what went wrong, like "No supplier provided" or "The supplier cannot return null" in the code we wrote previously. If we know that something wrong occurs in addAlgorithm(), descriptions like these allow to understand clearly what we did wrong with it, and fix it properly. But again, only if we know that it went wrong in this method, which is what provides the second piece of information. This one is the stack trace, which tells which method has generated the exception, by being called from which method, which has been called by which method, and so on until the root main() method (or thread).

If the message can be easily adapted in the code, and thus is mainly a matter of linguistic skill, the stack trace however depends on where the exception has been instantiated. In particular, instantiating the exception in addAlgorithm() would have lead to this kind of stack trace:


java.lang.NullPointerException: The supplier cannot return null
	at ExperimentBuilder.addAlgorithm(ExperimentBuilder.java:157)
	at ExperimentBuilder.main(ExperimentBuilder.java:322)

while moving it to build() lead to have something more like this:


java.lang.NullPointerException: The supplier cannot return null
	at ExperimentBuilder.build(ExperimentBuilder.java:183)
	at ExperimentBuilder.main(ExperimentBuilder.java:322)

Knowing that a supplier returns null in build() is, however, not as helpful: the issue could come from a faulty supplier in addAlgorithm(), or because we did something wrong with other methods used before build(). The faulty one is indeed the call to addAlgorithm(), which received an invalid supplier, and thus the best stack trace is the one leading to this method.

Solution: Split Instanciation and Throwing

As opposed to a usual case (if condition, throw new exception), this check on the returned value of the supplier requires us to properly decompose our exception management. Indeed, if we remain with the current structure, whether we do it in the right method (addAlgorithm()) but with an over-constrained parameter, whether we move it elsewhere but with a stack trace which does not help to spot the actual source of the mistake and a potential need of redundant code. In order to have both advantages, we can adopt an adaptive strategy, where we choose one of the two to have its advantage, and enforce somehow the second through some tricks.

If we choose to put it in build(), we need to enforce the stack trace to lead to addAlgorithm() in its last line, but we also need to adapt the rest of the stack trace to spot the right call of the method. Indeed, an experimenter usually add several algorithms, and thus call several times the method, so we need to find which call is faulty in order to help at best the debugging, which seems clearly cumbersome with the need of manual traces and so on. With the addition of potential redundant code, which means that we should try to factor it in a dedicated method to reuse, it seems that such a solution is a no go for managing a single kind of exception.

Rather, we may choose to instantiate the exception in addAlgorithm() to have the right stack trace from the start, the challenge being to check the returned value and throw the exception in build(). Instantiation is straightforward, we just need to have this line somewhere in addAlgorithm():


NullPointerException exception = new NullPointerException("The supplier cannot return null");

In order to check and throw in build(), we may need to store this exception in a field of the class and use it in build() at check time. However, as we said earlier, addAlgorithm() need to be called several times, and each call needs its own exception to have the right stack trace, which leads precisely to this call. Instead of a single field, we may use a Map<Supplier<Algorithm>, NullPointerException> to assign to each supplier (and thus each call) its own exception. Such a solution works, but we remain with an issue: the check is still done in build(), and thus the potential need to reproduce the check elsewhere through a redundant code occurs again.

One way to factor it is to create a dedicated method to reuse, which is only partially satisfying because we still need to think about calling this method when we add a piece of code needing it. As we said earlier, this is addAlgorithm() which is responsible of obtaining a valid supplier, and as such the best solution is to implement the check in this method, in such a way that the check is actually done wherever it might be useful, automatically. Is it only possible? Yes, it is, and here we do it by using a decorator, which consists in wrapping the supplier we get into another supplier which does something more than just returning the instance. Here is the code:


public void addAlgorithm(Supplier<Algorithm> supplier) {
	NullPointerException exception = new NullPointerException("The supplier cannot return null");
	Supplier<Algorithm> decoratedSupplier = new Supplier<Algorithm>() {
		@Override
		public Algorithm get() {
			Algorithm algorithm = supplier.get();
			if (algorithm == null) {
				throw exception;
			} else {
				return algorithm;
			}
		}
	};
	algorithmSuppliers.add(decoratedSupplier);
}

Here, we only create an auto-checking supplier which builds on the supplier provided in argument. In other words, we create a supplier which calls the one provided, checks whether or not the value is valid, and returns it or throws an exception depending on the answer. This way, every time it is called, it checks by itself its own value and throw our exception if required, or return the instance as usual if all is fine. Because we replace the provided supplier by the decorated one, any other piece of code using the supplier will necessarily pass through the auto-check, without even knowing about it. An important point, however, is to keep the exception instantiation directly in addAlgorithm(), out of the decorator definition, otherwise it will take the stack trace at the time it will be called, which means the stack trace of the user, like build(), which is not what we want. In other words, even now the instanciation of the exception is separated from throwing it.

With this solution, we preserve all the code into addAlgorithm(), so the responsibility is fully managed by this method, and satisfy both our requirements of having a relevant stack trace and a just-on-time check. However, this solution comes with a cost too: instead of instantiating the exception only when it is required, we need to do it in advance. Here, the exception is created anyway, as soon as we call addAlgorithm(), and it is done every time we call it, even if the supplier is valid. This means that we take additional space in memory to improve debugging, leading this solution to be worth few instances, but costly with a growing amount of calls to manage.

Conclusion: The Checker Instantiates, The Receiver Throws

We can conclude from this example that instanciating the exception and throwing it are 2 different responsibilities. The instanciation should be done by the method which is in charge of validating the data, independently of when this data will be checked. Here, the data we provide is a supplier, and the method which should ensure that it is valid is addAlgorithm(), so we instantiate the exception there. This method then should validate the supplier in 2 ways: ensuring that it is not null, and ensuring that the value it returns is not null. Whether we use a single exception or two different exceptions is a matter of design choice, but in any case they should all be instantiated by addAlgorithm() to drive the programmer to the actual faulty method.

Throwing the exception, however, should be done as soon as the data to check is available, and thus by the one receiving it. The one receiving the supplier is addAlgorithm(), thus it should be the one throwing an exception if the supplier is null. The one receiving the value of the supplier, however, is build(), which requests the instances to create the pairs. This is the one which should throw the exception if the supplier returns a null value. Here, we were able to factor this code in addAlgorithm(), but whether or not such factoring is always possible is a different topic.