Tuesday, February 27, 2018

Testing your code with Spock

Spock is a testing and specification framework for Java and Groovy applications.  Spock is:
  • extremely expressive 
  • facilitates the Given / When / Then syntax for your tests 
  • compatible with most IDEs and CI Servers.
Sounds interesting? Well you can start playing with Spock very quickly by paying a quick visit to the Spock web console.  When you have a little test you like, you can publish it like I did for this little Hello World test.

This Hello World test serves as a gentle introduction to some of the features of Spock.

Firstly, Spock tests are written in Groovy.  That means, some boiler plate code that you have with Java goes away.  There is
  • No need to indicate the class is Public as it is by default.
  • No need to declare firstWord and lastWord as Strings 
  • No need to hurt your little finger with a ; at the end every line
  • No need to explicitly invoke assert, as every line of code in the expect block gets that automatically.  Just make sure the lines in the then: block evaluate to a boolean expression.  If it is true the test passes otherwise it fails.  So in this case, it is just an equality expression which will either be true or false. You can have as many expressions as you want.
So less boiler plate code what next?  Well you know those really long test names you get with JUnit tests, well instead of having to call this test, helloWorldIntroductionToSpockTest() which is difficult to read, you can just use a String with spaces to name the test: Hello World introduction to Spock test. This makes things much more readable.

Thirdly, the Given: When: Then: syntax,  enforces test structure.  No random asserts all the test.  They are in a designated place.   More  complex tests, can use this structure to achieve BDD and ATDD.

Fourthly, if I were to make a small change to the test and change the assertion to also include Tony,  the test will of course fail. But when I get a failure in Spock, I get the full context of the expression that is tested.  I see the value of everything in the expression.  This makes it much quicker to diagnose problems when tests fail.

Not bad for an introduction.  Let's now have a look at more features. 

Mocking and Stubbing

Mocking and Stubbing are much more powerful than what is possible with JUnit (and various add on's). But, it is not only super powerful in Spock, it is also very terse, keeping your test code very neat and easy to read.

Suppose we want to Stub a class called PaymentCalculator in our test, more specifically one of its method, calculate(Product product, Integer count).   In the stubbed version we want to return the count multiplied by 10 irrespective of the value of product.   In Spock we achieve this by:
PaymentCalculator paymentCalculator = Stub(PaymentCalculator)
paymentCalculator.calculate(_, _) >> {p, c -> c * 10}
If you haven't realised how short and neat this is, well then get yourself a coffee.  If you have realised well you can still have a coffer but consider these points:
  1. The underscores in the calculate mean for all values 
  2. On the right hand side, of the second line, we see a Groovy Closure. For now, think of this as an anonymous method with two inputs. p for the product, c for count. We don't have to type them. That's just more boiler plate code gone. 
  3. The closure will always return the count time 10.  We don't need a return statement.  The value of the last expression is always returned. Again, this means less boiler plate code.  When stubbing becomes this easy and neat, it means you can really focus on the test - cool. 

Parameterised Tests

The best way to explain this is by example.
def "Check that the rugby player #player who has Irish status #isIrish plays for Ireland"(String player, Boolean isIrish) {
    given:"An instance of Rugby player validator"
    RugbyPlayerValidator rugbyPlayerValidator = new RugbyPlayerValidator()

    rugbyPlayerValidator.isIrish(player)  == isIrish

    player               ||  isIrish
    "Johny Sexton"       ||  true
    "Stuart Hogg"        ||  false
    "Conor Murray"       ||  true
    "George North"       ||  false
    "Jack Nowell"        ||  true

In this parameterised test we see the following:
  1. The test is parameterised. The test signature having parameters tells use this, as do the where block.  
  2. There is one input parameter player and one output parameter - which corresponds to an expected value. 
  3. The test is parameterised five times.  The input parameters are on the left, output on the right. It is, of course, possible to have more of either, in this test we just have one of each. 
  4. The @Unroll annotation will mean that if the test fails, the values of all parameters will be outputted. The message will substitute the details of player into #player and the details of the Irish status substituted into #isIrish. So for example, "Checks that the rugby player Jack Nowell who has Irish status true plays for Ireland"
Again, this makes it much quicker to narrow in on bugs. Is the test wrong or is the code wrong? That becomes a question that can be answered faster.  In this case, of course it is the test that is wrong.

All the benefits of Groovy

What else? Well another major benefit is all the benefits of Groovy.  For example, if you are testing an API that returns JSON or XML, Groovy is brilliant for parsing XML and JSON. Suppose we have an API that returns information about sports players in XML format. The format varies, but only slightly, depending on the sport they play:
Joey Carberry Teddy Thomas
Lionel Messi Cristiano Ronaldo
We want to just invoke this API and then parse out the players irrespective of the sport. We can parse this polymorphically very simply in Groovy.
def rootNode = new XmlSlurper().parseText(xml)
def players = rootNode.'*'.Players.Player*.text()

Some key points:
  1. The power of dynamic typing is immediate. The expression can be dynamically invoked on the rootNode. No verbose, complex XPath expression needed.
  2. The '*', is like a wildcard. That will cover both RugbySummaryCategory and FootballSummaryCategory.
  3. The Player*, means for all Player elements. So no silly verbose for loop needed here 
  4. The text() expression just pulls out the values of the text between the respective Player elements. So why now have a list all players and can simple do:players.size() == 4. Remember, there is no need for the assert. 
Suppose we want to check the players names. Well in this case we don't care about order, so make more sense to convert the list to a Set and then check. Simple.
players as Set == ["Joey Carberry", "Teddy Thomas", "Lionel Messi", Cristiano Ranaldo"] as Set

This will convert both list to a Set which means then order checking is gone and it is just a Set comparison. There's a tonne more Groovy features we can take advantage of. But the beauty is, we don't actually have to. All Java code is also valid in a Groovy class. The same hold trues for Spock. This means there is no steep learner curve for anyone from a Java background. They can code pure Java and then get some Groovy tips from code reviews etc.

Powerful annotations

Spock also has a range of powerful annotations for your tests. Again, we see the power of Groovy here as we can pass a closure to these annotations. For example:
def "I'll run anywhere except windows"() {...}
Or just make your test fail if they take too long to execute
@Timeout(value = 100, unit=TimeUnit.MILLISECONDS)
def "I better be quick"() {...}
So in summary Spock versus vanilla JUnit has the following advantages:
  1. Test Structure enforced. No more random asserts. Assertions can only be in designated parts of the code. 
  2. Test code is much more readable. 
  3. Much more information on the context of the failed test
  4. Can mock and stub with much less code
  5. Can leverage a pile of Groovy features to make code much less verbose
  6. Very powerful test parameterisation which can be done very neatly
  7. A range of powerful annotations. 
And one of the often forgotten points is that your project doesn't have to be written in Groovy. You can keep it all in Java and leverage the static typing of Java for your production code and use the power and speed of Groovy for your test code.

Until the next time take care of yourselves.

Friday, January 12, 2018

When a REST Resource should get its own Address?


Author's note

In a purist REST approach, all endpoints (except the starting endpoint) are opaque and their various details shouldn't need to be published.  Even if this approach is being used, the points in this article are relevant as Server logic will have to determine when something requires a end point or not. 


In a REST architecture an entity or a resource (for the rest of the article the term entity will be used) may or may not have its own address.  For example, suppose we have an inventory application merchants use to sell their products. Immediately it is possible to see a Product entity.  It's URL will look something like: /product/{id}

Now, it is possible for the merchant selling the Products to add his / her own comments to the Products.  For example, "Sells very well on Fridays" or "Consider changing price if product doesn't start selling".   A Product can have 0..* Comments.  As stated, the Product has its own address: /product/{id} for example /product/1231233

and a response payload like this



    "comments": [{


             "comment":"Sells very well on Fridays"                 

     }, {


             "comment":"Consider changing price if product doesn't start selling"  


As can be seen, the payload returns a collection of Comment Objects. Should the individual comments each have their own address as well or is it okay that they are just embedded into the Product response? To help the answer this question the following should be considered.

Does the Entity have any meaning outside the Containing Entity Context? 

If an Entity (for example Comment) has meaning outside their containing Entity (for example Product) then they should have their own address.  For example, suppose the Entity was Student and the Student returned a list of Universities he / she had studied.   These Universities have their own meaning outside the Student. So obviously the University should have its own address. In the Activity / Comments scenario, the Comments only exist for the activity.  No other Entity will ever reference them or need to reference them. Therefore further aspects needs to be considered.

Is it desirable to perform actions on the individual entities? 

Should the client be allowed to create, read, update or delete the individual entity? These have to be considered separately.

Writes: Create, Update, Delete 

In the Product / Comments scenario, a Comment would never be created outside or without an Product. It is essentially added to an Product. This could be considered as a partial update to the Product.  However, an update or delete to an existing Comment could also be considered a partial update to the Product.   This creates complexity on how to differentiate between Create / Updates and Deletes of a Comment using a partial update on the Product.  If this is required, it would be much simpler to create a contextual address for the Comment (which indicates the hierarchical nature of the Product / Comment) then allow the Client sent POST, PUT, PATCH, DELETES to that.

Example URL: /product/1231233/comment/1


In some scenarios the parent containing Entity may not return all the information about the child Entities. For example, again consider the Product --> Comment scenario. Suppose the comment was very large. This would mean the payload for the Product was also very large.  In such cases, it might be more prudent for the Product to just return a summary of the Comment and if the client wants the full Entity to make an individual request.  Similarly, if there's a big performance cost to get an individual Entity (for example a 3rd party API has to be invoked to get all the information about the comment), it can make more sense to just send a URL link to the Entity rather the than the actual entity contents.

N+1 Problem 

If individual Reads are required, be careful that the N+1 problem doesn't then get introduced. For example, suppose a Product could have 100 Comments. the Product API will only return a summary of the Comment and a link to each individual comment if the client wants all the information. However, if the client wants every single comment, this means there will now be 100 HTTP requests. If this is a potential scenario, then a secondary endpoint which aggregates all the comments into the Product should be considered. This is similar to the API Gateway pattern.

Surface Area of Endpoints

In any architecture when contracts are published, if there are too many it can become very unwieldy for developers to understand. Most well known APIs (e.g. PayPal, Amazon, Twitter, Google) usually only have about 20 - 30 addresses. This is a good aim to have. If there are 5,000 different addresses it can become way too large and difficult to control etc.

In summary, the decision diagram provides guidance on what you should do.

Sunday, January 7, 2018

Are you forgetting your Agile values?

A while back I wrote why sometimes Agile will fail.  In this post, I will focus on the specific misunderstandings of Agile values.   When people ask if you're Agile, they basically think:
  • Do you have stand ups?
  • Do you have retrospectives?
  • Do you have stories?  
  • Do you use yellow post its?
  • etc
Such ideas belong to an Agile process called Scrum which is the most popular Agile process but isn't the only one.   There are other processes: Kanban, RUP, XP, etc...  You don't necessarily have to be Scrum.

For me the most important thing that came from Agile was the actual manifesto.  It should be read by everyone working in software at the beginning of every year and at the beginning of every project. Then it should be discussed with team members and people should be encouraged to give specific examples they have of the values and principles detailed in the manifesto.    In this blog post,  we'll have a look at the values.

8 Values with emphasis on 4

There are four points which detail 8 values in the manifesto
  1. Individuals and interactions over processes and tools
  2. Working software over comprehensive documentation
  3. Customer collaboration over contract negotiation
  4. Responding to change over following a plan
Then there is the very important sentence: "That is, while there is value on items on the right, we value items on the left more."  It is worth saying that sentence twice.  Maybe three times... four times... whatever it takes so you never forget it.  Why? Well let's take them one by one.

Individuals and interactions over processes and tools

So does this mean if you are Agile do you stop worrying about process?  No.  No project can ever be successful without process.   From an Agile perspective, it means, you value process. But, you value Individuals and interactions more.   So for example,  say you spend lots of man hours in a process that isn't really adding value to the project or customer. When you look at this critically it is because developers and testers don't talk to each other.  Instead they have a convoluted process so they think they can blame each other when a production defect happens.  Lots of man hours goes into this.  But, it would be much more efficient if they talked regularly to each other which then negates the aspects of the process which are making it convolutedSo the challenge is two fold:
  • Ensure people talk to each other regularly
  • Ensure your processes are efficient 
So for example, it makes more sense to have a regular show and tell then no interaction whatsoever and a massive delivery at the end where you are hoping you will get customer satisfaction because of some long complex process.   Aim to make individuals, interactions in the heart of your processes.

Working software over comprehensive documentation

This is a classic misunderstanding.  Somebody wants comprehensive documentation for some complex functionality and a developer retorts: "Hey Dinasour, this isn't waterfall, there is no need for detailed documentation".   Wrong.  Documentation is still required. The point here is that with Agile you shouldn't be spending massive amounts of man hours on documentation when your software is riddled with bugs.   It is more important that your software works.  This means more time should be spend on excellent automated tests that have sufficient functional coverage.  This isn't always easy to do but you should ensure this is done. 

Customer collaboration over contract negotiation

Thirdly, do you stop doing contracts agreements with customers on your projects.  No. The point here is you spend more man hours with meetings collaborating with customers than you do teasing out a contract with lawyers.  Instead of spending a massive amount of time to get sign off on a massive project, you should collaborate thru the life cycle of the project, breaking it up into small increments, take on board feedback and work together towards a common goal: project success. 

Responding to change over following a plan

Lastly, do you stop planning?  Of course not.  But again, what is the point planning down to the nitty gritty if you cannot even respond to change? Could you imagine a customer asked for a tiny change to how a UI was displaying data and the developer team respond with: "Sorry that wasn't in our 6 month plan"? It is paramount that architecture facilitates reasonable change and if it can't the project will struggle to really embrace an agile philosophy. 


Agile is actually about putting more constraints on your software methodology.    As an analogy, the REST architecture style puts constraints on your architecture for example the Uniform Interface, Statelessness and Caching capabilities.  The idea is then by sticking to these constraints (and that's a technical challenge) you get benefits.  In the case of REST, your architecture will be more scalable, extensible and lead to much higher developer productivity for API consumers.  In the case of Agile, by sticking to the constraints of valuing 8 key concepts but putting even more emphasis on 4, your project will have greater chance of success. 

Wednesday, November 15, 2017

More Fail early - Java 8

Fail fast or Fail early is a software engineering concept that tries to prevent complex problems happening by stopping execution as soon as something that shouldn't happen, happens.   In a previous blog post and presentation I go more into detail about the merits of this approach, in this blog post I will just detail another use of this idea in Java 8.
In Java, Iterators returned by Collection classes e.g. ArrayList, HashSet, Vector etc are fail fast. This means, if you try to add() or remove() from the underlying data structure while iterating it you get a ConcurrentModificationException. Let's see:
import static java.util.Arrays.asList;
List ints = new ArrayList<>(asList(1,2,3,4,5,6,9,15,67,23,22,3,1,4,2));
for (Integer i: ints) {
    // some code
    ints.add(57);  // throws java.util.ConcurrentModificationException
In Java 8u20, the Collections.sort() API is also fail fast. This means you can't invoke it inside an iteration either. For example:
import static java.util.Arrays.asList;
List ints = new ArrayList<>(asList(1,2,3,4,5,6,9,15,67,23,22,3,1,4,2));

for (Integer i: ints) {
    // some code
    Collections.sort(ints); // throws java.util.ConcurrentModificationException
This makes sense. Iterating over a data structure and sorting it during the iteration is not only counter intuitive but something likely to lead to unpredictable results.  Now, you can get away with this and not get the exception if you have break immediately after the sort invocation.
import static java.util.Arrays.asList;
List ints = new ArrayList<>(asList(1,2,3,4,5,6,9,15,67,23,22,3,1,4,2));

for (Integer i: ints) {
    // some code
    Collections.sort(ints); // throws java.util.ConcurrentModificationException
But, that's hardly great code. Try to avoid old skool iterations and you use Lambdas when you can. But, if you are stuck, just do the sort when outside the iteration
import static java.util.Arrays.asList;
List ints = new ArrayList<>(asList(1,2,3,4,5,6,9,15,67,23,22,3,1,4,2));
for (Integer i: ints) {
    // some code
or use a data structure which sorts when you add.

This new behaviour of the Collections.sort() API came in Java 8 release 20.   It is worth having a look at the specific section that details the change in the API:
Area: core-libs/java.util.collections
Synopsis: Collection.sort defers now defers to List.sort
Previously Collection.sort copied the elements of the list to sort into an array, sorted that array, then updated list, in place, with those elements in the array, and the default method List.sort deferred to Collection.sort. This was a non-optimal arrangement.
From 8u20 release onwards Collection.sort defers to List.sort. This means, for example, existing code that calls Collection.sort with an instance of ArrayList will now use the optimal sort implemented by ArrayList.

I think it would have helped if Oracle were a little more explicit here on how this change could cause runtime  problems.   Considering everybody uses the Collections framework if an API that previously didn't throw a exception now can for the same situation (bad code and all that it is), it is better if the release notes made it easier for developers to find that information out.