Wednesday, June 8, 2016

Triggering a Client Cache Refresh

More and more web sites are using a single page style architecture.  This means there is a bunch of static resources (aka assets) including:
  • JS 
  • CSS
  • HTML templates
  • Images
  • ...
all residing in the client's browser.  For performance reasons, the static content is usually cached. And like all caches, eventually the data gets stale and the cache must be refreshed.  One excellent technique for achieving this in the web tier is Cache busting (see: here and here).
However, even using this excellent pattern there still needs to be some sort of trigger to indicate that things have gone out of date.  Some options:

HTML5 Sockets

In this case a server can send a request to any client to say go and get the new stuff.

Comet style polling

The Client consistently polls the server in the background asking if there is a new version available. When the server says there is, the client is triggered to start cache busting.

Both of these are good approaches but in some edge cases may not be available:
  1. Architecture may not allow HTML5 sockets - might be some secure banking web site that just doesn't like it.
  2. The Comet style polling might be too intensive and may just leave open edge cases where a critical update is needed but the polling thread is waiting to get fired. 

Client refresh notification pattern

Recently, on a project neither of the above approaches were available, so I needed something else which I shall now detail.  The solution was based on some existing concepts already in the architecture (which are useful for other things) and some new ones that needed to be introduced:
  1. The UI always knows the current version it is on.  This was already burnt into the UI as part of the build process.
  2. Every UI was already sending up the version it is on in a custom header. It was doing this for every request.  Note: this is very useful as it can make diagnosing problems easier. Someone seeing weird problems, nice to be able see their are on a old version straight away.
The following new concepts were then introduced:
  1. The server would now always store the latest client version it has received.  This is easy to do and again handy thing to have anyway.
  2. The server would then always add a custom header to every response to indicate the latest client version it has received.  Again, useful thing to have for anyone doing a bit of debugging with firebug or chrome web tools
  3. Some simple logic would then be in the client, so that when it saw a different version in response to the one it sent up, it knows there's a later version out there and that's the trigger to start cache busting!   This should be done in central place for example an Angular filter (if you are using Angular)
  4. Then as part of every release, just hit the server with latest client in the smoke testing. 
  5. As soon as any other client makes a request, it will be told that there is a later client version out there and it should start cache busting.

So the clients effectively tell each other about the latest version but without ever talking to each other it's all via the server.  It's like the mediator pattern. But it's still probably confusing. So let's take a look at a diagram.



With respect to diagram above: 
  • UI 1 has a later version (for example 1.5.1) of static assets than UI 2  (for example 1.5.0)
  • Server thinks the latest version static assets is 1.5.0
  • UI 1 then makes a request to the server and sends up the version of static assets it has e.g. 1.5.1
  • Server sees 1.5.1 is newer than 1.5.0 and then updates its latest client version variable to 1.5.1
  • UI 2 makes a request to the server and sends up the version of static assets it has which is 1.5.0
  • Server sends response to UI 2 with a response header saying that the latest client version is 1.5.1
  • UI 2 checks and sees that the response header client version is different to the version sent up and then starts busting the cache
Before the observant amongst you start saying this will never work in a real enterprise environment as you never have just one server (as in one JVM) you have several - true.  But then you just store the latest client version in a distributed cache (e.g. Infinispan) that they can all access. 

Note: In this example I am using two web clients and one back end server.  But note the exact same pattern could be used for back end micro-services that communicate with each other.  Basically anything where there is a range of distributed clients (they don't have to be web browsers) and caching of static resources is required this could be used. 

Until the next time, take care of yourselves. 



Tuesday, June 7, 2016

Why Agile can fail?

Most Development teams now claim they are doing Agile and are usually doing some variant of Scrum.  In this blog post I propose three reasons why teams can struggle with Scrum or any form Agile development. And conveniently they all begin with R.

Requirements

In the pre-agile days, requirements where usually very detailed and well thought out.  For example, when I worked in Ericsson, before we moved to the Agile methodology RUP, the projects were based on a classical Waterfall model that was a very high standard (usually around CMM level 3).  The requirements came from System experts who had lots of experience in the field, were very technically skilled and had incredible domain knowledge. Their full time job was to tease things out knowing that they had effectively only one chance to get it right.

In the Agile world, with shorter iteration cycles and much more releases there are chances to make something right when you get it wrong.  Functionality can be changed around much more easily.   This is good.   It is easier for customer collaboration and thus enable more opportunities to tweak, fine tune and get all stakeholders working together sculpting the best solution.

However, because the penalty for getting requirements wrong is not as great as it is in the Waterfall model it can mean that the level of detail and clarity in requirements can start becoming insufficient and before you know development time in the sprint gets wasted trying to figure what is actually required. The key is to get the balance right.  Enough detail so that there can be clear agreement across the dev team, customer and any other stakeholders about what is coming next, but not so much detail that people are just getting bogged down in paralysis analysis and forgetting that they are supposed to be shipping regularly.

I suggest one process for helping get the balance between speed and detail for your requirements in this blog post.

Releases

In the Agile methodology Scrum you are supposed to release at the end of every sprint.  This means instead of doing  1 - 4 releases a year you will be doing closer to 20 if not more.  If your release is painful your team will struggle with any form of Agile.  For example, say a full regression test, QA, bug fixing, build, deploy etc takes 10 days (including fixing any bugs found during smoke testing) it means that 20 * 10 = 200 man days are going to releases. Whereas in the old world,  with 4 release it would just be 4 * 10 = 40 days. In case it's not obvious, that's a little bit regressive.

Now, the simple maths tells us that a team with long release cycle (for whatever reason) will struggle releasing regularly and will thus struggle with any Agile methodology.

To mitigate this risk happening, it is vital that the team has superb CI with very high code coverage and works with a very strict CI culture.  This includes:
  • Ensuring developers are running full regressing tests on feature branches before merging into dev branches to minimise broken builds on dev branches
  • Fix any broken build as a priority
  • No checking into a broken build
  • Tests are written to a high quality - they need to be as maintainable as your source code
  • Coding culture where the code is written in a style so it is easily testable  (I'll cover this in a separate blog post)
  • CI needs to run fast.  No point having 10,000 tests, if they take 18 hours to run. Run tests in parallel if you have to.  If your project is so massive that it really does take 18 hours to run automated tests, you need to consider some decomposition. For example, a micro-service architecture where components are in smaller and more manageable parts that can be individually released and deployed in a lot less than 18 hours.
For more information on how to achieve great CI, see here and here.

By mastering automated testing, the release cycle will be greatly shortened.  The dev team should be aiming towards a continuos delivery model where in theory any check could be released if the CI says it is green.  Now, this all sounds simple, but it is not.  In practise you need skilled developers to write good code and good tests.  But the better you are at it, the easier you will be able to release, the easier you will be able to truly agile.

Note: One of the reasons why the micro-services architectural style has become popular is because it offers an opportunity to avoid big bang releases and instead only release what needs to be. That's true.  However, most projects are not big enough or complex enough to need this approach.  Don't just jump on a bandwagon, justify every major decision.

Roles 

The most popular agile methodology Scrum only defines 3 Roles:
  • Product Owner 
  • Scrum Master
  • Dev Team. 
That's it.  But wait sec! No Tech Lead, no Architect, no QA, no Dev manager - surely you need some of these on a project.
Of course you do. But this can be often forgotten.  Scrum is a mechanism to help manage a project, it is not a mechanism to drive quality
engineering, quality architecture and minimise technical debt.  Using Scrum is only one aspect of your process.  While your Scrum Master might governs process they don't have to govern architecture or engineering.   They may not have a clue about such matters.  If all the techies are just doing their own stories trying to complete them before the next show and tell and no-one is looking at the big picture, the project will quickly turn into an unmanageable ball of spaghetti.

This is a classical mistake at the beginning of a project. Because at the beginning there is no Tech debt. There are also no features and no bugs and of course all of this is because there is no code! But, the key point is that there is no tech debt. Everyone starts firing away at stories and there is an illusion of rapid progress but if no-one is looking at the overall big picture, the architecture, the tech debt, the application of patterns or lack of, after a few happy sprints the software entropy will very quickly explode.  Meaning that all that super high productivity in the first few weeks of the progress will quickly disappear.

To mitigate this,  I think someone technical has to back away from the coding (especially the critical path) and focus on the architecture, the technical leading, enforcing code quality. This will help ensure good design and architecture decisions are made and non-functional targets of the system are not just well defined but are met.  It is not always a glamorous job but if it ain't done, complexity will creep in and soon make everything from simple bug fixing, giving estimates, delivering new features all much harder than they should be.

In the Scrum world it is a fallacy to think every technical person must be doing a story and nothing else.  It is the equivalent of saying that everyone building a house has to be laying a brick and then wondering why the house is never finished because when the bricks never seem to line up.



Someone has to stand back ensure that people's individual work is all coming together, the tech debt is kept at acceptable levels and any technical risks are quickly mitigated.

Until the next time, take care of yourselves.





Thursday, June 2, 2016

Agile Databases

Any project following an Agile methodology will usually find itself releasing to production at least 15 - 20 times per year. Even if only half of these releases involve database changes, that's 10 changes to production databases so you need a good lean process to ensure you get a good paper trail but at the same time you don't want something that that will slow you just unduly down. So, some tips in this regard:

Tip 1: Introduce a DB Log table

Use a DB Log table to capture every script run, who ran it, when it was run, what ticket it was associated with etc. Here is an example DDL for such a table for PostGres:
create sequence db_log_id_seq;
create table db_log (id int8 not null DEFAULT nextval('db_log_id_seq'), created timestamp not null,  db_owner varchar(255), db_user varchar(255), project_version varchar(255), script_link varchar(255), jira varchar(255));
W.R.T. the table columns:
  • id - primary key for table. 
  • timestamp - the time the script was run. This is useful.  Believe me. 
  • db_owner - the user who executed the script. 
  • db_user - the user who wrote the script 
  • project_version_number - the version of your application / project the script was generated in.
  • scrip_link - a URL link to a source controlled version of the script 
  • jira - a URL to the ticket associated with the script. 

Tip 2: All Scripts should be Transactional

For every script, make sure it happens within a transaction and within the transaction make sure there is an appropriate entry into the db log table. For example, here is a script which removes a column
BEGIN;
ALTER TABLE security.platform_session DROP COLUMN IF EXISTS ttl;
INSERT INTO db_log (
       db_owner, db_user, project_version, script_link, jira, created)
VALUES (
       current_user,
       'alexstaveley',
       '1.1.4',
       'http://ldntools/labs/cp/blob/master/platform/scripts/db/updates/1.1.4/CP-643.sql',
       'CP-643',
       current_timestamp
);
COMMIT;

Tip 3: Scripts should be Idempotent

Try to make the scripts idempotent. If you have 10 developers on a team, every now and again someone will run a script twice by accident. Your db_log will tell you this, but try to ensure that when accidents happen that there is no serious damage. This means you get a simple fail safe,  rather than some newbie freaking out.   In the above script, if it is run twice the answer will be the exact same.

Tip 4: Source Control your Schema

Source control a master DDL for the entire project. This is updated anytime the schema changes. Meaning you have update scripts and a complete master script containing the DDL for entire project. The master script is run at the beginning of every CI, meaning that:
  • Your CI always starts with a clean database 
  • If a developer forgets to upgrade the master script, the CI will fail and your team will quickly know the master script needs to be updated.
  • When you have a master script it gives you two clear advantages: 
    • New developers get up and running with a clean database very quickly
    • It becomes very easy to provision new environments. Just run the master script! 

Tip 5: Be Dev Friendly

Make it easy for developers to generate the master script. Otherwise when the heat is on, it won't get done.

Tip 6: Upgrade and Revert

For every upgrade script write a corresponding revert script. Something unexpected happens in production, you gotta be able to reverse the truck back out!
BEGIN;

ALTER TABLE security.platform_session ADD COLUMN hard_ttl INT4;
UPDATE security.platform_session  SET hard_ttl = -1 WHERE hard_ttl IS NULL;
ALTER TABLE security.platform_session ALTER COLUMN hard_ttl SET NOT NULL;

ALTER TABLE security.platform_session ADD COLUMN ttl INT4;
UPDATE security.platform_session  SET ttl = -1 WHERE ttl IS NULL;
ALTER TABLE security.platform_session ALTER COLUMN ttl SET NOT NULL;


INSERT INTO db_log (
       db_owner, db_user, platform_version, script_link, jira, created)
       values (
       current_user,
       'alexstaveley',
       '1.1.4',
       'http://ldntools/labs/cp/blob/master/platform/scripts/db/reverts/1.1.4/revert-CP-463.sql',
       'CP-463',
       current_timestamp
    );

COMMIT;

Until the next time take care of yourselves.


Thursday, May 19, 2016

Immutable pointers - a pattern for modular design to help achieve microservices

Modular design has a lot of benefits, including:
  • making it easier to predict the impacts from a change
  • helping developers work in parallel
But, it is much much much harder than people think.  For a start, the abstractions have to be at the right level.  Too high and they can become meaningless and too low and they cease to be abstractions, as they will end up having way too many dependencies (efferent and afferent)

In a recent green field project, which was in the area of digital commerce, I was intent on achieving good modular design in the back end for the reasons outlined above and because the project had potential to grow into a platform (more than likely microservices) that would be used by multiple teams. To achieve the modular design, after some white boarding the back end was split up into a bunch of conceptual components to achieve shopping functionality.  The core components are listed below.


  • Shopping component - Core shopping functionality e.g. Shopping Cart management
  • User component - user management, names, addresses etc
  • Purchases - details about previous transactions
  • Merchant - Gateway to merchant API to get information about merchant products etc. 
  • Favourites - Similar to an Amazon wishlist
Now, every software project uses the word "component" differently. In this project, a component was strictly defined as something that did something useful and contained:
  • domain classes 
  • services
  • a database schema  (could be its own database, but the idea was to isolate it from any other components persistence) 
  • its own configuration
  • its own dedicated tests
  • its own exception codes domain 
The outside world could access a component via a bunch of ReSTful endpoints. 

Any component could be individually packaged, deployed etc.  Now, the astute out there will 
be thinking,  "this sounds like microservices", well they almost were.  For this project, 
some of them were co-located but they were architected so that deploying them out into 
individual  deployed artefacts (and hence a microservices approach would be easy). 

Ok, to reiterate, the goal was achieve a very clean modular design.  This meant, that I didn't want 
any dependencies from one component's database scheme to another and for this blog post we are only going to focus on how that aspect of the modularity was achieved. 

Now, looking at the above components, it doesn't take long to see that isn't going to be so easy.  For example: 
  • A shopping cart (in the Shopping component) will have a reference to a User (in the User component)
  • A shopping cart item (Shopping component) will have a reference to a Product (Merchant component)
  • A Shopping cart (Shopping component) will have a reference to a shipping Address (User component

So the challenge of achieving modularity in the persistence tier should now be becoming more clear. References of some sort across components need to be persisted.  Immediately, any developer will ask, 
"Wait a sec, if we just use foreign keys for these inter schema references we get ACID and referential 
integrity for free!"  True. But then you are loosing modularity and introducing coupling.  Say you want to move your products (in the Merchant Component) away from a relational database and use something like Elastic or Mongo DB instead - to leverage their searching capabilities.  That foreign key isn't so useful now is it?  

Ok, so first of all in looking for a solution here, I thought about all the references that were across 
components to see if there was anything in common with them.  One thing that was obvious was that 
they were generally all immutable in nature.  For example:
  • When a Cart Item (Shopping component) points to a Product (Merchant component) it points to that product only.  It never changes to point to another product.  
  • When a Shopping Cart (Shopping component) points to a User (User component), it is also immutable.  My shopping cart is always mine, it never changes to be someone else's.
So I was now starting to think about preferences:
  1. Avoid cross component dependencies if you can (this should be kinda obvious)
  2. If you have to have them, strive for immutable references.
So, next up was to have a name for this type of relationship - which I was calling "Immutable pointer" in design and architecture documents. But, for actual code I needed something more succint. The database schema was already using "id" for primary keys and "{name_other_relationship}_id" for foreign keys. So I decided all cross component relationships were named as the the name of the entity being pointed to and "ref".  

So, some concrete examples:
  • userRef   (ShoppingCart pointing to the user)
  • productRef (CartItem pointing to the product)
  • shippingAddressRef (ShoppingCart pointing to the ShippingAddress)
This meant anytime anyone saw something like "xyzRef" in code, schema or logfiles they knew it was a cross component reference. In case it wasn't obvious, Ref was an abbreviation for Reference.

Next up was to decide on the format for the actual Refs.  I took a bit of inspirational from that thing call the internet, which of course has similar concepts where abstractions: web sites of web pages contain immutable pointers to other abstractions: hyperlinks to web pages in other web sites.


So similar to hyperlinks and URLs, the refs would follow a hierarchical name spacing format.
Some good team input from senior technical member suggested to continue the inspiration from the Web and start the hierarchical names with cp://.  CP for Commerce Platform the name of the project. This was analogous to http://  I thought this was good idea as it indicates that our platform generated the reference.  Again, this meant they stood out in logfiles etc and could be differentiated from any downstream components also using hierarchical type references but in a different context. 

The key point of the ref was that when generated it should of course be unique.  To achieve this 
a mixture of database primary keys or something unique about the data (e.g. product skus were used)

So some examples: 
  • userRef -> cp://user/{uuid of user}
  • productRef -> cp://merchant/{uuid of merchant}/product/{sku of product}
  • cardRef -> cp://user/{uuid of user}/card/{uuid of card}

Always immutable?

As stated, the first preference was to avoid the cross component reference. The second preference
was to use the immutable pointer (ref pattern). However, what about an edge case where the cross component reference could be mutable.  Could this happen?  Well it could.  Easiest to explain by an example.   

Every shopping cart doesn't just have a User, it also has a selected shipping address where the contents purchased will be shipped to. In the domain model, the user's address lived in the User component.  But unlike the other cross component references the shipping address could change. Consider your Amazon shopping cart.  Imagine you are on the checkout screen with your selected card and your selected address, but before you proceed to checkout you go into your user preferences and delete your card and address.  This project had to facilitate similar scenarios.  So the User deletes the address that a shopping card id pointing to what should happen?

So for inspiration for solution here, we can look to the various NoSQL patterns and in particular 
one of the most popular is eventually consistency. What this says is that unlike ACID you don't 
always need consistency straight away, all the time.  In certain cases it is okay to allow inconsistency
on the basis that the system is able to reconcile itself. 

So in this case:
  1. The shopping cart is pointing to a specific shopping address using an addressRef. 
  2. The user deletes that address by hitting a ReST endpoint in the user component.
  3. This means the shopping cart will point to an address that doesn't exist. The system is inconsistent.
  4. The next time the user reads the shopping cart, in the request handling the shopping component asks the user component if the address with this address ref sill exists and if it doesn't removes the pointer.  
  5. This system is now consistent
So with the architect hat on it is really important we get this all right. Otherwise the goal 
of modular design falls apart. 

In this case, it is worth reiterating the strategy one more time:
  1. Avoid cross component references.  Doesn't matter how great your pattern is, if you have a lot of cross component references,  it is more than likely you have got the component abstractions at too low a level.
  2. Favour immutability.  In general immutability means less code paths, less edge cases, less complexity in code. 
  3. Eventual consistency.  
Software Architecture is about trade offs, and finding the right balance.  In this case, it was the balance between clean modular design but not going over board with it so that it become impossible to achieve. 

For anyone trying to do microservices, I would strongly recommend trying to master how you would do modular design in your architecture first.  If you can't do this, when you add in the complexity of the network things get very complicated.