Sunday, April 23, 2017

Primer on 12-factor app



The 12-factor app guideline was published by Heroku the platform as a service (PaaS) provider (now a part of Salesforce.com). These were a set of principles/best practices that one needed to follow to build applications on the Heroku PaaS. Over time, this has become the de-facto standard for building web applications on the cloud or more precisely cloud-native applications. The original 12-factor app guideline is available at 12factor.net and was not really meant for microservices. In this post, I will try to go over the 12-factor app guidelines and check how they fit into microservices architecture.

1 – Codebase
One codebase tracked in revision control, many deploys

The application codebase should be maintained in a version control system (VCS).All developers should have easy access to VCS.There is only one codebase per application in the VCS, but there can be multiple deployment copies of this application codebase. For example, an application can be deployed in different environments in a typical CI pipeline - pre-production, user acceptance, production etc. These environments have the codebase for the same application but they can be in different state or version. The pre-production can be few commits ahead of the code that is currently running in production.

2 – Dependencies
Explicitly declare and isolate dependencies

All web applications rely on external libraries(notably the framework libraries/jars) to run. There is high chance that the target deployment/server environment (say your web server) may not have the dependent libraries/jars. Hence the web application must declare all dependencies with the correct versions. These dependencies can then be included in the web server as part of the deployable unit.

3 – Configuration
Store config in the environments

Configuration information must be separate from the source code. This may seem so obvious, but often we are guilty of leaving critical configuration parameters in the scattered in the code. Instead, applications should have environment specific configuration files. The sensitive information like database password or API key should be stored in these environment configuration files in encrypted format.

4 – Backing Services
Treat backing services as attached resources

An application connects to backend services over the network. Backend service can be databases like MySQL, MariaDB, distributed caching based on Redis, Hazelcast etc, NoSQL stores like MongoDB. Applications typically use connection strings/URL formats to connect to these systems. If these servers are moved to a different node or new node comes up, the connection details of that backend service will change. The application should able to handle such changes without any code changes, rebuilds, and redeployments. These connection settings should be part of the configuration. Another example would be moving the application from pre-production to production. In this case, also the connection settings for database server will definitely differ. The code should be able to detect the environment profile and function as expected without any change.

5 – Build, release, run
Strictly separate build and run stages

This principle states that build, release and run stages should be treated separately.
During the build phase is the developer is in charge. This is where feature/capability branches are created, development is done, tests are run and finally merged to integration/develop branch.
In the release stage the software is prepared for a possible release to production, - may be a release candidate or general availability version. The regression tests and other tests would be run to verify if the software can behave as defined in the specification and can be deployed/pushed to production.
The actual production release is also tagged.Finally, in the run stage, the application is actually pushed to production or is deployed. It should then run without any intervention or modification.

In case a bug is detected in production or a new feature comes along, it has to be addressed all the way back in the build stage after detailed analysis. This disciplined approach minimizes risk, creates traceability and establishes a well-oiled process. This is evident that automation around agile CI/CD process will be key to implementing this guideline.

6 – Processes
Execute the app as one or more stateless processes

This guideline suggests building stateless web applications. These applications are easy to scale, upgrade. The application state is only stored in backend stores like databases. Also, his is kind of a warning that too much session state and sharing it in a cluster is not a best practice.


7 – Port Binding
Export services via port binding

The application services are accessible (securely to the external world) using URL exposed over HTTP protocol.


8 – Concurrency
Scale out via process model

The application should be run one process on the OS. That process can then scale out horizontally across multiple nodes. 


9 – Disposability
Maximize robustness with fast startup and graceful shutdown

The application must startup fast. In the case of a new deployment or upgrade, this should not take hours. In the case of a crash, new spawning a new node and bringing the application up on the new node, should not take more than a few minutes. The application instance must be capable of handling graceful shutdown without impacting overall scalability (#8) and jeopardizing the backing stores (#4). Also while starting down and shutting down, the application must send notifications to the monitoring tools on the network.

It is evident that this principle indirectly points to cloud deployments backed by continuous deployment/CI/CD pipeline.

10 – Dev/Prod parity
Keep development, staging, and production as similar as possible

I would put it differently. I do not think your development environment needs to mimic your staging and production. However then you should ensure that the application is packaged with a good build tool with robust dependency management in place.

Instead, your build and staging environments should resemble production as close as possible.
That implies having the exact same version of the operating system, virtualization, databases and any other relevant services.This helps shorter cycles to unearth infrastructure/environment related issues. Such bugs can be nasty, discovered late and can cause a lot of frustration and consume time to fix them. Also, there is lot more confidence going towards production as you know you have already run the application in exactly same kind of environment for staging.

This guideline again has deep consequences for classical applications. I will explain this in the next post.

11 – Logs
Treat logs as event streams

Logging is critical for debugging failures in a running application. The application must generate logs but it will not bother about the storage, management or analysis of log entries. This is a separate concern for specialized tools. The application will generate
the log event stream and route them to these specialized services for analysis, emitting critical alarms and archival for future references (for example
in some cases due to regulations or laws the logs are maintained for a certain duration of time)

12 – Admin Processes
Run admin/management tasks as one-off processes

After an application is deployed and goes live in production, then it needs to be monitored and managed. You may want to change certain internal parameters via JMX for example, restart one backing server instance, delete some useless files etc. In production, such tasks should be run separately / in a separate process than the actual application.  

Saturday, April 22, 2017

So whats wrong with Classic Applications a.k.a Monoliths?

A real life story

In my previous post introducing microservices, I wrote about this insurance product I was working on at the beginning of my IT career. It started off very simple as a layered application. Well it was not simple, the learning curve with J2EE of that time was indeed high. Just like microservices, it was all shiny new thing. The application started with just a couple of us. Then we started adding features. The product was sold. I was happy, I went to install and implement it at customer sites. It was fun.

Then customers demanded more and more features. New features were also required to address new business challenges and compete with established players. So we kept adding more bricks and mortar on that same structure. 2+ years flew by and I realized that we have about 20 odd modules, 500+ tables, and nearly a team of 100+. We had several customers in the US and elsewhere using this product. It was really a big success in such a short span of time.

But I also realized that I don't fully understand the system anymore. I had no clue about for example what the product module was doing, neither I understood the re-insurance module. Also, I could observe that new team members struggled to get on board. If learning EJB was not overwhelming enough, the size of the code base scared them. Changes were now difficult and time-consuming affair. For example making a change in the product module, you also needed to ensure nothing break in other modules.This meant several long meetings. Even if we ensure everything was all right and put several features together for next release, something or the other would break. This would lead to slow deployment and longer deployment times. I have already described the challenges in scaling this application in my previous post.  

In short, as the application created based on classic architecture makes it extremely difficult to maintain, change and operate as it grows big over a period of time.

I have many such stories and experiences of large classical applications becoming obese and fatigued over time.

A short summary of issues with large classic applications.

Figure 1 - Monolith Challenge Balls


Comprehension paralysis

It is extremely difficult to comprehend each and every aspect of a large application codebase. This is true even if you are working on it from Day 0 on that application. As the application grows bigger and bigger it becomes next to impossible for one developer/designer to know everything.  

High Ramp up time

This is a fall out of the comprehension paralysis. New developers would take months instead of a couple of days/weeks to come out to speed. Even if they did come up to speed, they would never feel assured, barring a few exceptions though.

Design Decay

New developers joining the team find it extremely difficult to understand the design, they resort to quick and dirty changes. The result is degradation in the overall design and code quality. I have seen developers found it so difficult that they at times wrote JDBC code in JSP (Amazing stuff !!!!)

Tight coupling
Since all the modules are seating together in one code base they are very tightly coupled. Some or the other concern would sneak in across module boundaries.

Snail paced changes

Due to the tight coupling, the team would spend hours in figuring out how to ensure that a change in one module is not going to break another. This can be a sign of collaboration, but a big impediment to progress. Team members, designers, and architects would then argue for hours and days to conclude on the best way forward to implement a change.

Continuous "SLOW" delivery 

CI/CD did not exist in 2001. Assume it did for a minute. Even in that case, the slow changes and gigantic code base would make it extremely difficult. The development was slow (Visual Age, Bea IDEs ran slowly with the large code base and it was the time of costly memory, we had only 512MB/1GB on developer workstations), there were several tests so it was not possible to get quick feedback. This also meant loss of developer productivity.

Long downtime

Since changes were difficult several features would be clubbed in a release, instead of short focused release. As a result lot of testing was required before and after pushing to production to ensure everything worked fine. Also, you had to follow a long manual to deploy and cannot afford to miss a line. This meant long downtime maintenance windows. This would not make the customers/end users happy. This was done after working long hours late in the night as this system was mainly accessed during day time and batch processes ran after 6 pm. Since the whole code base was being redeployed, with so many EJB components, the application itself took a long time to come up. I remember people going for coffee and cigarette break after restarting the server 😁

Scaling challenges

It is an absolute nightmare to do capacity planning for such large systems.As I explained before, even if you know which parts to scale more than the others, but due to the tight coupling, you cannot do anything about it i.e scale selectively with optimal hardware.
The product

Technology lockdown

In such tightly coupled or "monoliths" you could be forever locked with some specific technology and server and some vendors. Over a period of time, new and better technologies, languages come up. Even if you know they are better, even if you know you can or should make the switch two things will prevent you from doing so. First is the inertia especially from management and development group both. It's working and running let it run. Second is the cost of such a switch can be very very expensive and there is no sponsor for that.


Non-technical challenges

Some people would know the system and code better than others. They would take management by the gun point when it came to appraisal, raise and bonus. I am not joking it's true.


Wednesday, April 19, 2017

Microservices – Vertical Slicing

The whole idea of microservices revolves around building a large application from a set of small modules which are composed around business functions.Let me share one concrete example to understand this better. Figure 1, shows the classic layered architecture that I was introduced to back in 2001 when I was building a large product for the insurance providers.
Figure 1 - Classic Architecture of the Insurance Product

Only three key modules from the product are shown here for simplicity. This was a very large product with several other modules. The UI was developed using JSPs (HTML, Stylesheets, Javascript). The UI was packed in a war file. The business logic was written using stateless session beans. The data access layer was all entity beans. The EJBs and the war file was then packaged inside an ear file and finally deployed in a Java application server.

This application was also modular, but horizontally.So that ear file contained everything – the UI, business logic, the data access logic for all the insurance business functions – underwriting, claims, product, accounting, party, agents, reinsurance etc. We had one big “earball” that would be built and thrown in the application server.

Now in India, people tend to buy a lot of life insurance towards the end of the financial year (in the month of March) as it allows them to save taxes. So, for last few weeks in the month of March, there would be a lot of stress on the underwriters. This, in turn, means the underwriting module (and few other related ones like the product, accounting, party etc) is being heavily used in those few weeks. I am assuming the normal volume of claims during that period. In order to handle the high load, the application had to be scaled mostly horizontally. That means new nodes would be procured in the data center, the application server will be installed and the same “earball” would be thrown at them. In summary, we would scale the entire application and not just the underwriting components.

Given that this product’s codebase was very large and complex, a significant amount of hardware would be required across the server nodes to scale. For sake of simplicity let’s assume that this hardware size to be ‘X’. This scaling demand on the system would reduce after 31st March. So the additional servers and nodes will be redundant. However, it is not so easy to get rid of this hardware. Some companies would try to re-use the boxes for some other purpose. But that is also a time-consuming affair. In most cases, the company bleeds financially as hardware seats in the data center without any use.

In order to ensure that only underwriting component is scaled, it is required that is deployed separately from the rest of the application components. Now, what if an unforeseen situation happens and then there is increased activity in the claims department. Ok, let us break apart the claims module. Product management module being common to both underwriting and claims. What would you do? Do you package product module with both underwriting and claims module? That will not be very prudent. Instead, split that out as well and let the other two modules talk to it via some mechanism. In other words, we now have 3 loosely coupled modules to deal with. If you now manage to deploy them separately you can scale them separately. Now you will truly leverage cloud resources. The hardware size now required to scale say the underwriting module is Y and it will be much much less than X which was required earlier. That also means less bill from the cloud provider at the end of the month.

The exercise or operation described above is called “Vertical slicing“. This makes microservices autonomous – i.e they can be changed, deployed and scaled independently of each other. Let us now see how the Figure 1, gets transformed with all this.

Figure 2 - Microservices emerging from Vertical slicing.


In figure 2, vertical columns appear instead of the horizontal layers in figure 1. Each vertical column is a microservice as it is aligned based on business context/capability.

Interesting to note that even with microservice the layered cake is present, but is very lean now. Each microservice has its own UI, business logic and data access logic and database arranged in layers. Microservices are can be thought of as vertically sliced, lean layered cake components which work together to compose a bigger application. This makes me think if the microservices is a misnomer. Best name for this would have been "mini-apps". This is because each vertical pillar is a self-contained application in itself.

Now the underwriting and claims microservice would definitely need to query the product microservice. Even claims module needs to query underwriting module. Underwriting module, in turn, needs to communicate with party microservice (not shown in the image for simplicity) whenever a new policy is created. In short, microservices cannot function in silos, they must communicate. Since microservices widely expose rest endpoints over HTTP, that can be used as the protocol of communication. Microservices can also communicate over asynchronous communication channels like a message system.

So, if the deployment of microservices as an individual component on separate processes was not challenging enough,  internal communication between microservices over the network escalates that challenge even further. You need a solid business alignment, DevOps support on the cloud (public/private) to successfully build applications based on microservices architecture. Luckily, in 2017, we have exceptional cloud providers like AWS, Azure, Google, DigitalOcean to name a few. Add to that the excellent plethora of continuous delivery tools and containerization options. These two compliments to help implement applications based on microservices. You may think that stars have aligned somehow now to build such loosely coupled applications which we have always wanted for years now.

It’s not just scaling needs or support of cloud infrastructure or DevOps tools that’s fueling microservices-based applications. There are other factors as well. Computers have exploded in many form and shapes in the recent past. If laptops, desktops were not enough, you have tablets and mobiles. We are already into the age of wearables (smartphones will soon be a thing of the past in my opinion) and then the best is yet to come with IOT fuelled by 5G – possibly usher a new industrial revolution. This means the applications will not just be accessed by laptops, desktops or phones but there will be a plethora of other clients including robots, drones and maybe even your teeth and hair. This will put tremendous demand/strain on the software applications in the future. Hence, it is imperative for software applications to switch to loosely coupled microservice architecture and gain extreme agility and flexibility.

Tuesday, April 18, 2017

What are Microservices?

Microservices is still a new and evolving subject. Hence, there is a lot of confusion regarding the term and concepts around it. A lot of clarity is starting to emerge of late, as teams try to embrace this new architectural style.I know people tend to think and preach that since the application that they have developed recently expose few REST web services, they are doing microservices. Some have even installed API Gateway in front of the Rest web services layer. Trust me this is still a layered cake classic architecture which is just opposite of microservices. The naysayers call this a monolith. I have some reservations against that word. Later I will explain the reason for the same.
Figure 1 - Classic Architecture
Microservices is an architectural style. In this style, an application (generally large and complex one) is built using a small set of loosely coupled services which implement specific business capabilities. Well, we have been doing this for years, haven't we? What is the difference?  The difference is that the set of services are grouped by clear business functions/capabilities. Each group of services run on separate processes and can be deployed and scaled independently. As the microservices are scaled independently they are best complimented by cloud and DevOps ( or at least an efficient continuous delivery pipeline). What we have done thus far we have modularized the application based on the domain but then clubbed them all together in one deployable unit and thrown them at the server. Microservices, in contrast, recommends to break down the application based on the clear business goal and then deploy those mini applications separately.

The services being decoupled, physically separate and hence need a mechanism to talk to each other to create the combined whole application. This is done using lightweight HTTP-based protocols. However, there is no restriction or specification on this.So other protocols can be selected as per requirement. Some teams have used messaging systems like RabbitMQ Apache Kafka to share data across microservices.

Microservices also need to integrate with external systems, user interfaces (viz. modern Javascript etc) they will typically expose HTTP-based REST endpoints. This also means that the services need some form of centralized management and discovery.

Needless to say, Microservices implementation is complex. Adopting or embracing Microservices is not easy or nor a silver bullet. It is suitable for some applications, some teams, and environments. It is not applicable for all scenarios. The classical architectural style (or layered cake style) is still very much relevant and is also has suitable use cases. That is why I don’t like the applications built using classic layered architecture be termed as ‘monolith’ as if they are some form of demons. I bet people instantly liked the fancied term when Martin Fowler wrote his inspiring and insightful article on microservices. My experience says software developer/architect community love fancy terms and jargons. Once they get hold of a new term, they ensure they break Twitter, Whatsapp with the new found term.

Now coming to, why I have so much problem with the term ‘monolith’. Well because it sounds like an insult. Monolith is – ‘a large single upright block of stone, especially one shaped into or serving as a pillar or monument’. Generally, it refers to a massive rock immovable lifeless rock structure which wears away each passing day due to erosion. This sounds similar to dinosaurs. They were massive creatures and are extinct now. The so-called monoliths are not. They have a life and they respond to requests coming from the external world by processing business rules and manipulating data. They also evolve over time to cater to changing business needs.

Also, we need to go back in time to realize how monoliths were created monoliths. When I started my job as a programmer back in 2001, I got a chance to work on the bleeding edge technologies of the time – EJB1.x stateful and stateless session beans, entity beans and message driven beans. EJBs, application servers were the talk of the town. They were touted to help build the “ultimate” distributed applications ever. Well did not realize back then, I am also becoming one of many creators of monoliths for the future.

I did not take us long to figure out the challenges of building distributed applications with EJBs. I also read the first law of distributed object design by Martin Fowler which said – “Don’t distribute your objects”. I got the point and changed the path to simplicity with Spring framework around 2004 and never looked back. I thoroughly read Expert One To One J2EE Without EJB and realized it will be a better idea to do logical and not physical separation of layers/tiers. Hence I started churning out more “monoliths”.

So it is developers like me who have created the monoliths. Interestingly same gurus who advocated monoliths a decade or so back, are now advocating microservices and creating hype. But in my opinion (we will discuss more in future posts), a microservices-based application can be a daunting challenge and even bigger than EJBs.However it is not without virtues either.

The crux of the matter is that there is no need to be ashamed of the fact that we created “monoliths”. Instead, let’s be proud. “Monolith” applications have solved lots of business solutions in the past and even today. They provided the best solution at that time. There is no not need to follow the Pied Piper of Hamelin and resort to microservices for every solution. The classic architectural style is also equally relevant and will continue to do so.

In the next post, I will try to explain vertical slicing of classical applications to understand microservices-based architecture. In subsequent posts, I will uncover microservices in more details and understand scenarios when they are possibly the best fit.

Sunday, April 16, 2017

Spring 4: Unit Testing Classic Controller

Spring Boot is a great gift for all Spring developers. The productivity boost is enormous. It also makes it extremely easy to unit tests slices of your layered application. This feature is available from Spring Boot 1.4.x onwards. In this post, I am going to focus only on unit testing of Spring MVC classic controller i.e non-REST controller.

Spring Boot test slice support for MVC controller is enabled by the following annotation


This tells Spring MVC test framework that we only intend to test EmployeeController So it will only make the web MVC components (controllers, interceptors etc) available with mocking support. For more details of this annotation please refer to the Javadoc here.

In the example above, we are testing the request to copy the Employee. Mockito is used to create and inject the mock objects. Spring takes care of the mock objects, dependencies for us. This makes it extremely easy to write unit tests for classic Spring MVC controllers. Note that for quick feedback unit tests must be cheap i.e quick to develop, setup and execute. This goal is achieved very easily with this setup.

Beware

However, the Spring MVC controller test has one drawback. In this scenario, you would want to just test the functionality of the controller method by actually simulating the request/response cycle. This is done perfectly by Spring MVC test framework. However, it also loads/generates the view. In other words, it takes the logical view name from  ModelAndView object generates HTML content. This can potentially slow down the unit test execution time. This is also redundant as you will not be interested in testing the content of the HTML when testing for a controller (this can be done by Thyemeleaf test framework). In order to circumvent this limitation, I have added a blank test Thymeleaf template. Given this limitation, it will be very nice if Spring provides a switch/configuration to turn off the view generation. Github repository for this project is available at

Github Repo :

https://github.com/kdhrubo/playground