CDN for API Orchestration.


Content Delivery Network(CDN) is a globally deployed distributed network of servers that serves the content with high accelerated, high performance, and high availability. Traditionally CDN is being used mostly to serve static assets like images, videos, JS/CSS. With the recent trend of API based service developments, the adoption of CDN is also getting involved with serving API responses.

API content delivery network (API CDN) is a version of CDN that specifically focuses on API caching and other benefits. They do not maintain any business logic, but only serve copies of content for requests they have seen before and have previously pulled from the original source. Every request offloaded to an API CDN is one less request to the application servers. Serving requests from an API CDN are usually cheaper in terms of bandwidth cost than serving same bandwidth from servers. API CDN can also reduce latency by caching content in data centers across the country or world and deliver responses from the closest available node to users.

While API CDN is similar to a regular CDN, the key differentiator of an API CDN is that it provides critical API-specific features (see below) in addition to static content delivery.

API CDN Features.

  • SSL support. You can own SSL certificate.
  • Geographical distribution.
  • Token authentication and Management.
  • DDoS mitigation. An API CDN should be able to recover from attacks without significant service degradation.
  • Web Application Firewall. An API CDN provides support for rules automatically mitigating XSS or injections attacks.
  • Support for SPDY and/or HTTP/2.
  • API for URL invalidation and other cache management functionality.
  • SLAs and support process.
  • Effortless up and running.
  • Support to Fixed and per request/bandwidth pricings.

There are plenty of commercial players in CDN market, but there are few specialized in serving API over CDN.

Fastly

Fastly specializes in API caching. It serves a significant number of large-scale tech-savvy clients including Github, Twitter, Yelp, Etsy and more. Fastly supports. It has supports to SSL, Geo, Token Authentication and easy maintenance. From the pricing perspective, it seems to be very competitive and can be online instantly with a trial account. It also supports SAN certificates.

The only thing missing from Fastly support to a Web application firewall.

Mashery/Intel

One of the original players in the API CDN space. Very mature product with a large number of enterprise clients. It supports most of the features listed above including a Web application firewall. One of the not so good thing about Mashery compared to Fastly is unclear pricing structure and slow setup.

Akamai

The biggest CDN vendor in the world. Akamai is known primarily for hosting static web traffic and does not have a dedicated API solution. While it may be possible to make Akamai work as an API CDN, but doing a deep analysis of Akamai for this evaluation due to Akamai’s limited functionality in this area is time consuming as compared to the other platforms under consideration.

DIY Varnish

The self-hosted option. Varnish is the most popular open source cache server. It would be possible to deploy Varnish instances across availability zones. This option will likely cost a significant amount in terms of additional DevOps hours and hosting and will not be as full-featured as fully-managed products.

Cost Estimation.

Consider the site has 500,000,000 requests 15,000 GB of bandwidth requirement for a certain time frame. Pricing is generally based on usage, we can estimate the price on the based average usage.

Based on estimated traffic, the Fastly cost of about $1,975/Month(Price is based on $0.0075 per 10,000 requests and $0.12 per GB and $100/mo for certificate maintenance). Mashery price calculation is available over the phone but with previous records it would easily cost more than $5,000/Month.

Using the AWS pricing calculator, we can estimate the cost of usage and bandwidth for an uncached site through application servers. Consider we have a need to serve 15,000 GB of outbound traffic for a month, we would need 10 large servers (in terms of CPU, RAM, etc). The cost of this would be around $2,500/mo to Amazon. If we are able to cache most API requests, we can reduce usage to 4 medium servers and 2,000 GB at a monthly cost of $750, leading to a cost reduction of $1,750.

This infrastructure saving of $1,750 can be largely offset by the cost of Fastly API CDN while getting away additional number of critical features at the same time.

Summary

API CDN has a list of significant features including speed, DDoS support, token authentication at a very minimal price. Most of the cost can be covered by saving bandwidth on servers. There are a couple of players in API CDN with proven track record with large enterprise clients, upfront about capabilities, service levels, pricing, and have excellent documentation.

Docker Cache Insights.


I have gained various insights over the time about the better way of using the feature of Docker Caching for the couple of scenarios. I have tried to detail the couple of them which might be helpful for the engineering teams to be more productive and efficient.

About the Docker Cache, Docker provides the functionality to package an application in such a way that “Build once and run anywhere”. During development and continuous deployment, someone has to deal with dozens of continuous docker builds. To facilitate faster turnarounds, docker has the ability to cache the various steps of the build process so that subsequent builds are almost instantaneous. Yes, we all know that Docker is inbuilt with the caching and here are some details about the insights.

Dealing with source control operations.

To get a copy of source code in docker container, developers prefers to use git commands in or wrap into the a script and use Docker’s ‘RUN’ step.

# Clone the git repo
RUN /bin/sh /git clone git@github.com/<account>/<repo>.git
# Git repo operations in a script
RUN /bin/sh /git-clone.sh

Docker clones the repository for that first time and then caches the RUN command executes instantaneously for subsequent builds. Great! but it causes an issue, if repository gets updated after cache and docker build new container, it uses previously cached version as an output resulting a stale codebase. The expectation here is an output of an RUN command should be updated each time of new container build. As it stands, unless the RUN command itself changes (and thus invalidates Docker’s on-host cache), Docker will reuse the previous results from the cache. In this case once ‘RUN git clone*’ gets executed once, the subsequent build uses the same copy of the code and won't try to execute each time to get latest changes.

One quick way to get rid of this is disabling the cache by building with ‘–no-cache’ flag. But it invalidate all the build steps and not a specific step of ‘RUN command’ resulting in the execution of all steps again and totally defeating the purpose of the cache. One of the way to deal with this is to generate unique RUN command each time ensuring it gets run each time

  1. We can wrap the Docker build in another script that generates a uniquely numbered mini script for the clone operation. This step would insert the invocation of that script into the Dockerfile that is generated on-the-fly just prior to build time, such that for the operation that must be run every time – the clone – its RUN statement is indeed unique.

    i.e. RUN /bin/sh /git-clone-123abc.sh

    where ‘-123abc’ is uniquely generated and appended for each build (and subsequent executions create something like ‘clone-124abc.sh’) and contains the git clone operation.

  2. Place this source control operations into the last RUN that is listed in the Dockerfile. This guarantees that Docker will run the clone during each build while having the advantages of being both fully automated and ensuring that the cache is used right up to that last unique RUN.

Appropriate usage of ‘ADD’ step in Dockerfile.

‘ADD’ step is being used most commonly in Dockerfile. But I think if it’s not been used properly, it becomes a common cause of cache busting during builds.

Let me give an example through Node.js MEAN stack docker build steps,

# 1. Add the application folder. It has package.json
        ADD . /src
# 2. Go to the package.json directory and build the app using npm package manager.
        RUN cd /src && npm install

The first step adds the current folder to container’s /src folder. The second step is very time-consuming as it installs dependencies. But at the same time package.json don’t get change often, hence during subsequent build this step should be very fast.

But here is the glitch. In the first step, when Docker does its comparison either to use cache, it compares the folder (the ‘.’) against the previously built folder, and if any file has changed from the current folder in meantime, the cache gets busted and second step of ‘npm install’ get executed even though package.json didn’t change.

This behavior has been ignored many times. This can be avoided with reordering of the steps.

# 1. Add package.json
    ADD package.json /src/package.json
#2. Run the Build with package.json
    RUN cd /src && npm install
# 3. Add the application folder.
    ADD . /src

In this case, even though any file in application folder gets change, it doesn't affect the first two steps unless package.json itself change. Docker caches the first two steps easily and gets bust third command onwards.

mTime(modified time) TimeStamp

Docker determines whether or not to use the cached version of a file is by comparing several attributes of the older and the newer version, including mtime. For example, during the MEAN stack build steps

# 1. Add package.json
    ADD package.json /src/package.json

After the first docker build, docker uses the cached copy of ‘package.json’ during subsequent builds if docker build is happening on the same project from the same directory several times in a row.

But most of the time, engineering team uses continuous delivery environments like Jenkins or CD. So to configure docker hook up with the build trigger a fresh clone of the application’s repository occurs. When this happens the mtime of a new copy of ‘package.json’ gets different than the previous cached ‘package.json’. And this is a problem for docker. As Docker uses the mtime of a file when doing its comparison and the mtime changes for our package.json file on every single clone, the cached version can never be used although technically file content never changed.

Fortunately, there is a solution! As part of our build process, after cloning the Git repository mtime of the ‘package.json’ The file needs to be changed to the time file was last changed within Git. This means that on subsequent clones if the file has not been modified according to Git, then the mtime will be consistent and thus Docker can use the cached version.Below sample script is one of the way to deal with mTime Timestamp.

#!/bin/bash
# 1. Get the git revision for package.json
    REV=$(git rev-list -n 1 HEAD 'package.json');
# 2. Get the timestamp of the last commit
    STAMP=$(git show --pretty=format:%ai --abbrev-commit "$REV" | head -n 1);
# 3. Update the file with the last commit timestamp
    touch -d "$STAMP" package.json;

Step 3 ensures mTime is always same for all the git clone until unless the updated version of the package.json file gets checked into source control.

This solution is just a workaround. There are a couple of proposed enhancements to the behavior of docker caching strategy with consideration of other attributes like sha1sum and md5sum along with a timestamp.https://github.com/docker/docker/issues/4351.If this feature gets added then workaround mention here not be needed.

Avoid installing unnecessary packages

Every docker file starts with system installation instructions like,...

RUN apt-get update && apt-get install wget -yy
RUN wget -q -O - http://pkg.jenkins-ci.org/debian/jenkins-ci.org.key | apt-key add -
RUN echo "deb http://pkg.jenkins-ci.org/debian binary/" >> /etc/apt/sources.list
RUN apt-get update && apt-get install -y -qq --no-install-recommends git jenkins curl telnet unzip openssh-client && apt-get clean

Throughout development, more and more package gets added. Same time developers generally don’t try to refactor this portion of dockerfile considering it is a very core part of the docker build and package removal might encounter unnecessary issues down the line. As per my experience working with the a big team, this section becomes the hub of unnecessary packages.

The solution is simple, just have continuous refactoring of setup steps and avoid installation of unnecessary packages. Breaking the lengthier steps into multi-line and then sorting the arguments also help to get rid of duplicates.

RUN apt-get update && apt-get install -y -qq --no-install-recommends\
git \
jenkins \
curl \
telnet \
unzip \
openssh-client \
&& apt-get clean

Measurement of Individual Velocity in Agile teams.


Adoption of Agile methodology is going through hard, painful process and many organization are paving their way with customized version of flavor. I get very curious to hear stories from different professionals from various organization about how they made it agile work as a work culture.

Of Course Agile or Scrum its just specification… an abstract specifications and team adopt it as their understanding, circumstance, client and nature of work. During the course, you may encounter various cases, where you might find team overrun couple of very core and important Agile aspects and follow the old age waterfall mindset.

Measurement of individual velocity is one of them.It may not be something new to many of us, but it's for decision of consideration of individual velocity in Agile burn down charts.

In one of the project, we had received more and more requests to add individual velocity measurement in Jira tracker. Senior Management wanted to see this information and use for iterations planning. And as suggested we started for a week, but then realization came is it really required to measure individual velocity? It was not clear initially, but there are some serious problems with individual velocity for sure which works anti Agile philosophy.What caveats such calculation has?

Individual velocity is a measure of how much effort a person may complete in a defined time frame. In Agile, each task gets tagged with a time frame iteration in hours or points.For e.g. If Tom burns more sprint points for several tasks within a week and Jerry burns less for the same week, we should say that Tom has better velocity in the last iteration. Does it mean that Tom is a better and faster developer? Judging to yes for this answer is not straight at all. There are hundreds of reasons why Jerry completed less. Being a part of collective ownership, He helped other developers with tasks and mentored them, one of his tasks was underestimated due to unexpected performance problem with third-party component or he was feeling low for a day or two and had almost no progress. We can bring many more reasons in the table. Performance in the last iteration says nothing and the worst thing to count. Then another idea came what about average velocity? Will average velocity help us to make correct iteration plan? If we sum up individual velocities all developers will it help us to create a better iteration plan? No, since we already have Iteration Velocity metric and it will be exactly the same. Why should we care about individual velocity, in this case? To make better assignments for each person? Is it helpful? Maybe, a bit.

Individual velocity measurement has the wrong focus on individual performance. Agile recommends focusing on team performance, individual performance is not important. If an individual resource knows that his velocity is measuring, he maybe will not help other team members a lot. He will focus on his performance as an individual developer. The worst thing company can do is to bind bonuses to individual performance. This nips teams in the bud a bad. Individual velocity measurement forces work assignment while in agile teams it is all about work commitments. In agile methodology, any velocity is a team velocity, by definition.The right solution is to measure team performance with the performance metric a very good, simple and helpful that enough for iteration planning. Individual velocity will only create unhealthy competition, backstabbing and introduce cat mouse race with direct impact on quality. Individual velocity would/can/may lead to individuals trying to ensure they look good at the expense of the teams overall goal - and that is to deliver to the customer. Team should do a commitment to complete as many user stories as they feel can be completed during next iteration. Through every iteration planning meeting, team starts to learn the strengths and weakness of the team as they working together sprint to sprint and will try to find a pace with more robust commitments in total. The team soon get to know who is pulling their weight and who is not through the daily meetings and then they can do something about it. If you measure individual velocity you tend to assign stories based on numbers you have in hands rather team commitments team will fight each other to point out who is smart and who isn’t. “Hey, are you kidding? You did only x points in the last sprint, how are you committing more?”. Individual velocity may de-motivate people, and many managers having it in hands will use it incorrectly. It is very common to revert back to muscle memory of waterfall days and make assignments instead of commitments.

Then “How about recognizing talent for rewards”? Management and even each employee expects a factor to measure his performance. If we only measure team velocity then how do the find the visibility and productivity of each engineer on the team? If team velocity is a basis for productivity then you are flying blind when it comes to understanding if each engineer is being productive and pulling their weight or not. This is very big question where I see most of us are confused. I don't know the right answer, but my experience says, This should be addressed or measured with contribution of the individual developer to the team’s productivity, automation, competence, speed. Each person will vary here and its measurement.

You must have to care for each team member to have a great team. If the team’s win is coming as the result of one great individual but 10 player that are just OK and when you lose that one great player? All of a sudden you have a losing team. So team should be pretty careful about that and complete transparency is the best practice in my experience.

In the end, I think individual velocity is anti-Agile and harm to the team commitment, collaboration and breaks the rule collective ownership. Rather than individual velocity, I would recommend leadership to measure the contribution of individual to the team’s productivity, automation, competence, speed to understand the contribution.

Set up of Jenkins Master Slave with Kubernetes Docker.

Kubernetes is an open source system for managing containerized applications across multiple hosts, providing basic mechanisms for deployment, maintenance, and scaling of applications. I got a requirement to set up Docker based Jenkins Continuous Integration environment which is scalable on the fly using Kubernetes container services. Digging through any open source solutions yet discovered, I decided to come up with Draft solution. This work pretty well with all initial assumptions and required needs.

Architecture


Architecture workflow mainly consists three major components

  1. Jenkins Cluster Manager

  2. Jenkins Kubernetes Cluster

  3. Private Docker Registry

1. Jenkins Cluster Manager

Cluster Manager is a processor instance running with Python Flask and Celery framework. It manages the Jenkins cluster through Kubernetes API. It has responsibility to communicate with Jenkins Master and works on the incoming instruction of ramp up and ramp down of slave nodes.

It manages the cluster through resizing Jenkins slave replication controller and if needed ramp up and ramp down minion nodes to accommodate the increase in the load of Jenkins jobs. It returns newly created slave instances(minions) to the Jenkins Master, so Master can ssh into slaves and execute jobs.

2. Jenkins Kubernetes Cluster

Kubernetes is an open source implementation of container cluster management. Jenkins Kubernetes cluster consists one master node and a group of multiple and expendable minion nodes. Minions cluster do have a Jenkins Master instance and remaining Jenkins Slave instance. Jenkins slave nodes in Kubernetes cluster are elastic and can be easily ramp up and ramp down through kubernetes API with replication controller.

Jenkins Master by itself is the basic installation of Jenkins and in this configuration the master handles all tasks for your build system.It controls available jobs to process, and if the number of jobs in a queue increase after a specified limit, it kicks sshd and start a slave agent. It also takes the decision either to ramp up or ramp down slave agents nodes on the fly. To make this in reality, it set up two-way communication with the Jenkins Cluster Manager, an outside processor manages the cluster behavior with resizing replication controllers and ramp up and ramp down of minion nodes.

Replication Controllers

There are two main replication controller which manages the Jenkins Master and Jenkins Slave pod respectively.

  1. Jenkins Master replication controller :

    Master replication controller manages the execution of Jenkins Master docker image in cluster. It configures the one minion with Jenkins Master instance. Throughout the lifecycle, it will work with size of one most of the time.

  2. Jenkins Slave replication controller:

    Slave replication controller manages the execution of Jenkins Slave docker image in cluster. It configures one to many minions with slave instance. Throughout the lifecycle, it resizes the running instances of slave images as per the instruction given by Jenkins cluster manager(Stormy).

3. Private Docker Registry

Private docker registry is a GCE instance manages the docker images as a repository. It's only exposed to the internal nodes of Jenkins cluster. All nodes can pull and push docker instances.

Setup Kubernetes cluster on local

Instructions for setting up Kubernetes with GCE on your local machine.

Step-by-step guide

  • Download and install boot2docker https://docs.docker.com/installation/mac/
  • Clone the Kubernetes repo:
  • git clone https://github.com/GoogleCloudPlatform/kubernetes.git
  • Run cluster/kube-up.sh from the kubernetes directory. This will fail since the cluster already exists, but will create the necessary local files.
  • Copy keys from kubernetes-master to home folder:
  • ssh into kubernetes-master to confirm that can connect

    gcloud compute ssh --zone us-central1-b kubernetes-master

  • Go to /usr/share/nginx and execute chmod +r * , incase you are doing first time
  • Copy all certificates from Master to your host. Fire these commands from the host
      gcloud compute copy-files kubernetes-master:/usr/share/nginx/ca.crt ~ --zone <zone>
      gcloud compute copy-files kubernetes-master:/usr/share/nginx/kubecfg.crt ~ --zone <zone>
      gcloud compute copy-files kubernetes-master:/usr/share/nginx/kubecfg.key ~ --zone <zone>
    
  • Rename the keys:

    mv ca.crt .kubernetes.ca.crt
    mv kubecfg.crt .kubecfg.crt
    mv kubecfg.key .kubecfg.key
    
  • Run cluster/kubecfg.sh list pods to confirm that all is connected

Build & Deployment.

All the instruction about the build and deployment as been added in README in a git repository. Feel free to checkout and update in case necessary. I am sure many things are messed up.

Improvements:

This project certainly has a lot of bandwidth for refactoring and rearchitecture. Couple of things I feel can be good for refactoring

  • Develop Stormy solution in terms of Jenkins Plugin: Currently Stormy is running on a separate node and managing the Kubernetes Master and Slave. In this solution, I think we can redesign the Stormy to be a part of Jenkins Plugin, which can be installed on any Jenkins Master server. Stormy should only control slaves using Kubernetes REST API, with the interim status of Jenkins Master.

The 10 most inspirational personality of 2014.

The world is full of hardworking and aspirational people who are doing wonderful and extraordinary things. I feel fortunate to know, to read about them and to learn a lot in the year 2014. It's great to use my blog to highlight as many more. I’d like to shine a special spotlight on these 10 influential people(or a group of people) on my chart inspired me in 2014.

Mark Zuckerberg


Rejected Yahoo's 1 billion $ offer in 2006, when facebook was nothing. This man gets most of the credit to driving successful journey of facebook with recent worth 200 billion $ by September 2014. Having the strong belief of "If lot of people have access to more information and more connected, it will make the world better." facebook not only helped people to connect through socially, but drove the one of the biggest Egypt revolution of Modern history, changed the political equations of Indian elections. He inspired me for his attitude of build something for the long term, anything else is the distraction and find the things you are super passionate about.

Jeff Bezos


Jeff Bezos a founder of Amazon which is not only the greatest tech story of the 1990s but it’s also one of a few contemporaries still run by its founder. He is the one who saw the fall of amazon share price from 100$ to 6$ in front of his eyes, but remain calm and stayed positive. It's one of those qualities makes me feel great about Bezos. He cast every challenge as opportunity and one of those personalities who takes all decision on data, not judgment or instinct. He has also ability to handle many minute things while maintaining the longer goals in mind.

Amazon is immensely valuable today. It has successfully entered in those domains which no one ever thought. Nobody else reinvests almost every cent of profit in growth, as Bezos still does.

Jack Ma


The more I read about Jack Ma, more I found interesting his personality. Ecommerce business Alibaba started in apartment 1999, now bypass Walmart in net worth. I am impressed by his vision of using technology to alter the existing standards alter the user experience, which has social, economic and political impact on the society beyond just an e-commerce platform. 21st century belongs to Asia, and, of course, such visionary leaders would have more impact on their technical arm.

Tony Fadell


Working in the designer shop, I may be little biased with him, but whatsoever I am impressed with Tony Fadell. Working with HUGE one of our goal is to combine the strength of design and technology for the betterment, and now if think about Tony Fadell 'A passionate, brilliant engineer' I should consider him as a role model. Combining design, engineering and entrepreneurship its one of the toughest task. Tony did all this twice: with the father of the iPod and smart thermostat at Nest Labs recently sold to Google.

Edwards Snowden


Freedom is individuals right, world can be more creative and beautiful if every individual becomes real get this founding principle to practice. And for the sake of this basic right, Edward Snowden a computer genius has chosen to do what is right rather than what will enrich him, and he has chosen to do what is right rather than what is lawful.

He gave us a window of opportunity in which to make an informed, self-determined choice about the global system of surveillance for the sake of his life, career and future.

Malala Yousafzai


Malala’s drive, passion and perseverance in standing up for education and girls’ rights in Pakistan and around the world is extremely moving. She is a true human rights champion and a worthy recipient of this year’s Nobel Peace Prize.

The Ebola Fighters


"Not the glittering weapon fights the fight, says the proverb, but rather the hero’s heart."

These brave hearts reached from wealthy and safe countries like Germany, USA, Spain to slums areas in Liberia, Guinea and Sierra Leone having risked and persisted, sacrificed their lives and saved millions who could be next victim unless they act hard and fast selflessly. World is safer now from Ebola, and their sacrifice count more than anything else.

Narendra Modi


India a world's largest democratic country have two dozen national parties, more than hundred regional parties. Dealing with issues of coalitions, corruptions, boneless decisions, cast politics, uneducated people, most Indians believed that their country has lost its way as its growth rate has been almost halved while inflation has soared. And now India have Modi, the person who reversed every one of those traits by his charismatic, intense, utterly decisive personality. Former head of Gujarat, one of India’s fastest-growing states and now sworn in as Prime Minister of India on 26 May 2014 Prime Minister.

His leadership skills, right use of technology to do the right task is unmatchable in world's political circle.

Shinzo Abe


Rightly said, if Japan wanted to tell the world it was going to stage an economic comeback, telling about Abenomics would be the better way. Shinzo Abe seems to be one of the strong political figures I remember for this year who is the designer of bold set of strategies to defeat deflation, ignite consumer spending and restore economic dynamism. And all this happens within just 20 months. His strong commitment to the big task and make the things work in the political circle is awesome. I am amazed how single person can enforce strong relations within in Asia to counter China. If you have looked his twitter handle and the people he follows, you will understand, the vision to restore the pride and strength of Japan. Hats off …

Angela Merkel


Angela Merkel is a leader who really paved the way to transform of the German from the dark shadow of history. Her leadership for 2006 world cup as a host to perceive a team of young, hungry and decidedly “un-German” team. Her firm support for her thoughts made Germans ultimately embraced the new approach. After the world cup in 2007 financial crisis, she once again had to convince skeptical Germans that change was needed to rescue Europe’s economy. The quality of her leadership firm, measured and agreeable helped return Germany to a place of respect in the global arena.

Jenkins Plugin, Worth a Read.


Jenkins CI tool is backed by very strong active open source community which developed hundreds of very handy and useful plugins. While going through couple of existing pipeline in couple of teams, I have seen developers have tried to reinvent the wheel in many cases with adding all sort of nancy fancy, half tested logic to accomplish tasks, and nonetheless I was one of them. If I look back and asses, there were tons of handy plugins which could solve the use case with very minimum pain and could promote reusability across the projects.

I realize it's good to note down those plugins which I found very handy in day to day job of ours. I understand things are changing rapidly this domain, so I assume this post may be stale in very near future.

Build Flow Plugin

Managing a pipeline in Jenkins expects moderate configurations across all the jobs participating in the flow. Configuration can combine different types including parameterized build, parallel builds, joins or downstream wait. With the addition of more jobs, it gets quite complicated for further updates. The build process is then scattered in all those jobs and very complex to maintain. Technically individual Jobs are meant to carry the responsibility for their unique task but with this pipeline configuration they also carry such this joins connection details which pollute their goal. In this scenario, Build Flow Pipeline comes to the rescue. This plugin is designed to handle complex build workflows (aka build pipelines) as a dedicated entity in Jenkins. Build Flow enables you to define an upper-level Flow item to manage the job orchestration and link up rules, using a dedicated DSL. This DSL makes the flow definition very concise and readable. Main jobs don’t know anything about the pipeline as everything is externalized to the flow.

The only missing with this plugin, it doesn't have much support for graphical visualization on monitors. There is a Build Graph View Plugin but it didn't intend what someone expects. If graph view doesn't matter you most, then this can be best plugin to try out, or else go with regular scattered configuration.

Build Monitor Plugin

Visibility is the core aspect of Jenkins CI tool. Being understood CI is a part of collective ownership, everyone should know if something breaks, some part is not doing well consistently. And to make it in reality, having that projected on a monitor visible to all is the welcome move. Build Monitor Plugin provides a highly visible view of the status of selected Jenkins jobs. It easily accommodates different computer screen sizes and is ideal as an Extreme Feedback Device to be displayed on a screen on your office wall.

Build Name Setter Plugin

Every build comes with build number, but most of the time it's hard to identify a particular build with build number.Many types of build get executed with an update from the revision number by source code, so it would be a great deal to have a revision number as a unique identifier.In order to visualize the actual revision numbers within Jenkins, the Build Name Setter Plugin can be used instead. This makes it easier to identify builds by revision number instead of by build number.

Build Pipeline Plugin

As said earlier, visibility is the core aspect of Jenkins pipeline to be successful in terms of its value.This plugin helps renders upstream and downstream connected jobs that typically form a build pipeline in very nicely fashion on the monitor, big monitors specially. In addition, it offers the ability to define manual triggers for jobs that require intervention prior to execution, e.g. an approval process outside of Jenkins. This would help other decision-making teams(NonDevelopers like QA, BA or another kind of authority) to continue the pipeline

Build Timeout Plugin

Build-timeout is simple but necessary.

Clone Workspace SCM Plug-in

This plugin makes it possible to archive the workspace from builds of one project and reuse them as the SCM source for another project.This is useful in terms of avoiding space issues, and copying over same source code which is not absolutely needed in the couple of cases, resulting faster execution and feedback.

Cobertura Plugin

This plugin integrates Cobertura coverage reports to Jenkins. If testing framework in your build job creates reports aligned to xUnit standards, this can be handy plugin to project the progress in nice charts and detailed format.

Cucumber Test Result Plugin

This plugin allows you to show the results of Cucumber tests within Jenkins.This plugin really helped the team to accommodate the BDD development style. Cucumber plugin formats the raw reports into very nicely and visible charts which can be easily understood by Non-Tech folks like BA and Product owners.

Delivery Pipeline Plugin

This plugin visualizes Delivery Pipelines (Jobs with upstream/downstream dependencies). This is somewhat similar to Build pipeline plugin but has very lean design. This is favorite plugin across the teams here out in my company.

Email Extension Plugin

This plugin is a replacement for Jenkins's email publisher. This is more advanced plugin with better features for sending build status messages. Please make sure you setup very granular and effective way of communication to the responsible team rather than the whole team.

Git Plugin

This plugin integrates GIT with Jenkins. This plugin also added feature of notifying the build about git code base updates using REST API. This is very useful feature where Git Hooks can notify the builds immediately rather build wait for Git poll interval.

Heavy Job Plugin

During concurrent execution of jobs, Heavy Job Plugin can be used to allocate all available executors on that node in order to ensure exclusive access to all the local repositories.

HipChat Plugin

This plugin is a HipChat notifier that can publish build status to HipChat rooms.This plugin adds more visibility and awareness in terms of build status into the team resulting more sense of collective ownership.

JaCoCo Plugin

This plugin allows you to capture code coverage report from JaCoCo. Jenkins will generate the trend report of coverage. This plugin is a fork of the [Emma Plugin]. Big part of the code structure comes from it, however, it is completely refactored. It also includes functionality similar to the [Emma Coverage Column] which allows to include a column in Dashboards which displays the latest overall coverage numbers and links to the coverage report.

Join plugin

If you are thinking of setting up parallel jobs in pipeline, you may end up in a scenario known commonly as 'diamond' shape project dependency. It means there is a single parent job that should start several downstream jobs. Once those jobs are finished, a single aggregation job runs. This plugin allows a job to be run, after all, the immediate downstream jobs have completed. In this way, the execution can branch out and perform many steps in parallel, and then run a final aggregation step just once after all the parallel work is finished. More complex interactions are not possible with this plugin.

NodeLabel Parameter Plugin

NodeLabel Parameter Plugin can be used to assign the cleanup jobs to the specific nodes. This plugin adds two new parameter types to job configuration - node and label, this allows to dynamically select the node where a job/project should be executed.

Parameterized Trigger plugin

This plugin lets you trigger new builds when your build has completed, with various ways of specifying parameters for the new build. You can add multiple configurations: each has a list of projects to trigger, a condition for when to trigger them (based on the result of the current build), and a parameters section.

Performance plugin

This is very handy plugin in terms of Performance testing. This plugin integrates JMeter reports, JUnit reports, work output, and Iago reports into Hudson.

Priority Sorter Plugin

In scenarios where multiple pipeline executes in parallel, build steps of different pipelines oftenly expected not to get execute in a random order. For example, deploying artifacts of a pipeline to a live server cannot be succeeded by deploying artifacts of another pipeline earlier than a smoke test has run for the already deployed artifacts. This can be guaranteed by assigning a higher priority to the smoke test job using the Priority Sorter Plugin.

Rebuilder Plugin

Sometimes a step in the pipeline might fail because of some technical error that is not related to the associated revision like Jenkins restart, or out of memory issue. In order to trigger a rebuild of the failed downstream job the pipeline parameters revision number and build number need to be specified manually, which is a bit awkward. Here the Rebuilder Plugin comes in handy, which facilitates rebuilding a job with the same parameters as the failed build.

Shared Workspace

This plugin allows to share workspaces by Jenkins jobs with the same SCM repos. It saves some disk space and repetitive steps if you have different jobs with identical repos. Importance of this plugin is not well understood by the developers, considering the memory is cheap. But following standard practices and promoting reusability can be fruitful in the future.

SSH Agent Plugin

This plugin allows you to provide SSH credentials to builds via a ssh-agent in Jenkins.

SSH plugin

You can use the SSH Plugin to run shell commands on a remote machine via ssh.

SSH Slaves plugin

This plugin allows you to manage slaves running on Unix machines over SSH. It adds a new type of slave launch method. This launch method will open a SSH connection to the specified host as the specified username. Once it has a suitable version of java, copies the latest slave.jar via SFTP. Starts the slave process.

Subversion Plug-in

This plugin adds the Subversion support (via SVNKit) to Jenkins.

Thinbackup plugin

This plugin simply backs up the global and job specific configurations (not the archive or the workspace). One of the main features is automated backups. This is far better then Backup Plugin.

Throttle Concurrent Builds Plugin

The Throttle Concurrent Builds Plugin can be used to define throttle categories and restrict concurrent execution of jobs by assigning them to the same throttle category.

Wall Display Plugin

A wall display that shows job build progress in a way suitable for public wall displays. Rendering is performed using ajax based on REST API calls, so requires no page refreshes. It's one of the plugin you would like to give a try.

Workspace Cleanup Plugin

One of the commonly recommended tasks of Jenkin Job is to cleanup itself after build is finished to maintain immunity for next build. I have seen most of the time teams uses ‘rm -rf *’ somehow in script which works fine but doesn't cover bad scenarios. This plugin is exactly meant to save us from this granular job. This plugin deletes the project workspace after a build is finished.

xUnit Plugin

This plugin makes it possible to record xUnit test reports.

Jenkins Best Practices.


Setting up Jenkins pipeline seems to be one of the key task achieving ideal way of adopting continuous delivery. While going through this phase, we are going through a lot of new learnings, with the aspect of different team development cycle, technology, platform needs and clients expectation. This place is a great place to document best practices to set up Jenkins jobs.

Practices

  • Break Job to a granular level: A single Jenkins job to perform multiple tasks is not ideal. Jenkins is a just a build tool and it's not smart enough to know which step has been failed. The essence of creating a pipeline is breaking up a single build process in smaller steps, each having its own responsibility. In this way faster and more specific feedback can be returned.
  • The most reliable builds are the one which builds clean and builds fully from Source Code Control.
  • All Jenkins builds must follow the packaging principle. Build Once, Deploy Anywhere.
  • If you are having own instance of Jenkins, make sure it's secure with user credentials.
  • Right usage of plugins is highly expected. There are tons of plugins available for free in Jenkins marketplace. Quite useful plugins are
    • Delivery pipeline plugin
    • Join plugin
    • Mailer
    • Parameterized Trigger Plugin
    • Clone Workspace SCM Plug-in
    • Build Monitor Plugin

Rollback

If anything goes wrong in critical deployments, first and foremost thing comes in mind and that is Rollback. Considering its importance, still rollback strategy is most undervalued concept in Jenkins jobs and pipeline setup.

  • Revision number: Every Jenkins pipeline should use of pipeline revision number across its execution.
  • Define a pipeline that is strictly associated with a single revision within a version control system.
  • Code base should always go through step of Tag, label, or baseline after the successful build.

Configuration

  • Always Bootstrap Jenkins workspace from scratch to update with working copy prior to running the build goal/target.
  • Always configure your job to generate trend reports and automated testing when running a Testing jobs.
  • Use of public key authentication: By setting up the public key mechanism, anyone can log in from one system to another system without ever typing password. This is the real timesaver useful for Jenkins.
  • Use of Labels: Sometimes it's good to keep more diversity in build cluster. And one of the way to manage the diversity is using labels to a particular node. Right usage of the label helps team in the long run to identify the uniqueness of each node and help them to use it for rightful purpose.
  • Parallel execution is a great thing to see the results and feedback quickly. Every pipeline should find out the list of jobs which can be executed in parallel.

Maintenance

  • Every job should include relevant script to clean up the operation after completion to maintain cleanliness.
  • Jenkins should also include maintenance jobs such as cleanup operations to avoid full-disk problems.
  • Always no to Build Record Sprawl. You should discard old build with the configuration option.
  • Periodically team should archive unused jobs eventually removing them.
  • Almost every pipeline jobs eventually outgrow the ability to run builds on just one machine. Everyone should take Advantage of Distributed Builds. In larger systems, make sure all jobs run on slaves.

Notifications

  • Set up email notifications mapping to developers in the project, so that everyone on the team has his pulse on the project's current status.
  • Use of hipchat plugin, logging system is recommended.
  • Take steps to ensure failures are reported as soon as possible.