Software engineering for speed, quality and more innovation
Did you read Geoffrey Moore´s 2015-book, “Zone to win” yet? If not, I really recommend to read it whatever business you are working in. The book discusses the importance of catching the right waves and not being caught by one by surprise. It also covers the concept of zone management and how large enterprises can protect established parts of their business while still creating room for more disruptive innovation. Among many smart concepts in this book, I really like the following saying:
“What makes modern business different? Simply put, speed plus disruption.”
If we think more about the words, speed and disruption, we could probably write a whole set of new articles. Anyhow, this blog post is mainly about speed in software engineering. Because, if you achieve sufficient speed you can eventually create more room for innovation and disruption as well.
In my own organisation, we have at least 9 words that we constantly emphasise towards our great software engineering teams:
- Speed, speed and speed
- Quality, quality and quality
- Automate, automate and automate
Well, these are actually only 3 words, but I like to repeat them to my peers as the words are so important for us. The two first ones, speed and quality, are vital to have a short time to market and deliveries on agreed time. The last word, automate, is essential because it is the only way to achieve speed and quality at the same time. If you automate, you can leave more repetitive and “simple” tasks to the machines. This causes that software engineers can spend their time on more valuable tasks. Does it sound simple? Well, it is not. It takes some time to get there and it certainly needs some visions and roadmaps on the architecture, tooling and programming-technology level in addition to the functional roadmaps.
What did we achieve in terms of speed and quality for the past 4-5 years?
Within our own software products/solutions, we have achieved the following within the past years by focusing a lot on speed and quality:
- We have increased the release rate with ~40% within 5 years, with far fewer developers than we used to have some years ago
- We have reduced the average monthly number of customer-reported defects with 70%, partly by turning testers into “software test engineers” (programming the tests) and automated stuff
- We are planning to deliver bi-monthly releases in some of our main products from early 2018
- We are targeting a continuous delivery deployment model on our upcoming Software-as-a-service deliverables within the Sesam-portfolio
The planning, testing, release, deployment and feedback phases of software deliverables
If you look around in the software community there are different levels of maturity when it comes to speed. Some projects spend months or years to produce a release, while others are deploying new features or fixes several times a day. Perhaps you are using a movie streaming service like Netflix? Well, they are deploying at least a hundred times a day. That´s quite impressive when you take into account that the service is normally working perfectly.
Anyhow, a traditional software release cycle will look something like this:
We have two important metrics in the figure above; the cycle time and lead time. Our users would normally measure on us lead time. That means, how long time does it take from they propose and idea (or log a defect) until it is implemented and deployed into the product/solution.
Traditionally, we start with a vision and high-level release scope as input to the planning phase. Then we implement the software, run through some user acceptance testing before the release candidate is handed over to our Support & QA department. Once the release candidate has passed the User Acceptance Testing, we can deploy it to the market.
(This is the short version of a development cycle. In reality, it is far more complex).
However, to achieve speed and quality, you normally need to do some serious changes to this process. We are doing that at the moment as it becomes even more important when you enter into the world of Software-as-a-service or Platform-as-a-service where time to market is crucial.
Speed in vision, planning and feedback phase
Is it possible to “automate” parts of the vision and planning phase? Yes, it partly is. At least you can leave it a lot more up to our end-users and listen more directly to their voice. What we are doing is to experiment with software solutions such as Aha! or Uservoice which both have good interfaces to our more detailed issue-tracking and planning systems. With such solutions, we can get prompt and direct feedback directly from our user community. We can let them propose ideas or report issues, and then do voting on what they want. This direct type of feedback might add a lot of value and efficiency to the existing product management and sales organisations most companies have. You might get a qualified and prioritised wish-list for the next version of the software solution, as well as immediate feedback on the previous release. And you will create an even more dynamic user community.
The whole idea is to streamline the thing we refer to as the planning-onion. You might have silos or organizational barriers between product management teams (taking care of visions and roadmaps) and developer teams (planning releases, iterations and daily activities). There are actually more levels here because of end-users both request new features and give feedback on existing features.
A concrete example on this is Microsoft´s Uservoice-site for Word. You can go in and log whatever idea you have and then see of the rest of the community respond to that by “voting”. The software vendor will normally not guarantee any specific features or timelines, but will read all suggestions and respond to every suggestion that goes above a certain threshold (for instance 20 votes)
Speed and quality in development and testing phases
As developers, we do not want to discover issues in the upcoming release too late. And as a user, it is annoying to get a new release where you experience crashes, user experience issues or performance issues. Thus, we have established a very advanced “build and test” infrastructure to deal with this. In software engineering terminology, we call it continuous integration (CI) and continuous delivery systems (CD).
Each time a developer is doing a change in the code-base, it will automatically trigger a lot of actions in our CI/CD-systems. First of all, a mandatory “peer code-review” must be done before you are even allowed to commit changes to the main development branches of the software product/solution. A lot of static code-analysis is also done automatically to make sure we are not entering any “code smells” or security/performance flaws into our software. The software is compiled and ran through hundreds or thousands of unit tests. After this, the incremental version of the software is pushed to our regression test system which again will run it through hundreds of regression tests. Each test will create a result set which is compared with a set of baseline data which is already verified with independent sources or 3rd party tools (measurements, Excel-sheets, Mathcad, Vis-Sim, FEA-tools or similar).
The above metrics is an example from just one our of 500 regression test on one of our high-performance compute solutions in Sesam. It shows us that a test, in this case, test “X9026”, has been run on 5 different (virtualized) hardware configurations (OSLWP….and so on). This is one of the biggest tests in this product, so the average test-time is in the range 45-90 seconds. However, we can see some “peaks” there, i.e. the ones with 2 minutes plus/minus. If there is a worrying trend here, this will trigger some alarms in the CI-system, but if it is only one-offs we would normally see it as a result of a highly stressed infrastructure just at this time.
Speed in deployment phase
Several years ago, we had a semi-manual “create setup and deploy to customer portal” solution. Today, we are working with both Windows desktop solutions and Software-as-a-service (solutions “born in the cloud”). For the latter, we have some advantages, as we as software vendors would normally be in control of the cloud runtime environment (Azure, Amazon or similar). At the moment, our 3D Asset Viewer project is running on a number of developer-, test-, staging- and production environments. These environments are updated on a daily/weekly basis and we are targeting to go with continuous delivery once we go into full production in early 2018.
In the example below, you see an example of a 3D-viewer within Sesam-as-a-service. In this model, we are already applying the concept of feature-toggling. This is a technique in software development that attempts to provide an alternative to maintaining multiple source-code branches (known as feature branches), such that a feature can be tested even before it is completed and ready for full deployment to all customers.
Example: We have been asked by our user-community to implement a new feature in the 3D-model. That could, for instance, be to show something like marine growth or possibly to have some kind of text annotation directly on the model. As usual, there will be a number of ways to provide a good user experience around this specific feature. So rather than spending too much time writing requirement specifications and documents, we would prefer to create an interactive mockup and gradually test it out in the production environment of the software solution. Let´s imagine we have customer “A” that is eager about this new feature and is willing to test it, while customer “B” does not care that much about it. In such situations, we can feature-toggle it for customer “A” and make sure it works correctly prior to deploying it to the whole user community.