1 Introduction
2 Background
2.1 Release Engineering
2.2 Organizational Context
2.3 Empirical Studies on Release Engineering
3 Method
3.1 Case Organizations
Theme | Case BigCorp | Case SmallOrg |
---|---|---|
Organizational context | Product development unit in a large mature company | Startup company |
Company size | > 20,000 | 50 |
Case organization size | 180 | 50 |
Distribution of the organization | Two sites in Europe, one site in Asia | One site in North America |
3.1.1 Case BigCorp
3.1.2 Case SmallOrg
3.2 Research Goal and Questions
-
RQ1. What release engineering practices had the case organizations implemented?
-
RQ2. What outcomes did the implemented release engineering practices have for the case organizations?
-
RQ3. What were the reasons for differences in release engineering practices in the case organizations?
3.3 Case Selection Rationale
3.4 Data Collection
Theme | Topics |
---|---|
Background info | Interviewee role before the acquisition, place in the organization, responsibilities, relations to others in the organization |
Organizational differences | Differences between SmallOrg and BigCorp, explanations for the differences, consequences of the differences |
Software development process | Customer requests, planning and prioritization, quality assurance, release |
Continuous integration | Practices, test automation, tools, problems, future plans |
Organizational structure | Teams, responsibilities, division of labor, communication, decision making |
Product | Features, architecture, technologies |
Metrics | Release cycle, CI cycle, defect counts |
Other topics | Explanations for the SmallOrg’s market success |
Inter- | Interviewees | Interviewee’s role (at the time of the interviews) | Length |
---|---|---|---|
viewers | and experience | (min) | |
3 | 2 BigCorp managers on Site A (*) | Business improvement manager. 23 years at BigCorp in different R&D management positions. (For the second manager see the next row.) | 83 |
3 | BigCorp manager on Site A (*) | Head of business line transformation. Manager at BigCorp for 21 years. Led the agile transformation and agile coaching. Was involved in the assessment of the SmallOrg before the acquisition. | 110 |
2 | BigCorp manager on Site A | R&D leader. Assessed SmallOrg methods and tools in the acquisition. 22 years in various roles at BigCorp, e.g., testing manager, program manager and agile and lean coach. | 50 |
3 | BigCorp manager on Site C | Manager and Coach. Was involved in the evaluation of the SmallOrg before the acquisition and after the acquisition became part of the joint integration team in the areas of IT and HR. Started 12 years ago as at BigCorp as software engineer. Various positions in R&D and R&D management. | 74 |
1 | BigCorp manager on Site D | Business PO. Responsible of the studied business area at BigCorp. One year in the project. 21 years in BigCorp, starting as a developer. Various positions such as project management, program management, product customization and pre-sales. | 74 |
1 | BigCorp manager on Site D | Program manager. 2,5 years in the program management of the studies product. Joined BigCorp 17 years ago as a developer, then architect, line manager and program manager. | 57 |
2 | BigCorp architect on Site B | Head architect. Head technical assessor in the acquisition of SmallOrg. 21 years at BigCorp, 17 years in related products. | 84 |
2 | BigCorp architect on Site B | System Architect. Was involved in the technical evaluation of the SmallCorp’s product and has worked in the integration of the products since the acquisition. Joined BigCorp 11 years ago as a developer. Worked already for the predecessor product. | 74 |
2 | BigCorp architect on Site B | Technical Product Owner. Was involved in planning the integration of the two products and companies since the beginning of the acquisition, concentrating especially in continuous integration and continuous delivery aspects. | 107 |
1 | SmallOrg manager | Head of Product Strategy at BigCorp. Before the acquisition the CEO of SmallOrg, one of the founders of the SmallOrg. Over 20 years in this business area. | 81 |
1 | SmallOrg architect | Chief architect. One of first employees at SmallOrg, started as a senior developer. 15 years in this business area. | 72 |
1 | SmallOrg domain expert | Head of systems engineering. One of the original founders of SmallOrg. Has worked in this business area over 20 years. Technical background, started in hardware design. | 77 |
1 | SmallOrg domain expert | Systems engineer. Has worked in this business area for more than 10 years in several companies, first on the customer side and then in design. | 62 |
1 | SmallOrg development team member | Technical lead. Leads an integration team. | 79 |
1 | SmallOrg development team member | Technical lead and line manager. 9 years in software development, started 8 years ago in the predecessor of SmallOrg and 3 years ago in this product as a developer. | 84 |
1 | SmallOrg development team member | Developer. Has worked at SmallOrg since graduation from the university, i.e., 1,5 years. | 51 |
1 | SmallOrg service team lead | Head of engineering services. 18 year of experience in this business area. Previously field support engineer meeting customers. One of the founders of the SmallOrg. | 67 |
1 | SmallOrg service team member | Customer support and services manager. 3 years in SmallOrg, first as a developer and moved two years ago to the customer support team to perform customer trials. | 62 |
Method | Description |
---|---|
Background interview | An interview with a person responsible for improving release engineering practices in BigCorp. The length of the interview was 102 min. The interview consisted of the following themes: case background information, used software development methods, testing strategy, use of devops, used development tools, challenges and benefits of CD in the case organization. |
CI data collection | CI data was collected for the analysis of broken builds. |
Two group interviews | Two group interviews were performed with managers, testers and developers on one site of the case organization. The interviews lasted 104 and 160 min and both of them had 10 interviewees. The group interviews discussed the adoption problems of modern release engineering practices. |
Case | First data point | Time period | # of builds |
---|---|---|---|
BigCorp | Year before acquisition | 22 weeks | 1889 |
SmallOrg | Half a year before acquisition | 24 weeks | 760 |
3.5 Data Analysis
Product context |
Customer-specific customization required, customer traditionally do not want frequent |
releases, need to react fast to customer requests, some customers ready for frequent releases, |
complex product, early market, customers want to test the product in their environment |
Organizational context – Customers |
Single customer, friendly customer, multiple customers, strict service level agreement |
Organizational context – Organizational structure |
Co-location, collaborative organization, functional organization, large organization, |
multi-site, rigid organization, small organization |
Organizational context – Product development process |
Flexible process, plan-driven process, risk-avoiding organization, high system quality |
requirements, daily prioritization, product portfolio, overtime work on weekends |
Organizational context – Resources |
Amazon AWS, no production-like test environment, getting test environment takes minutes, |
getting production-like test environment takes months |
Release engineering practices – Code review |
Code review gate, build in branches, no code review |
Release engineering practices – Continuous integration discipline |
Fast build, build kept green, build in branches, build only changed components, failing |
build, slow build |
Release engineering practices – Internal verification scope |
Low test automation, manual system tests, integration tests in CI, simulators with customer |
data, low verification, smoke tests, no definition of done, low performance verification, |
dedicated testers, definition of done, production-like test environment, unit tests |
Release engineering practices – Domain expert testing |
Domain expert testing |
Release engineering practices – Testing with customers |
Customer acceptance testing, closed-loop testing, open-loop testing, strict service |
level agreement, manual progressive deployments |
Release engineering outcomes – Feedback from customers |
Fast reacting to customer needs, two week release capability, direct feedback from |
production, fast development cycle, no feedback from production, slow development cycle, slow |
reacting to customer needs, slow release capability |
Release engineering outcomes – Number of defect reports |
Many defect reports |
Release engineering outcomes – Quality issues with new customers |
Quality problems, product not ready for general availability, technical debt, elementary |
programming mistakes, scalability problems |
3.6 Mitigating Threats to Validity
-
Construct validity: by using inductive data collection and analysis methods, we made sure that the used constructs are the same as used by the case organization members. This means that we did not try to investigate constructs that were external to the case organizations, but instead allowed the case organization members themselves decide which words or meanings they use when describing their release engineering practices.
-
Internal validity: we only used such evidence from the interviews that was logically consistent with other interviewees’ experiences. This mitigated the possibility that the evidence would be an opinion of single persons and might not represent the truth. We also interviewed persons in different roles in the case organizations and therefore the results represent a holistic viewpoint. In addition, we collected quantitative data from actual CI systems (see Table 5) to triangulate the qualitative evidence. We had multiple researchers who took part in the interviews and analysis, thus the consistency is not dependent on single researchers either. In addition, we validated our results with two case organization members after the analysis.
4 Results
4.1 Overview of the Development Processes
4.1.1 Case BigCorp
4.1.2 Case SmallOrg
4.2 Product Context
Due to the requirement ambiguity and production environment differences, the product has to be customized for each customer. Thus, the business is acombination of product and service business, where ageneral product is customized for each customer’s needs. Some customer-specific customizations can be productized later on to provide the customizations to other customers too. In addition, customers want to have aproof of concept (PoC) before buying the product. In aPoC, the product provider demonstrates the benefits of the product by running it in the customer’s environment. During PoCs, customers often have customization requests specific to their own needs and being able to address those requests is agood way to assure the customer of the benefits of the product.In Asia, the customer needs are different than in USA. [...] And then, Korea and Japan have their own quality and other requirements.—BigCorp Manager
In the end, [the business] is closer to service business than product business. You cannot make [the product] in afactory that work adequately well, but instead you have to go to the customer’s [environment], or at least to atest [environment] and do little testing and fix them there.—BigCorp Manager
Usually you have to implement some things [...] that are gonna be [customer] dependent.— SmallOrg Dev Team
The product is installed to the customer premises, and thus each customer has its own instance of the product running. After buying the product, the customers traditionally do not want frequent updates to the system, because amalfunctioning product can cause serious harm to the customer. Atypical release cycle to customers, including new features and defect fixes, can be from six months to one year. The actual product release cycle can be faster, as it was in both BigCorp and SmallOrg, but individual customers are not typically willing to accept more frequent updates. This puts pressure on the release engineering practices, since there is no possibility to easily fix defects after arelease.When you do aproof of concept to acustomer, then there will be things that the customer wants to have really quickly. Like, “if you can put this [feature] there, then Iwill buy it.”—BigCorp Manager
If aproduct gets agood reputation, that typically it does not cause any hiccups, then maybe three months could be the fastest release cycle. And even then you would need to have avery willing [customer]. Half ayear is the most probable release cycle, in my opinion. Because it is not atrivial thing for the [customers], ever. [...] If you release an immature [product], you can get the whole [production environment] down suddenly, totally.—BigCorp Architect
However, since the customer environments differ alot from each other, it is difficult or practically impossible to verify the release quality in alaboratory setting, and often some defects are only discovered in customer environments. Luckily, there are some customers that are more risk-tolerant and allow monthly or even more frequent releases.I know there has been some customer requirements or requests to say, we want aproduct [version] to be supported for six months, and only be installed once ayear or something like that.—SmallOrg Dev Team
We did work with the [customer] [...] alot to actually go ahead and, verify, what the impact of the changes will be in the [production environment]. That also uncovered [many other issues], because now you’re connecting to an actual [environment], the information in the actual [environment] can be different, and, data can be missing sometimes and so, the hardening of [the product] takes alot of time.—SmallOrg Domain Expert
That’s kind of like the biggest source of our, defects and bugs. Is just running on new sets of data that we didn’t run before. That exposes new, assumptions in our system that were not correct.—SmallOrg Dev Team
4.3 Organizational Context
Theme | Case BigCorp | Case SmallOrg |
---|---|---|
Customers | Tens | One, Friendly |
Organizational structure | Large, Distributed, Functional | Small, Co-located, Collaborative |
Product development process | Long-term, Multiple stakeholders and products | Short-term, Few stakeholders, One product |
Resources | Sufficient workforce and Test equipment | Limited workforce and Test equipment |
4.3.1 Customers
If you have five, ten or fifty customers things will change completely. Every customer will do their own requests and they are typically different from each other.—BigCorp Architect
Second, different releases to different customers need to be tracked, in order to be able to provide customer support and defect fixes to different customers. In addition, updating to newer releases has to support many previous versions, because different customers might have many different versions running.You have to have global market view what are the most important things. Which leads to unsatisfied customers because you cannot fulfill every customer’s priority requests.—BigCorp Architect
Finally, you have to be more careful with the quality of the product, because getting defect reports from many customers simultaneously would congest the development pipeline quickly, as different kinds of defects can be found by different customers.When you think globally, you have to think how do you release to different customers, how do you track what each customers has, what patches has gone to where.—BigCorp Architect
With many customers, you have to pay more attention to product quality. [...] If you have released the product to fifty customers and you find adefect, find asecond and athird defect, then it will get harder to fix the defects in time.—BigCorp Architect
We already were building aproduct and [the SmallOrg customer] said oh, why don’t you put this and that, and we changed the whole product to match what they were asking for.—SmallOrg Service Team
[SmallOrg] set up this kind of, ways of working, that they have this weekly, project management meeting with [SmallOrg customer]. Over there basically they have access to aquite large [environment] and, kind of the daily issues with that [environment]. That is apowerful addition, for the development of the product. What we did with [BigCorp product], Ithink [...] we haven’t been that close, to the customer.—BigCorp Manager
SmallOrg could prioritize and release new content quickly compared to BigCorp, because there was only asingle customer and production instance. In addition, SmallOrg had made astrict service-level agreement (SLA) with the customer so that if there were any defects in the released product, the defects would be fixed fast and the customer would still be satisfied. The SLA demanded that SmallOrg had to respond to critical requests in 5 min, although actually fixing the issues could take longer time.They had avery good relationship with [SmallOrg customer], where they could develop the software, in the [SmallOrg customer environment] so you could, take asoftware which is half-ready, to the [SmallOrg customer environment], test, improve, test, improve, and then deploy it, throughout.—BigCorp Manager
At [SmallOrg] they managed to do somehow that, the customer was quite satisfied and, the main reason for that was this really really short, feedback, cycle that, they had. So when the customer reported the issue, [...] they were, kind of, replying in the five-minute time frame. And then also the, correction package arrived [...] in avery short time.—BigCorp Manager
4.3.2 Organizational Structure
I think this is quite atypical structure what we have in BigCorp. We have abusiness line where is the product development. Then we have sales in another organization and delivery in athird organization. And there comes the challenge that how to work over the different functions that work with different logic, incentives and success criteria. How to get the whole to work together.—BigCorp Manager
From the viewpoint of the studied product, the functional organization was a major challenge, because the product differed from the other products of the BigCorp company. Being an early market small software product, it had lower priority than other mature, large products. The prioritization prevented other organizations, such as sales, support and delivery, from adapting their ways of working to match how software could be developed efficiently from sales to delivery. In addition, as the product development was a separate organization with little interaction with the customers, it had a limited possibility to gain fast feedback from the changes done to the product.The power from aBigCorp point of view is to also see, we have ahuge portfolio. And how do you really interplay across this portfolio in order to really get value to the customer.—BigCorp Manager
However, when you have abig team which is distributed to many sites, and you have many customers, so you’re not so, let’s say flexible. So that’s why, actually, Ithink, one reason why [BigCorp product], development was, became very slow, is that the [BigCorp product] team was growing too fast.—BigCorp Architect
But in case of [BigCorp product] [...] lead designers or even architects [are] on different sites. And if we need to agree on something, it takes much more time so, because of time zone difference and, anyway, even if we are in the same time zone, it’s adifferent thing when you discuss it over the phone or, just, in the same room.—BigCorp Architect
[SmallOrg] was still asmall company it was abit easier because of course they were kind of co-located in the same room. So if there were some, say, problems or some synchronization was needed, so they just talked to each other. [...] In my opinion this is avery big advantage, when the team is co-located.—BigCorp Architect
The software development teams did not have direct contact with the customers, but the distance was nevertheless short, because the domain experts and service teams, who were in contact with the customers, had direct communication with the developers. The domain experts and customer service team members could either contact the developers face-to-face in the SmallOrg office, or use achat application.The fact that the [SmallOrg] was in one place and that, let’s say, the most, difficult parts, they were done by afew lead developers. [...] So, if you want to do some changes, or implement some urgent customer requests, it was quite easy to do because these three guys are, in the same room, and they can agree with each other how to do it quickly.—BigCorp Architect
In addition to the tight collaboration, the domain experts were responsible for the customer requests from specification to delivery. Thus, there were no hand-overs between different functions, which is typical in alarger functional organization and often results in delays and lower quality.I: So the connection between the [proof of concept] and software engineering is very tight?R: Super. Very, like direct communication.—SmallOrg Service Team
[SmallOrg] had domain experts and all the requirements to the developers came through the domain experts. And when anew feature goes to acustomer, the experts are responsible for it end-to-end. Their responsibility does not end when the requirement is specified but only when it is used by acustomer and customer verifies that it is good.—BigCorp Manager
4.3.3 Product Development Process
Since there was along organizational distance from the sales and marketing to the development organization, it was important that the plans would be realized completely, because otherwise there would be implications for the other organizations too. Failure to fulfill commitments would be punished with abureaucratic process where one would need to explain what went wrong. Therefore, the development organization tried to make commitments they were totally sure they could deliver and plans would not be changed after initiation.The [BigCorp] system is much more rigid and slower. We say that we can put [a customer request] on the roadmap and it will come after 18 months.—BigCorp Manager
In practice, however, software development is highly complex and unpredictable work. Even when the development organization tried to minimize the risk of missing the planned targets, it became clear that there was too much work in the plans. The plan-driven process caused burden on the product development organization, because the execution of the long-term plans was not feasible in practice. The burden was visible in being late in development and in having to do overtime work.In [BigCorp] the work is very Waterfall-like. You have to certainly commit to things and just those things have to be delivered. Even if only alittle bit is not delivered, you will be punished enormously.—BigCorp Architect
During the development process, there were quality gates whose intent it was that certain criteria would be met and the release was proceeding as planned. In practice, however, those quality criteria did not prevent low quality software from entering the last stage, general availability, in the process. Still, the quality gates generated lots of bureaucracy to be managed for the development organization.For [BigCorp product], we have had to work over the weekends for ayear now.—BigCorp Manager
We have lots of [quality gates], long lists. Now, if you think [BigCorp product], we have acustomer who uses arelease which has passed general availability along time ago. The quality of that release is very low. This illustrates what our extremely heavy process can do. —BigCorp Manager
There were two downsides of the flexible process: the accumulation of technical debt and constant reprioritization. Because changes were made frequently, the technical debt kept building up. SmallOrg was aware of the issue, but having limited resources and aneed to satisfy customers, it made aconscious decision to let the technical debt accumulate.I think it’s the, start-up, let’s say, mode of working that they, try to do as fast as possible what the customer requests.—BigCorp Architect
But Ithink that, [SmallOrg was] investing more in feature development rather than, kind of, trying to reduce this technical debt, they were building up.—BigCorp Manager
Software developers were under aconstant interference due to reprioritization. It was good for the business to react quickly to customer requirements, but some developers thought that there could have been some protection measures that would have allowed better concentration and productivity for the developers.[SmallOrg]’s focus has been, building aproduct that has usefulness. Sometimes.. we take atrade-off between ahighly scalable product. Versus aproduct that you can do things, well. And do useful things.—SmallOrg Architect
It seems like we just have to switch what we’re working on alot. [...] For example, the support team will just come over to developers and ask them things or email them directly. [...] The developers are being pulled in different directions alot.—SmallOrg Dev Team
4.3.4 Resources
Despite the fact that there were lots of resources, the processes of the large organization were slow to provide new hardware for test environments, for example. Thus, having resources did not necessarily mean that they could be used quickly.As we are [BigCorp], we have to have certain system-level features, such as high availability, scalability etc. in place.—BigCorp Architect
It takes two to three months to get alaboratory equipment with dedicated hardware.—BigCorp Manager
Despite the overall lack of resources, SmallOrg had invested in elastic cloud from Amazon that they used for testing purposes and running their CI system. Given that there was aneed for something in the organization, investments could be made rather easily if they would bring value for the product development.[SmallOrg] was lacking the resources of, testing the software in a, close-to-customer environment.—BigCorp Manager
We looked into using Amazon Web Service to help us to easily, elastically scale the hardware that we need.—SmallOrg Architect
4.4 Release Engineering Practices
Practice | Case BigCorp | Case SmallOrg |
---|---|---|
Code review | No systematic code review | All code is reviewed through pull requests |
CI discipline | Slow CI, broken most of the time | Fast CI, unbroken most of the time |
Internal verification scope | Strict definition of done, dedicated testers, integration and non-functional testing in test labs | No definition of done, no dedicated testers, low unit test coverage, lack of integration and non-functional testing |
Domain expert testing | No domain experts in the testing development organization | Domain experts test every feature |
Testing with customers | Pilots with existing customers performed by other organization than development, low feedback from production use | Extensive testing in customer labs and production environment by domain experts and service team members, fast feedback to development |
4.4.1 Code Review
In practice, Ithink we do not do any code review. There has been much talk about it, but it is so laborious, so no.—BigCorp Manager
In [BigCorp product] for instance, there was, maybe another extreme that we have too many developers [...] and alot of them were quite, fresh. And, Ithink that partly because of that we have quite alot of, bad code, which was not reviewed, by more experienced guys.—BigCorp Manager
The code review process had multiple benefits for SmallOrg. First, all code changes were reviewed by experienced lead developers, which helped onboarding new developers and prevented low quality code from entering the product. Second, the changes were not integrated into the product until the code review was completed and the CI build would pass for the feature branch. Thus, the product build was green almost all the time, as build-breaking changes would not be accepted for integration.Pull request is to merge from the current branch to the main branch. At that time, the code reviewers, watch it and they comment and they disapprove if they find something wrong. Until it gets fixed they do not approve it, and there are some merge privileges only for afew people here.—SmallOrg Dev Team
4.4.2 Continuous Integration Discipline
In addition to failing, the build took multiple hours to execute (see Table 9). Thus, fixing a failing build was hard, because one had to wait multiple hours to see if the build would be functional again after fixes. The slow build also delayed the feedback on the changes made by the developers.Everybody says okay my module is working and Iam green. [...] But [when] you put them together, it’s not green.—BigCorp Architect
Metric (mean) | Case BigCorp | Case SmallOrg |
---|---|---|
Green builds / week | 1.6 (min–max: 0–10) | 24.3 (min–max: 8–44) |
Build success rate | 1.9% (weekly min–max: 0–14%) | 76.7% (weekly min–max: 38–97%) |
CI execution time | 2.7 h (weekly min–max: 0.23–5.0 h) | 22 min (weekly min–max: 16–41 min) |
Other reasons for the failing builds were fragile third party components in the product architecture and the organizational distribution. The third party components were claimed to make the build fail randomly. Distribution of the organization made communicating changes more difficult, and conflicting changes were made because of that.In [BigCorp product] it was aproblem with how it builds that if somebody breaks the, build and the whole team suffers.—BigCorp Architect
The CI was run for all the feature branches in addition to the main branch. Running the CI in branches allowed developers to see the CI results before integrating the branch into the product. Thus, build failures rarely affected other developers’ work.[Note: the quotation explains the situation during the interviews, not before the acquisition, although the situation is similar based on the CI data.] We’ve actually been pretty good about keeping [the main branch] working. So we got like last 30 days.. 84% successful so it’s pretty good. Ijust made achange around there that reduced the duration of the build because it was getting up to like 29–30 min, so Imade achange that got it back down to 17 or 18 min, which is nice.—SmallOrg Dev Team
Our [CI] system will build every single branch and run all of the checked in tests. And deploy the system. Then it runs some very light integration tests with new data. Then you can see, if Imerge this in, I’m not gonna break the system. Which would happen sometimes.—SmallOrg Dev Team
4.4.3 Internal Verification Scope
In [BigCorp product] Ithink our [definition of done] was quite strict. That you should [create] all the unit tests or the automated tests if you have [made UI changes].—BigCorp Architect
However, after certain production deployments, it became clear that even after putting alot of resources on internal verification, there would be issues that would reveal themselves only in the production system. Thus, even arigorous internal verification can be insufficient if the production environment cannot be simulated totally.So in [BigCorp] we have these labs, which we call [environment] verification or system verification. But that requires quite serious.. upfront investment.—BigCorp Manager
We have acustomer that is really taking the product into use. And it shows that we have slipped in the test system verification alot. That we have not tried to test the system against the load that is present in acustomer environment.—BigCorp Architect
I was quite amazed that how low [SmallOrg’s] test automation is. They are having, some automated tests but, by far not the level we are requiring at [BigCorp].—BigCorp Architect
I: How much do you rely on manual testing?R: Fairly heavily.—SmallOrg Dev Team
Second, the CI included some light integration tests, but there was no integration to the real production-like environments. The reason for this was that the environments are expensive and SmallOrg lacked the resources to acquire such test systems. Nevertheless, SmallOrg had acquired test data from customer systems, so at least that data could be used for partially simulating the production environments. However, that data was used only in the manual release tests, not in the CI integration tests.The smoke tests essentially just run most modules and verify that there’s no errors and, that does actually catch quite abit. Surprisingly.—SmallOrg Dev Team
Yeah the integration tests so we.. start avirtual machine on AWS and then we install our software. And then we run tests on that software. So it’s.. you can test things like database access and things like that but there isn’t usually [production-like data].—SmallOrg Dev Team
Third, there were no performance, high availability or scalability tests. These would again require investments that SmallOrg did not have resources for. Instead, when performance issues were discovered during other verification activities, they were fixed reactively.Since the earliest days, ever since we connected into the, first time into the [SmallOrg customer environment], we have asnapshot of actual configuration, actual data from the [SmallOrg customer environment].—SmallOrg Domain Expert
Fourth, SmallOrg did not have resources for dedicated testing personnel and relied on developers and domain experts for manual release testing that took afew days. Some parts of the manual testing were still semi-automated; they were executed automatically but required ahuman to verify the results. Finally, SmallOrg did not have any formal definition of done in place. Thus, developers created automated test cases when they considered it useful.Then another big, problem in my opinion is that they did not perform at all the performance testing, performance and capacity testing. When you asked them for instance, how big [environment] the [SmallOrg product] [...] can handle. So they [don’t even know] the answer.—BigCorp Architect
[SmallOrg’s] definition of done, that Imentioned earlier that they were, kind of it was not so strict so, they were able to cut some corners, here and there. [...] So, you implement the code, the code is reviewed, it is done. And test automation Ithink, used to be more or less, voluntary. So if you want to do it, you do it. If not, you don’t do it, in [SmallOrg product].—BigCorp Architect
4.4.4 Domain Expert Testing
In [BigCorp product] we have aset of people who only do test. They don’t do development at all.—BigCorp Architect
A software engineer develops aprototype that they then hand off to the [domain expert who] will do their own validation, their own testing, [to] see if the expected functionality is there. [...] Alot of times [...] software developer [does not understand] fully the requirements or [...] what needs to be implemented.—SmallOrg Domain Expert
4.4.5 Testing with Customers
Without own production-like test environments, SmallOrg had to verify the integration in the customer’s laboratory together with the customer’s domain experts. Thus, integration testing was sufficiently fulfilled only after the product had passed internal verification and passed to customer testing.Another thing is, another way that we do testing, it’s the monitoring of the production server. So it’s, that becomes afield testing. Sometimes, because the product nature itself, it is very costly to build atest lab to mimic the production [environment]. Some of the test cases you really cannot cover in the test lab. What you can do, it’s monitoring production [environment] and trying to detect issues, as it goes. That’s another way to compromise, [complement] our testing effort.—SmallOrg Architect
Open-loop verification means running new functionality in aproduction environment, but while the functionality receives data from the production environment, all the effects it would have to the production environment are turned off. Thus, its behavior can be safely observed in areal environment. In addition, the service team always had arollback plan if something would go wrong. After verifying that the product worked in open-loop mode, it was switched to closed-loop mode where the product had an actual effect on the production system.It was more, the integration at the beginning was the most challenging part. Once it’s integrated with one system, because it’s the same [integration point] everywhere it’s just amatter of tweaking alittle pieces, when it’s adifferent software on alease. [...] But once those interfaces were developed then it was very straightforward.—SmallOrg Service Team
[After internal verification we] run the module [in the production environment] first and open loop, then when they’re happy with the output, the recommendations from the module then to proceed with closed loop.—SmallOrg Domain Expert
The, open-loop, again, so, depends on the scenario. If it’s anew customer, we’re doing a[proof of concept], we always do open-loop and so we can review the changes with the [customer]. That gives them abetter sense for what the tool is doing, helps to understand it, and also more confidence that, OK, it’s making the right types of decisions. And then, closed-loop obviously, as we get more, field experience with modules then, we’ll become alot more comfortable [...] that there’s no, unexpected behavior or surprises.—SmallOrg Domain Expert
Progressive deployments meant that arelease was not applied straight away to the whole customer’s production environment. Instead, the release was first verified in asmall part of the environment, and after successful verifications the release scope was progressively increased to cover the whole production environment.We would [be] monitoring very very very closely any changes being made, making sure that, there’s no, issues being caused by, [...] errors in the software. [...] So that’s aprocess that can take some time to complete.—SmallOrg Domain Expert
Due to the friendly relationship and SLA with the customer, SmallOrg could perform and monitor the releases directly. Thus, any problems that were encountered in the production system could be fixed immediately. However, the cost of that was the strict SLA (see Section 4.3.1).We didn’t deploy to even all the servers at the same time. So, normally we would, have two servers, [one for different technologies], where we did the first, upgrade, and we would test it. Always we find some kind of issue that gets solved and anew build is made before we start deploying [to] the other servers. So it’s like asoaking period of two three days. Once it’s stable enough, now we go ahead and upgrade all of them. So, it’s pretty time consuming.—SmallOrg Service Team
And they had managed to develop, aproduct which, was allowed to be kind of tested, inside the [SmallOrg customer], environment. But the price for that was that the, level of service, they, kind of agreed with [SmallOrg customer] was areally crazy one. [...] If acritical, issue came up, related to [SmallOrg product], the, response time, from, [SmallOrg] was 5 min.—BigCorp Manager
4.5 Outcomes of the Release Engineering Practices
Outcome | Case BigCorp | Case SmallOrg |
---|---|---|
Feedback from the customers | Low feedback from the production use | Frequent feedback from the production use |
Number of defect reports | Low number of defect reports | High number of defect reports |
Quality issues with new customers | Elementary programming mistakes revealed | Scalability and other issues |
4.5.1 Feedback from the Customers
Because we have along distance to the customers, the developers do not have information on what modules the customers really use. The only way to deduce that is to see which modules generate defect reports from the customers.—BigCorp Manager
The customer support team, they’re in charge of testing the module making sure it runs properly, before doing closed-loop execution. Sometimes we’ll find problems there. We document all the issues and pass it to R&D.—SmallOrg Service Team
We basically manage the servers ourselves. So we manage the release. And Ithink in [SmallOrg customer], [...] we’re still the ones that do the deployment and monitor the system.—SmallOrg Dev Team
4.5.2 The Number of Defect Reports
[SmallOrg] gets over thirty defects per week from the customer. Which is alot... We cannot deliver to fifty customers, with that kind of quality.—BigCorp Manager
4.5.3 Quality Issues with New Customers
And we have revealed, from the software, so many shocking things [...] Due to, maybe inexperienced developers, it is not understood that, when there is asmall error, some temporary problem in customer environment, the software just crashes. [...] And this is the problem we have with [BigCorp customer] at the moment.—BigCorp Architect
It was quite alarge [environment] and, basically the [SmallOrg product] was, not working as expected, and it was not scaling, as it was, promised to. Then, the, normal ways of, kind of normal ways of working at [SmallOrg] was that, they were, sending back this information to the R&D and the R&D was fixing it really fast, and sending it back to the customer.—BigCorp Architect
4.6 Synthesis
Arrow | Explanation |
---|---|
Multiple customers \(\rightarrow \) High quality standards | |
High quality standards \(\rightarrow \) Complex CI | High quality standards meant that the system had to be tested more rigorously in the CI system. See Section 4.4.2. |
High quality standards \(\rightarrow \) Slow verification | High quality standards also required additional testing after the CI pipeline. See Section 4.4.3. |
Complex CI \(\rightarrow \) Undisciplined CI | The complexity of the CI system made the CI practice more undisciplined. See Section 4.4.2. |
Large distributed organization \(\rightarrow \) Undisciplined CI | In addition, the distribution of the organization made the CI practice more undisciplined. See Section 4.4.2. |
Undisciplined CI \(\rightarrow \) Slow verification | |
Slow verification \(\rightarrow \) Slow release capability | Due to slow verification, BigCorp had slow release capability. See Fig. 4. |
Arrow | Explanation |
---|---|
Disciplined CI \(\rightarrow \) Fast internal verification | SmallOrg’s Disciplined CI kept the software continuously in a good condition and allowed fast internal verification. See Section 4.4.2. |
Fast internal verification \(\rightarrow \) Fast release capability | Fast internal verification allowed fast release capability. See Fig. 5. |
Fast release capability \(\rightarrow \) Feedback from production | Frequent releases enabled getting feedback from production use. See Section 4.5.1. |
Lack of internal verification \(\rightarrow \) Fast internal verification | Due to SmallOrg’s small internal verification scope, the internal verification was faster. See Section 4.4.3. |
Lack of internal verification \(\rightarrow \) More defect reports | Lack of internal verification caused more defect reports from customers. See Section 4.5.2. |
Lack of internal verification \(\rightarrow \) Customer testing | Due to the lack of internal verification, more testing had to be done with customers. See Section 4.4.5. |
Customer testing enables lack of internal verification | Being able to test with customers reduced the need for internal verification. See Section 4.4.5. |
Customer testing \(\rightarrow \) Feedback from production | Through customer testing, SmallOrg gained feedback from the production use. See Section 4.4.5. |
Friendly customer \(\rightarrow \) Customer testing | |
Lack of resources \(\rightarrow \) Lack of internal verification | |
Lack of resources \(\rightarrow \) Lack of test automation | |
Lack of test automation \(\rightarrow \) Lack of internal verification | Small amount of test automation meant that less testing was performed internally in SmallOrg. See Section 4.4.3. |
Lack of test automation mitigated by code review | SmallOrg’s functional code review practice mitigated the lack of test automation, because errors were caught by more experienced developers. See Section 4.4.1. |
Arrow | Explanation |
---|---|
Number of production environments \(+\rightarrow \) Internal | |
Number of customers \(+\rightarrow \) Internal quality standards | |
Control over production \(-\rightarrow \) Internal quality standards | |
Available resources \(+\rightarrow \) Internal quality standards | |
Internal quality standards \(-\rightarrow \) Defect reports | |
Internal quality standards \(+\rightarrow \) CI complexity | |
Internal quality standards \(-\rightarrow \) Release capability | |
CI complexity \(-\rightarrow \) CI discipline | BigCorp had more complex CI, which made it more difficult to keep CI usage disciplined. SmallOrg had a simple CI system and also good CI discipline. See Section 4.4.2. |
Organization size and distribution –\(\rightarrow \) CI discipline | |
CI discipline \(-\rightarrow \) Release capability |
5 Discussion
5.1 What Release Engineering Practices Had the Case Organizations Implemented?
5.2 What Outcomes Did the Implemented Release Engineering Practices Have for the Case Organizations?
5.3 What Were the Reasons for Differences in Release Engineering Practices in the Case Organizations?
5.4 Implications for Practitioners
Case BigCorp | Case SmallOrg | |
---|---|---|
Pros | Internal verification, capable to serve multiple customers | Customer verification, release velocity, capable to serve a single customer well |
Cons | Release velocity, new customer satisfaction | Internal verification |
5.5 Threats to Validity
6 Conclusions
-
How does organizational size and distribution affect continuous integration discipline?
-
How does continuous integration discipline affect release capability?
-
How could internal quality standards be measured?