How an international company of energy and services took control over 1400 test environments
An international energy and service company’s IT team is responsible for ensuring that business innovation is delivered and applications are stable for more than 25 million customers. They are driven to agility and DevOps efficiency, but they work in a complex environment. With thousands of moving parts, the enterprise IT portfolio can be extremely complex.
Software projects include tightly-coupled systems architectures, geographically dispersed team members, customized third-party projects and strict regulatory requirements. Complex tests of new applications and projects require very specific environments. Management of these test environments was identified as a key source of inefficiency by the team. They struggled every month to answer the basic question: “Does this test environment have sufficient resources to support the change plan?”
They have since developed a number of best practices to manage test environments. This not only allows them to bring more value to the market, but also increases application quality. They were nominated to receive a computing DevOps Excellence Award for their efforts in implementing DevOps transformations for their core system, field service applications, and websites serving millions of customers.
Foundation for shift-left Testing
They could not compromise on quality to speed up delivery due to the importance and scale of their applications. An increase in release time was necessary to balance with a decrease in production incidents. Defects and outages discouraged end-user adoption for new tools and services. They analyzed their end-to-end testing processes and discovered that they were finding defects late in the cycle. This was affecting application quality.
The QA team sought to address this problem by creating a foundation for shift left testing. They identified two elements that were essential in the environment planning of each project.
- Define the test stages and criteria that will allow code to be passed onto the next environment.
- To ensure configuration accuracy, capture all requirements for test environments during project kickoff.
Not only do you need to track the version of each component and application, but also data requirements. This is so that test environments can be shared with other projects, or any third party code that may be needed. These requirements can be captured once they are captured and replicated at each stage of the pipeline to ensure that each platform is suitable for its purpose.
They were able to reduce the complexity of triage scenarios as the code evolved into more complex test environments. This was especially important because QA timelines often got compressed towards the end.
Multi-stage testing strategy
The team developed a multi-stage testing strategy that became the foundation of their delivery pipeline.
Stage 1: Functional Testing: These tests help to identify defects in new code early in the delivery process. They are performed in a localised environment and do not need to be compared with any other projects.
Stage 2: Regression Testing: The production baseline is updated with new code. In the early stages, test teams can share environments and work against one another.
Stage 3: Performance Testing: After regression issues have been resolved, performance of the new code and the whole system are stress tested.
Stage 4: Final inspection
Stage 5: Live.
Managing heterogeneous test environments
The core of their tightly coupled architecture is SAP. The non-core applications include remote services for field technicians, such as installing meters or inspecting home meters. There are also website apps for business and home customers.
SAP is the only app that has been moved to the cloud. Other apps are still on-prem, in VMs or bare metal instances. The golden copy for SAP includes all code that has been released to production. These changes are then fed back into all testing environments to ensure they match production. Their Environment Data Services group makes sure that test environments have enough database records. The SAP environment defaults with 5000 accounts. If a project requires more or less data, this group creates the necessary volume and/or types of records.
Cost containment is a constant concern with over 1400 test environments, and nearly 2000 components. The historical usage and consumption rate of SAP instances are used to determine the spin up and down. They can save money by identifying projects which could share an environment that is efficient and that can be spun up to avoid extra charges. They can review previous usage reports for the cloud-based SAP environment and validate consumption against invoices from the cloud vendor. Then, they can cross charge to the appropriate internal project.
Streamlining bookings, and changing requests
Each month, the Environment Delivery Assurance team handles over 50 change and booking requests for test environments. Spreadsheets were not practical for scheduling, conflict resolution, tracking complex configuration items, cross-project dependencies, or tracking. They consolidated the management and tracking of all environments to ensure that fast-moving teams didn’t have to wait for an environment to become available or run tests on an incompatible configuration. The Environments Delivery Assurance team has a centralised view of all available environments, the correct application sets, and the relevant configurations. This saves time and allows them to concentrate on application testing, rather than trying to fix test environment problems.
Hardware lab for testing end-user journeys and…time travel
They had established solid practices and turned their attention towards the hardware labs that were used to test end-user journeys. Every gas and electricity customer has a government-regulated smart meter. Before any firmware updates, customers are tested for their usage patterns. New firmware versions must be tested on real end devices, as they generate data for R&D.
The comprehensive test suites can be used to simulate any customer scenario for end-user journey testing. Example: Can the communication hub receive the meter reading even after firmware has been updated? Is it possible to buy a new customer online even if the firmware is updated? To validate future user journeys, the team even created what they call “time travel” scenarios. Example: If a contract is changed by a business customer, does billing still work correctly? Is it possible to lose a customer with new firmware? Given projected system load changes, how long will it take for a payment mode change to be implemented?
The IT team also manages smart meters in the hardware laboratory. They also manage in-house displays and communication hubs. Electric meters, gas meters and firmware are all managed by the IT team. There are over 1200 artifacts that can be tracked, allocated, and scheduled for testing. Managers now have a centralised view of configuration metadata to ensure timely and accurate provisioning of all components. The hardware lab also has the same audit trail as IT releases and the same change history.
Faster delivery of quality code
It was no easy task to manage 1400 test environments and a large end user hardware lab. The environment management team is now a centre of excellence, a model for organisation and efficiency thanks to the use of the right tools and processes. They are confident in delivering correctly configured testing environments on schedule, which results in better quality code releases every day.