Walmart Global Technology Center
Transforming enterprise software from legacy on premise monolithic software architecture to a modern cloud based microservice architecture is a challenging and fulfilling journey. While our learnings on this journey are derived in a retail supply chain context the concepts are usable across domains. To explore the best practices, we categorize them into 3Ps Product & Technology, People and Process.
Product and Technology
- Required Usable Replacement vs Minimal Viable Product (MVP): The MVP approach builds the product incrementally to a minimal set of new features for first use. However, in product replacement of an existing system the “Required Usable” replacement strategy compares existing features to a minimal set to replace the current.
- Platform thinking: Platform thinking is required allowing extensibility across use cases, reusable components while adhering to the domain-based context of microservices. It is important to allow extensions within the microservice to logically support related use cases (a geographical extension or a logic extension) while keeping base functionality common. In addition, filing Patents to protect the Intellectual Property as teams build out the new platform protects the unique capabilities while enabling the team to feel the pride of being inventors.
- What is “better”: Adoption of any product depends on the success metric in the context of the persona of the stakeholder. For an engineer it may be newer tech, for a user it might be better user experience or efficiency, while a decision maker pivots to financials, and some stakeholders view agility positively.
- Coud cost: Sunk and fixed cost of on-prem hardware and maintenance associated cost is replaced with usage based variable cost of the cloud. Governance, tooling, and rigor around cost elements be it CPU (Central Processing Unit), memory, type of machine, vendor diversity is key as systems scale.
- Multi-tenancy and impact radius: Cloud Microservice based architecture would typically be multi-tenant. Use of diverse types of tenants (e.g.by size). and a canary mode of deployment or distribution of cloud instances is useful. This allows control on the impact radius if problems occur.
- Component Coupling: Building several microservices and then having one database can cause strong static coupling. Similarly appropriate decisions on usage of async vs sync, use of orchestrators, and decisions on consistency become key to managing the dynamic coupling between components.
- Microservice proliferation and architecture fitness function: As engineers transition off monolith to microservice and cloud there is often a tendency to make smaller and smaller bounded contexts, and this leads to significantly higher number of microservices than is ideal. In addition, there is a possibility of Conway’s law which indicates the product design starts to resemble the org. Thus, it is important to define architecture fitness functions and build in governance and validations to ensure adherence to appropriate number of microservices and fitness functions.
- Test data and automation: It is important to ensure the production test data of the monolith is suitably created in this new distributed world. In addition, “in sprint” automation built in as part of development is necessary for rapid development.
- Balancing “project” vs “product” vs “platform” mindset: As the monolith moves to the microservice and cloud the team should embrace a mindset of building an extensible platform. Team may get pulled into project mindsets which while important for transactional outcomes can lead to short term decisions. Hence a mindset of longevity, platform and product while balancing the project success is ideal.
- Builders, Users, Enablers, and decision makers: Every product has multiple roles. Builders who create the product, enablers who might sell it/resell it/market it, users who use the product and a most important category the decision makers. When you replace a product, it is extremely important to pay attention to the “decision maker + user” category to understand the motivation for use for each persona.
- Celebrating small wins and reiterating the why: The journey of a monolith to a microservice may last months and several versions of the product. It is important to celebrate incremental wins. Also, since teams keep changing it is important to reiterate the “why” periodically.
- ONE TEAM mindset: Products have stakeholders who build, use, sell, support them. From naming the team, bringing leaders together and ensuring appropriate ownership bring in a One team mindset. A powerful reinforcement is by embedding it in the name of the program for example “One” in the product/program name. Framing every challenge in a “our” than functional context helps.
- Upskilling and reskilling: Monoliths and microservices have different technology stacks. Upskilling, reskilling, and hiring experts with prior expertise on the newer tech stack is key for fungibility.
- Structured planning with common backlog: The Monolith will support the existing business till replacement. Microservices will go through rapid and active development. During this phase of co-existence Having a common prioritized backlog with one unified team is key. This leads to transparent decision making/ re-prioritization and continuing to deliver on business goals.
- Change management with incremental deliveries: Microservices delivered early on bring in awareness of change (“I understand why.”). Usage will lead to desire for the change (“I have decided to…”). The early adopters become advocates of the microservice and impart knowledge (“I know how to…”). By the time it comes to replacing the monolith in its entirety, most of the business users can use the microservices (“I am able to …”). Continued large scale deployment of the microservices and retiring of the monolith will lead to reinforcement of the change (“I will continue to…”).
- Left Shifting Quality, Demos and User Acceptance Testing: As with any agile development incremental demos to ensure alignment and User Acceptance Testing is key. This allows the product to have smaller increments out to the users and get feedback while building alignment within users. Additionally, unit test automation, Integration tests and a nightly run of end-to-end flows are needed. microservices come up independently and automation is needed at a microservice level and across defined flows.
- Managing external dependencies: Any large system interacts with others in-house and 3rd party systems. These may be on-premise, cloud, or hybrid and in some cases be a legacy dependent system of the monolith. An inventory of dependent systems and subsystems down the pipeline to the eventual end system use is needed. In addition, clear contracts, and parallel development support interface evolution.
- Replace one persona and microservice at a time: Replacement using the incremental approach of the Strangler pattern is useful. The new product is shared to a subset of personas and microservices in phases. Integrations between the current and new systems need to be built in addition to cut over and fall-back. In addition to encourage the users to use the new product offering it is important to make newer innovations available only in the new microservice and slow down and stop innovation on the current product. This creates incentive to move to newer product.
- Learning from Behaviour and Automating: Enabling and empowering users to make decisions first based on insights on actions which may be manual allows the users to provide preferences. Systemically one could learn from these choices and use the persona behaviour to automate them.
- Measure and Continually Improve: Knowing and measuring metrics is critical to continually improve and evolve the product and platform. They could be from the persona perspective such as Net Promoter Score, Productivity or Click metrics. Business metrics such as bookings, revenue, churn, Customer Lifetime Value, or suitable Business metrics for the domain help. One should not forget the Technology metrics such as error rates, retry rates, lags, cloud cost, and microservice and platform stability counts such as defects or customer service requests. In addition, suitable alerting and monitoring helps rapidly respond to and prevent problems.
- Tuning Ways of Working: Microservices deploy faster than Monoliths. Ways of Working will need to be tuned during the period of co-existence. Monolith Infrastructure provisioning have longer SLAs (Service Level Agreements) compared to Microservices. Hence awareness, understanding and tuning the ways of working is needed.
In summary, when these practices are applied individually or in combination it leads to the transformational journey that is moving the monolith to a cloud and microservice based product.
Few best practices for the on-premise monolith to cloud microservice based product journey was originally published in Walmart Global Tech Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.