Part One: A Unified Experience Blocked By A Fractured Ecosystem

Introduction

Let's be honest, micro-service architectures are one of the biggest "over promise, under deliver" things that have hit our industry over the last 10 years. It’s rare to see organizations actually find their way to the promised land of small service footprints, manageable maintenance, and independent scalability. This is usually more of a people problem than an inherent flaw in the micro-service design pattern. Poor coordination can lead to an increasingly fragmented and difficult to manage collection of semi-connected services rather than a tightly integrated ecosystem. Throughout this blog series we are going to explore how, when a “Big Three” automaker identified this problem, we were able to help them to take steps towards getting more out of their transition to micro-services.

The Context

When Focused engaged with one of the “Big Three” automakers, they were in the midst of transitioning away from a monolithic service that had long supported the management of a set of extended product and service offerings targeted towards customers with fleets of vehicles. This monolith had become an unwieldy, difficult-to-debug, do-it-all service that had been extended and tweaked ad-hoc by many different engineers and teams over time and it was struggling to keep up with the ever-increasing load put upon it by the expanding suite of product services. Microservices seemed like the perfect fix - until they weren't.

The Challenge

Before long, the micro-services they were transitioning to began to face the same problems that the monolith did: isolated teams across products and product lines developed specialized, but often overlapping or redundant services that sprawled quickly and proved to be just as difficult to maintain as the monolith.

Take, for example, vehicle data - the most fundamental, core dataset for the company, upon which so many of these bespoke services relied. When you're a company whose entire business is vehicles, this is data that must be accurate and reliable 100% of the time. This type of data integrity is difficult to achieve, however, when any given product group might have multiple services, managed by different product teams within the group, designed to serve some subset of the same underlying data to its customers and the standard approach for doing so involves various flavors of convoluted data pipelining, cache tables, and duplication from the source of truth. Scale that model across many different product groups and it’s easy to see how this can become problematic. When we came on board, pieces of this core dataset could be fetched from over 70 different endpoints in the ecosystem, each of which had been developed independently and to satisfy a unique use case.

If you're Bob from Bob's Plumbing and you have a fleet of vans to serve your own customers, it’s critical to be able to accurately manage your fleet and the services that support that fleet and your business operations. When you navigate to one product page and see different information than what is shown on another product page you're going to wonder what is going on. This is not the customer experience that the company was striving for.

The organization recognized this problem and began an initiative to unify the customer experience across their product offerings. A key factor in this initiative was to address the data problem that had developed over time. Our team, composed of a mix of Focused folks and internal engineers, was tasked with solving this data problem when one of the veteran internal engineers on our team proposed an idea and we decided to run with it.

Unifying The Data

To address the challenges outlined above, streamline product development, and enable a cohesive customer experience, we wanted to unify the data model without sacrificing the flexibility and team autonomy that comes with the distributed ownership of a micro-service model.

Our solution was to introduce a federated GraphQL API architecture. By federating the core source-of-truth services we could provide data consumers a single point of access to all core data, leave ownership and control of these core services to the individual teams that were responsible for them, and ensure that the requested data would always be fetched directly from its source-of-truth rather than from some secondary service that could still be waiting for its cache table to be updated after a change.

Additionally, related data that might be served from distinct underlying services could be composed into a single logical domain entity. This allowed for easier data fetching for consumers and a lot of flexibility on the backend. As the organization felt the need to reorganize and shuffle teams around, the ownership over a given field or dataset could follow this reorganization and the change would remain completely opaque and seamless to the consumer. It didn’t matter where the data came from - as long as the federation routing layer knew what the source of truth was for a given field or dataset, it could fetch that information and serve it as a part of the overall entity. With this model, changing where data lives becomes as easy as updating the configuration of the routing layer to tell it where to fetch that particular piece of the data.

Let’s take a look at a concrete example. Consider the following scenario:

Team A is responsible for core vehicle data, including VIN and registrationExpiration fields.
Team B is created to manage title and registration renewals for all vehicles. It’s logical that Team B should now become the source of truth for the registrationExpiration field

Using this federated API approach, this change can be achieved as a seamless transition:

Team A continues to serve the full dataset that comprises the overall Vehicle entity while Team B develops a service to expose the registrationExpiration field
Team B completes development and registers a schema with the router indicating that their service is the primary provider of the registrationExpiration field and that this field contributes to the federated Vehicle entity
When a client sends a request to the router for a vehicle’s VIN and registrationExpiration the router fetches each field from each of these services in parallel, composes the fields into a single Vehicle object containing all of the requested data and provides it back to the consumer

As long as users have the appropriate permissions to fetch each field, they can receive a complete view of vehicle data, regardless of which service manages each field. When ownership of a field shifts between teams, only the schema and router configuration need to be updated - no need for complex migrations or client disruptions.

Conclusion

We had landed on a solution that, at a high level, promised to offer the following key benefits:

Reliable and Accurate Data Access: Data is served directly from a single source of truth, configurable down to the field level, eliminating the need for cache tables, redundant logic, and data duplication
Improved Customer Experience: Customers would no longer encounter data inaccuracies depending on which page they opened and how that page sourced its data
Improved Developer Experience: A single entry point to access all data meant that customer facing product and frontend teams no longer had to coordinate across multiple APIs to gather necessary data, accelerating development of new products
Clear ownership boundaries: Each team responsible for managing data controlled their data operations, access permissions, and exposure rules, again down to the field level
Flexibility: Backend changes, such as shifts in data ownership, could be implemented without disrupting existing consumers. Schema and router configurations were updated transparently

What Comes Next

This overview is just the start. Throughout the rest of this blog series, we will dive deeper into the specifics of how our team implemented this federated API model, the architectural decisions we made, and the challenges we encountered along the way. Part two will focus on the implementation of the core graph and a deep dive into the technologies we used to deliver our solution (hint: big shoutout to Netflix and Apollo for paving the way for us).

Stay tuned.

Back to Explore Focused Lab