> draft

In Defense of Service Locator

Zhong Yu, 2015-12-20

The consensus is that Service Locator (SL) is really bad. It's forever associated with the title anti-pattern, in order to warn off naive young minds. And the alternative, Dependency Injection (DI), is superior in every way.

But in this article, we'll take a closer look at DI and SL, in an attempt to offer some defense for SL.

We assume the programming language is object-oriented and imperative, but not necessarily statically typed.

Configurable Dependency

Let's say our application has compile-time dependencies A → B → C → ..., that is, the source code of A references type B, and so on. Some of these dependencies are hardcoded, meaning, constructors and static methods are invoked on depended classes. For example, dependency B → C is hardcoded, if B directly instantiates C objects.

Some dependencies need to be configurable. For example, while C depends on D, we want different kinds of D in different environments. There are various solutions to that problem, among which are SL and DI. In both SL an DI, the solution is framed as such -- C will interact with an object of type D; C does not directly create the object; a configuration phase decides which and how D is created. The difference between SL and DI is -- how does C get the D object?

We call D a service to C. A service does not have to be something heavy; it can be as simple as a boolean flag.

Service Locator

How does C get the D object? -- In SL, C looks up D from a "static location".

The "static location" must be configured first, such that it yields a proper object upon each lookup. We call this binding, which is typically done at application startup. The rest of the application should only read from the location, but not bind.

The "static location" can be a static variable, a global hash map, a config file, environment variables, etc. However, it is not necessarily a one-per-process concept; the location could be one-per-thread for example. It is "static" in the sense that it is directly accessible anywhere in the code. In this article, we assume it's accessed through a static factory API.

Dependency Injection

How does C get the D object? -- In DI, a D object is passed(injected) to every C object (let's say, through the constructor); the D object is saved in C as an instance variable, to be used later.

Now the question becomes, who injects D to C? If it's B, we are just shifting the problem to B to get D. Even worse, that would introduce a new dependency B → D. Ideally, B should only care about C's interface, not the fact that C depends on D. Therefore, B should not invoke C's constructor. Then, how does B get a C object?

The genius of DI solution is that C is injected to B. Similarly, B is injected to A. It's turtles all the way down.

Of course, eventually, someone has to actually instantiate and inject all these objects. This is done in a phase we call assembly, typically at application startup. A D object is instantiated; then a C object is instantiated with the D; and so on; finally, an A object is created. After assembly, we kick start "actual work" by invoking methods on the A object.

We started out by trying to make C → D configurable; we ended up with the solution where B → C is also configurable because we can choose any C for B during assembly. In DI, all service dependencies can be configured, whether or not they need to be. This is celebrated as the soul of DI by evangelists. Let's describe DI again from that perspective.

Dependency Injection (2)

Dependency Injection is a design pattern concerning how service dependencies are expressed and established.

Injection

Without loss of generality, in this article we only consider constructor injection. We say type Y is injected to class X, if the constructor of X accepts a Y argument; X saves the Y in an instance variable, to be used later.

In DI, a service shall not be directly instantiated in "normal" application code. Instead, if class X wants to use service type Y, Y must be injected to X. (More generally, a factory of Y is injected; we'll discuss that shortly.) X → Y is said to be loosely coupled, in the sense that X is agnostic of how Y is created. DI enforces loose coupling among all service dependencies.

If a class depends on any services, the class itself is also a service; and so on.

DI works on instance level. A static method cannot have internal dependencies on services (that are not passed through method parameters). A static method that needs services should be refactored into an instance method..

Assembly

There is a special assembly phase, typically at application startup, where all services are instantiated and injected. The goal of the assembly is to create a root service, as the entry point of "actual work". The root service depends on other services, and so on, forming a service graph. The service graph is fixed at assembly; application logic and data flow through this rigid graph. In a sense, it's like assembling a circuit board by wiring prebuilt components; the circuit functions by signals traveling through wires and processed by components. Since the service graph is fixed and known at assembly, it's easier to apply analysis and transformation on the graph, for various reasons.

The assembly phase doesn't have to be at application startup; it could also be done on a sub-application level. For example, in a web server, we may want to assemble a new handler for every incoming request.

A DI application tends to have many services. Assembly can be done manually, i.e. by programmers explicitly invoking constructors of services. However, this can be very tedious if there are many services, and hard to maintain when dependencies evolve during development. Most programmers would rather use a DI framework to perform assembly; the central feature of DI frameworks is to provide a more succinct way to define service graphs.

Factory

The service graph is fixed after assembly, which raises an obvious concern: isn't that too rigid and not dynamic enough? For example, X depends on Y, but injecting a fixed Y object may not be enough -- maybe because different kinds of Y are required based on user inputs; or maybe because the nature of Y dictates that a new instance of Y is needed per use. This concern of dynamism is easily addressed by injecting a factory of Y into X; X will query the factory whenever needed, and the assembly ensures that the factory returns a proper Y. Query may take parameters; parameter values are typically limited to some compile-time constants.

Conceptually, even with factories, the service graph is still fixed; the X node can be seen as wired to multiple(possibly infinite) Y nodes that are nonetheless pre-determined at the assembly phase.

Note that injecting an instance is a special case of injecting a factory, because an instance can be seen as a factory of itself.

Service vs Data

Not all types are service types. The application must have types that can be instantiated at will, which we can call data types. DI enforces a strict separation between service types and data types. A data type cannot have internal dependencies on services. For example, we cannot have a saveToDb() method on a Person data type, because that would require a database service injected to Person, forcing Person into a service type.

How do we determine at design time whether a type is a service type or a data type? It is subjective, but usually intuitively obvious. For example, LinkedList is obviously a data type, and Database is obviously a service type. A rule of thumb is that, if a type needs to be mocked in unit tests, it is most likely a service type.

Strictly speaking, whether a type is a service type can vary on use sites. The type could be injected as service in one place, and instantiated as data in another. For the purpose of this article, it can be seen as two types in two usages.

Indirections

As the saying goes, every problem can be solved by another level of indirection... except the problem of too many indirections. Both SL and DI introduce indirections between service consumers and providers; however, DI does it more indirectly than SL.

SL contains minimum indirections -- a static factory is used for services that need to be configurable. SL is a simple solution to a specific problem. Of course, it might be too simple, not covering enough concerns.

DI is more sophisticated. It also uses factories: C → D is resolved by C querying a factory of D (note that we consider an instance of D a special case of a factory of D). But DI involves more indirections than SL, because:

More indirections solve more problems, even problems that don't exist. But indirections do not come free of cost.

DI Complexity

DI pattern is more complex for understanding. Nevertheless, the concept of DI is pretty simple once explained; we are more concerned about whether there are complexities in practicing DI.

The amount of indirections in DI may make application hard to understand. Indirection means code is split and scattered in different places; more indirection makes it more difficult to follow application structure and behavior.

The form of indirections in DI may impose constraints on application designs --

Additionally, DI frameworks can often become a source of frustration too, due to their magics, complexities, and flaws.

SL Simplicity

SL is trivial to understand, easy to practice.

SL hardly imposes any constraints on API structures. Services are easily accessible in any code.

SL encourages hardcoded dependencies

In DI, it is compulsory that all service dependencies are resolved indirectly. But in SL, indirection is only required for dependencies that need to be configurable. Other dependencies are often left hardcoded.

We should stress upfront that SL encourages hardcoded dependencies, a point probably missed by SL foes.

In most discussions that compare DI and SL, a single class is studied for the effects of applying DI or SL. But this is not exactly fair to DI. We should compare at the application level. A DI class in a DI application often corresponds to a class in a comparable SL application that has the dependencies hardcoded! Therefore to compare DI and SL in fairness, we also need to compare DI against Hardcoded Dependency.

Unit Testing

The most recognized benefit of extensive indirection in DI is probably for unit testing.

In the popular school of thought on unit testing, when testing a class C, only the code of C should be executed and tested; if there's a service dependency C → D, all interactions between C and D should be intercepted by the unit test. This school of thought is perfectly aligned with DI -- the unit test creates a mock D and injects it to C.

How does SL fare in this area? If C → D is resolved by SL, it's not much different -- the unit test binds a mock D before testing C. This binding should be confined to this particular unit test; in order to do that, SL API should provide localized binding, for example by using a thread-local registry.

But remember that SL also encourages hardcoded dependencies. If B → C is hardcoded, B can invoke C's constructors and static methods. How can the unit test of B mock C if it's hardcoded?

One way is to restructure B such that majority of its code are testable by mocks; hardcoded references to C are extracted and isolated. Such restructuring may be too complicated and make B hard to understand.

Another way is to link to a mock definition of C when unit testing B. Even though B → C is hardcoded at compile time, we still have the freedom of switching C at link time. In theory this is always possible, but whether it is practical depends on available tooling. In Java for example, there is PowerMock, which can mock constructors and static methods as easily as instance methods. (We may also use PowerMock to mock SL API if it doesn't provide localized binding.)

Some programmers may not even subscribe to the aforementioned school of thought on unit testing. When testing B, maybe it's more beneficial to execute C's code as well; we are more interested in seeing B's overall effect on D, therefore we only need to mock D, but not C. A purist may object that this is no longer a unit test.

Application Assembling

In DI, every service is born configurable, which makes it easy to assemble new applications from existing service classes. Such versatility is more important in a bigger organization with lots of code and coders; preemptive indirections may be wiser.

In SL, whether a service is configurable is of discretion, based on how the service is used in current fleet of applications, as well as how it might be used in future applications. It's common that a service is hardcoded at first, then later needs to be refactored to become configurable. If the programming language is statically typed, refactoring is typically easy and safe. But in a bigger organization, such refactoring may involve too many applications and developers, and it could be a difficult task.

It's also possible that a hardcoded service has different versions on different software branches. This becomes more practical with modern version control tools and practices. For example, it's common that one team sees a stand-in version of a class that another team is implementing full-fledged on another branch.

Decoration

In DI, since all services are accessed indirectly, we can apply individual decoration on any service, or cross-cutting decorations on multiple services. Decoration is done at the assembly phase, and it's a common feature of DI frameworks.

We can do the same thing in SL for all services that are accessed through SL API. However, what about services that are hardcoded? One possible response is Aspect Oriented Programming; but introducing AOP to an application may bring in more trouble than it's worth. Another response is to consider the need of decoration as a call for being configurable, therefore any service that may require decorations must be accessed through SL API.

Context

Dependency injection is per instance. For example, a service graph can contain two instances of C, each injected with a different kind of D. Therefore, C → D is resolved differently based on context. The context is encoded in the service graph.

Note that while in theory an arbitrarily complex service graph can be created, it may not be an easy thing to do in practice. Using a particular DI framework, it may be impossible or very difficult to define the exact service graph you want.

In SL, how can C → D be resolved differently based on context? Since it's resolved by looking up on a static factory, we need to pass some contextual information as qualifier of the lookup. The qualifier is determined somewhere at a higher layer where the context is established. The qualifier can be passed down as method parameters, but that may be too ugly or impractical. Another way to pass something down the call stack is through thread-local.

Actually, SL binding can be thread-local too. At a higher layer where the context is established, we can create a thread-local binding of D, and un-bind it when the context is terminated. C does not know or care about how the binding is established; it simply looks up D without any explicit qualifier (while the thread-ID is an implicit qualifier). This solution is more appealing from C's point of view, because now it is agnostic of the context.

Dependency Manifest

In SL, dependencies are "hidden", that is, we have to investigate the code of C to discover that it depends on D. Even worse, if we read the code of B which uses C, it is not obvious that B also depends on D, transitively. It's possible that an application using B forgets to bind D, causing runtime errors.

In DI, dependencies of a class are clearly listed in the constructor signature. Users are forced to supply all dependencies when instantiating the class. That sounds very nice, except... we shall not instantiate services in "normal" application code anyway! Therefore these constructor signatures aren't really useful in application coding.

We do need to invoke service constructors in DI in two places -- unit test, and assembly.

In SL, every service lookup must be matched by a service binding. This seems error prone if there's no mechanism to enforce the matching. In practice though, it doesn't seem to be a real problem. Typically during application development, every time we add a service lookup in application code, we'll simultaneously add a service binding in configuration code, thus maintaining the matching of lookup and binding along the way. Actually, it is probably the same in most DI practices -- adding a new service in application is accompanied by adding a new binding to the DI framework.

If the programming language is statically typed, we could also write a static analyzer for SL that examine service lookups and bindings to make sure they match. The analyzer needs to look into the code instead of just method signatures, which is a little harder to do. When such static analyzer is nonexistent, we could use the IDE to list all usages of the SL API, and manually verify that every lookup is satisfied by a binding.

Life Cycle

In DI, a service can be disposed when the service graph is disposed, at which point some actions can be taken by the service. But this is not specific to DI; it is more about identifying the termination point of a task and take some actions at that point; we can do the same thing in SL.

Another concept is "scope", which can host some service instances. For example, a service may be "request-scoped", for performance or semantic reasons. When the scope is terminated, the service is disposed. Ideally, use sites of the service do not know or care about its scoping. Once again, this does not seem to be specific to DI or SL; we can scope a service in DI at the assembly phase, and in SL at the binding phase.

Some service types may require timely and explicit disposal. In SL, we can look up a service, invoke it, then explicitly dispose it. In DI, in order to have a lookup step at the beginning, a factory of the service needs to be injected.

Summary

If DI is perfect, there is no reason for SL. We listed some problems of DI that don't exist in SL.

Some criticisms of SL do not seem to hold upon closer look, for example, dependency manifest.

DI involves more indirections, which implies more flexibilities. We discussed how to achieve similar flexibilities in SL.

--

Feedback to this article can be posted to https://groups.google.com/forum/#!forum/od-service-locator.

See also: OD -- a simple Service Locator library for Java.