Out-of-Process SDKs

One of the most dangerous things that you can do as a developer is build a third-party SDK into your app.

Let’s talk about SDKs
#

Often we think of SDKs (“software development kits”) as libraries of code that we use and bundle into our apps, and sometimes we hate them for that. They might not be in the language that we want to use, they might not use the idioms we prefer, they might have unpatched vulnerabilities, or they might even break our builds. Even if the problems with an SDK are below your horizon, they are still there, like an iceberg threatening to sink your big, beautiful app.

The problem with SDKs
#

The problem with SDKs is that the wrong people own them. I’ve worked on teams that produce SDKs, and I can say pretty confidently that these projects are not as high priorities or as well-funded by their companies as other efforts, and they are often quixotic labors of love by under-appreciated and under-rewarded engineers. The truth is that whatever that cloud provider or database company says, their SDKs aren’t their core product, and the interests of that core product will always supercede the interests of their SDKs.

But when you’re using those SDKs, they are as much a part of your product as any other code that you write. You will be shipping that SDK inside your app, and your customers will hold you responsible for whatever happens in that SDK. If your app crashes in that vendor’s SDK, you will be the one with a disappointed customer. If that SDK steals data and you are in an app store, that app store could reject your app or even ban you. If that SDK has untrustworthy dependencies, your app will have untrustworthy dependencies. If that SDK is unmaintained, part of your app is unmaintained. Are these companies really helping you by giving you SDKs?

Why companies make SDKs
#

Companies have plenty of reasons for wanting to publish SDKs. An SDK can make an API easier to use and sometimes even more reliable. SDKs that have retry built into them allow API providers to offer more reliable services without worrying that customer-initiated retries will naively overwhelm their servers. SDKs can also cover a multitude of messes, such as when companies decide to rearchitect their serving architectures to change the endpoints that their API customers call. And of course, the less you know about what’s going on in an SDK, the more you are locked into using it.

It’s Better Out-of-Process
#

It would be a lot easier to manage complex dependencies if they were outside of our applications, running as separate processes from their own binaries that we could easily test and upgrade. Our operating systems do some of this, but operating systems move too slowly and serve audiences that are too general to provide all the capabilities that we need. Instead of building these common critical capabilities into our operating systems or our apps, we can put them in a separate program that we call an out-of-process SDK.

An out-of-process SDK moves common critical capabilities out of your apps and into an external component that you can easily verify, upgrade, and reuse.

One kind of out-of-process SDK is called a “sidecar” because it runs alongside an application, but API gateways can also serve as out-of-process SDKs, and we don’t need to stop with these.

gRPC: the Mothra of SDKs
#

gRPC is a Google-initiated project that provides “a high performance, open source universal RPC framework.” gRPC provides patterns for efficiently building APIs on HTTP/2 (and subsequent protocols) that include streaming and the performance advantages of HTTP/2 connection sharing. But the gRPC protocol isn’t trivial, so using the gRPC protocol almost always requires developers to import support code, which is usually one of the supported gRPC libraries. But like Mothra, the Queen of the Monsters, these supported gRPC libraries are bearing lots of larval features that their creators would like to place in your apps.

Recent presentations at gRPConf showed the broad scope of the gRPC project. Here’s a summary slide from an overview presentation:

That slide shown presents a list of features that are built into the supported gRPC libraries. Here’s the list:

Name Resolution (Service Discovery, Pluggable)
Load Balancer (Manage subchannels, seamless HTTP/2)
Interceptor (Powerful middleware)
Deadlines/Timeouts, Cancellation (Safeguard against network latency or server issues, Optimize resource usage)
Retry (Fault-tolerant and resilient)
Termination (Resource cleanup)

The gRPC project currently provides official support for a dozen languages. Some of these are built on common implementations so the number of independent implementations is a bit less:

The C#/.Net, Objective-C, PHP, Python, and Ruby libraries are built on the C++ implementation.
The Dart implementation is independent.
The Go implementation is independent.
The Kotlin library is build on the Java implementation.
The Node implementation was originally built on the C++ implementation and is now independent.
The Swift implementation is independent.

Even with this sharing, this is a lot of implementations to keep up-to-date, particularly when we look at the feature list above, and even the gRPC project doesn’t have all of these capabilities in all of its libraries. The Dart and Swift implementations are more focused on client-side features (though not exclusively), but in the eyes of the gRPC project, a gRPC implementation isn’t complete until it includes the full list of features above. Another gRPConf presentation reviewed the developing support for Rust, which is based on a third-party gRPC-compatible library called Tonic. There we saw the slide shown below, which points out the missing features in Tonic that keep it from being a full gRPC implementation.

It’s a lot. On the slide we see “Service Config”, “Advanced Name Resolution”, “Configurable Load Balancing Policies”, “Connection Management”, “xDS/Envoy Support”, “Health Checking”, “Observability Integration (OpenTelemetry)”. But is that all? Who knows what else is spawning on other slide decks inside Google?

With so much required to implement a gRPC library, let’s review what we need to do to call gRPC APIs from our apps. gRPC APIs are generally described with Protocol Buffer files that are compiled to generate calling and serving support code. In some languages, like Go, code generation is poorly integrated with the build process, so this generated code winds up checked into GitHub or made available via package managers. But that has hazards – app developers can have build problems if different API client libraries are linked to different versions of the gRPC libraries or include conflicting generated files for common proto dependencies. And the supported gRPC libraries come with some or all of the features listed above and are regularly updated as these features are added, improved, and debugged. So even the most simple API client needs to be updated regularly.

This doesn’t mean that we shouldn’t use gRPC. If you don’t use gRPC, all of these advanced capabilities are ad hoc mix-and-match messes. The gRPC project provides valuable structure and implementations of powerful networking capabilities. Our question here is “where should we put them?”

Do we want all of this in our applications?
#

Feature proliferation in networking libraries forces developers to regularly rebuild and republish the applications that use them. It adds complexity to application code, it increases security profiles and vulnerability exposure, and it creates governance challenges for organizations. If your organization has even just a few applications that use gRPC APIs,

you need to be sure that all of your services are written in languages that have a full-featured gRPC library available,
you need to be sure that all of your services use the correct version of the right gRPC library,
you need to be sure that all of your services can be easily rebuilt and redeployed when your gRPC library needs updating,
and you need to be sure that all of the networking features that you use are correctly configured.

For example, if your API servers are supposed to check API keys, how do you know that they do? gRPC’s library-oriented approach might make sense if you can easily rebuild and redeploy all of your services whenever you need to update your gRPC library. This might be true for Google, which keeps code in a monorepo and builds and deploys with a single integrated system. But even for Google, this problem is solved with a separation of concerns that keeps important common functions out of applications in a service layer that uses proxies to manage communication. Saying that you should include all this complexity in your applications is classic “do what we say, not what we do.”

IO: the Out-of-Process SDK for Networking
#

We can think of IO as an SDK that runs out-of-process, in other words alongside an application, rather than inside of it. IO handles many common tasks of SDKs:

authentication, adding credentials to API requests
retry, allowing clients to robustly handle transient server-side glitches
routing, allowing clients to be written without hard-coded addresses of API service endpoints

In short, IO does the advanced networking things that the gRPC project wants to do in-process. But because IO is out-of-process, it allows these capabilities to be deployed and upgraded without ever touching applications. Also, because communication is managed out-of-process, platform teams can have confidence that network operations are correct and secure without ever looking at application code. Let’s read it one more time:

An out-of-process SDK moves common critical capabilities out of your apps and into an external component that you can easily verify, upgrade, and reuse.

What IO doesn’t do
#

Notably, IO does not do message serialization and deserialization. But this is the easiest thing that SDKs do. Non-coders sometimes think serialization is the most important reason to offer SDKs, especially when they use Protocol Buffers, but capable developers find it easy and can be annoyed when SDKs do this badly – which is often! SDK maintainers are often asked to support multiple languages including ones where they aren’t up on the latest best practices and idioms. On the other hand, application developers can easily hand-write or generate their serialization code, and for Protocol Buffers it’s not difficult to use a tool like protoc or buf to generate serialization code for the APIs that an application needs.

Smart Developers Say No to Vendor SDKs
#

If we are using an out-of-process SDK, what do we want from the API support code that we actually build into our apps?

Simplicity. All the code that we include should be easy to read and understand, and as much as possible, it should only be code that we actually use.
Security. If we’re running with a local proxy, we would like to bind the application and proxy together so that no other processes can intercept or interfere with their communication. One nice way to achieve that is to use Linux abstract sockets.
Commonality. The things that we need are not application- or API-specific. They should either be in a language’s standard library or in widely-used shared frameworks. Nothing needs to come from a service vendor.

When we move complexity out-of-process, thick networking libraries like gRPC and Connect RPC are liabilities! That’s led us to create Sidecar, a Go library that supports gRPC clients and servers that communicate through sidecar proxies.

agentio/sidecar

Baggage-free gRPC for Go.

Out-of-Process Power
#

IO is an out-of-process SDK that provides critical communication capabilities for your apps in an external component that you can easily verify, upgrade, and reuse.

IO handles the complex and common networking capabilities that networked applications need and leaves application-specific and language-idiomatic tasks to application developers. By providing a better distribution of responsibility, IO gives application developers more control of things that matter to them and less need to worry about things that don’t… or won’t, because IO handles them.

Comment with ATProto
#

0 likes | 0 reposts | 0 replies Join the conversation

Let’s talk about SDKs#

The problem with SDKs#

Why companies make SDKs#

It’s Better Out-of-Process#

gRPC: the Mothra of SDKs#

Do we want all of this in our applications?#

IO: the Out-of-Process SDK for Networking#

What IO doesn’t do#

Smart Developers Say No to Vendor SDKs#

Out-of-Process Power#

Comment with ATProto#