Debugging Microservices with Charles ✈︎

At Hipmunk, we have dozens of flight and hotel API integrations with our partners. Each one of those integrations is a microservice, which abstracts away the details of fetching and sanitizing the data. We have been writing even more microservices as our infrastructure gets more developed - things like the Circuit Breaker Processer.

This means that I often find myself in the position of tracking down bugs across a microservice architecture. If you've done this, you know that it can be like herding cats.

I'd like to talk about one of the tools that I've found useful in tracking down problems between microservices - though it can be used far more widely.

What is Charles / Charlesproxy?

Charles is a "Web Debugging Proxy".

From charlesproxy.com:

"Charles is an HTTP proxy / HTTP monitor / Reverse Proxy that enables a developer
to view all of the HTTP and SSL / HTTPS traffic between their machine and the
Internet. This includes requests, responses and the HTTP headers (which
contain the cookies and caching information)."

It's a proxy with a UI. You send your network requests through it, and it will display it for you. You can also meddle with it.

Why is this useful? (Why not just use Chrome Dev Tools?)

Charles can view all traffic that you funnel through it, not just browser traffic. Charles can configure your system proxy settings to pass through itself, which means as soon as you open it you can start seeing ALL the network traffic on your machine - at least all the network traffic done by programs that respect the system proxy. By default, that's most browsers and desktop programs, but potentially not code that you've written (not all the python libraries respect it by default. Requests does, Tornado doesn't unless you ask it too)

Charles attempting to configure your system proxy settings to point at itself Charles attempting to configure your system proxy settings to point at itself

A brief aside about SSL

Almost all web traffic these days is encrypted with SSL, which means to read the traffic, Charles needs to break your encryption. It does this by inserting itself between you and the server.

Charles is the man in the middle

This means that after you install Charles, if you turn on HTTPS proxying, you'll see this page when you try to go to a website. You can choose to trust Charles's root certificate to get around this.

SSL CA not trusted warning

Now that we've configured our code to use the proxy, and trusted the certificate, let's use it!

Example Microservice

I'd like to introduce an example microservice architecture, which we can use to demonstrate Charles's capabilities.

It's Unix fortune - as written in 2017.

If you're not familiar with fortune, it's a lovely little program included with most Linux distributions. You just type fortune on the command line and you get a pithy little quote like this:

Christopher@Tamiasiii: ~/Code/pybay> fortune
To err is human, to moo bovine.

But command-line programs that ‘just work’ are 1970's technology. It's 2017, and now we know that to make a really good program you need to use microservices.

I happen to run an API on my frivolous personal website that returns fortune results. That will serve as our "partner API integration". So I present to you, fortune written with microservices:

```

#!/usr/bin/env python
"""Implements Unix Fortune using 3 microservices and an API"""
import sys
from flask import Flask, request, render_template_string
import requests

FETCH_URL = "http://localhost:9001/fetch"
RENDER_URL = "http://localhost:9002/render"
role = sys.argv[1]
app = Flask(role)


@app.route("/")
def main():

    fortunes = []
    for _ in range(0, 3):
        fortunes.append(requests.get(FETCH_URL).content)

    return requests.post(RENDER_URL, data={'fortunes': fortunes}).content


@app.route("/fetch")
def fetch():
    return requests.get("http://mcscope.com/fortune").content


@app.route("/render", methods=["POST"])
def render():
    fortune_template = "<div style='margin:100px;'> {{fortune}} </div>"
    fortune_divs = [render_template_string(fortune_template, fortune=fortune)
                    for fortune in request.form.getlist("fortunes")]

    body_template = """
    <html>
        <body>
            <div style="width:50%; margin:auto; margin-top:100px;">
                {{"".join(fortune_divs)}}
            </div>
        </body>
    </html>
    """
    return render_template_string(body_template, fortune_divs=fortune_divs)

if __name__ == '__main__':
    app.run(port=int(sys.argv[2]), debug=True)
```

So there are three microservices in here, each runs just a single endpoint: "/", "/fetch", and "/render". For simplicity, they'll all run the same code, but we'll only hit one path on each.

There's a main app microservice, a backend data sourcing microservice (which hits the public Fortune API that I run) and a server-side rendering microservice, all of which cooperate in order to serve up 3 pithy fortunes.

Here's a Charles screenshot showing these all cooperating on my computer.

Cooperating Microservices

You can see that for each of these calls, we’re able to inspect the timing of these requests, with statistics, as well as their content. These API’s serve raw strings, but Charles can understand and display JSON, Protobufs, XML and even SOAP in a smart, nested way.

Cooperating Microservices

Now you can stop writing print(response.body) forever!

Inspection is all well and good, but let’s go further.

Repeat aka "▲ for the web"

Let's say we are working on a patch to change the behavior of just the rendering microservice.

I’m often doing this - working on a patch that just affects one part of one service. In order to test that service, I need to convince some other service to talk to it.

That other service can have it’s own complications like caching or business logic that can influence when it decides to talk to other services. So now, I gotta figure out how to work around that, maybe change my parameters to it each time so it won't cache. Now I’m doing this snakecharming act with an upstream microservice when I’m trying to work on something completely unrelated.

Enter Charles!

Charles lets you repeat any of these requests, so you can just repeat the one call that activates your service.

Replay

In this example, I’m just repeating the call the main app microservice made to the rendering microservice. I don't have to invoke the entire machinery - my snakecharming days are behind me.

If you've done any command line programming, you've probably run into the concept of "▲", or "Up Arrow". It's one action you can take to quickly test your change.

"Edit. Save. ▲ Enter. Repeat"

Charles lets you do an up-arrow for the web.

After enough repeats, I got it working! Repeat - fixed!

Bug Reports

Sometimes something breaks and it’s not under your control, or maybe you just don’t have time to fix it at this moment.

When you’re filling out a bug report, you have to write exact reproduction requests. Sometimes that can get complicated.

Instead of that, attach a Charles log showing the bug. When someone goes to address the bug, reproducing is as simple as clicking “repeat”.

Bug Report!

No more page-long JIRA tickets with repro instructions!

I also use the Save Session feature to record traffic between my own microservices that is difficult to recreate - for instance, I have some complicated flight searches saved and whenever I want to test them, I can open up my saved session and start reissuing requests.

API Exploration

It's not just useful for trying to replay existing calls, you can also use it to explore APIs.

Here I'm exploring the 'partner' fortune API. The documentation say it supports a -o "offensive" flag, but does it really? Using Charles, I can call that API and get a proof of concept that the API responds like the documentation says, without changing a line of code. Api Exploration!

Load Testing

I'm wondering if my fortune API will hold up under load. I can use the 'Repeat-Advanced' to issue hundreds of calls to it in a short time.

Load Test!

Looks like the API is slowing down!

Load Test results

(That was a version with a deliberately introduced bug for demonstration... I promise!)

Breakpoints / Rewriting / Throttling

The other day, I was working on some error-handling code, specifically a circuitbreaker. I wanted to test how my code adapted when the partner API had an outage, but I have no control over their system. Oh wait, yes I do! I set up my environment, pointed it at Charles, and wrote a throttling rule that would add 30 seconds of latency to every request.

Timeouts galore!

You could also set up some throttling to test how your page would load on a mobile network, or dial-up, etc.

Here's another example, where I have a failing service but I want to keep testing the rest of the system. I use a rewrite rule to make it appear to the rest of the services like it succeeded. Rewriting the failure to a success!

Turn that frown upside down!

500 Internal Server Error becomes 200 OK !

Rewriting result!

This could be good if you have a flaky API, or even an API that returns variable results, and you would like to test it with a consistent response.

The rewriting tools are extremely powerful, and I have just begun to scratch the surface of them.

The tool "Map Local" lets you serve up content from a local folder as though it came from a server, and the tool "Mirror" helps you build that local folder from browsing.

There is a lot more that it can do, here's a list from charlesproxy.com

Is Charles Free?

Nah, it's proprietary nagware. However, it's a useful tool. The licences are cheap (~$50) and it's maintained by a single developer. I bought Hipmunk a site license. I wasn't compensated by XK72, the Charles developer, for writing this. I just like the software!

Fin.

I hope you try out Charles and find it useful. If you do, drop me a line (christopher at hipmunk.com) and let me know!

The content from this blog post was adapted from a longer talk I gave at PyBay about debugging tools. The talk was titled "Python Debugging with PUDB, Charles and cProfile" The slides are here and a video will be published soon.