Python 3.6 and the Road to Static Typing

I've become a huge fan of compile time type safety and I feel foolish for ignoring it for as long as I did.

I've built a lot of tools and backend web services in Python that rely on other internal services and JSON data and whatever else. Unit tests run fine (once I properly figured out how to use unittest.mock), luring into me into a false sense of safety. I happily deploy my app, QA does their check and I do some spot testing. A few hours later, alerts go off due to unhandled exceptions. The top errors being:

My precious code, breaking at noobish scenarios. Clearly there was an unexpected data format that my code failed to check for when turning JSON into a dictionary object. Does the key value exist? Is it a string? Is it null?

These errors were actually mitigated by using the fantastic library Cerberus. Cerberus lets you write a schema and then validate a document against that schema. That alone is fine and will catch all (???) cases of bad data ruining your code.

That's all well and good. The root of the issue is that we've all been lulled into complacency in using dictionary and arrays and hashes as if they're totally free parts of a language. Take this simple dictionary:

a = {
  'id': 1,
  'value': "A String",
  'nested': {
    'whoa-nested': [1,2,3]
  }
}

I want you to take some time to think about how this dictionary would be written in C++ or heck, even Java. You're looking at a lot of hashmaps and vectors and pushing and whatever else. But it all appears to be free because Python (and Node, Ruby, etc.) have it built into the syntax.

Of course, you simply wouldn't do it this way in C++. You'd have a struct or class that abstracts your object and a function that loads it from JSON using some kind of node traversal. It would all be much more verbose, but what do you get out of it? Type safety. When your code sees that struct, it knows exactly what fields are available and what type they are.

Can we do this in Python? Sure, you can, and I've been trying to do it more and more. I find myself writing a lot of boilerplate like:

class SomeObject:
    def __init__(self, id: int, value: str) -> None:
        self._id = id
        self._value = value

    @property
    def id(self) -> int:
        return self._id

    @property
    def value(self) -> str:
        return self._value

The downsides aren't that hard to enumerate. It's verbose because it's unpythonic, there's room for a myriad of spelling errors, and ultimately it still doesn't protect you against runtime type mismatches. You can run your code through Mypy, and it'll make sure your own code plays nice with itself, but that's all. How can I really make sure that SomeObject is only ever instantiated with valid typed arguments?

I guess what I really want out of Python is a clear way to make type-safe struct-alikes like this. And I think it's getting there. Python 3.5 introduced me to the world of the hybrid type checking approach. Dynamic runtime, but check beforehand with Mypy. 3.6 is adding some more type safety syntax. I'm really excited to see what comes to future versions of Python because it really seems like they're starting to really appreciate the safety you get out of compile time typing.

Rust and Swift are also exciting for this reason. Go, not so much. For some reason they dropped the ball when it came to typing (no nullable types?) so you're still having to do a lot of runtime checks yourself. Rust and Swift have this stuff down pat and I've been seriously considering starting to build new services in Swift. The future of new languages is really exciting and I can't wait to see what comes of it in the very near future.