One thing about being a developer on the Palantir Finance product that doesn’t get nearly enough publicity is the fact that we have our own programming language. I’m pretty excited about it so let me repeat, with emphasis: we have our own programming language. Yeah, it’s awesome. All those late hours you spent in the lab working on your final project in compilers: turns out they’re actually good for something other than getting into grad school.
Building this language ourselves — as opposed to, say, using an existing language that already just works — wasn’t an easy decision. In fact, it wasn’t even a single decision. We wracked our collective brain dozens of times trying to think of a better approach. But every which way we sliced it, the problems we needed to solve always pointed to building our own language. I still question this decision sometimes, but on the whole I’m very happy with how things have turned out.
The Palantir Finance programming language — Hedgehog as we know it — is an interpreted, statically typed, object-oriented language. With a syntax that’s based loosely on Java, it mixes roughly Java-style semantics and a few idiosyncrasies that make it a really interesting case study in language design. It’s built to be extremely efficient for batch operations on time series, which is the heavy lifting in financial analysis. It also allows you to dynamically add methods to a class from outside the class itself (conceptually similar to Ruby’s Mixins) — you define the function and its input type, and when you type the dot operator, your new method is auto-completed alongside all the “native” methods. Hedgehog also has a vast number of effectively global constants: all the stocks, bonds, and other financial instruments that are essential to the user experience, but that make for quite a design challenge.
I’m not a language guy myself, so instead of continuing to geek out over the core language features, I want to geek out about an emergent property that’s truly unique to the Hedgehog language. But first I’m going to back up and talk about something else that’s really important to us at Palantir: user experience. (I’ll get back to languages I promise.)
There’s a UX principle that says your interface should be “low threshold, high ceiling”. That is, it should be easy for the user to get started, but also able to do powerful things. This is actually a corollary of a more general principle: that your interface should strive for the optimal learning curve. My first CS professor explained this with a set of three diagrams, each representing one of the major OS families. I don’t remember exactly how he drew these diagrams at the time, but an updated version of them might look like this:
The x-axis of each curve represents “wizardry,” a measure of the user’s technical sophistication. The y-axis represents the power of the system — how much the user can accomplish at a given level of wizardry.
The best of the three curves, my prof argued, was the third curve. The first learning curve is great for providing incentives to learn. Each unit of effort spent to increase your wizardry yields an appropriate amount of reward or power. The drawback is that it’s hard for new users to do anything useful; its reward threshold is too high. The middle curve has a lower threshold and is better for novice users, but will frustrate an intermediate user because of the great plateau in the middle. (This might represent a place where the GUI isn’t powerful enough for the tasks you want to accomplish but scripting is still too difficult, leaving no way to express your commands) The third curve, however, is the best of both worlds: a low threshold and a smooth trajectory to the top.
Now let’s apply this back to our topic at hand, programming languages. Specifically, what does the learning curve look like for learning a first language? (Once you’ve learned one, of course, the rest come pretty easily.)
If your experience of learning to program was anything like mine, the first few projects in your first language were painful. You could sense the power further up the curve — it’s what convinced you to stick with CS — but simple tasks took a lot more effort than they should have, at the beginning.
Hedgehog on the other hand — our little homebrew that will someday have its own Wikipedia page — has the smoothest learning curve I’ve ever seen in a programming language. That’s the emergent property I wanted to talk about, because it’s a thing of beauty. You can get started with Hedgehog right away and accomplish quite a bit — without even knowing that you’re “programming” and the slope on the curve stays relatively constant throughout your trajectory.
We didn’t realize it at the time, but we were probably destined to create a low-threshold, high-ceiling language with a smooth learning curve, due to the nature of our user base. Financial analysts are impatient, and they still need to perform many kinds of complicated analysis. They definitely don’t have the time or inclination to spend a semester learning how to program. The solution to their problem is Hedgehog.
Allow me to illustrate with one of the earliest things a user might type into the expression bar:
And that’s it. The user types a ticker symbol and he gets a chart of IBM’s stock price. At no point did he have to wonder about variables or types or #includes. This experience is so frictionless he probably doesn’t even realize he’s writing code in a programming language. He just starts with what he knows, and the system gives him what he wants.
It starts to get interesting as you move further up the curve. Take this user input:
Of course that innocent dot between “IBM” and “volume” means a method invocation to anyone who’s familiar with C++ or Java. But to a new Palantir Finance user it simply means, “Let me access all the types of data associated with IBM.” Conceptually painless.
Or how about this one?
volume/1000 expression is an anonymous method acting in the scope of a Stock object; it’s syntactic sugar for
return this.volume()/1000;. But by allowing the user to strip away all the unnecessary syntax, we make learning the language that much easier.
I could go on tracing the curve here (I’ve only scratched the surface), but I hope I’ve made my point: we coax new users into writing code by making it look as much as possible like performing operations that they already intuitively understand. This is one of the benefits of creating a domain-specific language — we got the richness of the domain for free, and all the understanding that comes with it — and then we went above and beyond the simplification of a traditional DSL to really pare down the complexity of the language for novice users.
From simple beginnings like the ones I’ve shown here, it doesn’t take our users long at all to cross the threshold to more intermediate-level work, such as chaining function calls together or creating their own methods. As far as the high ceiling goes, we’re still working on it, but the language is currently capable of producing not only a quine, as one of our candidates showed us (yes, we ended up hiring him), but also code that can generate studies like the one below:
So Hedgehog has a low threshold and a smooth learning curve, and the ceiling is high enough that our users can do some really serious information processing with it — tasks that would make their other tools break down and cry. But there’s still a lot of interesting work for us to do, especially in pushing the language’s ceiling higher (developing better interactive debugging; working with large objects efficiently) — and as always, making it faster.