Atlas · Details
Decision Time
Author’s note
Overly long piece advocating for creating custom programming languages for various Amazon-specific problems, with Erlang as an example of a language that solves a specific problem that's hard in C++.
This led to some teams adopting Erlang at Amazon, interestingly. I think it was abandoned after a few years.
When I went to Google, I was surprised to find that they use custom languages all over the place. Borgcfg is the most notorious, their turing-complete production configuration language. But they have others, as we had to support them all in Grok/Kythe. But by and large, Google proved my thesis true. Was nice to be there.
AI Notes
Written April 2005, twelve months into a working year of language-tour blogging at Amazon, and treated as a summing-up. The opening recaps the five most-read essays from the previous twelve months; the middle is a working list of Amazon-specific problems that C++, Java, and Perl don't solve cleanly: N-box distributed systems, time-series, custom wire protocols, configuration at scale, build- and cycle-time. Each item has a recent production scar attached. The body is the proposal: adopt a new language properly, the way Amazon adopted Perl, Java, and C++ before, and put a small dedicated team on the libraries, tools, documentation, and porting work it takes to make the language usable for a thousand engineers. Erlang and Gambit Scheme are the top picks (concurrency story plus the ability to grow abstractions); Ruby, Common Lisp, OCaml, Haskell behind. The argument is organisational, not technical. The languages exist; the library and tool support doesn't, and waiting for the open-source community to deliver it hasn't worked. The silver bullet exists — you have to buy enough of them to arm a whole army.
Read against what Amazon actually did (which is not this), it's one of the clearest documented moments of an engineer asking a large engineering organisation to make a strategic move it wasn't going to make. It also worked as a prediction: Erlang's concurrency story did turn out to matter, and the JVM languages did eventually absorb most of what Steve wanted. And it's a record of a kind of in-house technical writing — careful, fully referenced, addressed at one's own management — that mostly no longer exists on the open web.
Related listings
-
2005
Tin Foil Hats
Six weeks earlier. Tin Foil Hats is Steve picking for himself; Decision Time is Steve asking the company to pick for everyone.
-
2005
Choosing Languages
The methodology behind the pick. Choosing Languages lays out the criteria; Decision Time applies them and names candidates.
-
2005
Duck Season
Same month. Duck Season is the equalizer; Decision Time is the moment Steve points at the slider arrangement Amazon actually needs.
From the peanut gallery
Read the rest of the thread · 5 more
-
Continuous optimization. This is a huge and unexplored area of supply chain that I am on task to work on. I haven't worked on it yet, and while tentatively we've planned on using Java (anything is better than C++!), I am 100% open to leapfrogging my performance.
The issues I have found with non-supported languages at Amazon are as follows:
Manager/Peer pressure. "You want to use WHAT?!" Having a cheering section to at the very least cheer you up helps.
Database interoperability. The DBAs don't allow you to connect to Oracle without using APLOracle ? Crap! One solution is to use MySQL, which it what I may consider.
Service and communication interoperability. Rarely any important system stands alone. You need to hit up remote cat, you need to capture and inject pubsub messages, you need to access HTTP/BSF services, etc.
All of these are easily solvable, but not by just one person with a day job.
I want to do this, we need to talk more obviously!
PS: I've been talking about these things for a while, just not on a blog, and also I've been stuck with the 3 problems outlined above. I think the hardest problem (#1) for me won't be a big issue anymore, but I still need help.
-
For what it's worth, C++, Perl and Java were all disallowed at Amazon early on. Each of them succeeded in overcoming popular resistance by accumulating a critical mass of developers who eventually just pressed forward and started getting real work done in these languages.
Initially it was just ANSI C, our in-house web-templating language, and Emacs-Lisp for the original Customer Service app suite (which was quite popular for many years, actually). Oh, and the inevitable handful of scripting languages that are essential in Unix -- shell-script, awk, etc. All other languages were strictly forbidden.
Perl broke through first, and they had the easiest time doing it, because (a) there were many problems for which Perl was arguably the best option in 1995-1998, and (b) Perl is kiiind of soooort of a standard on Unix, so it was hard to keep it out.
I personally think Python would have been a smarter choice; it's the choice Google made, so obviously it doesn't cause grevious ills for highly-scalable problem spaces. But Perl had Python beat in the CGI space for quick-and-dirty internal web apps, for Unix integration, and overall popularity.
C++ came next. The "why can't I use C++?" griping started to ramp up heavily from mid-1998 to end-1998, and then it started appearing in big projects like Customer Master. Of course it *immediately* (as in, on the very same project) began causing portability/compatibility headaches -- we had to do a GCC-to-CXX migration for BEA compatibility, then migrate back later during PARCS. It was months of work and caused no end of bugs. But the C++ fans pressed forward fairly aggressively, once they realized they outnumbered the original Amazon architects and could brazenly flaunt their non-portable template tricks.
Java took much longer. It was used first in SCOS, and their framework was adopted by CS, but Swing didn't play well with the existing xterm-based network architecture, so it was incredibly sluggish through most of 1999, and I believe it was the infrastructure guys who put up the most resistance, saying there was no possible way we could afford the hardware costs to make Java scale. For all I know, they may have been right; I'm just recounting the sequence of events as I remember it.
I don't think Java transitioned to a first-class language at Amazon (where you could use it with impugnity) until... mid-2003, maybe? Pretty recently. And again, it was largely just a big mass of competent Java programmers who finally just forced the issue. All they needed to do was prove that it worked. They were carrying the pagers for their own systems, and they were hitting deadlines, and the systems were performing adequately for the most part, depending on who you asked.
Even today, though, it remains to be seen whether someone could sell the idea of running a Java 5 JVM instance on every essentially every online and production machine. Many people are still highly skeptical of Java's value-add, and take a "run it on your box, but not MY box" stance. I personally can sympathize with both sides of the argument, but generally speaking, I'd favor replacing C++ with Java.
I should make the important point that I trust Java and the JVM. Regardless of how I feel about Java's long-term viability, I think that being a Java shop (plus Ruby or Python for scripts, of course) would be a huge improvement over being a C++ shop, and could easily carry us for 4 to 6 years. Unfortunately, Java seems to be losing some ground to C++, for a very good reason: having a unified platform for our primary websites is far more important than the particular language choice; just ask anyone who has to deploy to three or more platforms today.
C++, Java and Perl all have a lot going for them: they're well-documented, stable, mature languages with plenty of libraries, big communities, and reasonable interoperability options. At least we're no worse off than the rest of the industry.
But my hope is that a bunch of us can agree to team up, follow the example set by the Perl, C++ and Java early-adopters in years past, and share the pain of driving a higher-quality language to Amazon-level production readiness.
I envision more than just incremental improvement. I think that if we make good choices, and bust our butts on it for a while, we could put ourselves in a position to do the really fancy stuff Amazon always dreams about, like building self-regulating systems with reinforcing feedback loops that need far less human intervention. Right now I just don't see much of that happening. I know it's a minority opinion at the moment, but I hold C++, Perl and Java largely accountable for our lack of progress on this front.
But I can guarantee a fair amount of mundane work up front -- we'll have to solve issues like your #2 and #3 before we can move on to more interesting problem domains. (FWIW, the Ruby folks here have quietly gone and solved at least #3 already. That's one of the side-benefits of using a cool high-level language: you can generally get stuff working really fast, with very little code, and it's more fun too.)
-
Problems #2 and #3 can be fun anyways. I don't dislike working on platform issues, especially when you are working towards a magnificent new future.
So if it is your hope some people will team up, any suggestions to those of us who have the projects and time to team up on?
-
Well, I'd like to give it a couple days, to give more people time to read this, think it over, and comment. In the meantime, I'm finally forcing myself to learn Erlang and Gambit Scheme, since they have uniquely scalable concurrency support that doesn't seem to exist in any other languages out there.
In both Gambit and Erlang, it's easy to fire up half a million to a million active threads on a single workstation, passing messages, each of them able to block on I/O at will without affecting any of the others. You Just Can't Do That with Java (or Ruby, or Python, or any "normal" multi-threaded or multi-processing language I'm aware of). It makes approaching the problem of thousands of concurrent connections a lot easier when the language has that level of concurrency support built-in.
If we were to choose any language other than Gambit or Erlang, it would be a prerequisite to be able to modify the language runtime to give it Erlang-style concurrency -- and get the language maintainers to accept the patch, so we don't have to maintain it in-house. Depending on the language, that could be nontrivial, to say the least.
Hence my initial predisposition towards Gambit. Although it would need a lot of library work, it has a surprisingly strong core set of language features and tools. It would be a good foundation to build on. But I need to study it for a few days to get a feel for it.
-
I learned scheme in University. While I wasn't 100% enthusiastic for it at the time, I didn't have the difficulty some other people did with the highly functional and recursive nature of it. I never understood why tail-recursion is better until a few years later.
But one thing we never touched on was macros. Does Scheme have macros? Is that why you are predisposed to Gambit Scheme? From the respective sites, it appears that erlang is a bit more, ahem, production ready than Gambit Scheme. Maybe I read the wrong site though?
Hmm, we could instead try to hire Matz and dedicate a few engineers to getting Ruby 2.0 out the door :)
— Andrew W · April 29, 2005 08:35 PM
That's not such a bad idea. But I think he's already being paid to work on Ruby 2 and the VM full-time; probably all hiring him here would accomplish is make him lose 2-3 months from his schedule.
— Steve Yegge · April 29, 2005 09:58 PM
Gambit Scheme tracks the R4RS and IEEE Scheme definitions, which don't specify a macro implementation. However, it does implement both Common-Lisp-style macros, and a superset of the hygenic macros specified in R5RS.
— Derek U · April 29, 2005 08:21 AM