The Rise and Fall of Programing Language X

A Philosophy of Programming Language Design

11 min readAug 17, 2019

This is not a PL/I tutorial.

In the 1950s and early 1960s, business and scientific users programmed for different computer hardware using different programming languages. Business users were moving from Autocoders via COMTRAN to COBOL, while scientific users programmed in Fortran, ALGOL, GEORGE, and others. The IBM System/360 (announced in 1964 and delivered in 1966) was designed as a common machine architecture for both groups of users, superseding all existing IBM architectures. Similarly, IBM wanted a single programming language for all users. It hoped that Fortran could be extended to include the features needed by commercial programmers. In October 1963 a committee was formed, composed originally of three IBMers from New York and three members of SHARE, the IBM scientific users group, to propose these extensions to Fortran. Given the constraints of Fortran, they were unable to do this and embarked on the design of a "new programming language" based loosely on ALGOL labeled "NPL". This acronym conflicted with that of the UK's National Physical Laboratory and was replaced briefly by MPPL (MultiPurpose Programming Language) and, in 1965, with PL/I (with a Roman numeral "I"). The first definition appeared in April 1964.

“A Genealogy of Computer Languages” by James Haddock says: “In 1964, IBM was developing its System/360. Never happy with ALGOL, they wanted to have a dialect of their own as a system implementation language, but with the ability to handle COBOL style applications. The result was PL/1 (they also copyrighted the names PL/2 through PL/100 just in case)

They (IBM) actually though they had delivered the ultimate programming language. We all know what happened next.

Ever since the 50s people have been creating programming languages. One would expect that such a flurry of creativity would, at some point, converge into a universal formal language. Not so. Very much not so.

The number of languages in existence today is in the hundreds and the list keeps growing. Hey, there’s even a language whose vocabulary is made of Arnold Schwarzenegger one-liners.

Why do people keep inventing new languages?

The Right Tool for the Task

Try to screw in a loose screw with a hammer and you’ll damage something. Same goes for formal languages. There is no one size fits all.

A programming language, much like natural language, allows a wide but not an unlimited spectrum of expressions. Some natural languages, for example, would have a rich set of nouns describing in detail a set of objects relevant to their native speakers’ natural habitat, while others have few or none, as their native speakers never needed (or maybe never event met) the objects so important to the former.

Navajo, for example, has ten different verbs for “carrying”, each relating to shape and physical properties of the object being carried, while only a few for expressing mental states (“to think”, “to be angry”). English, on the other hand, is extremely rich in the latter, while poor in the former.

Another interesting feature in the Navajo language is its embedded hierarchy of beings. In Navajo, humans count as higher than large intelligent animals, while those are higher than small animals, which in turn count as higher than plants and inanimate objects.

Why?

The answer is that a language, natural or formal, is a tool, and as such, it needs to be comfortably applicable to the problem at hand. If the task of carrying is so important for the Navajo, it is only reasonable they would want to describe it in as little words as possible, rather than waste a full-sentence ever so frequently. You, being a Navajo, would hate repeating the well-known (to you) fact that humans are higher than animals and plants, it is simply an annoying waste of time.

The same goes for formal languages. A formal (programming) language is a tool that allows thinking about a problem and expressing a solution to that problem. If your vocabulary and language metaphors cannot describe the problem you may still find creative ways of expressing yourself, however, you’ll waste a lot of time much like an English speaker trying to order in a French restaurant.

SANScript, A Lesson Learned

Back in 1999, I was given a seemingly simple task. I was asked to design a new scripting language that would allow rapid and easy programming of service-aware networks. For those of you, not versed in the world of DPI (Deep Packet Inspection) let me put it this way. It was intended to be a scripting language allowing almost any programmer to write code that analyzes millions of Internet connections concurrently at the application level, tracking its messages, high-level events (such as web browsing or even the content of page browsed, VoIP calls, DDoS attack detection etc.), and possibly modify them in flight while fed by nothing but the raw packets flowing on the wire.

This “scripting” language quickly became a full-fledged programming language that allowed even the most simpleton programmer to write such applications disregarding almost all complexities of the underlying packet-based jungle.

The language, fondly termed SANscript — Service Aware Networking Scripting Langauge — later to become SML (now owned by Cisco), was designed as a DSL, domain-specific language, whose primary object was to hide the complexity of the packet-domain from the programmer without limiting her power of expression in building new and interesting application-level (a.k.a. Layer 7) solutions for real-world problems.

Creating the language I had to settle the two seemingly contradictory requirements (hiding complexity, while keeping expressiveness), which resulted in a lightweight Object-Oriented language with additional structures and idioms allowing the programmer to believe she is in a serialized universe while the underlying data was heavily concurrent (both temporally and data-wise).

It was in fact a non-multi-threaded programming paradigm hiding the complexity of the underlying multiple concurrent data flows. And it was efficient and expressive (a freshman programmer, after a short learning curve, could easily write a production-level DDoS detection application, or one that analyzed VoIP calls, running over millions of TCP/IP streams in a few days).

How was this made possible?

The Elements

Every programming language must tackle the four elements of computing namely:

Memory management
Concurrency
Safety
Performance

To make things even more complicated, those underlying elements are usually intertwined in unpredictable ways. Getting it wrong is equal to messing around with the four elements of nature. You’ll end up cold, wet, and on the brink of disaster. As we’d like to avoid such cataclysmic events allow me to concisely explain each of them, and how they interrelate (if you feel you know enough about the above, feel free to skip to the next section).

Every programming language must provide a way for the programmer to reference memory. Different languages differ in the ways they allow expressing memory references. Older languages like C place memory at your hands. Grab a chip and hammer on. If you mess up, it’s your problem. Java, on the other hand, with its state of the art garbage collector, took the opposite approach, taking complete responsibility for managing memory.

But memory management must take into account how the language deals with concurrency. If it doesn’t, you may end up performing concurrent tasks that unintentionally access the same memory area, corrupting program data out of the programmer control.

To complicate things even further, memory management is also tightly-coupled with safety. That is thread-safety (i.e. not allowing concurrent threads to corrupt each other variables), scoping (e.g. keeping function or object internal members safe from corruption by external code), object encapsulation and closures.

This intricate dance between the three is usually where programming languages differ. Designing one, you have to make choices, that, like fractals, are highly influenced by your starting conditions (basic assumptions), priorities, and most importantly, the goals your language has to achieve.

Those, in turn, would grow your language metaphors, namely the idioms the programmer would have to use (and think with) while applying your language.

And there’s performance.

Nowadays, most programmers take performance to simply mean: “get more cloud CPUs”. But it was not always the case, and in fact nor it is the case today.

More cloud CPU power costs money, so building on the assumption that you’ll just throw money at the problem might entitle you to an early demise as your runway ends just before you scaled enough for the next dollars’ infusion.

Hence, performance must be taken into account. But only when it matters.

A large scale operation that must provide real-time services to billions of people around the globe, such as Dropbox or Netflix, must care about performance, while a web application which serves a few thousands of requests per second may not. It all depends on your use case. Current, and future.

However, performance is useless if the code you write is full of bugs, memory corruptions, concurrency havocs, and the like. You may have achieved great speed but you’re going to spend your weekends debugging a production system under immense pressure, hammered by both management and customers.

Though choices.

This is why language designers take great care with language specifications until they believe they got the elements in balance. Well, at least balanced according to their original goals.

Language Theocracies

Have you ever been in a “which is the best programming language” discussion (which is the polite way of saying: a squabble).

If you’ve been in this business long enough you’ve seen it more than once. Fresh out of school programmers clinging to a programming language just out the oven bashing seasoned ones as outdated. Others, born and raised to a specific discipline (say OOP), refusing to let go of it while the problem screams “use another language”.

Why can’t they all just get along?

Because none of them accepts the basic truth. That different languages were designed with different goals in mind and should be used accordingly.

That a problem may require a combination of languages, as there is no one size fits all. Or simply, that there is no language that snuggly fits the problem, and you must choose the closest best language candidate accepting the fact you’ll have to grind your teeth from time to time.

To become a better programmer, one must learn to make such distinctions as early as the design stage. As difficult as it may seem to choose the right language or set of languages for each sub-problem domain, it is much easier ruling out languages which may be cool (or hip, or hipster, or whatever new term they are going to make up for being unique and so far ahead of those programmers using languages a year old).

What Not to Use

First, never use a language because it is new and cool. It’s a hipsters mistake. If it’s cool, and you love it, spend your free time coding in it till you get the feel of it, the balance of the elements. Only then consider it.

Second, define the problem you are trying to solve in broad strokes. Is it CPU intensive? I/O intensive? Does it need scale? Does it need scale now? Would it benefit from concurrency? Does it rely on concurrency? Is performance important? Is performance important now? I can go on but you get it, right?

Once you have your answers you can easily rule out those religious arguments usually given by single-language fanatics. You can partition the problem into sub-problems, each requiring its own language. And you can even plan your implementation phases according to what is the current priority (e.g. no scale), vs. the future one (large scale) while taking the “from here to there” cost into account.

Sounds perfect, right? Yet almost no one bothers.

Common Mistakes

The most common mistake in selecting the right language or set f languages is the “It’s easier to find programmers for X”. It may be easier to get them, but each would have to work harder to bend his mind and the problem into the unfitting language metaphor you enforced on them. It’s like asking a university professor who spent his life writing papers in biology to promptly phrase a paper in computer science. He’ll get it done, but it will take him twice the time, and be poorly written.

The second most common mistake is “this is the language I am familiar with”. Who cares? Learn a new one. If you already know a few, you’ll quickly adapt to the new one. Your learning curve time would be negligible vs. the time you’ll save using the right set of idioms and language metaphors.

The third most common is “It is a cool and hip new language”. It may be, but it would be stupid wasting your time (or your employer money) on trying to learn it while you solve the problem when you don’t even know if the language fits(!).

That said, reality has shown most programmers go for a combination of the above, with catastrophic results in time and lost runway.

Let me give you but a few examples.

JavaScript (and now also TypeScript) NodeJS programmers are a dime a dozen. Everybody wants to code in JS/TS. The major reason for that is that JS/TS is easy to learn, and you can make as many mistakes as you’d like (notably, less in TS) and the language would simply, well … do something. Kind of the worst choice for a large scale, multi-service (or micro-serviced) backend with multiple data flows and multiple endpoints. Yet, many choose to go that way simply due to mistake number one.

What could they have done differently you ask? Well, it depends on their problem. If they need a quick implementation of web-based endpoints they are welcome to use the JS/TS NodeJS combination. If they also need data-science based backend they would be wise to use (containerized) python micro-services.

Should they need a distributed implementation that may benefit from error-free message passing as its underlying orchestration model, they should probably use Go.

I have even seen Matlab code being properly encapsulated and used in large scale production environments where it was the best choice at the time.

RUST may be used, as Dropbox chose to use it, for extremely large scale systems where the cost of using it (high learning curve, extremely unfriendly ownership metaphors, and so on) was lower than the safety benefits it provided.

Even C, yes that old thing, may come in handy when you need to speed things up (and don’t want to dwell on RUST for a few months till you get everybody up to speed), and god forbid assembly (or shaders as they nicked it), for small kernels such as the ones used in general-purpose GPU programming.

And the list goes on.

Some Final Words to the Wise

If you don’t need performance use languages that make your life easier (garbage collected), your bug rate lower (strongly typed), and improve code readability (enforced documentation).

If on the other hand, blazing performance is your business, you may (if it fits) use GPUs with CUDA, or write shader kernels, and play around with raw memory management until you are blue at the face.

If you develop a simple dynamic website with some DB at the backend and simple logic, use NodeJS and TypeScript (don’t even start with JS). If you scale, you’ll simply add more nodes, and you have your public-facing web app taking more traffic in no time.

For data science, use Python due to its immense library base in the field.

If you build a distributed system with massive data flows, Go may be your choice (depending on the problem, as it is less expressive), or even RUST (or V, its freshly out competitor).

None of this means you should treat the above as an instruction manual, rather as a set of examples.

As the Buddha said a long time ago, I can only show you the path I walked. You, however, need to walk your own path.

Choose wisely.

—

The author spent the last 27 years coding in just about every language, and role. From low-level programming his Sinclair Spectrum 48K at fifteen in Z80 assembly, to Founder, to startups’ CTO design and implementation of large scale systems, programmable DPI, natural language processing and more.