The Tower of Babel -- A Comparison Programming Languages

By: Eric Suh with large additions by the webmaster, based on an article that originally appeared in Code Journal

Today's computer programmer has many languages to choose from, but what's the difference between them? What are these languages used for? How can we categorize them in useful ways?



These days, programming languages are becoming more and more general and all-purpose, but they still have their specializations, and each language has its disadvantages and advantages.

Languages can generally be divided into a few basic types, though many languages support more than one programming style. This following list isn't all inclusive or as fine-grained as possible, but it brings out some of the basic design decisions behind languages.

Language Types

  • Procedural
    The programming style you're probably used to, procedural languages execute a sequence of statements that lead to a result. In essence, a procedural language expresses the procedure to be followed to solve a problem. Procedural languages typically use many variables and have heavy use of loops and other elements of "state", which distinguishes them from functional programming languages. Functions in procedural languages may modify variables or have other side effects (e.g., printing out information) other than the value that the function returns.
  • Functional
    Employing a programming style often contrasted with procedural programming, functional programs typically make little use of stored state, often eschewing loops in favor of recursive functions. The primary focus of functional programming is on the return values of functions, and side effects and other means storing state are strongly discouraged. For instance, in a pure functional language, if a function is called, it is expected that the function not modify any global variables or perform any output. It may, however, make recursive calls and change the parameters of those calls. Functional languages are often simpler syntactically and make it easier to work on abstract problems, but they can also be "further from the machine" in that their programming model makes it hard to understand exactly how the code is translated into machine language (which can be problematic for system programming).
  • Object-oriented
    Object-oriented programming views the world as a collection of objects that have internal data and external means of accessing parts of that data. The goal of object-oriented programming is to think about the problem by dividing it into a collection of objects that provide services that can be used to solve a particular problem. One of the main tenets of object oriented programming is encapsulation -- that everything an object will need should be inside the object. Object-oriented programming also emphasizes reusability through inheritance and the ability to extend current implementations without having to change a great deal of code by using polymorphism.
  • Scripting
    Scripting languages are often procedural and may contain elements of object-oriented languages, but they fall into their own category because they are typically not meant to be full-fledged programming languages with support for large system development. For instance, they may not have compile-time type checking or require variable declarations. Typically, scripting languages require little syntax to get started but make it very easy to make a mess.
  • Logic
    Logic programming languages allow programmers to make declarative statements (possibly in first-order logic: "grass implies green" for example) and then allow the computer to reason about the consequences of those statements. In a sense, logic programming is not telling the computer how to do something, but placing constraints on what it should consider doing.
To call these categories "language types" is really a bit misleading. It's possible to program in an object-oriented style in C, or a functional style in a scripting language. In truth, most modern languages incorporate features and ideas from multiple domains, which only serves to increase the richness and usefulness of these languages. Nevertheless, most languages do not excel at all styles of programming.

The Languages

C++ is well-suited for large projects because it has an object-oriented structure. People can collaborate on one program by breaking it up into parts and having a small group or even one individual work on each part. The object-oriented structure also allows code to be reused a lot, which can cut down development time. C++ is also a fairly efficient language - although many C programmers will disagree.

C is a popular language, especially in game programming, because it doesn't have the extra packaging of the object-oriented C++. Programmers use C because it makes programs slightly faster and smaller than programs written in C++. You might wonder, however, whether it's worth giving up the reusability of C++ to get the small increase in performance with C, especially when C++ can, where necessary, be written in a C programming style.

Pascal is primarily a teaching language. Few industrial programs are written in Pascal. Pascal tends to use keywords instead of C-style braces and symbols, so it is a bit easier for beginners to understand than languages like C++. Still, not everyone thinks Pascal is just for the schools. Borland, the huge compiler software company, has been pushing Delphi as an industrial strength programming language. Delphi is an object-oriented version of Pascal, and currently, only Borland compilers use it.

Fortran is a number-crunching program, and it is still used by scientists because the language allows variables of any size up to the memory limit of the machine. Fortran is especially convenient for engineers, who have to mathematically model and compute values to high precision. Fortran, however, isn't nearly as flexible as C or C++. Programming in Fortran is rigid, with strict rules on whitespace and formatting, which sometimes makes reading Fortran programs difficult.

Java is a multi-platform language that is especially useful in networking. Of course, the most famous usage of Java is on the web, with Java applets, but Java is also used to build cross-platform programs that stand alone. Since it resembles C++ in syntax and structure, learning Java is usually quite easy for most C++ programmers. Java offers the advantages provided by object-oriented programming, such as reusability; on the other hand, it can be difficult to write highly efficient code in Java, and Swing, its primary user interface, is notoriously slow. Nevertheless, Java has increased in speed in recent years, and version 1.5 offers some new features for making programming easier.

Perl was originally a file management language for Unix, but it has become well known for its use in CGI programming. CGI (Common Gateway Interface) is a term for programs that web servers can execute to allow web pages additional capabilities. Perl is great with regular expression pattern matching, which is a method for searching text. Perl can be used for databases and other useful server functions, and it is simple to pick up the basics if you have experience in any imperative language. Web hosting services prefer Perl over C++ as a CGI language because the web hosts can inspect Perl script files, since they're just text files, while C++ is compiled, so it can't be inspected for potentially dangerous code. Perl is, however, notorious for its "write once" style of code -- it's very easy to write Perl scripts taking advantage of lots of shortcuts that you later cannot understand.

PHP is a common language for webpage design that is sometimes used as a scripting language in *nix. PHP is designed for rapid website development, and as a result contains features that make it easy to link to databases, generate HTTP headers, and so forth. As a scripting language, it contains a relatively simple set of basic components that allow the programmer to quickly get up to speed, though it does have more sophisticated object-oriented features.

LISP is functional language used mostly in computer science research. LISP is unusual in that it stores (nearly) all data in lists, which are like arrays, but without index numbers. The syntax for lists is very simple, making it easy for programmers to implement complex structures.

Scheme A well-known variant of LISP, Scheme has a slightly simpler syntax and not quite as many features. A common joke is that any large project undertaken in Scheme will result in the reimplementation of most of LISP. Nevertheless, Scheme is quite popular in academic circles and is the introductory language of MIT's computer science department (and is taught as part of Harvard's introductory sequence). Scheme's simplicity makes it a good way to get started solving problems instead of worrying about programming language syntax.

Of course, there are still many, many languages not discussed, a few major ones being Prolog, Tcl, Python, COBOL, Smalltalk, and C#. Those are generally related or similar to the programming languages I have described above. The take home message is that different programming languages have their advantages and disadvantages, and picking the appropriate language for the task is often an important step in the process of developing an application or program.