a2h programming tutorial v1.1, 12/05/2002 --- LOG ENTRY FOR THIS VERSION: $Log$ --- This text is meant to serve as an introduction to various programming languages, and to try to at least partially answer the question "What's a good language to learn to program in?". As such it is necessarily very brief in its descriptions, with each language being given only a paragraph or two in overview. In this short space I will attempt to explain what a language is used for, how accessible it is and also some references, where I have been able to find them. It is therefore necessary if you are thinking of taking up programming to do your own research. Internet search engines such as http://www.google.com are invaluable for this sort of thing. Once you have decided on a course of action, it would be worth reading a lot about the language you have chosen. In most cases there will be documentation with the product - whether online or as a manual, but there are books available too. In general O'Reilly's (http://www.oreilly.com) books tend to be fantastic as references, and have cool pictures on the covers ;-). Their books are usually aimed toward users of Unix systems, whereas Sams (http://www.samspublishing.com) have a definite Windows leaning in their "Teach yourself...in 21 days" series (apart from the C for Linux programming volume). Sams' books introduce their subject one concept at a time, and some of the books don't do much to tie them together, so learneing from them can be a little jerky but they do cover all the most useful material. Note that not all languages are covered here (by a long chalk). If anyone would like to contribute a new section, or provide advice/amendments to existing sections, then I can be contacted at the address in this message's headers. A current version of this document will be maintained at http://users.ox.ac.uk/~wadh1342/, and posted in alt.2600.hackers periodically. So here we go, in no particular order. ASSEMBLY: Assembly language consists of a series of mnemonics (memory aids) that map one-for-one onto instructions used internally by a microprocessor. As such, this is the least portable language that exists, as the code is specifically written for one type of computer. Programming in assembly requires a decent knowledge of how the processor in the target machine works, and also of the operating system in use on it. These days the computer world sees the language as a necessary evil when you need to interface with a computer on the lowest possible level, with higher languages such as C being preferred wherever possible. For information on x86 assembly development, see "Computer Organization and Assembly Language Programming: For IBM PCs and Compatibles" 2nd Ed., Thorne, 1990 Benjamin Cummings. BASIC: Short for Beginner's All-purpose Symbolic Instruction Code, this language was the flavour of the decade in the 1980s for providing a user interface on microcomputers. Its syntax is similar to the English language (e.g. if condition then statement else statement), but many flavours don't have any form of functional programming (goto most certainly doesn't count) and are thus frowned upon by modern day programming-style Nazis. The only implementation in widespread use today is Visual Basic (http://www.microsoft.com), used as a RAD (Rapid Application Development) tool and scripting language for Microsoft's Office applications. It is still possible to find implementations of qbasic or asic (it's nearly basic) for MS-DOS, and bwbasic for Linux is bundled with some distributions. BASIC comes in both interpreted (executed line-by-line by an external program) and compiled (converted into executable machine code) varieties. BRAINFUCK: Yes, you read that correctly. Brainfuck was originally developed for the Amiga in 1993, and consists of just over half-a-dozen commands. There is one pointer, which can be assigned to any of (I think) 30000 integers in an array. This value can then be incremented or decremented. Very simple to learn, bastard impossible to do anything useful with. Go on, prove me wrong... C: Derived from B, itself derived from BCPL (Beginner's Combined Programming Language), C is one of the most widespread languages used in development environments. This is because although it is a high-end language, C allows the programmer to interface with the computer at a more fundamental level by accessing memory directly. This can make some of C's concepts tricky for beginners to learn, as they are quickly introduced to the concept of pointers (variables that encode the location in memory of other variables). C is a strongly typed language. This doesn't mean that you need to mash the keyboard to write code, it refers to the fact that you need to define explicitly what type of variable you want to use (integer, float, character, structure, array...) and cannot sneakily treat it as something else, for instance you can't pass an integer to a function that expects a float. C does not take naturally to the concept of Object Oriented Programming (OOP, or Object-Oriented O-O), so Bjarne Stroustrup created the C++ language. Its name is an in-joke to developers, as the ++ operator increments its operand by one. Hence C++ is one better than C (strictly it should be written ++C; but who am I to argue?). C++ brings with it the concept of function overloading, whereby a function can be defined more than once with different parameters. When the function is called, the arguments passed determine which version of the function to run. For more information on C and its family, take a look at newsgroups such as alt.comp.lang.learn.c-c++, comp.lang.c, comp.lang.objective-c and comp.lang.c++. Be warned: these groups are intensely strict on protocol, and will flame anyone who asks a FAQ or off-topic question. Questions on compilers are not welcome in these groups. The definitive reference on C++ is "The C++ Programming Language" 3rd Ed., Stroustrup, 1997 Addison Wesley (a.k.a The Bible). COBOL: The COmpletely Business Oriented Language. This is one of the earliest compiled languages (apart from, perhaps, Fortran), designed to run business applications such as database reporting on mainframes. The syntax is very old-looking, and the mainframe heritage leads to some very awkward limitations on - for instance - I/O. Cobol is definitely still supported on Unix and Windows by Fujitsu, and the main providers of development tools for the language are Micro Focus. The language spec is up-to-date; COBOL has bindings to the open standard .NET platform and Object-Oriented versions exist. There are people at comp.lang.cobol, and a book "Cobol: From Micro to Mainframe" 3rd Ed,. Grauer et al, 1998 Prentice Hall comes with an evaluation version of the Fujitsu compiler. JAVA: Surprisingly enough, Java does not stand for Just Another Vague Acronym. In terms of syntax, Java is very similar to C++ though its object model is considerably different. As Java was developed at Sun by James Gosling (as part of a project called Oak), the Sun version is by far the most prevalent (although GNU and IBM both offer alternatives), and most code will be written for their API. Opinions on whether or not Java is easy to learn vary, I personally place it (on the sliding-scale of OO languages) somewhere between C++ at the hard end and ObjC at the easy end, though the sheer size of Sun's API makes doing anything interesting in Java a very daunting task. The object model used in Java is heavily based on Objective-C and Smalltalk, and not based on C++. The best source of free Java documentation is http://java.sun.com, and their series of textbooks documents just about everything. An easier read is O'Reilly's Java in a Nutshell. LISP: Short for List Processing, Lisp isn't in common use much more but is mentioned here because it is the language Richard Stallman elected to give Emacs for its extensibility features. As such many diverse applications have ended up being written in the language, such as web browsers, newsreaders, virtual psychologists and chess games. Fledgling programmers will find Lisp daunting to begin with, as its syntax demands that more time is given to bracket-matching than to actually typing code. The best source for Emacs Lisp is the documentation produced by the Free Software Foundation (http://www.gnu.org), lurk in the newsgroup comp.lang.lisp for information on the language as a whole. OBJECTIVE-C: Another C-variant, which merely wraps Smalltalk-esque Object-Oriented methodologies around the existing C language. In this respect it is easier for C programmers to pick up than C++. The Objective-C language is used primarily in Mac OS X/NeXTStep, although as it is part of the Free Software Foundation's GCC, ObjC compilers are available for many platforms. There are many good texts on Objective-C, but "The Objective-C Programming Language" is available free from http://developer.apple.com and is as good a place as any to start. Support from other programmers can be obtained from comp.lang.objective-c, but the flame wars in there are lively as Apple's friends and foes duke it out. PASCAL: This functional language stands somewhere between C and BASIC, in that its structure mimics C but the syntax can be easier to grasp by beginners. Pascal has also found its way into the RAD market by way of Borland's (http://www.borland.com) Delphi for Windows and Kylix, Delphi's Linux sister. Pascal is often tought as an introduction to computing for non-computing students. A good book for pascal under DOS/Windows is "Mastering Pascal and Delphi Programming", Buchanan, 1998 Macmillan Publishing. PERL: Larry Wall's creation is a bit of a jack-of-all-trades, as it can be used to maintain large databases, simplify batch processing (shell script) tasks, or even maintain a file system. It has seen much use in Common Gateway Interface (CGI) scripts on the Web, due to its powerful string processing facilities. Wall originally created Perl to replace Unix tools such as sed, grep, awk and sh, but it has evolved far beyond this. Perl is an interpreted language and as such tends to only be used for relatively small applications. It has no in-built O-O capability. As a loosely-typed language beginners find Perl easier to deal with than C, as Perl is (usually) able to tell what kind of variable you are trying to use and will act accordingly. Perl comes with its own online documentation - type 'perldoc' at a command prompt. The two most highly-acclaimed books on Perl are known colloquially as "The Llama Book" (though strictly it's an alpaca) and "The Camel Book". O'Reilly sell them under the slightly more pedestrian names of "Learning Perl" and "Programming Perl". A beginners web site (http://learn.perl.org) maintains several newsgroups and mailing lists, including perhaps the least frequent tip of the day ever. PYTHON: Python's remit is very much similar to Perl, though it understands objects and doesn't go in as heavily for regular expressions (the technical name for Perl-style string matching). Its variable types sound quite daunting (what the hell *is* a tuple?), but again it's loosely typed. Python's abilities at handling lists give a nod to Lisp, but are a lot easier to use and include extensions on the simple list concept like dictionaries. Some people find Python's syntax easier to understand than Perl, and as they are reasonably similar for very small scripting tasks this can be useful. Python's documentation can be viewed by running the 'pydoc' command, and O'Reilly do a good series of books on the language. The book for beginners is "Learning Python: Help for Programmers", Lutz and Ascher, 1999 O'Reilly.