I was recently reading the C99 rationale (because why the fuck not) and I was intrigued by some comments that certain features were retired in C89
, so I started wondering what other weird features existed in pre-standard C. Naturally, I decided to buy a copy of the first edition of The C Programming Language, which was published in 1978.
Here are the most interesting things I found in the book:
- The original
hello, world
program appears in this book on page 6.
In C, the program to print “hello, world” is
main() { printf("hello, world\n"); }
Note that
return 0;
was not necessary. - Exercise 1-8 reads:
Write a program to replace each tab by the three-column sequence
>
, backspace,-
, which prints as ᗒ, and each backspace by the similar sequence ᗕ. This makes tabs and backspaces visible.… Wait, what? This made no sense to me at first glance. I guess the output device is assumed to literally be a teletypewriter, and backspace moves the carriage to the left and then the next character just gets overlaid on top of the one already printed there. Unbelievable!
- Function definitions looked like this:
power(x, n) int x, n; { ... return(p); }
The modern style, in which the argument types are given within the parentheses, didn’t exist in K&R C. The K&R style is still permitted even as of C11, but has been obsolete for many years. (N.B.: It has never been valid in C++, as far as I know.) Also note that the return value has been enclosed in parentheses. This was not required so the authors must have preferred this style for some reason (which is not given in the text). Nowadays it’s rare for experienced C programs to use parentheses when returning a simple expression.
- Because of the absence of function prototypes, if, say, you wanted to take the square root of an
int
, you had to explicitly cast todouble
, as in:sqrt((double) n)
(p. 42). There were no implicit conversions to initialize function parameters; they were just passed as-is. Failing to cast would result innonsense
. (In modern C, of course, it’s undefined behaviour.) void
didn’t exist in the book (although it did exist at some point before C89; see here, section 2.1). If a function had no useful value to return, you just left off the return type (so it would default toint
).return;
, with no value, was allowed in any function. The general rule is that it’s okay for a function not to return anything, as long as the calling function doesn’t try to use the return value. A special case of this wasmain
, which is never shown returning a value; instead, section 7.7 (page 154) introduces theexit
function and states that its argument’s value is made available to the calling process. So in K&R C it appears you had to callexit
to do what is now usually accomplished by returning a value frommain
(though of courseexit
is useful for terminating the program when you’re not insidemain
).-
Naturally, since there was no
void
,void*
didn’t exist, either. Stroustrup’s account (section 2.2) appears to leave it unclear whether C or C++ introducedvoid*
first, although he does say it appeared in both languages at approximately the same time. The original implementation of the C memory allocator, given in section 8.7, returnschar*
. On page 133 there is a comment:The question of the type declaration for
alloc
is a vexing one for any language that takes its type-checking seriously. In C, the best procedure is to declare thatalloc
returns a pointer tochar
, then explicitly coerce the pointer into the desired type with a cast.(N.B.:
void*
behaves differently in ANSI C and C++. In C it may be implicitly converted to any other pointer type, so you can directly doint* p = malloc(sizeof(int))
. In C++ an explicit cast is required.) - It appears that
stdio.h
was the only header that existed. For example,strcpy
andstrcmp
are said to be part of the standard I/O library (section 5.5). Likewise, on page 154exit
is called in a program that only includesstdio.h
. - Although
printf
existed, the variable arguments library (varargs.h
, laterstdarg.h
) didn’t exist yet. K&R says thatprintf
is … non-portable and must be modified for different environments
. (Presumably it peeked directly at the stack to retrieve arguments.) - The authors seemed to prefer separate declaration and initialization. I quote from page 83:
In effect, initializations of automatic variables are just shorthand for assignment statements. Which form to prefer is largely a matter of taste. We have generally used explicit assignments, because initializers in declarations are harder to see.
These days, I’ve always been told it’s good practice to initialize the variable in the declaration, so that there’s no chance you’ll ever forget to initialize it.
- Automatic arrays could not be initialized (p. 83)
- The address-of operator could not be applied to arrays. In fact, when you really think about it, it’s a bit odd that ANSI C allows it. This reflects a deeper underlying difference: arrays are not lvalues in K&R C. I believe in K&R C lvalues were still thought of as expressions that can occur on the left side of an assignment, and of course arrays do not fall into this category. And of course the address-of operator can only be applied to lvalues (although not to bit fields or register variables). In ANSI C, arrays are lvalues so it is legal to take their addresses; the result is of type
pointer to array
. The address-of operator also doesn’t seem to be allowed before a function in K&R C, and the decay to function pointer occurs automatically when necessary. This makes sense because functions aren’t lvalues in either K&R C or ANSI C. (They are, however, lvalues in C++.) ANSI C, though, specifically allows functions to occur as the operand of the address-of operator. - The standard memory allocator was called
alloc
, notmalloc
. - It appears that it was necessary to dereference function pointers before calling them; this is not required in ANSI C.
- Structure assignment wasn’t yet possible, but the text says
[these] restrictions will be removed in forthcoming versions
. (Section 6.2) (Likewise, you couldn’t pass structures by value.) Indeed, structure assignment is one of the features Stroustrup says existed in pre-standard C despite not appearing in K&R (see here, section 2.1). - In PDP-11 UNIX, you had to explicitly link in the standard library:
cc ... -lS
(section 7.1) - Memory allocated with
calloc
had to be freed with a function calledcfree
(p. 157). I guess this is becausecalloc
might have allocated memory from a different pool thanalloc
, one which is pre-zeroed or something. I don’t know whether such facilities exist on modern systems. - Amusingly,
creat
is followed by[sic]
(p. 162) - In those days, a directory in UNIX was
a file that contains a list of file names and some indication of where they are located
(p. 169). There was noopendir
orreaddir
; you just opened the directory as a file and read a sequence ofstruct direct
objects directly. Example is given on page 172. You can’t do this in modern Unix-like systems, in case you were wondering. - There was an unused keyword,
entry
, said to bereserved for future use
. No indication is given as to what use that might be. (Appendix A, section 2.3) - Octal literals were allowed to contain the digits
8
and9
, which had the octal values10
and11
, respectively, as you might expect. (Appendix A, section 2.4.1) - All string literals were distinct, even those with exactly the same contents (Appendix A, section 2.5). Note that this guarantee does not exist in ANSI C, nor C++. Also, it seems that modifying string literals was well-defined in K&R C; I didn’t see anything in the book to suggest otherwise. (In both ANSI C and C++ modifying string literals is undefined behaviour, and in C++11 it is not possible without casting away constness anyway.)
- There was no unary
+
operator (Appendix A, section 7.2). (Note: ANSI C only allows the unary+
and-
operators to be applied to arithmetic types. In C++, unary+
can also be applied to pointers.) - It appears that
unsigned
could only be applied toint
; there were no unsigned chars, shorts, or longs. Curiously, you could declare along float
; this was equivalent todouble
. (Appendix A, section 8.2) - There is no mention of
const
orvolatile
; those features did not exist in K&R C. In fact,const
was originally introduced in C++ (then known asC With Classes
); this C++ feature was the inspiration forconst
in C, which appeared in C89. (More info here, section 2.3.)volatile
, on the other hand, originated in C89. Stroustrup says it was introduced in C++[to] match ANSI C
(The Design and Evolution of C++, p. 128) - The preprocessing operators
#
and##
appear to be absent. - The text notes (Appendix A, section 17) that earlier versions of C had compound assignment operators with the equal sign at the beginning, e.g.,
x=-1
to decrementx
. (Supposedly you had to insert a space between=
and-
if you wanted to assign-1
tox
instead.) It also notes that the equal sign before the initializer in a declaration was not present, soint x 1;
would definex
and initialize it with the value 1. Thank goodness that even in 1978 the authors had had the good sense to eliminate these constructs… :P - A reading of the grammar on page 217 suggests that trailing commas in initializers were allowed only at the top level. I have no idea why. Maybe it was just a typo.
- I saved the biggest WTF of all for the end: the compilers of the day apparently allowed you to write something like
foo.bar
even whenfoo
does not have the type of a structure that contains a member namedbar
. Likewise the left operand of->
could be any pointer or integer. In both cases, the compiler supposedly looks up the name on the right to figure out which struct type you intended. Sofoo->bar
iffoo
is an integer would do something like((Foo*)foo)->bar
whereFoo
is a struct that contains a member namedbar
, andfoo.bar
would be like((Foo*)&foo)->bar
. The text doesn’t say how ambiguity is resolved (i.e., if there are multiple structs that have a member with the given name).
nerd
I am not sure if function prototypes where in the _original_ K&R but they definitely existed well before ANSI. They looked like this (quote from a working program):
int zmsndfiles(file_list*);
int zmsndfiles(lst)
file_list *lst;
{
…
Not sure why it’s unbelievable that you would print a character, then a backspace, then another character to get overprinting… back in the day, we did that all the time. It didn’t work on what we called “glass TTYs” i.e. “modern” serial terminals, but it worked on teletypewriters, most dot matrix or daisy wheel printers (yes, I remember them too). I also worked with a vector terminal a few times, and overprinting worked on it. In fact, it was a bear to figure out how to erase characters inline with that one.
“return 0” from main is not necessary even now (C99).
Struct members were in the same global namespace, so there was no ambiguity possible.
The Unix v6 sources used this quite a bit for cheap type overloading. Weird when you first see it.
Guess you never used a paper terminal (or line printers). Overstriking was the norm. APL depended on it. There were magical overstriking video terminals too, such as the Tektronix 4014.
I believe void* first appeared in pcc, Steven Johnson’s portable C compiler, although I don’t remember the exact reason why. It might have been to allow explicitly an idea of “pointer to anything” in an era when a byte pointer and a word pointer might have very different representations, but that’s just a guess.
printf was special but so was nargs (number of arguments) a sorta-related function that needed to be implemented afresh with every new PDP-11 MMU, and always had bugs.
C with classes was, I believe, the first place not to require dereferencing a function pointer, perhaps because it made methods clumsy. It then appeared in pcc, although I think that was pretty much by accident. I remember asking for it to be added to some compiler for consistency as well as convenience, but I’m not sure whether it was pcc or VAX cc or maybe the 68020 pcc. DMR’s cc didn’t do it.
Whoever broke directory I/O is my enemy for life.
I don’t believe const and volatile belong in C at all, but that’s just an opinion.
Linking with -lS got you standard I/O, as it was called, but it wasn’t really standard yet. The real standard C library came for free. What you now know as standard I/O came in V7.
In V6 there was #define and #include, implemented by cc.c, which was just a wrapper for the C compiler’s multiple passes (c0, c1, c2, as, ld). The abominations like ## and #if came much later; ## was invented by the ANSI committee as part of making the preprocessor the worst part of C, after wide character handling. Do not invent in committee!
For the frame buffer on our custom hardware at U of T that lived with registers at address 0, we wrote 0->reg and such to access the control registers defined in a struct like:
struct X {
int reg;
…
};
You’ve barely scratched the surface of what was different then, but enjoy.
I think I have a newer edition of the book printed in 1984 which I am sure is not the second edition, since it has the same cover as your book.
I can’t remember the teletype example or returning in parentheses stuff. I guess preface was saying that ANSI standard is being developed and some changes were made since the first edition.
I have to check though. Once I find the book I will look at the points you wrote and check which ones were changed before second edition (if any).
Rob Pike wrote: “Whoever broke directory I/O is my enemy for life.”
Especially directory O, right? (-: I’ve heard that Unix directories were user-writable early on, but by the time I got to Unix (between V6 and V7), that was not possible.
As for exit() being in stdio, no. In K&R C, it was not necessary to declare extern functions. Any undefined word used as a call was assumed to be an extern.
The really alien thing about Unix in the 1970s was the size of it. It really was possible to read and understand _everything_. When my school got its V7 tapes, I printed off all eight chapters of the man pages, took them home and read them that night. They were only about 3/4 inch thick (2 cm). Maybe 300 pages, probably less, and most were quite brief. And you’ve no doubt seen the Lions book, which includes the full kernel source code in ~100 pages. (And Digital published full schematics for many of their machines of that era, but I digress.)
The fact that the structure member names were from a common names space–the reason you could say 0->bar–is the reason that Unix defined structures have a three letter prefix. Each member in struct time, for example, begins with the string “tm_”. The use of these pesky prefixes has now been immortalized in the Posix standard.
When Dennis heard that I was writing a quick and dirty C compiler from the 1973 definition of C for a project in 2003 he said that function prototypes were a long over due feature even in 1973. If he had it to do again they would have been in the language from the beginning.
In Solaris 10 and OpenSolaris, reading a directory still works. I haven’t tried it in Solaris 11.
The Classic C argument lists are easy to understand when you remember two things:
– V6 C mostly managed without types at all, they were mainly added to distinguish between integers and pointers, and until C99 you could still omit the ‘int’ type, e.g., “auto x;” was a legal way to declare an int variable. So you already had funcname(args) { body } declarations. Adding
the types between the header and the body meant not disrupting the existing header.
– Algol 68 and Pascal put argument types inside the header; previous languages, including Fortran, Algol 60, PL/I, Algol W, and Simula 67, put them between the header and the body. C was following a well established tradition. The problem was not where the argument types were written, but that they didn’t have to be written at all and the compiler didn’t use them in processing calls.
Void, enums, and struct assignment/pass by value were added in between Unix V6 and Unix V7,
so some time around 1980.
I remember writing those exit(0)-s at the end of main()-s.
It’s interesting to look back and observe how the old C was slowly diverging from old old Bash.
It’s not very often that I get to appreciate articles that are as good as this.
I shared this on my twitter. Thank you for posting
this.
Online школа моделирования. Стримы, видео, статьи, книги, самоучители.
Pingback: Linux++ (20th Edition) [Part 1] - Front Page Linux
Pingback: A Guide Through The History of Unix & Linux: Everything You Need To Know - Front Page Linux
While I have not tried this on FreeBSD in about 7 years, as of FreeBSD 9 (released 8 years ago), one can open() and read() directories. However, the data returned was in the directory format of the underlying filesystem. This behavior appeared to be valid at least for FreeBSD UFS and FAT.