When someone says “I want a programming language in which I need only say what I wish done,” give him a lollipop.’
What is C? The simple answer—a widely used programming language developed in the early 1970s at Bell Laboratories—conveys little of C’s special flavor. Before we become immersed in the details of the language, let’s take a look at where C came from, what it was designed for, and how it has changed over the years (Sec¬ tion 1.1). We’ll also discuss C’s strengths and weaknesses and see how to get the mostoutofthelanguage(Section1.2).
1.1 History of C #
Let’s take a quick look at C’s history, from its origins, to its coming of age as a standardized language, to its influence on recent languages.
Origins #
C is a by-product of the UNIX operating system, which was developed at Bell Lab¬ oratories by Ken Thompson, Dennis Ritchie, and others. Thompson single-hand¬ edly wrote the original version of UNIX, which ran on the DEC PDP-7 computer, an early minicomputer with only 8K words of main memory (this was 1969. after all!).
Like other operating systems of the time. UNIX was written in assembly lan¬ guage. Programs written in assembly language arc usually painful to debug and hard to enhance; UNIX was no exception. Thompson decided that a higher-level language was needed for the further development of UNIX, so he designed a small language named B. Thompson based B on BCPL, a systems programming lan¬ guage developed in the mid-1960s. BCPL, in turn, traces its ancestry to Algol 60, one of the earliest (and most influential) programming languages.
Ritchie soon joined the UNIX project and began programming in B. In1970. Bell Labs acquired a PDP-11 for the UNIX project. Once B was up and running on the PDP-11. Thompson rewrote a portion of UNIX in B. By 1971, it became apparent that B was not well-suited to the PDP-11, so Ritchie began to develop an extended version of B. He called his language NB (“New B”) at first, and then, as
it began to diverge more from B. he changed the name to C. The language was sta¬ ble enough by 1973 that UNIX could be rewritten in C. The switch to C provided an important benefit: portability. By writing C compilers for other computers at Bell Labs, the team could get UNIX running on those machines as well.
Standardisation #
C continued to evolve during the 1970s. especially between 1977 and 1979. It was during this period that the first book on C appeared. The C Programming Language. written by Brian Kemighan and Dennis Ritchie and published in 1978, quickly became the bible of C programmers. In the absence of an official standard for C. this book—known as K&R or the “WhiteBook” to aficionados—served as a de facto standard.
During the 1970s, there were relatively few C programmers, and most of them were UNIX users. By the 1980s, however, C had expanded beyond the narrow con¬ fines of the UNIX world. C compilers became available on a variety of machines running under different operating systems. In particular, C began to establish itself on the fast-growing IBM PC platform.
With C’s increasing popularity came problems. Programmers who wrote new C compilers relied on K&R as a reference. Unfortunately. K&R was fuzzy about some language features, so compilers often treated these features differently. Also,
K&R failed to make a clear distinction between which features belonged to C and which were part of UNIX. To make matters worse, C continued to change after K&R was published, with new features being added and a few older features removed. The need for a thorough, precise, and up-to-date description of the lan¬ guage soon became apparent. Without such a standard, numerous dialects would have arisen, threatening the portability of C programs, one of the language’s major strengths.
The development of a U.S. standard for C began in 1983 under the auspices of the American National Standards Institute (ANSI). After many revisions, the stan¬ dard was completed in 1988 and formally approved in December 1989 as ANSI standard X3.159-1989. In 1990, it was approved by the International Organisation
for Standardization (ISO) as international standard ISO/IEC 9899:1990. This version of the language is usually referred to as C89 or C90, to distinguish it from the original version of C. often called K&R C. Appendix C summaries the major differences between C89 and K&R C.
The language underwent a few changes in 1995 (described in a document known as Amendment I). More significant changes occurred with the publication ot a new standard, ISO/IEC 9899:1999. in 1999. The language described in this standard is commonly known as C99. The terms “ANSI C,” “ANSI/ISO C,” and “ISO C”—once used to describe C89—are now ambiguous, thanks to the existence of two standards.
C- Based Languages #
C has had a huge influence on modern-day programming languages, many of which borrow heavily from it. Of the many C-based languages, several are especially prominent:
- C++ includes all the features of C, but adds classes and other features to sup¬ portobject-orientedprogramming.
- Java is based on C++ and therefore inherits many C features.
- C# is a more recent language derived from C++ and Java.
- Perl was originally a fairly simple scripting language; over time it has grown and adopted many of the features of C.
Considering the popularity of these newer languages, it’s logical to ask whether it’s worth the trouble to learn C. I think it is, for several reasons. First, learning C can give you greater insight into the features of C++. Java. C#. Perl, and the other C-based languages. Programmers who learn one of these languages first often fail to master basic features that were inherited from C. Second, there are a lot of older C programs around; you may find yourself needing to read and main¬ tain this code. Third. C is still widely used for developing new software, especially in situations where memory or processing power is limited or where the simplicityof C is desired.
If you haven’t already used one of the newer C-based languages, you’ll findthat this book is excellent preparation for learning these languages. It emphasizes data abstraction, information hiding, and other principles that play a large role in object-oriented programming. C++ includes all the features of C. so you’ll be able to use everything you learn from this book if you later tackle C++. Many of the features of C can be found in the other C-based languages as well.
1.2 Strengths and Weaknesses of C #
Like any other programming language, C has strengths and weaknesses. Both stem from the language’s original use (writing operating systems and other systems software) and its underlying philosophy:
- C is a low-level language. To serve as a suitable language for systems programming, C provides access to machine-level concepts (bytes and addresses, for example) that other programming languages try to hide. C also provides operations that correspond closely to a computer’s built-in instructions so that programs can be fast. Since application programs rely on it for input/output, storage management, and numerous other services, an operating system can’t afford to be slow.
- C is a small language. C provides a more limited set of features than many languages. (The reference manual in the second edition of K&R covers the entire language in 40 pages.) To keep the number of features small, C relies heavily on a “library” of standard functions. (A “function” is similar to what other programming languages might call a “procedure,” “subroutine,” or “method.”)
- C is a permissive language. C assumes that you know what you’re doing, so it allows you a wider degree of latitude than many languages. Moreover, C doesn’t mandate the detailed error-checking found in other languages.
Strengths #
C’s strengths help explain why the language has become so popular:
- Efficiency. Efficiency has been one of C’s advantages from the beginning. Because C was intended for applications where assembly language had traditionally been used, it was crucial that C programs could run quickly and in limited amounts of memory.
- Portability. Although program portability wasn’t a primary goal of C, it has turned out to be one of the language’s strengths. When a program must run on computers ranging from PCs to supercomputers, it is often written in C. One reason for the portability of C programs is that—thanks to C’s early association with UNIX and the later ANSI/ISO standards—the language hasn’t splintered into incompatible dialects. Another is that C compilers are small and easily written, which has helped make them widely available. Finally, C itself has features that support portability (although there’s nothing to prevent programmers from writing non-portable programs).
- Power. C’s large collection of data types and operators helps make it a powerful language. In C, it’s often possible to accomplish quite a bit with just a few lines of code.
- Flexibility. Although C was originally designed for systems programming, it has no inherent restrictions that limit it to this arena. C is now used for applications of all kinds, from embedded systems to commercial data processing. Moreover, C imposes very few restrictions on the use of its features; operations that would be illegal in other languages are often permitted in C. For example, C allows a character to be added to an integer value (or, for that matter, a floating-point number). This flexibility can make programming easier, although it may allow some bugs to slip through.
- Standard library. One of C’s great strengths is its standard library, which contains hundreds of functions for input/output, string handling, storage allocation, and other useful operations.
- Integration with UNIX. C is particularly powerful in combination with UNIX (including the popular variant known as Linux). In fact, some UNIX tools assume that the user knows C.
Weaknesses #
C’s weaknesses arise from the same source as many of its strengths:
- C programs can be error-prone. C’s flexibility makes it an error-prone language. Programming mistakes that would be caught in many other languages can’t be detected by a C compiler. In this respect, C is a lot like assembly language, where most errors aren’t detected until the program is run. To make matters worse, C contains a number of pitfalls for the unwary. In later chapters, we’ll see how an extra semicolon can create an infinite loop or a missing
&
symbol can cause a program crash. - C programs can be difficult to understand. Although C is a small language by most measures, it has a number of features that aren’t found in all programming languages (and that consequently are often misunderstood). These features can be combined in a great variety of ways, many of which—although obvious to the original author of a program—can be hard for others to understand. Another problem is the terse nature of C programs. C was designed at a time when interactive communication with computers was tedious at best. As a result, C was purposefully kept terse to minimize the time required to enter and edit programs. C’s flexibility can also be a negative factor; programmers who are too clever for their own good can make programs almost impossible to understand.
- C programs can be difficult to modify. Large programs written in C can be hard to change if they haven’t been designed with maintenance in mind. Modern programming languages usually provide features such as classes and packages that support the division of a large program into more manageable pieces. C, unfortunately, lacks such features.
Obfuscated C #
Even C’s most ardent admirers admit that C code can be hard to read. The annual International Obfuscated C Code Contest actually encourages contestants to write the most confusing C programs possible. The winners are truly baffling, as 1990’s “Best Small Program” shows:
v, i, j, k, l, s, a[99];
main()
{
for (scanf("%d", &s); *a - s; v = a[j *= v] - a[i], k = i < s,
j += (v = j < s && (!k && !printf(2 + "\n\n%c" - (!1 << !j),
" #Q"[l * v ? (1 * j) & 1 : 2]) && ++1 || a[i] < s && v && v - i + j && v + i - j)) && !(l %= s),
v || (i == j ? a[i += k] = 0 : ++a[i]) >= s * k && ++a[--i])
;
}
This program, written by Doron Osovlanski and Baruch Nissenbaum, prints all solutions to the Eight Queens problem (the problem of placing eight queens on a chessboard in such a way that no queen attacks any other queen). In fact, it works for any number of queens between four and 99. For more winning programs, visit www.ioccc.org, the contest’s website.
Effective Use of C #
Using C effectively requires taking advantage of C’s strengths while avoiding its weaknesses. Here are a few suggestions:
- Learn how to avoid C pitfalls. Hints for avoiding pitfalls are scattered throughout this book—just look for the
A
symbol. For a more extensive list of pitfalls, see Andrew Koenig’s C Traps and Pitfalls (Reading, Mass.: Addison-Wesley, 1989). Modern compilers will detect common pitfalls and issue warnings, but no compiler spots them all. - Use software tools to make programs more reliable. C programmers are prolific tool builders (and users). One of the most famous C tools is named
lint
, which is traditionally provided with UNIX.lint
can subject a program to a more extensive error analysis than most C compilers. Iflint
(or a similar program) is available, it’s a good idea to use it. Another useful tool is a debugger. Because of the nature of C, many bugs can’t be detected by a C compiler; these show up instead in the form of run-time errors or incorrect output. Consequently, using a good debugger is practically mandatory for C programmers. - Take advantage of existing code libraries. One of the benefits of using C is that so many other people also use it; it’s a good bet that they’ve written code you can employ in your own programs. C code is often bundled into libraries (collections of functions); obtaining a suitable library is a good way to reduce errors—and save considerable programming effort. Libraries for common tasks, including user-interface development, graphics, communications, database management, and networking, are readily available. Some libraries are in the public domain, some are open source, and some are sold commercially.
- Adopt a sensible set of coding conventions. A coding convention is a style rule that a programmer has decided to adopt even though it’s not enforced by the language. Well-chosen conventions help make programs more uniform, easier to read, and easier to modify. Conventions are important when using any programming language, but especially so with C. As noted above, C’s highly flexible nature makes it possible for programmers to write code that is all but unreadable. The programming examples in this book follow one set of conventions, but there are other, equally valid, conventions in use. (We’ll discuss some of the alternatives from time to time.) Which set you use is less important than adopting some conventions and sticking to them.
- Avoid tricks and overly complex code. C encourages programming tricks. There are usually several ways to accomplish a given task in C; programmers are often tempted to choose the method that’s most concise. Don’t get carried away; the shortest solution is often the hardest to comprehend. In this book, I’ll illustrate a style that’s reasonably concise but still understandable.
- Stick to the standard. Most C compilers provide language features and library functions that aren’t part of the C89 or C99 standards. For portability, it’s best to avoid using nonstandard features and libraries unless they’re absolutely necessary.