C and related programming languages
Hayo Thielecke www.cs.bham.ac.uk/~hxt
There are a number of reasons for learning about C:
This course is not one of those “C in 7 Days by Cutting and Pasting” books. Not all C constructs will be covered, let alone all of C++. Instead, C will be seen in a wider computer science context.
The minimum prerequisite for this course is some basic imperative programming. For instance, you should know how to write the factorial function via recursion or a loop.
For simple programs of this kind, Java and C are almost identical, so we will not cover this material again.
In this course, we first concentrate on C before moving on to some of C++. C gives us the ability to write code close to the hardware, as needed in systems programming. C++ adds features for structuring large programs. As C is (essentially) a subset of C++, what is said about C also applies to C++. We will concentrate on code that you cannot write in Java.
The themes of this course are:
C syntax has become part of the pop culture, so much so that almost everyone can write bad C code. Parts of the syntax of C are such that an obfuscated C contest has been running for many years. We will mostly avoid going into any dark corners of the syntax.
C was never intended for beginners and works best when you understand how the compiler works.
Our first C program is Hello World, which your IDE may generate for you automatically:
#include <stdio.h> /* printf etc */
int main(int argc, char * argv[])
{
printf("Hello again, World!\n");
return 0;
}
At first sight, the code looks very much as it would in Java. We have a main function that takes the parameters from the command line as an array called argv. But if we look closer into how this array is passed, we see differences to Java. In fact, this is our first example of pointers in C, and pointers will be one of the main topics of this course.
The type constructor for pointers is written as *.
Pointers are not integers. p + 10 makes sense, but p + p does not,
Here is a function that uses pointers for copying strings.
void stringcopy(char * to, char * from, int len)
{
int i = 0;
while ((i < len) && *from)
*to++ = *from++;
*to = '\0';
}
String copying without bounds checks is the cause of buffer overflows.
We can create pointers into the stack using the address operator &. Specifically, this is used to achieve call-by-reference in functions such as scanf. To use pointers into the stack safely, we need to be aware of how the call stack works.
Doubly linked lists are widely used in systems programming (such as the Linux kernel and malloc). They have the advantage that the list can be manipulated very efficiently, for instance removing a node from the middle of the list.
The use of a void pointer gives us a certain amount of polymorphism.
struct doublylinked
{
void * data;
struct doublylinked * prev;
struct doublylinked * next;
};
This is also an example of a recursive type. The structure tag sets up the recursion.
The syntax of types is one of the less polished parts of C, perhaps because C evolved from the untyped language B and evolved types as an afterthought. In modern languages like CAML and Haskell, there is a clean way to build up expression for complex types from simpler ones. By contrast, C uses a curiously inside-out syntax that takes some getting used to.
In C, postfix operators bind more tightly than prefix. This matters for type syntax:
char *argv[] for an array of pointers
char *f() for a function returning a pointer
char (*f)() for a pointer to function
{ string s; float f; }
string AND float
string OR float
(6.7.2.1) struct-or-union-specifier:
struct-or-union identifieropt { struct-declaration-list } struct-or-union identifier
(6.7.2.1) struct-or-union:
struct
union
Here the first identifier names the structure. The whole phrase forms a specifier, which works like primitive types like int.
Various forms of trees are ubiquitous in programming. A leading example is parse trees for grammars. In functional languages, we use data types to build in trees. In Java, we have to tie ourselves in knots and use the Composite patterns. In C, there is an idiom of using structures and unions, together with enum, to construct trees.
C is flexible enough that we can write object-oriented code, even if it is laborious. We can write C code using structures and arrays of funtion pointers that closely follows the C++ object model.
We will focus mainly on the use of virtual functions in C++ for polymorphism.
Java classes are very closely based on those In C++.
Inheritance has a bright future behind it. C++ was never as fundamentalist about objects as Java. It retained struct and used templates for containers.
C++ has multiple inheritance, whereas Java only allows a single base class. Multiple inheritance is approximated in Java by interfaces.
C++ makes a distinction between public and private inheritance.
Templates give us the ability to write code that is parametric in a type. This is similar to the polymorphism in Haskell or ML. A leading application for templates is writing type-safe container classes.
Bjarne Stroustrup: Foundations of C++
http://www.stroustrup.com/ETAPS-corrected-draft.pdf
Draft ISO standard for C:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
C++11 standard for C: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3242.pdf
Brian W. Kernighan and Dennis M. Ritchie: The C Programming Language. Prentice Hall
James O. Coplien: Advanced C++ Programming Styles and Idioms. Addison Wesley
Robert C. Seacord: Secure Coding in C and C++. Addison Wesley