This section describes known problems that affect users of the compiler. Most of these are not the compiler bugs per se — if they were, we would fix them. But the result for a user may be like the result of a bug.
Some of these problems are due to bugs in other software, some are missing features that are too much work to add, and some are places where people's opinions differ as to what is best.
There are several noteworthy differences between GNU C and most existing (non-ANSI) versions of C. The “-traditional” option eliminates many of these incompatibilities, but not all, by telling GNU C to behave like the other C compilers.
The compiler normally makes string constants read-only. If several identical-looking string constants are used, the compiler stores only one copy of the string.
One consequence is that you cannot call mktemp with a string constant argument. The function mktemp always alters the string its argument points to.
Another consequence is that sscanf does not work on some systems when passed a string constant as its format control string or input. This is because sscanf incorrectly tries to write into the string constant. Likewise fscanf and scanf.
The best solution to these problems is to change the program to use char-array variables with initialization strings for these purposes instead of string constants. But if this is not possible, you can use the “-fwritable-strings” flag, which directs the compiler to handle string constants the same way most C compilers do. “-traditional” also has this effect, among others.
-2147483648 is positive.
This is because 2147483648 cannot fit in the type int, so (following the ANSI C rules) its data type is unsigned long int. Negating this value yields 2147483648 again.
The compiler does not substitute macro arguments when they appear inside of string constants. For example, the following macro in the compiler
#define foo(a) "a"
will produce output "a" regardless of what the argument a is.
The “-traditional” option directs the compiler to handle such cases (among others) in the old-fashioned (non-ANSI) fashion.
When you use setjmp and longjmp, the only automatic variables guaranteed to remain valid are those declared volatile. This is a consequence of automatic register allocation. Consider this function:
jmp_buf j; foo () { int a, b; a = fun1 (); if (setjmp (j)) return a; a = fun2 (); /* longjmp (j) may occur in fun3. */ return a + fun3 (); }
Here a may or may not be restored to its first value when the longjmp occurs. If a is allocated in a register, then its first value is restored; otherwise, it keeps the last value stored in it.
If you use the “-W” option with the “-O” option, you will get a warning when the compiler thinks such a problem might be possible.
The “-traditional” option directs GNU C to put variables in the stack by default, rather than in registers, in functions that call setjmp. This results in the behavior found in traditional C compilers.
Programs that use preprocessing directives in the middle of macro arguments do not work with the compiler. For example, a program like this will not work:
foobar ( #define luser hack)
ANSI C does not permit such a construct. It would make sense to support it when “-traditional” is used, but it is too much work to implement.
Declarations of external variables and functions within a block apply only to the block containing the declaration. In other words, they have the same scope as any other declaration in the same place.
In some other C compilers, a extern declaration affects all the rest of the file even if it happens within a block.
The “-traditional” option directs GNU C to treat all extern declarations as global, like traditional compilers.
In traditional C, you can combine long, and so on., with a typedef name, as shown here:
typedef int foo; typedef long foo bar;
In ANSI C, this is not allowed: long and other type modifiers require an explicit int. Because this criterion is expressed by Bison grammar rules rather than C code, the “-traditional” flag cannot alter it.
PCC allows typedef names to be used as function parameters. The difficulty described immediately above applies here too.
PCC allows whitespace in the middle of compound assignment operators such as “+=”. The compiler, following the ANSI standard, does not allow this. The difficulty described immediately above applies here too.
The compiler complains about unterminated character constants inside of preprocessing conditionals that fail. Some programs have English comments enclosed in conditionals that are guaranteed to fail; if these comments contain apostrophes, the compiler will probably report an error. For example, this code would produce an error:
#if 0 You can't expect this to work. #endif
The best solution to such a problem is to put the text into an actual C comment delimited by “/*...*/”. However, “-traditional” suppresses these error messages.
Many user programs contain the declaration “long time ();”. In the past, the system header files on many systems did not actually declare time, so it did not matter what type your program declared it to return. But in systems with ANSI C headers, time is declared to return time_t, and if that is not the same as long, then “long time ();” is erroneous.
The solution is to change your program to use time_t as the return type of time.
When compiling functions that return float, PCC converts it to a double. The compiler actually returns a float. If you are concerned with PCC compatibility, you should declare your functions to return double; you might as well say what you mean.
When compiling functions that return structures or unions, the compiler output code normally uses a method different from that used on most versions of UNIX. As a result, code compiled with the compiler cannot call a structure-returning function compiled with PCC, and vice versa.
The method used by the compiler is as follows: a structure or union that is 1, 2, 4 or 8 bytes long is returned like a scalar. A structure or union with any other size is stored into an address supplied by the caller (usually in a special, fixed register, but on some machines it is passed on the stack). The machine-description macros STRUCT_VALUE and STRUCT_INCOMING_VALUE tell the compiler where to pass this address.
By contrast, PCC on most target machines returns structures and unions of any size by copying the data into an area of static storage, and then returning the address of that storage as if it were a pointer value. The caller must copy the data from that memory area to the place where the value is wanted. The compiler does not use this method because it is slower and nonreentrant.
On some newer machines, PCC uses a reentrant convention for all structure and union returning. The compiler on most of these machines uses a compatible convention when returning structures and unions in memory, but still returns small structures and unions in registers.
You can tell the compiler to use a compatible convention for all structure and union returning with the option “-fpcc-struct-return”.
GNU C complains about program fragments such as “0x74ae-0x4000” that appear to be two hexadecimal constants separated by the minus operator. Actually, this string is a single preprocessing token. Each such token must correspond to one token in C. Since this does not, GNU C prints an error message. Although it may appear obvious that what is meant is an operator and two values, the ANSI C standard specifically requires that this be treated as erroneous.
A preprocessing token is a preprocessing number if it begins with a digit and is followed by letters, underscores, digits, periods and “e+”, “e-”, “E+”, or “E-” character sequences.
To make the above program fragment valid, place whitespace in front of the minus sign. This whitespace will end the preprocessing number.