Quantcast
Channel: keeping simple
Viewing all articles
Browse latest Browse all 195

Pointer alias analysis in C

$
0
0

sinking-boat-(9)[2]Perhaps there is some reason to provide a mechanism for asserting, in a particular patch of code, that the compiler is free to make optimistic assumptions about the kinds of aliasing that can occur. I don’t know any acceptable way of changing the language specification to express the possibility of this kind of optimization, and I don’t know how much performance improvement is likely to result. I would encourage compiler-writers to experiment with extensions, by #pragma or otherwise, to see what ideas and improvements they can come up with, but I am certain that nothing resembling the noalias proposal should be in the Standard. – Dennis Ritchie 

 

As far as I can tell,  compiler developers prefer to limit both pointer alias (multiple pointers to the same region of memory) and  type-punning ( multiple pointers with different types to the same region of memory) even though these are quite different issues. The “noalias” discussion was ostensibly driven by the difficulty of vectorizing code given overlapping pointers – an issue that has been considered in FORTRAN almost since the dawn of time.  Essentially pointer-aliasing creates a classical consistency issue, like ones we have in distributed computing. Suppose we have code:

 *x += *y; *(x+1) += *(y+1)

Given a two element vector addition instruction, the compiler could potentially get this done in parallel: maybe
“load v,x” to load *x and *(x+1) into the vectors and “add v,y” for the addition and “store v x” would complete it. But maybe the bad programmer has an overlapping alias and x = y+1. If y points to [10,20,30] then the step-by step produces [10,30,20] and then [10,30,50] but the parallel operation produces [10,40,40].  This gets more depressing for large scale matrix operations – which is where it was first encountered in FORTRAN. So compiler writers would like to have unaliased pointers, but in C, aliased pointers are very useful – and not for pathological overlapping data structures.  The obvious example is computing a checksum for a packet. Instead of worrying about the possibly complex structure of the packet, the programmer might want to think of it as a sequence of 64bit unsigned ints and just xor them, but in this case p->ipnumber and *(packetwords+3) might overlap! Maybe there are other reasons for wanting unaliased pointers (the examples in Regehr’s post are not all that compelling)  and someone can tell me about them, but if the only reason is to permit compilers to use vector instructions in loops, it seems like a simple opt-in could be used. C restrict tries to get there, but it’s impressively unclear.

The noalias proposal was “opt-in” but mandated in standard library functions and it would have had, as Ritchie pointed out, a negative affect on the language. But the same project came in through a back door, in a more far reaching mode, via the  “axiomatic false” interpretation of “undefined behavior” and the impressively obscure discussion of aliasing and effective types in the standard.  Anyone who thinks the C Standard has a clear or even excusable model of type-punning and pointer alias could read this horrifying discussion    which ends with the understatement “The conclusion for the de facto semantics here is not completely clear. ” ( via @comex in twitter)  or look at John Regehr’s authoritative post with the even more understated title:

The Strict Aliasing Situation is Pretty Bad

 

See this for a more optimistic take.

Viewing all articles
Browse latest Browse all 195

Trending Articles