jueves, 11 de abril de 2013

[C++] Achieve undefined behaviour through static_cast

Those days I came across some heavy C++ programming. Due to external limitations I was forced to use C++98 instead of the shiny new C++11 standard. That's to say I had to rewrite simpler versions of available STL functionalities or core ones. One of the core functionalities I most missed was anonymous functions (aka lambdas). While I was doing it I found a curious piece of code that would invoque undefined behaviour. It's only a curiosity, nothing new or not defined in the standard.

Would you believe me if I tell you that the this pointer can be of some other different type than the class in which the method is contained without using reinterpret_cast, const_cast or even dynamic_cast? Let's see it:


static const int SIZE = 1000000;
 
struct Son1 : Parent {
  char c[SIZE];
  char First() {
    Son1* MyThis = dynamic_cast<Son1*>(static_cast<Parent*>(this)); 
//Redundant. Can it launch an error?
    return this->c[SIZE-1];
  }
}

  Can it launch an error? Yes: when it's not of type Son1*. Let's force it:
struct Parent {char misteriousFunc() {}};
static const int SIZE = 1000000;
truct Son1 : Parent {
  char c[SIZE]; //We will use only the last element. It's huge!
  Son1() {c[SIZE-1]='a';} //Let's give some value
  char First() {
    return this->c[SIZE-1]; //Can it launch an error?
  }
};
 
struct Son2 : Parent { //Highlight: this class is really small compared to last one.
  char Second() { return 'z';} //let's return some different value than Son1
};
  

int main() {
  Parent* one = new Son1(); //Instantiates huge class
  Parent* two = new Son2(); //Instantiates little class
  char (Son1::* oneFunc)() = &Son1::First;
  char (Son2::* twoFunc)() = &Son2::Second;
 
  //So, Ok, we've instantiated two pointer to funcions. What's the matter? Let's cast it to parent type. It makes sense, don't you think? Abstraction and all that shit.
  char (Parent::* parentFuncOne)() = static_cast<char (Parent::*)()>(oneFunc);
  char (Parent::* parentFuncTwo)() = static_cast<char (Parent::*)()>(twoFunc);
 
  //Now let's use them. In a given object we can invoque any operation defined in some of its ancestors.
  (one->*parentFuncOne)(); //So we invoque an operation of Parent in Parent. It should be Ok.
  (two->*parentFuncOne)(); //Same applies here
}

Have you seen it? What's really doing the last line?
(two->*parentFuncOne)(); //So we invoque an operation of Parent in Son2. It should be Ok.

It's invoquing the function Son1::First on a Son2 instance. So, when we are inside the funtion, the implicit parameter this will be of a different type! Let's look at it closely:
char Son1::First() {
  return this->c[SIZE-1]; //this is of type Son2, it's not a Son1 object!
}

On most systems when we take out the redundant dynamic_cast it fails due to access to memory not owned by the process: in Son2 we really do not have 100000 bytes, it's a class formed by 0 bytes!


TLDR I'm awesome because I can make this fail:

class Son1 : Parent {
  char First() {
    Son1* MyThis = dynamic_cast<Son1*>(static_cast<Parent*>(this));
    assert(MyThis);
  }
};