IDA interpreting C++ interfaces in C

In the C++ language, we have something that is called a class. A class is essentially a C struct, containing some data and functions. It may contain members or member functions that are either private, public or protected. It may also contain just a single base class (single inheritance) or a couple dozens of them (multiple inheritance), as we may have seen in some complicated class hierarchies. Well, in this article I will explain how to carefully define classes in IDA, so that the IDA decompiler can turn these nasty offsets into actual members or member functions of the particular class.

The basics

So as you may know, it's possible to upload a C header into IDA, which will be then turned into defined data that you can apply while reversing. For example, you create a header file, declare a struct there, upload it to IDA and then the struct is defined inside IDA, simple as that. This is how we'll define our C++ classes. But wait, how can we declare a C++ class inside a C header file? Well, we can, however it requires some knowledge of how they're structured in memory.

Single inheritance

class IBaseInterface { public: virtual ~IBaseInterface() {} }; class IInterface : public IBaseInterface { public: virtual ~IInterface() {} virtual void a() {}; virtual void b() {}; virtual void c() {}; };

When a class has at least one virtual function, a so called "Virtual Function Table" or VFT is created internally by the compiler. It is essentially just an array of pointers, each pointing to the virtual function implementation. A pointer to this VFT is placed at the very bottom of the class (offset 0).

In-memory representation

class IBaseInterface_vftbl +--- 0 | &IBaseInterface::`virtual destructor` 4 +--- class IInterface_vftbl +--- 0 | &IInterface::`virtual destructor` // overridden from IBaseInterface 4 | &IInterface::a 8 | &IInterface::b 12 | &IInterface::c 16 +--- class IInterface +--- 0 | IInterface_vftbl* __vftable // the VF table is always the first member | if IInterface had any members, they would be all placed here 4 +---

As we can see, the IInterface class implements three unique virtual functions and one virtual destructor (overridden from IBaseInterface). These are all placed in the IInterface's VFT. So far the class doesn't have any members, so it only contains this pointer.

Class that inherits from IInterface

Now, what if we create another class that will inherit from IInterface? The approach is similar however, more complicated. Long story short, the third class (CClass) will override IInterface functions and will also create new virtual functions and one non-virtual function. All of these (virtual only) will be then added into one VFT created for this class.

class CClass : public IInterface { public: virtual ~CClass() {} // virtual void a() {} <-- the one from IInterface will be used instead virtual void b() {} virtual void c() {} public: virtual void do_bad_stuff() {} // these two create a new VFT entry virtual void do_nice_stuff() {} void do_class_things() {} // non-virtual isn't included inside VFT public: int data; };

In-memory representation

class IBaseInterface_vftbl // still has a standalone copy of itself +--- 0 | &IBaseInterface::`virtual destructor` 4 +--- class IInterface_vftbl // same here +--- 0 | &IInterface::`virtual destructor` // overridden from IBaseInterface 4 | &IInterface::a 8 | &IInterface::b 12 | &IInterface::c 16 +--- class CClass_vftbl +--- 0 | &CClass::`virtual destructor` // overridden from IBaseInterface 4 | &IInterface::a // CClass didn't implement a(), use from IInterface 8 | &CClass::b // overridden from IInterface 12 | &CClass::c 16 | &CClass::do_bad_stuff // CClass virtual methods 20 | &CClass::do_nice_stuff 24 +--- class IInterface +--- 0 | CClass_vftbl* __vftable // IInterface + IBaseInterface VFT gets merged with CClass's VFT 0 | // if IInterface had any members, they would be all placed here 4 +--- class CClass_mbrs +--- 0 | data 4 +--- class CClass +--- 0 | IInterface __baseclass_IInterface // The base class is just "copy-pasted" here 4 | CClass_mbrs __members 8 +---

It is important to note that while both IBaseInterface and IInterface VFT's gets merged into CClass's VFT, both IBaseInterface and IInterface pertain copies of itself somewhere in memory.

Multiple inheritance

Now let's create a new class called CAnotherClass, and make it inherit from multiple classes at once. The approach to the solution will be again, very similar however, more complicated.

class IInterface1 { public: virtual void a1() {}; virtual void b1() {}; virtual void c1() {}; virtual void interface1() {}; }; class CAnotherClass : public CClass, public IInterface1 { public: virtual ~CAnotherClass() {} virtual void a1() {} virtual void b1() {} virtual void c1() {} virtual void another_class() {}; public: int another_data; };

In-memory representation

class IInterface1_vftbl // original VFT for IInterface1, placed somewhere in memory +--- 0 | &IInterface1::a1 4 | &IInterface1::b1 8 | &IInterface1::c1 12 | &IInterface1::interface1 16 +--- // CAnotherClass's copy of IInterface1 with overridden functions (2nd inheritance) class CAnotherClass_IInterface1_vftbl +--- 0 | &CAnotherClass::a1 // overriden by CAnotherClass 4 | &CAnotherClass::b1 // same here 8 | &CAnotherClass::c1 // same here 12 | &IInterface1::interface1 // IInterface1's unique virtual method 16 +--- class CAnotherClass_vftbl +--- 0 | &CAnotherClass::`virtual destructor` // overridden from IBaseInterface 4 | &IInterface::a // again didn't implement a(), use from IInterface 8 | &CClass::b // didn't implement, use from CClass 12 | &CClass::c // same here 16 | &CClass::do_bad_stuff // same here 20 | &CClass::do_nice_stuff // same here 24 | &CAnotherClass::another_class // now comes CAnotherClass's virtual methods 28 +--- class CAnotherClass_IInterface1 +--- 0 | CAnotherClass_IInterface1_vftbl* __vftable 4 +--- class IInterface +--- 0 | CAnotherClass_vftbl* __vftable 4 +--- class CClass_mbrs +--- 0 | data 4 +--- class CClass +--- 0 | IInterface __baseclass_IInterface 4 | CClass_mbrs __members 8 +--- class CAnotherClass_mbrs +--- 0 | another_data 4 +--- class CAnotherClass +--- 0 | CClass __baseclass_CClass // 1st base class (contains CAnotherClass's VFT) 8 | CAnotherClass_IInterface1 __baseclass_CAnotherClass_IInterface1 // 2nd inheritance 12 | CAnotherClass_mbrs _members 16 +---

In this example, CAnotherClass derives from CClass and IInterface1. IInterface creates its own standalone VFT with its virtual functions however, another one is created because CAnotherClass overrides some of IInterface1 methods.

So now CAnotherClass contains two VFTs, one for its own virtual functions, and another one for virtual functions in second base class.

Even more inheritance?

So far we have seen only a class with two base classes however, there can be far more than that. As you can see, this gets really complicated when you have a lot of inheritance, but the pattern is the same every time. The first base classes VFT always gets merged with the derived classes VFT, and other base classes have it's own VFT.

C implementation of the class

Now when we know how the class looks like in memory, we can recreate the class concept in C. Let's recreate the CAnotherClass in C.

// header file containing C classes struct IInterface; struct IInterface_vftbl { void (__thiscall* virtual_destructor)(IInterface* _this); // &IInterface::~IInterface void (__thiscall* a) (IInterface* _this); // &IInterface::a void (__thiscall* b) (IInterface* _this); // &IInterface::b void (__thiscall* c) (IInterface* _this); // &IInterface::c }; struct CClass; struct CClass_vftbl { void (__thiscall* virtual_destructor)(CClass* _this); // &CClass::~CClass void (__thiscall* a) (IInterface* _this); // &IInterface::a void (__thiscall* b) (CClass* _this); // &CClass::b void (__thiscall* c) (CClass* _this); // &CClass::c void (__thiscall* do_bad_stuff) (CClass* _this); // &CClass::do_bad_stuff void (__thiscall* do_nice_stuff) (CClass* _this); // &CClass::do_nice_stuff }; struct IInterface1; struct IInterface1_vftbl { void (__thiscall* a1) (IInterface1* _this); // &IInterface1::a1 void (__thiscall* b1) (IInterface1* _this); // &IInterface1::b1 void (__thiscall* c1) (IInterface1* _this); // &IInterface1::c1 void (__thiscall* interface1) (IInterface1* _this); // &IInterface1::interface1 }; struct CAnotherClass_IInterface1_vftbl { void (__thiscall* a1) (CAnotherClass* _this); // &CAnotherClass::a1 void (__thiscall* b1) (CAnotherClass* _this); // &CAnotherClass::b1 void (__thiscall* c1) (CAnotherClass* _this); // &CAnotherClass::c1 void (__thiscall* interface1) (IInterface1* _this); // &IInterface1::interface1 }; struct CAnotherClass; struct CAnotherClass_vftbl { void (__thiscall* virtual_destructor)(CAnotherClass* _this); // &CAnotherClass::~CAnotherClass void (__thiscall* a) (IInterface* _this); // &IInterface::a void (__thiscall* b) (CClass* _this); // &CClass::b void (__thiscall* c) (CClass* _this); // &CClass::c void (__thiscall* do_bad_stuff) (CClass* _this); // &CClass::do_bad_stuff void (__thiscall* do_nice_stuff) (CClass* _this); // &CClass::do_nice_stuff void (__thiscall* another_class) (CAnotherClass* _this); // &CAnotherClass::another_class }; struct CAnotherClass_IInterface1 { CAnotherClass_IInterface1_vftbl* __vftable; }; struct IInterface { CAnotherClass_vftbl* __vftable; }; struct CClass_mbrs { int data; }; struct CClass { IInterface __baseclass_IInterface; CClass_mbrs __members; }; struct CAnotherClass_mbrs { int another_data; }; struct CAnotherClass { CClass __baseclass_CClass; CAnotherClass_IInterface1 __baseclass_CAnotherClass_IInterface1; CAnotherClass_mbrs _members; };

And that's it. If you load this into IDA, you should have fully working C version of C++ CAnother class. Hope it helped!