Introduction

I’d always wondered why it’s impossible to define two functions with the same name in C, even if they have different parameters. Every time I tried, I’d get a linker error saying, “bro, you can’t do it”, while doing the same in C++ and other compiled languages worked fine. Eventually, I learned that the difference lies in a compilation technique known as name mangling.

Definition

Name mangling (also called name decoration) is a technique used to solve various problems caused by the need to resolve unique names for programming entities in many modern programming languages.

Where Name Mangling Happens

Firstly, let’s understand where exactly the name mangling happen. Name mangling occurs during the compilation process in languages like C++. When the compiler translates your source code into machine code, it needs to ensure that every function, method, or variable has a unique identifier in the binary. This is particularly important in C++ because of features like:

  • Function Overloading: Multiple functions can have the same name but different parameter lists.

  • Namespaces: Functions or variables in different namespaces can have the same name.

  • Classes: Member functions in different classes can have the same name.

To achieve this, the compiler “mangles” the names by encoding information about the function’s name, parameter types, return type, and scope into a unique string. This ensures that the linker can resolve references correctly.

Why Name Mangling Matters

Without name mangling, we would face the same issues as in the C programming language, where function names must be unique across the entire program. This would lead to:

  • Name Resolution Problems: If two functions have the same name, the linker wouldn’t know which one to use.

  • Modularity Issues: Libraries and modules could not have functions with the same name, even if they are in different namespaces or classes.

  • No Static Polymorphism: Features like function overloading would not be possible because the compiler wouldn’t be able to distinguish between functions with the same name but different parameter lists.

Learn By Doing

Here we have a simple example in C++ with function overloading:

#include <iostream>

int add(int a, int b) {
    return a + b;
}

double add(double a, double b) {
    return a + b;
}

int main() {
    std::cout << add(1, 2) << std::endl;
    std::cout << add(2.5, 3.0) << std::endl;
    return 0;
}

Let’s go now to see the mangled name in the produced binary:

g++ -c main.cpp -o main
nm main | grep add

You’ll see output like this:

0000000000000014 T _Z3adddd
0000000000000000 T _Z3addii

Here, _Z3addii and _Z3adddd are the mangled names for the add functions. The ii and dd suffixes represent the parameter types (int and double, respectively).

Let’s to see now symbols in C programming language that does not support name mangling. Here is a simple example:

#include <stdio.h>

int add(int a, int b) {
    return a + b;
}

int main() {
    printf("%d\n", add(1, 2));
    return 0;
}

Compile the C code and inspect the symbols using nm:

gcc -o main main.c
nm main | grep add

You will see something like:

0000000000001139 T add

Here, the function name add is not mangled. It appears exactly as it is in the source code.

Some fun

Now I think we need to play a little bit with this concept of name mangling and bring a fun real example. We will define a function in c++ and compiled into a static lib (we will let it as it is with its mangled name), then we will called it from C as an extern fucntion.

Let’s start by writing the C++ function that we’ll later call from C. Create a file called manglelib.cpp:

#include <iostream>

void mangle_me(int x) {
    std::cout << "Hi, I am a C++ function called from C using its mangled name! Value: " << x << std::endl;
}

Next, we’ll compile the C++ code into a static library. A static library is a collection of object files that can be linked into a program at compile time.

g++ -c manglelib.cpp -o manglelib.o

Now, let’s find out what the mangled name of the mangle_me function is. We’ll use the nm command to inspect the symbols in the static library.

nm manglelib.a | grep mangle_me

You’ll see output like this:

000000000000009b t _GLOBAL__sub_I__Z9mangle_mei
0000000000000000 T _Z9mangle_mei

Now that we know the mangled name, let’s write a C program that calls the mangle_me function using its mangled name. Create a file called main.c:

#include <stdio.h>

extern void _Z9mangle_mei(int x);

int main() {
    printf("Calling C++ function from C using its mangled name...\n");
    _Z9mangle_mei(42);
    return 0;
}

Next, we’ll compile the C code and link it with the static library. We will use gcc for this:

gcc main.c manglelib.a -o main -lstdc++

Finally, let’s to run the program:

./main

Here we go:

Calling C++ function from C using its mangled name...
Hi, I am a C++ function called from C using its mangled name! Value: 42

This fun examaple demonstrates how name mangling works in C++ and how you can use it to call C++ functions from C.

Beyond C and C++: FFI (Foreign Function Interface)

While this example is fun and educational, it’s important to note that there’s a more portable way to achieve interoperability between programming languages: FFI (Foreign Function Interface).

What Is FFI?

FFI is a mechanism that allows one programming language to call functions written in another language. It provides a standardized way to bridge the gap between languages without relying on low-level details like name mangling.

Tools and Further Readings

Here is a list of resources that were used for this blog:

Tools and Commands

  1. nm Command:

    • The nm command is used to display symbol information in object files, static libraries, and shared libraries.
    • GNU nm Documentation
  2. gcc (GNU Compiler Collection):

Further Readings

  1. FFI (Foreign Function Interface):