Stage 2 - GCC Plugin Project

Introduction

In the open-source world, blogs are a lifeline for sharing tech ideas—like how devs on GCC or LLVM keep up with each other. I’m diving into compiler optimization, and this week’s about my Stage 2 GCC Plugin Project. I built a plugin to spot duplicate functions in code, which ties into real-world stuff like slimming down Linux kernel builds. Let’s break it down!

What I Did

My goal was to make a GCC plugin that checks for function clones—basically, functions that do the same thing under different names. I used GIMPLE, GCC’s middle-layer representation, to compare them by counting basic blocks (code chunks), statements, and their GIMPLE codes. If they match, I print PRUNE (cut it out!); if not, NOPRUNE (keep it). It’s like a code declutter tool.

Setting It Up

I started with a custom GCC build I called gcc-test-001. To make the plugin work, I needed GCC’s internal headers from the plugin include folder. I compiled it with g++ using -fPIC and -shared flags for a shared library. I hit some error snags, so I set PREFERRED_DEBUGGING_TYPE=DWARF2 and flag_checking=0 to keep it smooth. Here’s the command I used:

g++ -fPIC -shared -o project.so /home/sdmakki/git/gcc/gcc/pass_project.cc \
  -I/home/sdmakki/git/gcc/gcc \
  -I/home/sdmakki/git/gcc/include \
  -I/home/sdmakki/gcc-test-001/lib/gcc/x86_64-pc-linux-gnu/15.0.1/plugin/include \
  -I/home/sdmakki/gcc-test-001/include \
  -fno-rtti \
  -DPREFERRED_DEBUGGING_TYPE=DWARF2 -Dflag_checking=0
    

It’s a mouthful, but it links all the right pieces together!

Writing the Plugin

The plugin’s a GIMPLE pass called project, hooked into GCC’s pipeline with gimple_opt_pass. It scans each function, grabs its structure (blocks and statements), and compares it to a “base” function (tagged with .resolver). I added debug logs to /tmp/plugin_debug.log to track what’s happening—super helpful when nothing showed up at first. Check the full code below!

Testing It Out

I tested it on clone-test-core.c, a file with clone functions like sum_sample and scale_samples. Here’s how I ran it:

~/gcc-test-001/bin/gcc -O2 -fplugin=./project.so -fdump-tree-project -c clone-test-core.c
    

This spit out dump files like clone-test-core.c.019t.project and clone-test-core.c.369t.project. At first, no PRUNE or NOPRUNE showed up—ugh! But after fixing the dump_file checks and base matching, it worked. Seeing those messages was a rush.

What I Learned

- Built and ran a custom GCC plugin—huge for understanding compilers.
- Nailed a GIMPLE pass to analyze code structure.
- Figured out clone detection with basic blocks and statements.
- Debugged with logs—patience is key!
- Tested it on real code and got tree dumps working.

The Code

Here’s the plugin (pass_project.cc):

extern "C" int plugin_is_GPL_compatible = 1;

#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "backend.h"
#include "tree.h"
#include "gimple.h"
#include "tree-pass.h"
#include "ssa.h"
#include "tree-pretty-print.h"
#include "gimple-iterator.h"
#include "gimple-walk.h"
#include "internal-fn.h"
#include "gimple-pretty-print.h"
#include "cgraph.h"
#include "gimple-ssa.h"
#include "attribs.h"
#include "pretty-print.h"
#include "tree-inline.h"
#include "intl.h"
#include "function.h"
#include "basic-block.h"
#include "plugin.h"
#include "plugin-version.h"

#include 
#include 
#include 

namespace {
const pass_data pass_data_project = {
  GIMPLE_PASS, "project", OPTGROUP_NONE, TV_NONE, PROP_cfg, 0, 0, 0, 0
};

class pass_project : public gimple_opt_pass {
public:
  pass_project(gcc::context *ctxt) : gimple_opt_pass(pass_data_project, ctxt) {}
  bool gate(function *) override { return true; }
  unsigned int execute(function *) override;
};

unsigned int pass_project::execute(function *fun) {
    if (!fun || !fun->cfg) return 0;

    FILE *debug_file = fopen("/tmp/plugin_debug.log", "a");
    if (debug_file) {
        fprintf(debug_file, ">>> GCC PASS PROJECT ACTIVE <<<\n");
        fclose(debug_file);
    }

    if (!dump_file) return 0;

    fprintf(dump_file, ">>> GCC PASS PROJECT ACTIVE <<<\n");

    static std::string base_function;
    static int base_bb_count = 0;
    static int base_stmt_count = 0;
    static std::vector base_codes;

    cgraph_node *node = cgraph_node::get(fun->decl);
    if (!node) return 0;

    const char *fname = node->name();
    if (!fname) return 0;

    std::string name(fname);

    if (base_function.empty() && name.find(".resolver") != std::string::npos) {
        base_function = name.substr(0, name.find(".resolver"));
        fprintf(dump_file, "Found base function: %s\n", base_function.c_str());
        return 0;
    }

    if (name.find(base_function) != 0 || name.find(".resolver") != std::string::npos) return 0;

    int bb_count = 0;
    int stmt_count = 0;
    std::vector current_codes;

    basic_block bb;
    FOR_EACH_BB_FN(bb, fun) {
        bb_count++;
        for (gimple_stmt_iterator gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
            stmt_count++;
            current_codes.push_back(static_cast(gimple_code(gsi_stmt(gsi))));
        }
    }

    if (base_bb_count == 0 && base_stmt_count == 0) {
        base_bb_count = bb_count;
        base_stmt_count = stmt_count;
        base_codes = current_codes;
        fprintf(dump_file, "Base function: %s\n", name.c_str());
        fprintf(dump_file, "BBs: %d, Stmts: %d\n", base_bb_count, base_stmt_count);
    } else {
        if (bb_count != base_bb_count || stmt_count != base_stmt_count || current_codes != base_codes) {
            fprintf(dump_file, "NOPRUNE: %s\n", name.c_str());
        } else {
            fprintf(dump_file, "PRUNE: %s\n", name.c_str());
        }
    }

    return 0;
}

} // namespace

gimple_opt_pass *make_pass_project(gcc::context *ctxt) {
    return new pass_project(ctxt);
}

int plugin_init(struct plugin_name_args *plugin_info, struct plugin_gcc_version *version) {
    const char *plugin_name = plugin_info->base_name;
    struct register_pass_info pass_info;

    printf(">>> Plugin initialized successfully <<<\n");

    pass_info.pass = make_pass_project(nullptr);
    pass_info.reference_pass_name = "cfg";
    pass_info.ref_pass_instance_number = 1;
    pass_info.pos_op = PASS_POS_INSERT_AFTER;

    register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, nullptr, &pass_info);

    return 0;
}
    

Wrapping Up

Stage 2’s done! I’ve got a working GCC plugin that spots clones and helps trim code fat. Debugging was a slog, but it taught me tons about GCC’s plugin API and GIMPLE. Next up: more optimization adventures in SPO600. Thoughts on this? Hit me up in the comments!

You’ve visited this page 1 times.