In the open-source world, blogs are a lifeline for sharing tech ideas—like how devs on GCC or LLVM keep up with each other. I’m diving into compiler optimization, and this week’s about my Stage 2 GCC Plugin Project. I built a plugin to spot duplicate functions in code, which ties into real-world stuff like slimming down Linux kernel builds. Let’s break it down!
My goal was to make a GCC plugin that checks for function clones—basically, functions that do the same thing under different names. I used GIMPLE, GCC’s middle-layer representation, to compare them by counting basic blocks (code chunks), statements, and their GIMPLE codes. If they match, I print PRUNE
(cut it out!); if not, NOPRUNE
(keep it). It’s like a code declutter tool.
I started with a custom GCC build I called gcc-test-001
. To make the plugin work, I needed GCC’s internal headers from the plugin include
folder. I compiled it with g++ using -fPIC
and -shared
flags for a shared library. I hit some error snags, so I set PREFERRED_DEBUGGING_TYPE=DWARF2
and flag_checking=0
to keep it smooth. Here’s the command I used:
g++ -fPIC -shared -o project.so /home/sdmakki/git/gcc/gcc/pass_project.cc \ -I/home/sdmakki/git/gcc/gcc \ -I/home/sdmakki/git/gcc/include \ -I/home/sdmakki/gcc-test-001/lib/gcc/x86_64-pc-linux-gnu/15.0.1/plugin/include \ -I/home/sdmakki/gcc-test-001/include \ -fno-rtti \ -DPREFERRED_DEBUGGING_TYPE=DWARF2 -Dflag_checking=0
It’s a mouthful, but it links all the right pieces together!
The plugin’s a GIMPLE pass called project
, hooked into GCC’s pipeline with gimple_opt_pass
. It scans each function, grabs its structure (blocks and statements), and compares it to a “base” function (tagged with .resolver
). I added debug logs to /tmp/plugin_debug.log
to track what’s happening—super helpful when nothing showed up at first. Check the full code below!
I tested it on clone-test-core.c
, a file with clone functions like sum_sample
and scale_samples
. Here’s how I ran it:
~/gcc-test-001/bin/gcc -O2 -fplugin=./project.so -fdump-tree-project -c clone-test-core.c
This spit out dump files like clone-test-core.c.019t.project
and clone-test-core.c.369t.project
. At first, no PRUNE
or NOPRUNE
showed up—ugh! But after fixing the dump_file
checks and base matching, it worked. Seeing those messages was a rush.
- Built and ran a custom GCC plugin—huge for understanding compilers.
- Nailed a GIMPLE pass to analyze code structure.
- Figured out clone detection with basic blocks and statements.
- Debugged with logs—patience is key!
- Tested it on real code and got tree dumps working.
Here’s the plugin (pass_project.cc
):
extern "C" int plugin_is_GPL_compatible = 1; #include "config.h" #include "system.h" #include "coretypes.h" #include "backend.h" #include "tree.h" #include "gimple.h" #include "tree-pass.h" #include "ssa.h" #include "tree-pretty-print.h" #include "gimple-iterator.h" #include "gimple-walk.h" #include "internal-fn.h" #include "gimple-pretty-print.h" #include "cgraph.h" #include "gimple-ssa.h" #include "attribs.h" #include "pretty-print.h" #include "tree-inline.h" #include "intl.h" #include "function.h" #include "basic-block.h" #include "plugin.h" #include "plugin-version.h" #include#include #include namespace { const pass_data pass_data_project = { GIMPLE_PASS, "project", OPTGROUP_NONE, TV_NONE, PROP_cfg, 0, 0, 0, 0 }; class pass_project : public gimple_opt_pass { public: pass_project(gcc::context *ctxt) : gimple_opt_pass(pass_data_project, ctxt) {} bool gate(function *) override { return true; } unsigned int execute(function *) override; }; unsigned int pass_project::execute(function *fun) { if (!fun || !fun->cfg) return 0; FILE *debug_file = fopen("/tmp/plugin_debug.log", "a"); if (debug_file) { fprintf(debug_file, ">>> GCC PASS PROJECT ACTIVE <<<\n"); fclose(debug_file); } if (!dump_file) return 0; fprintf(dump_file, ">>> GCC PASS PROJECT ACTIVE <<<\n"); static std::string base_function; static int base_bb_count = 0; static int base_stmt_count = 0; static std::vector base_codes; cgraph_node *node = cgraph_node::get(fun->decl); if (!node) return 0; const char *fname = node->name(); if (!fname) return 0; std::string name(fname); if (base_function.empty() && name.find(".resolver") != std::string::npos) { base_function = name.substr(0, name.find(".resolver")); fprintf(dump_file, "Found base function: %s\n", base_function.c_str()); return 0; } if (name.find(base_function) != 0 || name.find(".resolver") != std::string::npos) return 0; int bb_count = 0; int stmt_count = 0; std::vector current_codes; basic_block bb; FOR_EACH_BB_FN(bb, fun) { bb_count++; for (gimple_stmt_iterator gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) { stmt_count++; current_codes.push_back(static_cast (gimple_code(gsi_stmt(gsi)))); } } if (base_bb_count == 0 && base_stmt_count == 0) { base_bb_count = bb_count; base_stmt_count = stmt_count; base_codes = current_codes; fprintf(dump_file, "Base function: %s\n", name.c_str()); fprintf(dump_file, "BBs: %d, Stmts: %d\n", base_bb_count, base_stmt_count); } else { if (bb_count != base_bb_count || stmt_count != base_stmt_count || current_codes != base_codes) { fprintf(dump_file, "NOPRUNE: %s\n", name.c_str()); } else { fprintf(dump_file, "PRUNE: %s\n", name.c_str()); } } return 0; } } // namespace gimple_opt_pass *make_pass_project(gcc::context *ctxt) { return new pass_project(ctxt); } int plugin_init(struct plugin_name_args *plugin_info, struct plugin_gcc_version *version) { const char *plugin_name = plugin_info->base_name; struct register_pass_info pass_info; printf(">>> Plugin initialized successfully <<<\n"); pass_info.pass = make_pass_project(nullptr); pass_info.reference_pass_name = "cfg"; pass_info.ref_pass_instance_number = 1; pass_info.pos_op = PASS_POS_INSERT_AFTER; register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, nullptr, &pass_info); return 0; }
Stage 2’s done! I’ve got a working GCC plugin that spots clones and helps trim code fat. Debugging was a slog, but it taught me tons about GCC’s plugin API and GIMPLE. Next up: more optimization adventures in SPO600. Thoughts on this? Hit me up in the comments!