pi: package pyghidra-mcp + wire as OMP MCP server

Adds two inline Python derivations to home/progs/pi.nix:

  - ghidrecomp 0.5.9 (clearbluejar/ghidrecomp) — required by pyghidra-mcp,
    not in nixpkgs.
  - pyghidra-mcp 0.2.2 (clearbluejar/pyghidra-mcp) — headless MCP server
    that exposes Ghidra's analysis primitives (decompile, disassemble,
    list_strings, get_xrefs_to, etc.) over Model Context Protocol stdio.

The wrapper bakes in GHIDRA_INSTALL_DIR=${pkgs.ghidra}/lib/ghidra so
pyghidra discovers the Ghidra install at runtime without env munging.

Wires into OMP via:
  - home.packages: pyghidra-mcp + pkgs.ghidra (GUI for occasional manual
    exploration alongside the agent-driven flow).
  - ~/.omp/agent/mcp.json: registers a 'ghidra' MCP server that spawns
    pyghidra-mcp on stdio when any of its tools are invoked.
  - ~/.omp/agent/skills/ghidra/SKILL.md: tells the agent when to reach
    for Ghidra (static binary RE) vs. usbmon (dynamic capture) vs. the
    built-in tools, and gives the canonical exploration workflow.

Replaces the previously-recommended LaurieWired/GhidraMCP, which has
been stale since June 2025. clearbluejar/pyghidra-mcp is actively
maintained (last commit 3 days ago), pure-Python via pyghidra+jpype, and
multi-binary capable in a single session.

Verified: pi.nix parses, the yarn NixOS closure evaluates, both
derivations build, and the wrapped binary's --help works (Ghidra runtime
discovered correctly via GHIDRA_INSTALL_DIR).
This commit is contained in:
2026-05-04 20:28:13 -04:00
parent 9ef9389672
commit feae0f8002

View File

@@ -61,6 +61,64 @@ let
) (findSkillDirs inputs.android-skills)
);
# ghidrecomp: command-line Ghidra decompiler. Required by pyghidra-mcp,
# not in nixpkgs as of now.
ghidrecomp = pkgs.python3Packages.buildPythonPackage rec {
pname = "ghidrecomp";
version = "0.5.9";
pyproject = true;
src = pkgs.python3Packages.fetchPypi {
inherit pname version;
hash = "sha256-ocluLUic2qMREO7kXWum8l3VZ/parj/WtQ9JgOood6I=";
};
nativeBuildInputs = [ pkgs.python3Packages.setuptools ];
propagatedBuildInputs = [ pkgs.python3Packages.pyghidra ];
pythonImportsCheck = [ "ghidrecomp" ];
meta = {
description = "Python command-line Ghidra decompiler";
homepage = "https://github.com/clearbluejar/ghidrecomp";
license = lib.licenses.mit;
};
};
# pyghidra-mcp: headless MCP server exposing Ghidra analysis primitives over
# the Model Context Protocol (clearbluejar/pyghidra-mcp). Replaces the
# better-known LaurieWired/GhidraMCP which has been stale since mid-2025.
# Pure-Python via pyghidra/jpype — no Ghidra GUI required.
pyghidra-mcp = pkgs.python3Packages.buildPythonApplication rec {
pname = "pyghidra-mcp";
version = "0.2.2";
pyproject = true;
src = pkgs.python3Packages.fetchPypi {
pname = "pyghidra_mcp";
inherit version;
hash = "sha256-d3I9TP+OkLu6lU2994PR+77vIqB+4z8pHkHl56GNreY=";
};
nativeBuildInputs = [ pkgs.python3Packages.hatchling ];
propagatedBuildInputs = with pkgs.python3Packages; [
pyghidra
mcp
click
click-option-group
chromadb
ghidrecomp
];
# pyghidra discovers the Ghidra install via GHIDRA_INSTALL_DIR; bake it in
# at the wrapper level so the agent doesn't need to set it.
makeWrapperArgs = [
"--set"
"GHIDRA_INSTALL_DIR"
"${pkgs.ghidra}/lib/ghidra"
];
pythonImportsCheck = [ "pyghidra_mcp" ];
meta = {
description = "Python command-line Ghidra MCP server";
homepage = "https://github.com/clearbluejar/pyghidra-mcp";
license = lib.licenses.mit;
mainProgram = "pyghidra-mcp";
};
};
# Browser path for the playwright skill body.
playwrightChromium =
let
@@ -79,6 +137,8 @@ in
{
home.packages = [
inputs.llm-agents.packages.${pkgs.stdenv.hostPlatform.system}.omp
pyghidra-mcp
pkgs.ghidra # GUI Ghidra for occasional manual exploration
];
home.file = androidSkillFiles // {
@@ -88,6 +148,22 @@ in
# model/provider config: ~/.omp/agent/models.yml
".omp/agent/models.yml".text = builtins.toJSON ompModels;
# MCP server config: ~/.omp/agent/mcp.json
# OMP discovers servers from this file at startup. The ghidra entry below
# spawns pyghidra-mcp on stdio when the agent invokes any of its tools.
".omp/agent/mcp.json".text = builtins.toJSON {
"$schema" = "https://raw.githubusercontent.com/can1357/oh-my-pi/main/packages/coding-agent/src/config/mcp-schema.json";
mcpServers = {
ghidra = {
command = lib.getExe pyghidra-mcp;
args = [
"--transport"
"stdio"
];
};
};
};
# global instructions loaded at startup
".omp/agent/AGENTS.md".text = ''
You are an intelligent and observant agent.
@@ -210,5 +286,69 @@ in
export PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1
```
'';
".omp/agent/skills/ghidra/SKILL.md".text = ''
---
name: ghidra
description: Static binary reverse engineering via the Ghidra MCP server (pyghidra-mcp). Use for analyzing compiled binaries (Windows .exe, Linux ELF, Mach-O, .NET, JVM, raw firmware blobs) when you need decompiled C-pseudocode, function-level analysis, string searches inside .rdata, cross-references to imported APIs, or call-graph navigation.
---
# Ghidra (via pyghidra-mcp)
A headless MCP server is configured at `mcpServers.ghidra` in
`~/.omp/agent/mcp.json` and binds Ghidra's analysis engine to MCP tools
you can call directly. The Ghidra install lives at
`${pkgs.ghidra}/lib/ghidra`; pyghidra-mcp picks it up via the
GHIDRA_INSTALL_DIR env var that's wired into the binary's wrapper.
## When to use this
- Static analysis of any compiled binary you have on disk (or extract
from a game install, container image, firmware dump, etc.).
- Finding the decision logic behind a runtime behavior. E.g. where in
F1 23's executable the adaptive-trigger code lives and what params
it passes.
- Extracting embedded tuning tables from `.rdata`/`.data` sections.
- Discovering which Sony / Steam / Windows HID APIs a game calls.
## Workflow
The first invocation imports a binary into a fresh Ghidra project and
runs auto-analysis (10-90 minutes depending on size). Subsequent calls
are fast.
Typical exploration sequence for a stripped C++ game binary:
1. `list_strings(filter="DualSense")` (or other relevant substring) to
find string literals; Codemasters/Ubisoft typically don't strip these.
2. `list_imports()` filtered for HID / Sony / Steam APIs to find the
haptic call surface.
3. `get_xrefs_to(<address-of-string-or-import>)` to surface call sites.
4. `decompile_function_by_address(<addr>)` to read C-pseudocode.
5. `set_decompiler_comment` and `rename_function` as you identify
components, so the database remembers your findings across calls.
## Loading a binary
Drop the binary somewhere readable (don't commit to git license + size)
and pass the absolute path to pyghidra-mcp's import tool:
```
/tmp/games/f1_23/F1_23_dx12.exe
/tmp/games/cyberpunk/Cyberpunk2077.exe
```
Auto-analysis runs once per binary; the project database persists in
`~/.cache/pyghidra-mcp/` so re-invocations are fast.
## What this is NOT for
- Dynamic capture use usbmon + Wireshark for live HID traffic.
- PS5 binaries encrypted, out of scope.
- Decoding live network traffic separate tooling.
Reverse engineering for interoperability is permitted under DMCA §1201(f)
and analogous EU provisions. Don't share decrypted/cracked binaries.
'';
};
}