nuclear-treestump/pydepgate

Live in production

A zero dependency lightweight static analyzer designed for adversarial-shape code in python to detect supply chain attacks before they reach your interpreter.

★10▲ 5 since joining⑂ 1PythonPush 8d agoListed 16d ago4 open issuesApache-2.0

nuclear-treestump.github.io/pydepgate/

apache2-licensecvedetectoropen-sourceopen-source-securitypythonpython-securitypython-security-tool

Python99.3%
Shell0.5%
Dockerfile0.2%

View on GitHub

Report a problem

1 Review

thejaycampbell16d ago

PyDepGate is one of the more interesting security tools I’ve looked at in this batch. It has a very specific threat model: Python code that can execute before a user’s script really starts, especially .pth files, sitecustomize.py, usercustomize.py, package init.py, setup.py, and generated console entry points. That focus is valuable because it is different from the usual “known vulnerable dependency” or generic static-analysis story. The README makes the problem concrete and explains why startup vectors are dangerous instead of assuming the reader already knows.

The project is surprisingly mature for a repo with only a handful of stars. It has 200+ commits, PyPI packaging, Docker publishing, Apache-2.0 licensing, a security policy, contributing docs, a roadmap, CodeQL, unit-test workflows, SARIF validation, and a structured src/pydepgate package layout. The README is detailed without being vague: it covers wheel/sdist/package/file scanning, --single, --peek, recursive decoded-payload scanning, CI mode, custom rules, SARIF output, exit codes, Docker usage, pre-commit hooks, and a clear analyzer pipeline.

The design choices are strong. Zero runtime dependencies makes sense for a supply-chain defense tool, and the project is explicit about never executing, compiling, importing, or deserializing inspected input. The analyzer set also feels thoughtful: encoding abuse, dynamic execution, obfuscated strings, suspicious stdlib usage, and density-style signals. The rules engine is a good touch because it lets the same raw signal mean different things depending on context; a base64 blob in a normal library file is not the same risk as one in a .pth file.

The main concern is trust calibration. Security tools need adoption, peer review, and careful false-positive reporting, and this project is still very small publicly: around 5 stars, no forks, and no open issues at the time I checked. I would add a short “known limitations” section near the top, publish a few real-world case studies, and include benchmark/false-positive results against a representative PyPI sample. The latest release, v0.4.7 on May 24, 2026, also appears ahead of the pyproject.toml version shown on main as 0.4.5, so keeping version metadata visibly in sync would avoid confusion.

Overall, PyDepGate looks technically serious and unusually well documented. It is still early from an ecosystem-trust perspective, but the threat model is sharp, the safety posture is sensible, and the CI/SARIF/Docker/PyPI integrations make it practical enough for security-minded Python teams to experiment with.

@0xIkari15d ago

Thank you for the review of pydepgate! I'm not sure when you pulled the repo, but the toml is showing 0.4.7 for me, committed three days ago. Nonetheless, good call out. I've had a bit of trouble keeping version metadata in sync due to shipping velocity. Addressing the trust calibration point, I do actually test against a corpus of PyPI. I picked several large packages (numpy, scipy, pandas, others, comes in at ~130 packages total) and their dependencies, and ran against them for false positive correlation. While yes, I should be looking for more, I figured a spot check against software old enough to vote would be a good way to get weird behavior to express itself. That's where the dragons are: old packages carry years of accumulated patterns and inherited build hacks that surface analyzer edge cases faster than synthetic samples ever would. I should publish the test list explicitly, and I'll get that up. For the CVE scanner side, I tested by picking 200 package names from a hat that were present in the CVE DB and confirmed findings map correctly. There are also reports from scans on the LiteLLM 1.82.8 supply chain attack in samples/ in the repo. Those cover text, JSON, SARIF, and decoded-findings output across the same incident, so the multi-format coverage is exercised against a real CVE rather than a synthetic case. On the Known Limitations point, the section does exist: https://github.com/nuclear-treestump/pydepgate/blob/main/README.md#known-limitations, but it's near the bottom of the README rather than near the top, so the structural critique is fair. Substantive limitations and edge-case documentation will live on the docs site.