- Add CLAUDE.md with development standards - Remove docs/architecture.md and docs/flows.md (obsoleted by redesign spec) - Track docs/code-review.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
14 KiB
Flow CLI -- Code Review
Reviewed 2026-03-15 against commit
24d682a(main). Source: ~6,900 LOC across 21 modules. Tests: ~2,600 LOC, 167 pass / 6 skip.
1. Reason to Exist
Flow solves a real problem: managing a personal dev environment across machines. It unifies dotfiles linking, machine bootstrapping, container management, SSH entry, and binary package installs into one CLI. The alternative is a pile of shell scripts or Ansible for what is essentially personal-workstation setup. Flow is lighter, more opinionated, and self-hosted (config lives in your dotfiles repo). This is a reasonable project to maintain.
2. Features Supported
| Area | Implemented | Notes |
|---|---|---|
enter |
SSH handoff with tmux, terminfo warnings, target config | Works |
dev |
Container create/exec/connect/list/stop/remove/respawn | Docker & Podman |
dotfiles |
init/link/unlink/undo/status/sync/relink/clean/edit, modules, repo ops | Most complex subsystem |
bootstrap |
Profile-driven provisioning: packages, shell, locale, hostname, ssh-keygen, runcmd, post-link | Works |
package |
Binary package install/list/remove from manifest | Real install; "remove" only forgets state |
sync |
Git health check across ~/projects |
Works |
completion |
Dynamic zsh completion with context-aware candidates | Thorough |
3. Mismatches Between Docs/README and Source Code
3.1 Architecture doc describes layers that don't fully exist yet
docs/architecture.md describes a clean adapter/service/runtime split:
flow.commands.*— argparse registration and compatibility wrappers onlyflow.services.*— domain behavior
In practice:
commands/sync.pystill contains ~170 lines of its own git logic (_git,_check_repo,run_check,run_fetch,run_summary) that duplicateservices/projects.pyalmost line-for-line. The service exists but the command module never imports or uses it.commands/package.pydoes its own JSON state loading and binary install calls instead of delegating toservices/packages.py. The service classPackageServiceexists but is never used anywhere.commands/dotfiles.pyis a 193-line shim that re-exports ~20 private functions fromservices/dotfiles.py, each wrapped in_sync_service_module(). The command module is not "argparse registration only" -- it's a compatibility layer that monkeypatches module-level constants into the service at runtime.
Verdict: The architecture doc describes the target state, not the actual state. Two service
modules (projects, packages) are dead code.
3.2 README says "environments is not supported" but code still checks for it
services/bootstrap.py:63 raises RuntimeError if environments key is found. This is actually
correct defensive code, but the README makes it sound like the key is simply ignored. Minor.
3.3 README documents flow dev create api -i tm0/node -p ~/projects/api
The code works, but the container is always sleep infinity + exec-in. README doesn't mention this
is the execution model (not a standard container entry point).
3.4 docs/flows.md lists "Keep But Treat As Convenience Aliases" section
These aliases aren't flagged in code or docs as aliases. dotfiles sync is described as
"repo pull + modules sync", but in code it's _pull_dotfiles + _sync_modules -- not actually
calling the repo/modules commands. Functionally equivalent but not aliased.
4. Code Quality Assessment
4.1 Bugs
Duplicate method definition in system.py
FileSystem.write_bytes is defined twice (lines 259-261 and 263-265). The second definition
silently shadows the first. They're identical, so no behavioral bug, but this is a clear copy-paste
error.
commands/container.py has unused legacy functions
_container_exists, _container_running, and _tmux_fallback are defined at module level (lines
61-122) but never called -- the ContainerService class has its own versions. Dead code.
commands/bootstrap.py monkeypatches the service module
def _sync_service_module() -> None:
_service.shutil = shutil
_service.urllib = urllib
_service._copy_install_item = _copy_install_item
This replaces shutil and urllib on the service module object at runtime. The service already
imports these -- this patching has no real effect (the service's own imports win).
_copy_install_item patch is circular: it reassigns the service function to a wrapper that calls
the original. This is effectively a no-op.
dotfiles.py command module monkeypatches module-level constants
def _sync_service_module() -> None:
_service.DOTFILES_DIR = DOTFILES_DIR
_service.MODULES_DIR = MODULES_DIR
_service.LINKED_STATE = LINKED_STATE
...
The service already imports these from flow.core.paths. This override only matters if the command
module's own constants diverge from the service's -- but they're imported from the same source. The
whole pattern is a code smell from an incomplete refactoring.
4.2 Massive Code Duplication
This is the single biggest quality issue. The bootstrap service duplicates nearly every function
from package_defs.py:
services/package_defs.py |
services/bootstrap.py |
|---|---|
linux_detect_package_manager() |
_linux_detect_package_manager() |
resolve_package_manager() |
_resolve_package_manager() |
get_package_catalog() |
_get_package_catalog() |
normalize_profile_package_entry() |
_normalize_profile_package_entry() |
resolve_package_spec() |
_resolve_package_spec() |
resolve_pkg_source_name() |
_resolve_pkg_source_name() |
platform_lookup_keys() |
_platform_lookup_keys() |
resolve_binary_platform_vars() |
_resolve_binary_platform_vars() |
resolve_binary_asset() |
_resolve_binary_asset() |
resolve_binary_download_url() |
_resolve_binary_download_url() |
validate_declared_install_path() |
_validate_declared_install_path() |
install_destination() |
_install_destination() |
install_strip_prefix() |
_install_strip_prefix() |
strip_prefix() |
_strip_prefix() |
BinaryInstaller.install() |
_install_binary_package() |
BinaryInstaller.copy_install_item() |
_copy_install_item() |
profile_template_context() |
_profile_template_context() |
render_template_value() |
_render_template_value() |
That's 18 duplicate function pairs with near-identical implementations. package_defs.py was
clearly extracted as the canonical version, but bootstrap.py was never updated to import from it.
Similarly, services/projects.py duplicates commands/sync.py.
Total duplicate code: roughly 500-600 lines.
4.3 Module-Level Singletons
services/dotfiles.py and services/bootstrap.py use module-level singletons:
_RUNNER = CommandRunner()
_FS = FileSystem()
These are then patched by the command modules. This makes the service modules hard to test in
isolation and creates hidden mutable global state. The FlowContext already carries a
SystemRuntime with runner, fs, and git -- but the service code ignores it and uses the
module-level singletons instead.
4.4 Tests Reach Into Private APIs
Tests import private (underscore-prefixed) functions directly:
from flow.commands.bootstrap import (
_ensure_required_variables,
_get_profiles,
_install_binary_package,
...
)
This couples tests to internal implementation details. Combined with the command-module shimming pattern, it means the test suite is testing the wrong layer (command shims instead of service logic directly).
4.5 Inconsistent Error Handling
Two exception hierarchies coexist:
FlowError(RuntimeError)-- defined incore/errors.py, used bysystem.pyand services- Plain
RuntimeError-- used throughoutbootstrap.py,dotfiles.py, and command modules
The CLI entry point catches RuntimeError as the generic error type, so both work. But FlowError
was clearly intended to be the project-wide error type and is underused. The bootstrap.py service
(1001 lines) never imports or uses FlowError.
4.6 stow.py is Unused
core/stow.py (358 lines) implements a GNU Stow-style tree folding/unfolding algorithm with
LinkTree, TreeFolder, LinkOperation. It has 310 lines of tests. But nothing in the
application imports or uses it. The dotfiles service has its own LinkSpec-based approach. This
is dead code (or a planned feature that never landed).
4.7 action.py is Unused
core/action.py (120 lines) defines Action and ActionExecutor for plan-then-execute workflows.
It has 115 lines of tests. But no command or service uses it. The bootstrap command does its own
sequential execution. Dead code.
4.8 process.py is a Thin Wrapper With Odd Restrictions
def run_command(command, console, *, check=True, shell=True, capture=False):
if not shell:
raise RuntimeError("run_command only supports shell commands")
It rejects shell=False even though the parameter exists in the signature. This wrapper creates a
new CommandRunner() on every call instead of using the one from context.
4.9 Minor Issues
- No type checking enforcement.
pyproject.tomlhas no mypy/pyright configuration. Type annotations exist but are never validated. ConsoleLoggerhardcodes ANSI codes. No TTY detection, no--no-colorflag. Piping output to a file produces escape sequences.flow package removeonly forgets state. README anddocs/flows.mdboth acknowledge this, but the CLI help text says "Remove installed packages" without qualification. Users will expect files to be deleted.- SSH keygen quoting.
bootstrap.py:747passesshlex.quoteresults into a string that is later shell-executed. Double quoting could be an issue with values containing spaces, though unlikely in practice for key filenames.
5. Summary
What's Good
- Clear domain model. The manifest format is well-designed: profiles, packages, dotfiles layout, modules, templates. It's genuinely useful.
- Transactional dotfiles linking. The undo system with snapshots and backup directories is well-thought-out and more robust than most dotfile managers.
- Defensive input validation. Config parsing handles multiple formats (dict keys, lists, string shorthand), missing values, and type mismatches gracefully.
- Test coverage exists. 167 tests pass. The dotfiles folding and e2e container tests show real investment in correctness.
- Clean CLI surface. Argparse registration is consistent, aliases work, help text is clear.
- Zsh completion is thorough. Context-aware dynamic completion is a nice touch.
What Needs Work
| Priority | Issue | Effort |
|---|---|---|
| High | ~600 lines of duplicated code between bootstrap and package_defs | Delete duplicates, import from package_defs |
| High | Two dead modules (stow.py, action.py) + dead service modules (projects.py, packages.py) = ~900 lines of unused code | Delete or wire up |
| High | Command modules bypass their own service layer | Finish the refactoring described in architecture.md |
| Medium | Module-level singletons vs FlowContext.runtime | Use runtime from context consistently |
| Medium | _sync_service_module() monkeypatching pattern |
Remove once services use their own imports |
| Medium | Inconsistent error types (RuntimeError vs FlowError) | Standardize on FlowError |
| Low | No color/TTY detection | Add --no-color or detect isatty |
| Low | package remove semantics |
Either implement real uninstall or rename to forget |
| Low | Duplicate write_bytes method |
Delete the duplicate |
Overall Assessment
This is a functional but mid-refactoring codebase. The core functionality works -- dotfiles linking, bootstrapping, container management, and SSH entry all do what they claim. The manifest model and transactional undo are genuinely well-designed.
The main debt is architectural: a service-layer extraction was started but left incomplete, resulting in large amounts of duplicated code, dead modules, and a monkeypatching compatibility layer. The tests pass but are coupled to the wrong abstraction layer.
For a personal "vibe-coded" tool, this is solid. For production or team use, the duplication and dead code need cleanup before adding more features.