Files
flow/docs/code-review.md
Tomas Mirchev 457fbc669c Add CLAUDE.md, remove old docs, track code review
- Add CLAUDE.md with development standards
- Remove docs/architecture.md and docs/flows.md (obsoleted by redesign spec)
- Track docs/code-review.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 04:27:57 +02:00

14 KiB

Flow CLI -- Code Review

Reviewed 2026-03-15 against commit 24d682a (main). Source: ~6,900 LOC across 21 modules. Tests: ~2,600 LOC, 167 pass / 6 skip.


1. Reason to Exist

Flow solves a real problem: managing a personal dev environment across machines. It unifies dotfiles linking, machine bootstrapping, container management, SSH entry, and binary package installs into one CLI. The alternative is a pile of shell scripts or Ansible for what is essentially personal-workstation setup. Flow is lighter, more opinionated, and self-hosted (config lives in your dotfiles repo). This is a reasonable project to maintain.


2. Features Supported

Area Implemented Notes
enter SSH handoff with tmux, terminfo warnings, target config Works
dev Container create/exec/connect/list/stop/remove/respawn Docker & Podman
dotfiles init/link/unlink/undo/status/sync/relink/clean/edit, modules, repo ops Most complex subsystem
bootstrap Profile-driven provisioning: packages, shell, locale, hostname, ssh-keygen, runcmd, post-link Works
package Binary package install/list/remove from manifest Real install; "remove" only forgets state
sync Git health check across ~/projects Works
completion Dynamic zsh completion with context-aware candidates Thorough

3. Mismatches Between Docs/README and Source Code

3.1 Architecture doc describes layers that don't fully exist yet

docs/architecture.md describes a clean adapter/service/runtime split:

flow.commands.* — argparse registration and compatibility wrappers only flow.services.* — domain behavior

In practice:

  • commands/sync.py still contains ~170 lines of its own git logic (_git, _check_repo, run_check, run_fetch, run_summary) that duplicate services/projects.py almost line-for-line. The service exists but the command module never imports or uses it.
  • commands/package.py does its own JSON state loading and binary install calls instead of delegating to services/packages.py. The service class PackageService exists but is never used anywhere.
  • commands/dotfiles.py is a 193-line shim that re-exports ~20 private functions from services/dotfiles.py, each wrapped in _sync_service_module(). The command module is not "argparse registration only" -- it's a compatibility layer that monkeypatches module-level constants into the service at runtime.

Verdict: The architecture doc describes the target state, not the actual state. Two service modules (projects, packages) are dead code.

3.2 README says "environments is not supported" but code still checks for it

services/bootstrap.py:63 raises RuntimeError if environments key is found. This is actually correct defensive code, but the README makes it sound like the key is simply ignored. Minor.

3.3 README documents flow dev create api -i tm0/node -p ~/projects/api

The code works, but the container is always sleep infinity + exec-in. README doesn't mention this is the execution model (not a standard container entry point).

3.4 docs/flows.md lists "Keep But Treat As Convenience Aliases" section

These aliases aren't flagged in code or docs as aliases. dotfiles sync is described as "repo pull + modules sync", but in code it's _pull_dotfiles + _sync_modules -- not actually calling the repo/modules commands. Functionally equivalent but not aliased.


4. Code Quality Assessment

4.1 Bugs

Duplicate method definition in system.py

FileSystem.write_bytes is defined twice (lines 259-261 and 263-265). The second definition silently shadows the first. They're identical, so no behavioral bug, but this is a clear copy-paste error.

commands/container.py has unused legacy functions

_container_exists, _container_running, and _tmux_fallback are defined at module level (lines 61-122) but never called -- the ContainerService class has its own versions. Dead code.

commands/bootstrap.py monkeypatches the service module

def _sync_service_module() -> None:
    _service.shutil = shutil
    _service.urllib = urllib
    _service._copy_install_item = _copy_install_item

This replaces shutil and urllib on the service module object at runtime. The service already imports these -- this patching has no real effect (the service's own imports win). _copy_install_item patch is circular: it reassigns the service function to a wrapper that calls the original. This is effectively a no-op.

dotfiles.py command module monkeypatches module-level constants

def _sync_service_module() -> None:
    _service.DOTFILES_DIR = DOTFILES_DIR
    _service.MODULES_DIR = MODULES_DIR
    _service.LINKED_STATE = LINKED_STATE
    ...

The service already imports these from flow.core.paths. This override only matters if the command module's own constants diverge from the service's -- but they're imported from the same source. The whole pattern is a code smell from an incomplete refactoring.

4.2 Massive Code Duplication

This is the single biggest quality issue. The bootstrap service duplicates nearly every function from package_defs.py:

services/package_defs.py services/bootstrap.py
linux_detect_package_manager() _linux_detect_package_manager()
resolve_package_manager() _resolve_package_manager()
get_package_catalog() _get_package_catalog()
normalize_profile_package_entry() _normalize_profile_package_entry()
resolve_package_spec() _resolve_package_spec()
resolve_pkg_source_name() _resolve_pkg_source_name()
platform_lookup_keys() _platform_lookup_keys()
resolve_binary_platform_vars() _resolve_binary_platform_vars()
resolve_binary_asset() _resolve_binary_asset()
resolve_binary_download_url() _resolve_binary_download_url()
validate_declared_install_path() _validate_declared_install_path()
install_destination() _install_destination()
install_strip_prefix() _install_strip_prefix()
strip_prefix() _strip_prefix()
BinaryInstaller.install() _install_binary_package()
BinaryInstaller.copy_install_item() _copy_install_item()
profile_template_context() _profile_template_context()
render_template_value() _render_template_value()

That's 18 duplicate function pairs with near-identical implementations. package_defs.py was clearly extracted as the canonical version, but bootstrap.py was never updated to import from it. Similarly, services/projects.py duplicates commands/sync.py.

Total duplicate code: roughly 500-600 lines.

4.3 Module-Level Singletons

services/dotfiles.py and services/bootstrap.py use module-level singletons:

_RUNNER = CommandRunner()
_FS = FileSystem()

These are then patched by the command modules. This makes the service modules hard to test in isolation and creates hidden mutable global state. The FlowContext already carries a SystemRuntime with runner, fs, and git -- but the service code ignores it and uses the module-level singletons instead.

4.4 Tests Reach Into Private APIs

Tests import private (underscore-prefixed) functions directly:

from flow.commands.bootstrap import (
    _ensure_required_variables,
    _get_profiles,
    _install_binary_package,
    ...
)

This couples tests to internal implementation details. Combined with the command-module shimming pattern, it means the test suite is testing the wrong layer (command shims instead of service logic directly).

4.5 Inconsistent Error Handling

Two exception hierarchies coexist:

  • FlowError(RuntimeError) -- defined in core/errors.py, used by system.py and services
  • Plain RuntimeError -- used throughout bootstrap.py, dotfiles.py, and command modules

The CLI entry point catches RuntimeError as the generic error type, so both work. But FlowError was clearly intended to be the project-wide error type and is underused. The bootstrap.py service (1001 lines) never imports or uses FlowError.

4.6 stow.py is Unused

core/stow.py (358 lines) implements a GNU Stow-style tree folding/unfolding algorithm with LinkTree, TreeFolder, LinkOperation. It has 310 lines of tests. But nothing in the application imports or uses it. The dotfiles service has its own LinkSpec-based approach. This is dead code (or a planned feature that never landed).

4.7 action.py is Unused

core/action.py (120 lines) defines Action and ActionExecutor for plan-then-execute workflows. It has 115 lines of tests. But no command or service uses it. The bootstrap command does its own sequential execution. Dead code.

4.8 process.py is a Thin Wrapper With Odd Restrictions

def run_command(command, console, *, check=True, shell=True, capture=False):
    if not shell:
        raise RuntimeError("run_command only supports shell commands")

It rejects shell=False even though the parameter exists in the signature. This wrapper creates a new CommandRunner() on every call instead of using the one from context.

4.9 Minor Issues

  • No type checking enforcement. pyproject.toml has no mypy/pyright configuration. Type annotations exist but are never validated.
  • ConsoleLogger hardcodes ANSI codes. No TTY detection, no --no-color flag. Piping output to a file produces escape sequences.
  • flow package remove only forgets state. README and docs/flows.md both acknowledge this, but the CLI help text says "Remove installed packages" without qualification. Users will expect files to be deleted.
  • SSH keygen quoting. bootstrap.py:747 passes shlex.quote results into a string that is later shell-executed. Double quoting could be an issue with values containing spaces, though unlikely in practice for key filenames.

5. Summary

What's Good

  • Clear domain model. The manifest format is well-designed: profiles, packages, dotfiles layout, modules, templates. It's genuinely useful.
  • Transactional dotfiles linking. The undo system with snapshots and backup directories is well-thought-out and more robust than most dotfile managers.
  • Defensive input validation. Config parsing handles multiple formats (dict keys, lists, string shorthand), missing values, and type mismatches gracefully.
  • Test coverage exists. 167 tests pass. The dotfiles folding and e2e container tests show real investment in correctness.
  • Clean CLI surface. Argparse registration is consistent, aliases work, help text is clear.
  • Zsh completion is thorough. Context-aware dynamic completion is a nice touch.

What Needs Work

Priority Issue Effort
High ~600 lines of duplicated code between bootstrap and package_defs Delete duplicates, import from package_defs
High Two dead modules (stow.py, action.py) + dead service modules (projects.py, packages.py) = ~900 lines of unused code Delete or wire up
High Command modules bypass their own service layer Finish the refactoring described in architecture.md
Medium Module-level singletons vs FlowContext.runtime Use runtime from context consistently
Medium _sync_service_module() monkeypatching pattern Remove once services use their own imports
Medium Inconsistent error types (RuntimeError vs FlowError) Standardize on FlowError
Low No color/TTY detection Add --no-color or detect isatty
Low package remove semantics Either implement real uninstall or rename to forget
Low Duplicate write_bytes method Delete the duplicate

Overall Assessment

This is a functional but mid-refactoring codebase. The core functionality works -- dotfiles linking, bootstrapping, container management, and SSH entry all do what they claim. The manifest model and transactional undo are genuinely well-designed.

The main debt is architectural: a service-layer extraction was started but left incomplete, resulting in large amounts of duplicated code, dead modules, and a monkeypatching compatibility layer. The tests pass but are coupled to the wrong abstraction layer.

For a personal "vibe-coded" tool, this is solid. For production or team use, the duplication and dead code need cleanup before adding more features.