# Flow CLI -- Code Review > Reviewed 2026-03-15 against commit `24d682a` (main). Source: ~6,900 LOC across 21 modules. Tests: > ~2,600 LOC, 167 pass / 6 skip. --- ## 1. Reason to Exist Flow solves a real problem: managing a personal dev environment across machines. It unifies dotfiles linking, machine bootstrapping, container management, SSH entry, and binary package installs into one CLI. The alternative is a pile of shell scripts or Ansible for what is essentially personal-workstation setup. Flow is lighter, more opinionated, and self-hosted (config lives in your dotfiles repo). This is a reasonable project to maintain. --- ## 2. Features Supported | Area | Implemented | Notes | | ------------ | --------------------------------------------------------------------------------------------- | ----------------------------------------- | | `enter` | SSH handoff with tmux, terminfo warnings, target config | Works | | `dev` | Container create/exec/connect/list/stop/remove/respawn | Docker & Podman | | `dotfiles` | init/link/unlink/undo/status/sync/relink/clean/edit, modules, repo ops | Most complex subsystem | | `bootstrap` | Profile-driven provisioning: packages, shell, locale, hostname, ssh-keygen, runcmd, post-link | Works | | `package` | Binary package install/list/remove from manifest | Real install; "remove" only forgets state | | `sync` | Git health check across `~/projects` | Works | | `completion` | Dynamic zsh completion with context-aware candidates | Thorough | --- ## 3. Mismatches Between Docs/README and Source Code ### 3.1 Architecture doc describes layers that don't fully exist yet `docs/architecture.md` describes a clean adapter/service/runtime split: > `flow.commands.*` — argparse registration and compatibility wrappers only `flow.services.*` — > domain behavior In practice: - **`commands/sync.py`** still contains ~170 lines of its own git logic (`_git`, `_check_repo`, `run_check`, `run_fetch`, `run_summary`) that duplicate `services/projects.py` almost line-for-line. The service exists but the command module **never imports or uses it**. - **`commands/package.py`** does its own JSON state loading and binary install calls instead of delegating to `services/packages.py`. The service class `PackageService` exists but is **never used anywhere**. - **`commands/dotfiles.py`** is a 193-line shim that re-exports ~20 private functions from `services/dotfiles.py`, each wrapped in `_sync_service_module()`. The command module is not "argparse registration only" -- it's a compatibility layer that monkeypatches module-level constants into the service at runtime. **Verdict:** The architecture doc describes the target state, not the actual state. Two service modules (`projects`, `packages`) are dead code. ### 3.2 README says "`environments` is not supported" but code still checks for it `services/bootstrap.py:63` raises `RuntimeError` if `environments` key is found. This is actually correct defensive code, but the README makes it sound like the key is simply ignored. Minor. ### 3.3 README documents `flow dev create api -i tm0/node -p ~/projects/api` The code works, but the container is always `sleep infinity` + exec-in. README doesn't mention this is the execution model (not a standard container entry point). ### 3.4 `docs/flows.md` lists "Keep But Treat As Convenience Aliases" section These aliases aren't flagged in code or docs as aliases. `dotfiles sync` is described as "`repo pull` + `modules sync`", but in code it's `_pull_dotfiles` + `_sync_modules` -- not actually calling the repo/modules commands. Functionally equivalent but not aliased. --- ## 4. Code Quality Assessment ### 4.1 Bugs #### Duplicate method definition in `system.py` `FileSystem.write_bytes` is defined **twice** (lines 259-261 and 263-265). The second definition silently shadows the first. They're identical, so no behavioral bug, but this is a clear copy-paste error. #### `commands/container.py` has unused legacy functions `_container_exists`, `_container_running`, and `_tmux_fallback` are defined at module level (lines 61-122) but never called -- the `ContainerService` class has its own versions. Dead code. #### `commands/bootstrap.py` monkeypatches the service module ```python def _sync_service_module() -> None: _service.shutil = shutil _service.urllib = urllib _service._copy_install_item = _copy_install_item ``` This replaces `shutil` and `urllib` on the service module object at runtime. The service already imports these -- this patching has no real effect (the service's own imports win). `_copy_install_item` patch is circular: it reassigns the service function to a wrapper that calls the original. This is effectively a no-op. #### `dotfiles.py` command module monkeypatches module-level constants ```python def _sync_service_module() -> None: _service.DOTFILES_DIR = DOTFILES_DIR _service.MODULES_DIR = MODULES_DIR _service.LINKED_STATE = LINKED_STATE ... ``` The service already imports these from `flow.core.paths`. This override only matters if the command module's own constants diverge from the service's -- but they're imported from the same source. The whole pattern is a code smell from an incomplete refactoring. ### 4.2 Massive Code Duplication This is the single biggest quality issue. The bootstrap service duplicates nearly every function from `package_defs.py`: | `services/package_defs.py` | `services/bootstrap.py` | | ------------------------------------- | ------------------------------------ | | `linux_detect_package_manager()` | `_linux_detect_package_manager()` | | `resolve_package_manager()` | `_resolve_package_manager()` | | `get_package_catalog()` | `_get_package_catalog()` | | `normalize_profile_package_entry()` | `_normalize_profile_package_entry()` | | `resolve_package_spec()` | `_resolve_package_spec()` | | `resolve_pkg_source_name()` | `_resolve_pkg_source_name()` | | `platform_lookup_keys()` | `_platform_lookup_keys()` | | `resolve_binary_platform_vars()` | `_resolve_binary_platform_vars()` | | `resolve_binary_asset()` | `_resolve_binary_asset()` | | `resolve_binary_download_url()` | `_resolve_binary_download_url()` | | `validate_declared_install_path()` | `_validate_declared_install_path()` | | `install_destination()` | `_install_destination()` | | `install_strip_prefix()` | `_install_strip_prefix()` | | `strip_prefix()` | `_strip_prefix()` | | `BinaryInstaller.install()` | `_install_binary_package()` | | `BinaryInstaller.copy_install_item()` | `_copy_install_item()` | | `profile_template_context()` | `_profile_template_context()` | | `render_template_value()` | `_render_template_value()` | That's **18 duplicate function pairs** with near-identical implementations. `package_defs.py` was clearly extracted as the canonical version, but `bootstrap.py` was never updated to import from it. Similarly, `services/projects.py` duplicates `commands/sync.py`. Total duplicate code: roughly 500-600 lines. ### 4.3 Module-Level Singletons `services/dotfiles.py` and `services/bootstrap.py` use module-level singletons: ```python _RUNNER = CommandRunner() _FS = FileSystem() ``` These are then patched by the command modules. This makes the service modules hard to test in isolation and creates hidden mutable global state. The `FlowContext` already carries a `SystemRuntime` with `runner`, `fs`, and `git` -- but the service code ignores it and uses the module-level singletons instead. ### 4.4 Tests Reach Into Private APIs Tests import private (underscore-prefixed) functions directly: ```python from flow.commands.bootstrap import ( _ensure_required_variables, _get_profiles, _install_binary_package, ... ) ``` This couples tests to internal implementation details. Combined with the command-module shimming pattern, it means the test suite is testing the wrong layer (command shims instead of service logic directly). ### 4.5 Inconsistent Error Handling Two exception hierarchies coexist: - `FlowError(RuntimeError)` -- defined in `core/errors.py`, used by `system.py` and services - Plain `RuntimeError` -- used throughout `bootstrap.py`, `dotfiles.py`, and command modules The CLI entry point catches `RuntimeError` as the generic error type, so both work. But `FlowError` was clearly intended to be the project-wide error type and is underused. The `bootstrap.py` service (1001 lines) never imports or uses `FlowError`. ### 4.6 `stow.py` is Unused `core/stow.py` (358 lines) implements a GNU Stow-style tree folding/unfolding algorithm with `LinkTree`, `TreeFolder`, `LinkOperation`. It has 310 lines of tests. But **nothing in the application imports or uses it**. The dotfiles service has its own `LinkSpec`-based approach. This is dead code (or a planned feature that never landed). ### 4.7 `action.py` is Unused `core/action.py` (120 lines) defines `Action` and `ActionExecutor` for plan-then-execute workflows. It has 115 lines of tests. But **no command or service uses it**. The bootstrap command does its own sequential execution. Dead code. ### 4.8 `process.py` is a Thin Wrapper With Odd Restrictions ```python def run_command(command, console, *, check=True, shell=True, capture=False): if not shell: raise RuntimeError("run_command only supports shell commands") ``` It rejects `shell=False` even though the parameter exists in the signature. This wrapper creates a new `CommandRunner()` on every call instead of using the one from context. ### 4.9 Minor Issues - **No type checking enforcement.** `pyproject.toml` has no mypy/pyright configuration. Type annotations exist but are never validated. - **`ConsoleLogger` hardcodes ANSI codes.** No TTY detection, no `--no-color` flag. Piping output to a file produces escape sequences. - **`flow package remove` only forgets state.** README and `docs/flows.md` both acknowledge this, but the CLI help text says "Remove installed packages" without qualification. Users will expect files to be deleted. - **SSH keygen quoting.** `bootstrap.py:747` passes `shlex.quote` results into a string that is later shell-executed. Double quoting could be an issue with values containing spaces, though unlikely in practice for key filenames. --- ## 5. Summary ### What's Good - **Clear domain model.** The manifest format is well-designed: profiles, packages, dotfiles layout, modules, templates. It's genuinely useful. - **Transactional dotfiles linking.** The undo system with snapshots and backup directories is well-thought-out and more robust than most dotfile managers. - **Defensive input validation.** Config parsing handles multiple formats (dict keys, lists, string shorthand), missing values, and type mismatches gracefully. - **Test coverage exists.** 167 tests pass. The dotfiles folding and e2e container tests show real investment in correctness. - **Clean CLI surface.** Argparse registration is consistent, aliases work, help text is clear. - **Zsh completion is thorough.** Context-aware dynamic completion is a nice touch. ### What Needs Work | Priority | Issue | Effort | | -------- | ------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------- | | High | ~600 lines of duplicated code between bootstrap and package_defs | Delete duplicates, import from package_defs | | High | Two dead modules (stow.py, action.py) + dead service modules (projects.py, packages.py) = ~900 lines of unused code | Delete or wire up | | High | Command modules bypass their own service layer | Finish the refactoring described in architecture.md | | Medium | Module-level singletons vs FlowContext.runtime | Use runtime from context consistently | | Medium | `_sync_service_module()` monkeypatching pattern | Remove once services use their own imports | | Medium | Inconsistent error types (RuntimeError vs FlowError) | Standardize on FlowError | | Low | No color/TTY detection | Add `--no-color` or detect `isatty` | | Low | `package remove` semantics | Either implement real uninstall or rename to `forget` | | Low | Duplicate `write_bytes` method | Delete the duplicate | ### Overall Assessment This is a **functional but mid-refactoring codebase**. The core functionality works -- dotfiles linking, bootstrapping, container management, and SSH entry all do what they claim. The manifest model and transactional undo are genuinely well-designed. The main debt is architectural: a service-layer extraction was started but left incomplete, resulting in large amounts of duplicated code, dead modules, and a monkeypatching compatibility layer. The tests pass but are coupled to the wrong abstraction layer. For a personal "vibe-coded" tool, this is solid. For production or team use, the duplication and dead code need cleanup before adding more features.