Skip to content

Support Copilot CLI multi-storage recall and provider-based fallback#5

Closed
jshessen wants to merge 2 commits into
dezgit2025:mainfrom
jshessen:bug/issue-3-multistorage-recall
Closed

Support Copilot CLI multi-storage recall and provider-based fallback#5
jshessen wants to merge 2 commits into
dezgit2025:mainfrom
jshessen:bug/issue-3-multistorage-recall

Conversation

@jshessen
Copy link
Copy Markdown
Contributor

@jshessen jshessen commented Apr 22, 2026

Summary

This PR resolves Copilot CLI 1.0.34+ storage compatibility by introducing provider-based multi-storage recall instead of assuming a single SQLite layout.

Closes #3

Storage model coverage by provider

Provider Storage location(s) Storage format Commands supported
cli (Copilot CLI) ~/.copilot/session-store.db SQLite (legacy/current where present) list, search, show, files, checkpoints, health, schema-check
cli fallback ~/.copilot/session-state/*/events.jsonl JSONL events list, search, show (schema-check reports compatibility mode)
vscode ~/.config/Code/User/workspaceStorage/**/chatSessions/*.jsonl (+ flatpak/snap variants) JSONL logs list, search, show, files
jetbrains ~/.config/github-copilot/chat-sessions/* + related chat files file-backed JSON/session logs list, search, show, files
neovim ~/.config/github-copilot/** and ~/.local/share/nvim/** chat JSON/JSONL file-backed JSON/JSONL list, search, show, files

Environment overrides supported

  • SESSION_RECALL_DB
  • SESSION_RECALL_CLI_STATE_ROOT
  • SESSION_RECALL_VSCODE_STORAGE
  • SESSION_RECALL_JETBRAINS_ROOT
  • SESSION_RECALL_NEOVIM_ROOT

Runtime behavior and fallback semantics

Provider discovery

At runtime, providers are discovered and filtered to those actually available on the machine:

  • CLI provider is available if either SQLite DB exists or session-state event files exist.
  • File-backed providers are available when their roots exist.

Command behavior in multi-storage mode

  • list, search, show, files, checkpoints now route through active providers.
  • schema-check is provider-aware:
    • Performs strict schema checks when CLI SQLite is present.
    • Returns compatibility detail (session-state-or-sqlite) when fallback event storage is active.
  • health is provider-aware:
    • Runs SQLite health dimensions when SQLite is available.
    • Falls back to provider compatibility dimensions when SQLite is not available.

UX and recall quality improvements

  • Added session-recall repos to summarize discovered repositories/workspaces across providers.
  • Improved repository attribution for CLI session-state parsing.
  • Improved list/repos alignment and sparse-scope behavior so recall output better matches discovered session reality.

What changed

  • Added provider architecture:
    • src/session_recall/providers/ (base.py, discovery.py, copilot_cli.py, file_backends.py, common.py)
  • Added repo discovery command:
    • src/session_recall/commands/repos.py
  • Updated command paths for provider fallback and multi-storage context:
    • list, search, show, files, checkpoints, health, schema-check
  • Improved repo detection/output formatting:
    • src/session_recall/util/detect_repo.py
    • src/session_recall/util/format_output.py
  • Added/updated tests:
    • test_provider_backends.py
    • test_health_schema_multistorage.py
    • test_repo_scope_fallback.py
    • test_repos_command.py
    • updates in test_list_sessions.py

How this differs from PR #4

PR #4 primarily introduces a DB-layer adapter approach (db/jsonl_store.py + connect/schema integration).

This PR addresses the same issue via provider routing and command-layer fallback, including:

  • dynamic provider discovery,
  • explicit session-state support for CLI,
  • cross-surface file-backed providers (VS Code / JetBrains / Neovim),
  • and provider-aware health/schema semantics.

Both aim at Issue #3 but with different architecture and operational surface area.

Validation

  • pytest src/session_recall/tests/ -q → 105 passed
  • ruff check src/ → all checks passed
  • Focused suite:
    • pytest -q src/session_recall/tests/test_provider_backends.py src/session_recall/tests/test_repo_scope_fallback.py src/session_recall/tests/test_repos_command.py src/session_recall/tests/test_list_sessions.py src/session_recall/tests/test_health_schema_multistorage.py → 18 passed

CONTRIBUTING checklist

  • Tests pass: pytest src/session_recall/tests/ -q
  • Lint passes: ruff check src/
  • No new runtime dependencies added
  • Docs updated for behavior changes (README.md)

dezgit2025 added a commit that referenced this pull request Apr 28, 2026
PR #5 remediation — 21 findings addressed across 6 phases:

Phase 1 — Foundation fixes:
  - CC1: File backends opt-in via SESSION_RECALL_ENABLE_FILE_BACKENDS
  - F7: list --limit default reverted to 10
  - F5: search excerpt restored to 250-char truncation
  - F1+F21: Deterministic labels + macOS VS Code path
  - CC4: Asymmetric lookback (5d JSONL / 30d SQLite)
  - F10: schema_problems() on repos command
  - F13: Provider field shortened/omitted

Phase 1.5 — WSL/Linux compatibility:
  - VS Code Server path, XDG dirs verified

Phase 2 — Structure + hardening:
  - CC2: file_backends.py split into providers/file/ subpackage
  - copilot_cli.py split into providers/copilot_cli/ subpackage
  - F3: Bounded JSONL reader (iter_jsonl_bounded)

Phase 3 — Security:
  - F2: Symlink guard (is_under_root) at all glob sites
  - F4: Trust level field + sentinel fence for untrusted content
  - F6: mtime prefilter + early termination

Phase 4 — Regression tests:
  - Token budget tests (list/search/files byte limits)
  - Adversarial tests (symlink, JSONL bomb, injection, nested JSON)

Phase 5 — Conventions:
  - CLAUDE.md LOC cap relaxed to 200/300
  - Version bumped to 0.2.0
  - CHANGELOG.md reformatted to Keep a Changelog

Phase 6 — Documentation:
  - README: What's New, multi-storage section, env vars, upgrade instructions
  - deploy/install.md: v0.2.0 upgrade guide, multi-storage config
  - PyPI publish workflow (.github/workflows/publish.yml)

171 tests passing (90 → 171). Zero runtime dependencies.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
dezgit2025 added a commit that referenced this pull request Apr 28, 2026
#11)

* Support multi-storage recall for Copilot CLI 1.0.34+ (fixes #3)

* Align PR with CONTRIBUTING checks (ruff clean + docs update)

* ci: add PyPI publish workflow on tag push

Trusted Publisher OIDC — no API tokens needed.
Triggers: git tag v* + git push origin v*
Pipeline: test matrix → build → PyPI publish → GitHub Release

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: v0.2.0 — multi-storage recall, security hardening, token budgets

PR #5 remediation — 21 findings addressed across 6 phases:

Phase 1 — Foundation fixes:
  - CC1: File backends opt-in via SESSION_RECALL_ENABLE_FILE_BACKENDS
  - F7: list --limit default reverted to 10
  - F5: search excerpt restored to 250-char truncation
  - F1+F21: Deterministic labels + macOS VS Code path
  - CC4: Asymmetric lookback (5d JSONL / 30d SQLite)
  - F10: schema_problems() on repos command
  - F13: Provider field shortened/omitted

Phase 1.5 — WSL/Linux compatibility:
  - VS Code Server path, XDG dirs verified

Phase 2 — Structure + hardening:
  - CC2: file_backends.py split into providers/file/ subpackage
  - copilot_cli.py split into providers/copilot_cli/ subpackage
  - F3: Bounded JSONL reader (iter_jsonl_bounded)

Phase 3 — Security:
  - F2: Symlink guard (is_under_root) at all glob sites
  - F4: Trust level field + sentinel fence for untrusted content
  - F6: mtime prefilter + early termination

Phase 4 — Regression tests:
  - Token budget tests (list/search/files byte limits)
  - Adversarial tests (symlink, JSONL bomb, injection, nested JSON)

Phase 5 — Conventions:
  - CLAUDE.md LOC cap relaxed to 200/300
  - Version bumped to 0.2.0
  - CHANGELOG.md reformatted to Keep a Changelog

Phase 6 — Documentation:
  - README: What's New, multi-storage section, env vars, upgrade instructions
  - deploy/install.md: v0.2.0 upgrade guide, multi-storage config
  - PyPI publish workflow (.github/workflows/publish.yml)

171 tests passing (90 → 171). Zero runtime dependencies.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: resolve ruff lint errors (unused imports + __all__ re-exports)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: make test_schema_check_missing_db CI-compatible

Test now handles both local (session-state exists) and CI
(no Copilot CLI installed) environments gracefully.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: update progress.md (all phases complete) + save session context

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: jshessen <jeff.hessenflow@gmail.com>
Co-authored-by: Desi Villanueva <217994822+dezgit2025@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dezgit2025
Copy link
Copy Markdown
Owner

Addressed via #11 — took the multi-storage concept but reimplemented with security hardening and provider isolation. Thank you for the contribution and the idea, @jshessen! The provider architecture, trust fencing, bounded JSONL reads, and token budget enforcement in v0.2.0 were all inspired by your work here.

@dezgit2025 dezgit2025 closed this Apr 28, 2026
dezgit2025 added a commit that referenced this pull request Apr 28, 2026
#11)

* Support multi-storage recall for Copilot CLI 1.0.34+ (fixes #3)

* Align PR with CONTRIBUTING checks (ruff clean + docs update)

* ci: add PyPI publish workflow on tag push

Trusted Publisher OIDC — no API tokens needed.
Triggers: git tag v* + git push origin v*
Pipeline: test matrix → build → PyPI publish → GitHub Release

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: v0.2.0 — multi-storage recall, security hardening, token budgets

PR #5 remediation — 21 findings addressed across 6 phases:

Phase 1 — Foundation fixes:
  - CC1: File backends opt-in via SESSION_RECALL_ENABLE_FILE_BACKENDS
  - F7: list --limit default reverted to 10
  - F5: search excerpt restored to 250-char truncation
  - F1+F21: Deterministic labels + macOS VS Code path
  - CC4: Asymmetric lookback (5d JSONL / 30d SQLite)
  - F10: schema_problems() on repos command
  - F13: Provider field shortened/omitted

Phase 1.5 — WSL/Linux compatibility:
  - VS Code Server path, XDG dirs verified

Phase 2 — Structure + hardening:
  - CC2: file_backends.py split into providers/file/ subpackage
  - copilot_cli.py split into providers/copilot_cli/ subpackage
  - F3: Bounded JSONL reader (iter_jsonl_bounded)

Phase 3 — Security:
  - F2: Symlink guard (is_under_root) at all glob sites
  - F4: Trust level field + sentinel fence for untrusted content
  - F6: mtime prefilter + early termination

Phase 4 — Regression tests:
  - Token budget tests (list/search/files byte limits)
  - Adversarial tests (symlink, JSONL bomb, injection, nested JSON)

Phase 5 — Conventions:
  - CLAUDE.md LOC cap relaxed to 200/300
  - Version bumped to 0.2.0
  - CHANGELOG.md reformatted to Keep a Changelog

Phase 6 — Documentation:
  - README: What's New, multi-storage section, env vars, upgrade instructions
  - deploy/install.md: v0.2.0 upgrade guide, multi-storage config
  - PyPI publish workflow (.github/workflows/publish.yml)

171 tests passing (90 → 171). Zero runtime dependencies.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: resolve ruff lint errors (unused imports + __all__ re-exports)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: make test_schema_check_missing_db CI-compatible

Test now handles both local (session-state exists) and CI
(no Copilot CLI installed) environments gracefully.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: update progress.md (all phases complete) + save session context

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: jshessen <jeff.hessenflow@gmail.com>
Co-authored-by: Desi Villanueva <217994822+dezgit2025@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
dezgit2025 added a commit that referenced this pull request Apr 28, 2026
#11)

* Support multi-storage recall for Copilot CLI 1.0.34+ (fixes #3)

* Align PR with CONTRIBUTING checks (ruff clean + docs update)

* ci: add PyPI publish workflow on tag push

Trusted Publisher OIDC — no API tokens needed.
Triggers: git tag v* + git push origin v*
Pipeline: test matrix → build → PyPI publish → GitHub Release

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat: v0.2.0 — multi-storage recall, security hardening, token budgets

PR #5 remediation — 21 findings addressed across 6 phases:

Phase 1 — Foundation fixes:
  - CC1: File backends opt-in via SESSION_RECALL_ENABLE_FILE_BACKENDS
  - F7: list --limit default reverted to 10
  - F5: search excerpt restored to 250-char truncation
  - F1+F21: Deterministic labels + macOS VS Code path
  - CC4: Asymmetric lookback (5d JSONL / 30d SQLite)
  - F10: schema_problems() on repos command
  - F13: Provider field shortened/omitted

Phase 1.5 — WSL/Linux compatibility:
  - VS Code Server path, XDG dirs verified

Phase 2 — Structure + hardening:
  - CC2: file_backends.py split into providers/file/ subpackage
  - copilot_cli.py split into providers/copilot_cli/ subpackage
  - F3: Bounded JSONL reader (iter_jsonl_bounded)

Phase 3 — Security:
  - F2: Symlink guard (is_under_root) at all glob sites
  - F4: Trust level field + sentinel fence for untrusted content
  - F6: mtime prefilter + early termination

Phase 4 — Regression tests:
  - Token budget tests (list/search/files byte limits)
  - Adversarial tests (symlink, JSONL bomb, injection, nested JSON)

Phase 5 — Conventions:
  - CLAUDE.md LOC cap relaxed to 200/300
  - Version bumped to 0.2.0
  - CHANGELOG.md reformatted to Keep a Changelog

Phase 6 — Documentation:
  - README: What's New, multi-storage section, env vars, upgrade instructions
  - deploy/install.md: v0.2.0 upgrade guide, multi-storage config
  - PyPI publish workflow (.github/workflows/publish.yml)

171 tests passing (90 → 171). Zero runtime dependencies.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: resolve ruff lint errors (unused imports + __all__ re-exports)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: make test_schema_check_missing_db CI-compatible

Test now handles both local (session-state exists) and CI
(no Copilot CLI installed) environments gracefully.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: update progress.md (all phases complete) + save session context

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: jshessen <jeff.hessenflow@gmail.com>
Co-authored-by: Desi Villanueva <217994822+dezgit2025@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dezgit2025
Copy link
Copy Markdown
Owner

Hey @jshessen — wanted to give you proper credit here. Your PR laid the groundwork for the multi-storage provider architecture we shipped in v0.2.0. The provider-based fallback design, the VS Code/JetBrains/Neovim backend concepts, and the JSONL event parsing all trace back to your work on this PR.

We ended up refactoring significantly before merging (different module structure, tighter schema validation, some scope changes), but the core idea of pluggable storage providers originated here. You've been added to the Contributors section in the README. Thanks for the solid foundation. 🙏

dezgit2025 added a commit that referenced this pull request Apr 29, 2026
Acknowledge jshessen's multi-storage provider architecture work
that laid the groundwork for the v0.2.0 provider system.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jshessen
Copy link
Copy Markdown
Contributor Author

Hey @jshessen — wanted to give you proper credit here. Your PR laid the groundwork for the multi-storage provider architecture we shipped in v0.2.0. The provider-based fallback design, the VS Code/JetBrains/Neovim backend concepts, and the JSONL event parsing all trace back to your work on this PR.

We ended up refactoring significantly before merging (different module structure, tighter schema validation, some scope changes), but the core idea of pluggable storage providers originated here. You've been added to the Contributors section in the README. Thanks for the solid foundation. 🙏

Perfect -- I am just glad it can be extended to cover the other use cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Support Copilot CLI 1.0.34 session-state DB schema (session-store.db no longer present)

2 participants