Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat: add SQLite database support to OpenCode analyzer#120

Open
mike1858 wants to merge 3 commits intomain from
feat/opencode-sqlite-support
Open

feat: add SQLite database support to OpenCode analyzer #120
mike1858 wants to merge 3 commits intomain from
feat/opencode-sqlite-support

Conversation

@mike1858
Copy link
Member

@mike1858 mike1858 commented Feb 17, 2026
edited by coderabbitai bot
Loading

Summary

OpenCode has migrated from individual JSON message files to a SQLite database (opencode.db). This adds seamless support for the new format alongside the existing JSON files — no new tab, all data merges under the existing OpenCode tab.

What changed

SQLite parsing

  • Parse messages from ~/.local/share/opencode/opencode.db using the message, session, project, and part tables
  • Batch-load tool call stats from the part table with a LIKE pre-filter to avoid deserializing large non-tool parts (text, reasoning, etc.)
  • Open database read-only with WAL support and busy timeout for safe concurrent access while OpenCode is running

Seamless integration

  • Deduplication: Messages are deduplicated across JSON files and SQLite DB during the migration period using the same global_hash formula (opencode_{session_id}_{msg_id})
  • Dynamic contribution strategy: MultiSession when DB exists (correct for multi-message source), SingleMessage for JSON-only installs
  • File watching: Watches both the legacy storage/message/ directory and the parent opencode/ directory for SQLite DB changes
  • Data path validation: Accepts both opencode.db and legacy .json files

Refactoring

Extracted shared logic into reusable helpers:

  • compute_message_stats() — stats computation from message + tool stats
  • build_conversation_message() — ConversationMessage construction
  • json_to_conversation_message() — legacy JSON path wrapper
  • Made OpenCodeMessage.id and session_id #[serde(default)] so the same struct parses both full JSON files and DB data blobs (which omit those fields; they come from DB columns instead)

Compatibility

Supports three user states:

  1. JSON only (older OpenCode versions) — works as before
  2. SQLite only (new OpenCode after migration) — reads from DB
  3. Both (during migration transition) — reads from both, deduplicates

Tests

Added 20 new tests covering:

  • SQLite data blob parsing (assistant, user, minimal)
  • Stats computation (with cost, user messages, tool stats preservation)
  • build_conversation_message with various project hash fallbacks
  • Global hash consistency between JSON and SQLite paths
  • In-memory SQLite integration tests (projects, sessions, tool stats, end-to-end message conversion)

All 220 tests pass (up from 200). Clippy, fmt, and doc checks clean.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added SQLite database support while keeping legacy JSON compatibility
    • Automatic deduplication across JSON and SQLite sources
    • Live detection of database and legacy data changes; improved migration handling
    • More resilient message parsing with broader optional metadata support
  • Tests

    • Expanded test coverage for SQLite parsing, mixed-source workflows, stats aggregation, and migration scenarios

OpenCode has migrated from individual JSON message files to a SQLite
database (opencode.db). This adds seamless support for the new format
alongside the existing JSON files — no new tab, all data merges under
the existing 'OpenCode' tab.
## What changed
- Parse messages from ~/.local/share/opencode/opencode.db using the
 message, session, project, and part tables
- Batch-load tool call stats from the part table with a LIKE pre-filter
 to avoid deserializing large non-tool parts
- Deduplicate messages across JSON files and SQLite DB during the
 migration period (same global_hash for identical messages)
- Dynamic contribution strategy: MultiSession when DB exists (correct
 for multi-message source), SingleMessage for JSON-only installs
- Watch both the legacy storage/message/ directory and the parent
 opencode/ directory for SQLite DB changes
- Accept opencode.db as a valid data path for file watcher events
## Refactoring
Extracted shared logic into reusable helpers:
- compute_message_stats() — stats computation from message + tool stats
- build_conversation_message() — ConversationMessage construction
- json_to_conversation_message() — legacy JSON path wrapper
- Made OpenCodeMessage.id and session_id #[serde(default)] so the same
 struct parses both full JSON files and DB data blobs (which omit those
 fields; they come from DB columns instead)
## Tests
Added 20 new tests covering:
- SQLite data blob parsing (assistant, user, minimal)
- Stats computation (with cost, user messages, tool stats preservation)
- build_conversation_message with various project hash fallbacks
- Global hash consistency between JSON and SQLite paths
- In-memory SQLite integration tests (projects, sessions, tool stats,
 end-to-end message conversion)
Copy link

coderabbitai bot commented Feb 17, 2026
edited
Loading

Warning

Rate limit exceeded

@mike1858 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 17 minutes and 55 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Adds SQLite (opencode.db) support to the OpenCode analyzer alongside legacy JSON: database access helpers, SQLite message parsing and tool-stats aggregation, unified conversion to ConversationMessage, parallel parsing of mixed sources, deduplication by global hash, and updated discovery/watch logic.

Changes

Cohort / File(s) Summary
SQLite DB integration & loaders
src/analyzers/opencode.rs
Adds open_db, DbProject, DbSession, load_projects_from_db, load_sessions_from_db, batch_load_tool_stats_from_db, and parse_sqlite_messages for read-only SQLite querying and pre-aggregated tool stats.
Unified message conversion
src/analyzers/opencode.rs
Introduces json_to_conversation_message, build_conversation_message, compute_message_stats, expanded OpenCodeMessage serde fields (many made optional/defaulted) and logic to merge JSON- and DB-origin messages into ConversationMessage.
Parsing flow & parallelization
src/analyzers/opencode.rs
Implements parse_sources_parallel_with_paths, parse_sources_parallel, source partitioning (JSON vs DB), deduplication by global hash, and get_stats_with_sources that aggregates both source types.
Data discovery & watcher updates
src/analyzers/opencode.rs
Updates get_data_glob_patterns, discover_data_sources, is_available, is_valid_data_path, get_watch_directories, and contribution_strategy to recognize opencode.db and legacy JSON directories and adapt strategy (MultiSession vs SingleMessage).
Filesystem helpers & legacy JSON support
src/analyzers/opencode.rs
Adds storage_root, db_path, app_dir, has_sqlite_db, has_json_messages, load_projects, load_sessions, extract_tool_stats_from_parts, and ms_to_datetime to support legacy filesystem parts and timestamp normalization.
Tests
src/analyzers/opencode.rs
Extensive tests added for SQLite blob parsing, in-memory DB flows, timestamp conversion, stats computation, message construction, tool-stats extraction, and global-hash consistency between JSON and SQLite sources.

Sequence Diagram

sequenceDiagram
 participant Analyzer as OpenCode Analyzer
 participant Discovery as Data Discovery
 participant FS as Filesystem (JSON)
 participant DB as SQLite DB
 participant Parser as Parser/Converter
 participant Stats as Tool Stats Aggregator
 participant Dedup as Deduplicator
 participant Output as Output
 Analyzer->>Discovery: discover data sources
 Discovery->>FS: detect legacy JSON message dirs
 Discovery->>DB: detect opencode.db
 par JSON path
 FS->>Parser: load JSON files
 Parser->>Stats: extract tool stats from parts
 Stats-->Parser: per-message tool stats
 Parser->>Parser: json_to_conversation_message
 and DB path
 DB->>Parser: open_db (read-only WAL mode)
 Parser->>Stats: batch_load_tool_stats_from_db
 Stats-->Parser: aggregated tool stats
 Parser->>Parser: parse_sqlite_messages -> build_conversation_message
 end
 Parser->>Dedup: emit messages (with global hash)
 Dedup->>Output: unified, deduplicated messages
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Hopped from files to a database den,
Two homes for messages, I visited again.
I stitched their stories, stats in my paw,
Hashes aligned—what a nifty law! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding SQLite database support to the OpenCode analyzer, which is the core objective of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/opencode-sqlite-support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Fixes clippy::field_reassign_with_default triggered by CI's
`cargo clippy --tests -- -D warnings`.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
src/analyzers/opencode.rs (5)

459-477: Duplicated tool-stats accumulation logic.

The tool-name matching and stat incrementing in extract_tool_stats_from_parts (lines 459–477) and batch_load_tool_stats_from_db (lines 633–652) are identical. Extracting a shared helper would reduce the chance of future divergence.

♻️ Proposed shared helper
fn accumulate_tool_stat(stats: &mut Stats, tool_name: &str, value: &OwnedValue) {
 stats.tool_calls += 1;
 match tool_name {
 "read" => {
 stats.files_read += 1;
 }
 "glob" => {
 stats.file_searches += 1;
 if let Some(count) = value
 .get("state")
 .and_then(|s| s.get("metadata"))
 .and_then(|m| m.get("count"))
 .and_then(|c| c.as_u64())
 {
 stats.files_read += count;
 }
 }
 _ => {}
 }
}

Then both call sites become:

- stats.tool_calls += 1;
-
- match tool_name {
- "read" => {
- stats.files_read += 1;
- }
- "glob" => {
- stats.file_searches += 1;
- ...
- }
- _ => {}
- }
+ accumulate_tool_stat(&mut stats, tool_name, &value);

Also applies to: 633-652

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/analyzers/opencode.rs` around lines 459 - 477, Extract the duplicated
tool-stat logic into a helper function (e.g., fn accumulate_tool_stat(stats:
&mut Stats, tool_name: &str, value: &OwnedValue)) and replace the matching
blocks in extract_tool_stats_from_parts and batch_load_tool_stats_from_db with
calls to this helper; the helper should increment stats.tool_calls and handle
"read" and "glob" cases (including extracting the nested
"state"->"metadata"->"count" as_u64 to add to stats.files_read and increment
stats.file_searches for "glob"). Ensure the function is visible to both call
sites (module-level) and use the same Stats and OwnedValue types as in the
original code.

883-927: Duplicated source-partitioning and parsing logic across methods.

get_stats_with_sources (lines 883–927) repeats the partition → load-context → parallel-parse-JSON → sequential-parse-DB pattern that already exists in parse_sources_parallel_with_paths (lines 814–860). Consider reusing parse_sources_parallel:

fn get_stats_with_sources(&self, sources: Vec<DataSource>) -> Result<AgenticCodingToolStats> {
 let messages = self.parse_sources_parallel(&sources);
 // ... aggregate stats from `messages` ...
}

This eliminates ~40 lines of duplicated logic and ensures future changes to the parsing pipeline are applied in one place.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/analyzers/opencode.rs` around lines 883 - 927, get_stats_with_sources
currently reimplements the partition/load/parallel-JSON/sequential-DB parsing
logic already implemented in parse_sources_parallel_with_paths (aka
parse_sources_parallel); replace the duplicated block in get_stats_with_sources
with a call to that parsing helper to obtain Vec<ConversationMessage> (or adapt
the helper to return that type), e.g. let messages =
self.parse_sources_parallel_with_paths(sources) and then aggregate stats from
messages; ensure any error handling or storage_root-dependent behavior is
centralized in parse_sources_parallel_with_paths and update
get_stats_with_sources to use its return value for further aggregation.

597-603: LIKE pre-filter may miss tool parts with unexpected JSON formatting.

The two LIKE patterns cover "type":"tool" and "type": "tool", but won't match other valid JSON whitespace variants (e.g., "type" : "tool" or multi-line formatting). Since the filter is an optimization and false negatives would silently drop tool stats, consider a single broader pattern or a note documenting the assumption about OpenCode's serialization format.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/analyzers/opencode.rs` around lines 597 - 603, The current conn.prepare
call uses two specific LIKE patterns that miss valid JSON spacing/formatting
variants and can silently drop tool parts; replace the fragile LIKE filter with
a robust check such as using SQLite JSON functions (e.g., json_extract(data,
'$.type') = 'tool') or broaden the pattern to a single catch‐all before parsing,
so all parts with type=="tool" are reliably detected; update the SQL string
passed to conn.prepare (the query in the SELECT message_id, data FROM part ...)
accordingly and ensure subsequent code that deserializes data still handles
non-tool rows if you keep a looser pre-filter.

789-805: parse_source reloads all projects & sessions for every individual JSON file.

When called in a loop (e.g., from a watcher processing one file at a time), load_projects and load_sessions are invoked per file. This is fine for one-off parses but is worth noting; the batch path (parse_sources_parallel_with_paths) correctly loads context once.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/analyzers/opencode.rs` around lines 789 - 805, parse_source currently
calls load_projects and load_sessions on every invocation (reloading context per
JSON file); change parse_source to avoid per-file reloads by accepting preloaded
context or reusing a cached value: update parse_source signature to take
projects and sessions (e.g., add parameters like projects: &ProjectsType,
sessions: &SessionsType or a single Context struct), remove the internal calls
to load_projects/load_sessions and use the supplied preloaded data, and update
call sites (including parse_sources_parallel_with_paths and the watcher loop) to
load projects/sessions once and pass them through; alternatively implement a
small memoized/cache lookup keyed by storage_root inside parse_source if
changing the signature is impractical.

853-858: Consider structured logging instead of eprintln!.

Using eprintln! for error reporting (also at line 921) mixes analyzer output with stderr. If the project uses a logging framework (e.g., tracing), switching to tracing::warn! would allow log-level filtering and structured metadata.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/analyzers/opencode.rs` around lines 853 - 858, Replace the eprintln!
calls in the Err(e) arms (e.g., the block that prints "Failed to parse OpenCode
SQLite DB {:?}: {}" and the similar call near line 921) with structured tracing
logs: import tracing and use tracing::warn! (or trace/debug/info as appropriate)
with named fields for the path and error (for example: tracing::warn!(path =
%source.path.display(), error = %e, "Failed to parse OpenCode SQLite DB");).
Ensure you remove the eprintln! usage, add the necessary use tracing::...
import, and format the message as structured metadata so the logs can be
filtered and queried.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/analyzers/opencode.rs`:
- Around line 1206-1208: Replace the two-step reassignment of a default Stats
with a single struct initialization: instead of creating tool_stats via
Stats::default() and then setting tool_calls and files_read, construct
tool_stats using the struct init pattern (base it on Default::default()) and set
tool_calls and files_read inline; target the variable tool_stats and the Stats
type, replacing the existing two assignments with the combined initialization.
- Around line 47-50: The doc for has_sqlite_db() claims it checks existence and
schema but the implementation only checks file existence; either update the
comment to state it only checks existence, or implement a lightweight schema
check: in has_sqlite_db() (or a helper called from it) open the SQLite at
Self::db_path(), run a simple query against sqlite_master to ensure the expected
table (e.g., "message") exists (for example: SELECT name FROM sqlite_master
WHERE type='table' AND name='message' LIMIT 1), and return true only if the file
exists, the DB opens, and the table is present; ensure errors opening/queries
are handled and result in false.
- Around line 705-707: The SQLite-path fallback uses session.project_id while
the JSON-path uses session.id, causing inconsistent project_hash; update the
SQLite-path fallback to use the session id instead. Locate the variables
session_title, worktree, fallback in opencode.rs and change the fallback
assignment from session.map(|s| s.project_id.as_str()) to use session.id (e.g.,
session.map(|s| s.id.as_str()) or session.map(|s| s.id.clone()) as appropriate)
so both JSON and SQLite paths use the same session_id fallback.
---
Nitpick comments:
In `@src/analyzers/opencode.rs`:
- Around line 459-477: Extract the duplicated tool-stat logic into a helper
function (e.g., fn accumulate_tool_stat(stats: &mut Stats, tool_name: &str,
value: &OwnedValue)) and replace the matching blocks in
extract_tool_stats_from_parts and batch_load_tool_stats_from_db with calls to
this helper; the helper should increment stats.tool_calls and handle "read" and
"glob" cases (including extracting the nested "state"->"metadata"->"count"
as_u64 to add to stats.files_read and increment stats.file_searches for "glob").
Ensure the function is visible to both call sites (module-level) and use the
same Stats and OwnedValue types as in the original code.
- Around line 883-927: get_stats_with_sources currently reimplements the
partition/load/parallel-JSON/sequential-DB parsing logic already implemented in
parse_sources_parallel_with_paths (aka parse_sources_parallel); replace the
duplicated block in get_stats_with_sources with a call to that parsing helper to
obtain Vec<ConversationMessage> (or adapt the helper to return that type), e.g.
let messages = self.parse_sources_parallel_with_paths(sources) and then
aggregate stats from messages; ensure any error handling or
storage_root-dependent behavior is centralized in
parse_sources_parallel_with_paths and update get_stats_with_sources to use its
return value for further aggregation.
- Around line 597-603: The current conn.prepare call uses two specific LIKE
patterns that miss valid JSON spacing/formatting variants and can silently drop
tool parts; replace the fragile LIKE filter with a robust check such as using
SQLite JSON functions (e.g., json_extract(data, '$.type') = 'tool') or broaden
the pattern to a single catch‐all before parsing, so all parts with type=="tool"
are reliably detected; update the SQL string passed to conn.prepare (the query
in the SELECT message_id, data FROM part ...) accordingly and ensure subsequent
code that deserializes data still handles non-tool rows if you keep a looser
pre-filter.
- Around line 789-805: parse_source currently calls load_projects and
load_sessions on every invocation (reloading context per JSON file); change
parse_source to avoid per-file reloads by accepting preloaded context or reusing
a cached value: update parse_source signature to take projects and sessions
(e.g., add parameters like projects: &ProjectsType, sessions: &SessionsType or a
single Context struct), remove the internal calls to load_projects/load_sessions
and use the supplied preloaded data, and update call sites (including
parse_sources_parallel_with_paths and the watcher loop) to load
projects/sessions once and pass them through; alternatively implement a small
memoized/cache lookup keyed by storage_root inside parse_source if changing the
signature is impractical.
- Around line 853-858: Replace the eprintln! calls in the Err(e) arms (e.g., the
block that prints "Failed to parse OpenCode SQLite DB {:?}: {}" and the similar
call near line 921) with structured tracing logs: import tracing and use
tracing::warn! (or trace/debug/info as appropriate) with named fields for the
path and error (for example: tracing::warn!(path = %source.path.display(), error
= %e, "Failed to parse OpenCode SQLite DB");). Ensure you remove the eprintln!
usage, add the necessary use tracing::... import, and format the message as
structured metadata so the logs can be filtered and queried.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/analyzers/opencode.rs (1)

878-946: 🛠️ Refactor suggestion | 🟠 Major

get_stats_with_sources duplicates parse_sources_parallel logic.

The JSON/SQLite partitioning, context loading, parallel parsing, and deduplication are duplicated here. Since parse_sources_parallel already handles all of this (including deduplication), this method could delegate to it:

♻️ Proposed refactor
 fn get_stats_with_sources(
 &self,
 sources: Vec<DataSource>,
 ) -> Result<crate::types::AgenticCodingToolStats> {
- // Partition sources into JSON files and DB files.
- let (db_sources, json_sources): (Vec<_>, Vec<_>) = sources
- .iter()
- .partition(|s| s.path.extension().is_some_and(|ext| ext == "db"));
-
- let mut all_messages: Vec<ConversationMessage> = Vec::new();
-
- // --- Parse JSON sources in parallel ---
- if !json_sources.is_empty()
- && let Some(storage_root) = Self::storage_root()
- {
- // ... ~30 lines of duplicated parsing ...
- }
-
- // --- Parse SQLite sources ---
- for source in db_sources {
- // ... duplicated DB parsing ...
- }
-
- // Deduplicate
- let messages = crate::utils::deduplicate_by_global_hash(all_messages);
+ let messages = self.parse_sources_parallel(&sources);
 // Aggregate stats.
 let mut daily_stats = crate::utils::aggregate_by_date(&messages);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/analyzers/opencode.rs` around lines 878 - 946, get_stats_with_sources
duplicates the JSON/SQLite partitioning, context loading, parallel parsing, and
deduplication already implemented in parse_sources_parallel; replace the body of
get_stats_with_sources with a delegation to parse_sources_parallel and then
adapt its returned messages/stats into the AgenticCodingToolStats struct.
Specifically: call Self::parse_sources_parallel(sources) (ensuring
storage_root/context are handled there), use the returned deduplicated messages
to compute daily_stats and num_conversations (reuse
crate::utils::aggregate_by_date) and construct the AgenticCodingToolStats with
analyzer_name from self.display_name(); remove the duplicated json/db parsing
and parse_sqlite_messages usage from get_stats_with_sources. Ensure any helper
functions referenced (storage_root, parse_sources_parallel,
deduplicate_by_global_hash) are used rather than reimplemented.
🧹 Nitpick comments (1)
src/analyzers/opencode.rs (1)

853-858: Consider structured logging instead of eprintln!.

Using eprintln! for error reporting in a TUI application may interfere with the UI. If the project has a logging framework (e.g., tracing or log), prefer warn! or error! macros for better observability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/analyzers/opencode.rs` around lines 853 - 858, Replace the direct stderr
print in the Err arm that uses eprintln! with a structured logging macro (e.g.,
tracing::error! or log::error!) so the error doesn't disrupt the TUI; keep the
same context by logging the source.path and the error (e), and add the
appropriate use/import (tracing::error or log::error) at the top of the module;
target the Err(e) => block that references source.path and e and swap eprintln!
for the chosen logging macro with a clear message and structured fields if using
tracing (e.g., error!(path = %source.path, error = %e, "Failed to parse OpenCode
SQLite DB")).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/analyzers/opencode.rs`:
- Around line 596-603: The LIKE pre-filter in the conn.prepare call (query
string selecting from part WHERE data LIKE '%"type":"tool"%' OR data LIKE
'%"type": "tool"%') can miss tool parts with arbitrary whitespace/newlines;
update the query to either remove the LIKE filter entirely and rely on the
post-parse type check (the code path that inspects parsed JSON around line 625)
or replace the pattern with a broader match (e.g., use REGEXP/JSON functions if
supported) so that all candidate rows are returned; modify the SQL in the
conn.prepare invocation accordingly and keep the existing post-parse filtering
logic intact.
- Around line 527-530: DbSession.title is currently String but the DB title
column can be NULL, causing row.get(2)? to fail and rows to be lost; change the
DbSession struct's title field to Option<String> and update the query extraction
to use row.get::<_, Option<String>>(2)? (replace any plain row.get(2)? calls),
then adjust downstream code that expects a String (e.g., the mapping/usage
around the former line 705) to handle Option<String> safely (provide fallback or
propagate None) so sessions with NULL titles are preserved instead of being
dropped.
- Around line 270-276: The field s.cached_tokens is incorrectly set to only
tokens.cache.read; update the assignment in the block that checks msg.tokens so
s.cached_tokens = tokens.cache.write + tokens.cache.read (sum write and read)
instead of using tokens.cache.read alone; locate the code around the msg.tokens
handling where s.input_tokens, s.output_tokens, s.reasoning_tokens,
s.cache_creation_tokens and s.cache_read_tokens are set and change the
s.cached_tokens assignment accordingly.
---
Outside diff comments:
In `@src/analyzers/opencode.rs`:
- Around line 878-946: get_stats_with_sources duplicates the JSON/SQLite
partitioning, context loading, parallel parsing, and deduplication already
implemented in parse_sources_parallel; replace the body of
get_stats_with_sources with a delegation to parse_sources_parallel and then
adapt its returned messages/stats into the AgenticCodingToolStats struct.
Specifically: call Self::parse_sources_parallel(sources) (ensuring
storage_root/context are handled there), use the returned deduplicated messages
to compute daily_stats and num_conversations (reuse
crate::utils::aggregate_by_date) and construct the AgenticCodingToolStats with
analyzer_name from self.display_name(); remove the duplicated json/db parsing
and parse_sqlite_messages usage from get_stats_with_sources. Ensure any helper
functions referenced (storage_root, parse_sources_parallel,
deduplicate_by_global_hash) are used rather than reimplemented.
---
Duplicate comments:
In `@src/analyzers/opencode.rs`:
- Around line 705-707: The JSON and SQLite code paths generate inconsistent
project hashes because the JSON path uses session.id while the SQLite path uses
session.project_id; pick one and make both consistent — update the JSON path
(where fallback_project_hash is computed) to use session.project_id (matching
the SQLite path) so the same fallback value is used; locate uses of session.id
and session.project_id (and related vars like fallback_project_hash,
session_title, worktree, fallback) and replace the JSON-side session.id usage
with session.project_id.as_str() (or the equivalent) so grouping is consistent.
- Around line 47-50: The doc for has_sqlite_db() is incorrect: it claims the
function checks "exists and has the expected schema" but the implementation only
tests Self::db_path().is_some_and(|p| p.exists()); either update the doc to say
it only checks file existence or extend the function to validate schema (e.g.,
open the DB at Self::db_path(), run a PRAGMA user_version or query expected
tables/columns, and return true only if schema matches). Modify the comment or
implement the schema check in has_sqlite_db() and keep references to
Self::db_path() and has_sqlite_db() so callers remain unchanged.
---
Nitpick comments:
In `@src/analyzers/opencode.rs`:
- Around line 853-858: Replace the direct stderr print in the Err arm that uses
eprintln! with a structured logging macro (e.g., tracing::error! or log::error!)
so the error doesn't disrupt the TUI; keep the same context by logging the
source.path and the error (e), and add the appropriate use/import
(tracing::error or log::error) at the top of the module; target the Err(e) =>
block that references source.path and e and swap eprintln! for the chosen
logging macro with a clear message and structured fields if using tracing (e.g.,
error!(path = %source.path, error = %e, "Failed to parse OpenCode SQLite DB")).

Comment on lines +270 to +276
if let Some(tokens) = &msg.tokens {
s.input_tokens = tokens.input;
s.output_tokens = tokens.output;
s.reasoning_tokens = tokens.reasoning;
s.cache_creation_tokens = tokens.cache.write;
s.cache_read_tokens = tokens.cache.read;
s.cached_tokens = tokens.cache.read;
Copy link

@coderabbitai coderabbitai bot Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Search for Stats struct definition
ast-grep --pattern 'struct Stats {
 $$$
}'

Repository: Piebald-AI/splitrail

Length of output: 46


🏁 Script executed:

# Also search for the Stats struct in case it uses a different pattern or derives
rg -t rs 'struct Stats' -A 30

Repository: Piebald-AI/splitrail

Length of output: 90


🏁 Script executed:

# Check tokens structure to understand what fields are available
rg -t rs 'tokens.*cache' -B 2 -A 2

Repository: Piebald-AI/splitrail

Length of output: 90


🏁 Script executed:

# Search for Stats struct without type flag
rg 'struct Stats' -A 30

Repository: Piebald-AI/splitrail

Length of output: 11228


🏁 Script executed:

# Find the Stats struct definition more directly in opencode.rs
rg 'pub struct Stats|struct Stats' src/analyzers/ -A 25

Repository: Piebald-AI/splitrail

Length of output: 8510


🏁 Script executed:

# Check what fields are in Stats by searching for cache_read_tokens and cached_tokens
rg 'cache_read_tokens|cached_tokens' -B 2 -A 2

Repository: Piebald-AI/splitrail

Length of output: 50376


Line 276 incorrectly sets cached_tokens to only cache reads — should sum both write and read tokens

Currently only assigns tokens.cache.read. Based on all other analyzer implementations, cached_tokens should represent the total cached tokens: tokens.cache.write + tokens.cache.read.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/analyzers/opencode.rs` around lines 270 - 276, The field s.cached_tokens
is incorrectly set to only tokens.cache.read; update the assignment in the block
that checks msg.tokens so s.cached_tokens = tokens.cache.write +
tokens.cache.read (sum write and read) instead of using tokens.cache.read alone;
locate the code around the msg.tokens handling where s.input_tokens,
s.output_tokens, s.reasoning_tokens, s.cache_creation_tokens and
s.cache_read_tokens are set and change the s.cached_tokens assignment
accordingly.

Comment on lines +527 to +530
struct DbSession {
project_id: String,
title: String,
}
Copy link

@coderabbitai coderabbitai bot Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

DbSession.title is non-optional but DB column might be NULL.

DbSession.title is String, and row.get(2)? will fail if the DB title column is NULL. Failed rows are silently dropped via .flatten() (line 581), which means sessions with NULL titles would be lost — and all their messages would lose session metadata.

The legacy JSON model already uses Option<String> for title (line 116). Consider making DbSession.title an Option<String> for resilience:

🛡️ Proposed fix
 struct DbSession {
 project_id: String,
- title: String,
+ title: Option<String>,
 }

And update row.get(2)? to row.get::<_, Option<String>>(2)? with a corresponding adjustment at line 705.

Also applies to: 568-578

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/analyzers/opencode.rs` around lines 527 - 530, DbSession.title is
currently String but the DB title column can be NULL, causing row.get(2)? to
fail and rows to be lost; change the DbSession struct's title field to
Option<String> and update the query extraction to use row.get::<_,
Option<String>>(2)? (replace any plain row.get(2)? calls), then adjust
downstream code that expects a String (e.g., the mapping/usage around the former
line 705) to handle Option<String> safely (provide fallback or propagate None)
so sessions with NULL titles are preserved instead of being dropped.

Comment on lines +596 to +603

// Use LIKE to pre-filter for tool-type parts — avoids deserializing text,
// reasoning, step-start, etc. parts which are typically much larger.
let Ok(mut stmt) = conn.prepare(
"SELECT message_id, data FROM part WHERE data LIKE '%\"type\":\"tool\"%' OR data LIKE '%\"type\": \"tool\"%'",
) else {
return map;
};
Copy link

@coderabbitai coderabbitai bot Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

LIKE pre-filter may miss tool parts with non-standard whitespace.

The SQL LIKE '%"type":"tool"%' pattern handles zero and one space, but won't match if the JSON was pretty-printed with newlines or tabs between "type": and "tool". Since this is a pre-filter and the actual type check happens post-parse (line 625), a miss here silently drops tool stats.

If OpenCode ever pretty-prints part data, consider a broader pattern or removing the LIKE filter (relying solely on post-parse filtering). For now this is low-risk since DB blobs are typically compact JSON.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/analyzers/opencode.rs` around lines 596 - 603, The LIKE pre-filter in the
conn.prepare call (query string selecting from part WHERE data LIKE
'%"type":"tool"%' OR data LIKE '%"type": "tool"%') can miss tool parts with
arbitrary whitespace/newlines; update the query to either remove the LIKE filter
entirely and rely on the post-parse type check (the code path that inspects
parsed JSON around line 625) or replace the pattern with a broader match (e.g.,
use REGEXP/JSON functions if supported) so that all candidate rows are returned;
modify the SQL in the conn.prepare invocation accordingly and keep the existing
post-parse filtering logic intact.

- Fix doc comment on has_sqlite_db() to match implementation (only
 checks file existence, not schema) [comment 1]
- Fix inconsistent fallback_project_hash between JSON and SQLite paths:
 JSON used session.id but SQLite used session.project_id, causing
 different project_hash values for the same message depending on which
 source won deduplication. Both now use session_id. [comment 2]
- Extract shared accumulate_tool_stat() helper to deduplicate the
 tool-name matching logic between extract_tool_stats_from_parts (JSON
 filesystem) and batch_load_tool_stats_from_db (SQLite). [nitpick 1]
- Collapse get_stats_with_sources() to reuse parse_sources_parallel()
 instead of reimplementing the partition/parse/dedup pipeline, removing
 ~40 lines of duplicated logic. [nitpick 2]
- Document the LIKE pre-filter assumption: OpenCode uses JSON.stringify
 without pretty-printing, so the two patterns cover all expected
 formatting. False positives are harmless (filtered in Rust). [nitpick 3]
Skipped two nitpicks that don't apply:
- parse_source per-file reload: trait method signature is fixed, batch
 path already handles it, and this matches the pre-existing pattern.
- eprintln! vs tracing: the entire codebase uses eprintln!, not tracing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@coderabbitai coderabbitai[bot] coderabbitai[bot] requested changes

Requested changes must be addressed to merge this pull request.

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

1 participant

AltStyle によって変換されたページ (->オリジナル) /