Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Paired-data pipeline: 4 bugs in CSI recorder + ground-truth aligner corrupt or block camera-supervised training data #1007

Open
Labels
bugSomething isn't working

Description

Context

Found during ADR-152 §2.2 measurement (b) (2026年06月10日/11), when a fresh 40-minute paired collection initially aligned to zero windows and the trained-model forensics exposed silent data corruption. These bugs also retroactively explain pathologies in earlier sessions (#645, #509). Full forensic record: benchmarks/wiflow-std/RESULTS.md on branch feat/adr-152-wiflow-std-benchmark.

Bug 1 — scripts/record-csi-udp.py stamps local time with a Z (UTC) suffix

parse_csi_packet() builds timestamp via time.strftime('%Y-%m-%dT%H:%M:%S.') + ... + 'Z'local wall time labeled as UTC. The camera collector writes true-epoch ts_ns. The aligner parses the CSI ISO string as UTC, so camera and CSI disagree by the UTC offset (−4 h under EDT) and alignment produces 0 pairs. Workaround used: --clock-offset-ms=-14400000. Fix: write datetime.now(timezone.utc).isoformat() or just use the already-present ts_ns in the aligner (preferred — see Bug 4 note).

Bug 2 — scripts/align-ground-truth.js dilutes window confidence with non-detection frames

loadGroundTruth() keeps records with keypoints: [] (empty array is truthy) at confidence 0; window avgConf then averages detections and empties. At a normal ~27% MediaPipe detection rate, every window's avgConf lands ~0.22 < the 0.5 threshold → all windows rejected even when detections themselves average 0.80 confidence. Fix: skip empty-keypoint records at load (treat as gaps); confidence statistics should be over detections only. --min-camera-frames still guards sparse windows.

Bug 3 — heterogeneous csi_shape with silent zero-padding

extractCsiMatrix() stamps the window's subcarrier count from window[0].subcarriers and zero-pads/truncates the other 19 frames to match. Tonight's session: ×ばつ[70,20], ×ばつ[134,20], ×ばつ[26,20], ×ばつ[12,20], ×ばつ[20,20] — ~20% of frames inside even native-70 windows were silently zero-padded. Mixed-subcarrier frames come from the ESP32 emitting different packet formats (HT20/HT40/fragments). Fix: either filter frames to the session's modal subcarrier count before windowing, or record the per-frame subcarrier count and reject mixed windows; never silently pad.

Bug 4 — transposed shape label in extractCsiMatrix

The matrix is filled frame-major (matrix[f * nSc + s]) but declared shape: [nSc, nFrames] (~line 351). Consumers that trust the label transpose the data. Found because the measurement-(b) trainer had to correct it on load. Fix the label or the fill order, and add a round-trip test.

Acceptance

  • A fresh paired session aligns with zero clock-offset flags needed
  • Window kept-rate ≈ csi_frames/20 ×ばつ detection_coverage (no silent confidence collapse)
  • No zero-padded frames in output windows; csi_shape homogeneous per file
  • Shape label matches memory layout (tested)
  • Re-run alignment on tonight's raw files (data/recordings/csi-1781143789.csi.jsonl + data/ground-truth/keypoints_20260610_221000.jsonl) reproduces ≥2,046 pairs without workarounds

Related

#645 (paired-data quantity/quality tracking), #509 (external reproducibility), ADR-152 §2.2, the 92.9% retraction (CHANGELOG + PR #535).

🤖 Generated with claude-flow

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /