Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add ForeachMVCCSpecificNodeInRange #22137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
LeftHandCold wants to merge 10 commits into matrixorigin:main
base: main
Choose a base branch
Loading
from LeftHandCold:debug_ForeachMVCCNodeInRange2

Conversation

Copy link
Contributor

@LeftHandCold LeftHandCold commented Jul 9, 2025
edited by qodo-merge-pro bot
Loading

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #22129

What this PR does / why we need it:

ForeachMVCCSpecificNodeInRange is used by "do checkpoint".
The purpose is to ensure that the checkpoint contains all data before the end timestamp.


PR Type

Bug fix


Description

  • Fix checkpoint timeout by adjusting MVCC node traversal logic

  • Modify checkpoint entry validation to require only 1 entry instead of 2

  • Add new ForeachMVCCSpecificNodeInRange method for checkpoint operations

  • Ensure checkpoint contains all data before end timestamp


Changes diagram

flowchart LR
 A["Checkpoint Entry Validation"] --> B["MVCC Node Traversal"]
 B --> C["ForeachMVCCSpecificNodeInRange"]
 C --> D["Checkpoint Data Collection"]
 D --> E["Snapshot Restore Fix"]
Loading

Changes walkthrough 📝

Relevant files
Bug fix
db.go
Relax checkpoint entry validation requirements

pkg/vm/engine/disttae/db.go

  • Change checkpoint entry validation from requiring 2 entries to 1
  • Modify end timestamp calculation to use last entry instead of
    penultimate
  • +2/-2
    ckp_writer.go
    Update checkpoint collector traversal method

    pkg/vm/engine/tae/logtail/ckp_writer.go

  • Replace ForeachMVCCNodeInRange with ForeachMVCCSpecificNodeInRange
  • Update checkpoint collector to use new traversal method
  • +1/-1
    Enhancement
    object.go
    Add specialized MVCC traversal for checkpoints

    pkg/vm/engine/tae/catalog/object.go

  • Add new ForeachMVCCSpecificNodeInRange method for checkpoint
    operations
  • Implement transaction waiting logic for uncommitted nodes
  • Add specific logic for appendable objects and delete node handling
  • +33/-0

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • @matrix-meow matrix-meow added the size/S Denotes a PR that changes [10,99] lines label Jul 9, 2025
    Copy link

    qodo-merge-pro bot commented Jul 9, 2025

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    🎫 Ticket compliance analysis ✅

    22129 - PR Code Verified

    Compliant requirements:

    • Fix restore from snapshot timeout issue
    • Address checkpoint-related timeout problems during snapshot operations

    Requires further human verification:

    • Ensure snapshot workflow completes successfully without timing out

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    Recommended focus areas for review

    Logic Change

    The checkpoint validation logic changed from requiring 2 entries to 1 entry, and end timestamp calculation now uses the last entry instead of penultimate. This fundamental change in checkpoint validation needs careful verification to ensure it doesn't break existing functionality.

    if len(checkpointEntries) < 1 {
    	return false
    }
    // The end time of the penultimate checkpoint must not be less than the ts of the snapshot,
    // because the data of the snapshot may exist in the wal and be collected to the next checkpoint
    end := checkpointEntries[len(checkpointEntries)-1].GetEnd()
    Transaction Safety

    The new ForeachMVCCSpecificNodeInRange method includes transaction waiting logic and complex conditional checks for appendable objects and delete nodes. The transaction state handling and the various conditional branches need thorough validation to ensure correctness.

    func (entry *ObjectEntry) ForeachMVCCSpecificNodeInRange(start, end types.TS, f func(*txnbase.TxnMVCCNode) error) error {
    	needWait, txn := entry.GetLastMVCCNode().NeedWaitCommitting(end.Next())
    	if needWait {
    		txn.GetTxnState(true)
    	}
    	var createIn, deleteIn bool
    	createIn, _ = entry.CreateNode.PreparedIn(start, end)
    	if createIn {
    		if err := f(&entry.CreateNode); err != nil {
    			return err
    		}
    	}
    	deleteIn, _ = entry.DeleteNode.PreparedIn(start, end)
    	if !deleteIn {
    		if !entry.IsAppendable() {
    			return nil
    		}
    		if !createIn {
    			return nil
    		}
    		if !entry.DeleteNode.IsCommitted() {
    			return nil
    		}
    	}
    	if err := f(&entry.DeleteNode); err != nil {
    		return err
    	}
    	return nil

    Copy link

    qodo-merge-pro bot commented Jul 9, 2025
    edited
    Loading

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion Impact
    Possible issue
    Add nil check for transaction

    The code doesn't handle the case where txn might be nil when needWait is true.
    This could lead to a panic when calling txn.GetTxnState(true).

    pkg/vm/engine/tae/catalog/object.go [735-738]

     needWait, txn := entry.GetLastMVCCNode().NeedWaitCommitting(end.Next())
    -if needWait {
    +if needWait && txn != nil {
     txn.GetTxnState(true)
     }
    • Apply / Chat
    Suggestion importance[1-10]: 8

    __

    Why: The suggestion correctly identifies a potential nil pointer dereference that could cause a panic and provides a proper fix, improving the code's robustness.

    Medium
    • Update

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Reviewers

    @gouhongshen gouhongshen gouhongshen approved these changes

    @XuPeng-SH XuPeng-SH Awaiting requested review from XuPeng-SH XuPeng-SH is a code owner

    @volgariver6 volgariver6 Awaiting requested review from volgariver6 volgariver6 is a code owner

    At least 1 approving review is required to merge this pull request.

    Assignees

    No one assigned

    Labels

    kind/bug Something isn't working kind/enhancement Review effort 4/5 size/S Denotes a PR that changes [10,99] lines

    Projects

    None yet

    Milestone

    No milestone

    Development

    Successfully merging this pull request may close these issues.

    AltStyle によって変換されたページ (->オリジナル) /