Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add --transaction-isolation flag #1441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
timvaillancourt wants to merge 5 commits into github:master
base: master
Choose a base branch
Loading
from timvaillancourt:isolation_level_flag

Conversation

@timvaillancourt
Copy link
Collaborator

@timvaillancourt timvaillancourt commented Aug 14, 2024
edited
Loading

A Pull Request should be associated with an Issue.

Related issue: #1262

Further notes in https://github.com/github/gh-ost/blob/master/.github/CONTRIBUTING.md
Thank you! We are open to PRs, but please understand if for technical reasons we are unable to accept each and any PR

Description

This PR resolves #1262 by adding a --transaction-isolation flag that supports both REPEATABLE-READ (default - what GitHub tests) and READ-COMMITTED

In case this PR introduced Go code changes:

  • contributed code is using same conventions as original code
  • script/cibuild returns with no formatting errors, build errors or unit test errors.

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Copy link
Contributor

arthurschreiber commented Oct 19, 2024
edited
Loading

Do we really need this flag? I'm absolutely sure that mysql replication uses READ_COMMITTED when applying RBR changes from the binlog. As gh-ost does not support statement based replication, there's no point in using REPEATABLE_READ for the changelog applier and we should be able to use READ_COMMITTED always.

For the table copy part, I don't see a reason how READ_COMMITTED would have any negative side-effects either. Right now, REPEATABLE_READ might copy an "old" version of the row data, but the changelog applier will fix that up afterwards.

With READ_COMMITTED, we'll always read the "latest" version of the data, so in theory there could be less changes that need to be applied by the changelog applier, but I don't see any negative sides to this. 🤔

Copy link
Collaborator Author

timvaillancourt commented Oct 19, 2024
edited
Loading

@arthurschreiber I pondered using READ_COMMITTED 100% in an earlier thread when setting transaction isolation was introduced, but there was some concerns around using a new isolation level. Allowing users that previously were using READ_COMMITTED before the enforcement was introduced seemed like the easiest way forward

But for the most part I agree, REPEATABLE_READ isolation is usually not required and can actually introduce stale results as you mention. There is one spot where a snapshot read might have a benefit, however: the calculation of the min/max chunk ranges. A snapshot isolation guarantees both the min and max query are operating on the same data - which sounds like a good thing but I'm not 100% sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@rashiq rashiq Awaiting requested review from rashiq rashiq is a code owner

@meiji163 meiji163 Awaiting requested review from meiji163 meiji163 is a code owner

At least 1 approving review is required to merge this pull request.

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Regression in master: SET transaction_isolation = 'repeatable_read';

AltStyle によって変換されたページ (->オリジナル) /