-
Notifications
You must be signed in to change notification settings - Fork 1.3k
opt(stream): add option to directly copy over tables from lower levels (#1700) #1872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@mangalaman93
mangalaman93
requested review from
akon-dey,
billprovince,
joshua-goldstein and
skrdgraph
as code owners
February 14, 2023 10:51
CLA assistant check
All committers have signed the CLA.
@mangalaman93
mangalaman93
force-pushed
the
aman/block-length
branch
2 times, most recently
from
February 14, 2023 14:39
e8583aa to
a3c70d1
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
February 14, 2023 15:36
16131ef to
0c4052d
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/block-length
branch
from
February 14, 2023 15:46
a3c70d1 to
5c374e2
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
2 times, most recently
from
February 14, 2023 15:52
5e7a48d to
1f99a5c
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/block-length
branch
from
February 15, 2023 19:36
5c374e2 to
d1fe28b
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
February 15, 2023 19:37
1f99a5c to
0e2acc6
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/block-length
branch
2 times, most recently
from
February 18, 2023 11:40
1375846 to
12a0e42
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
3 times, most recently
from
February 20, 2023 05:37
f1a05da to
63a4948
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/block-length
branch
from
February 21, 2023 03:43
12a0e42 to
72a4c72
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
February 22, 2023 09:11
63a4948 to
1ab2de4
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
2 times, most recently
from
February 24, 2023 13:10
f7c65e6 to
ed0f577
Compare
This code has a data race, I am looking into it:
==================
WARNING: DATA RACE
Write at 0x00c000b100e8 by goroutine 2957:
github.com/dgraph-io/badger/v3.(*StreamWriter).Write.func1()
/home/aman/gocode/pkg/mod/github.com/dgraph-io/badger/v3@v3.2103.6-0.20230214155941-0e7c6a7a614a/stream_writer.go:212 +0x167
github.com/dgraph-io/ristretto/z.(*Buffer).SliceIterate()
/home/aman/gocode/pkg/mod/github.com/dgraph-io/ristretto@v0.1.1/z/buffer.go:290 +0x30a
github.com/dgraph-io/badger/v3.(*StreamWriter).Write()
/home/aman/gocode/pkg/mod/github.com/dgraph-io/badger/v3@v3.2103.6-0.20230214155941-0e7c6a7a614a/stream_writer.go:146 +0x194
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*countIndexer).writeIndex()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/count_index.go:188 +0x9c7
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*countIndexer).addCountEntry.func1()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/count_index.go:100 +0x47
Previous write at 0x00c000b100e8 by goroutine 228:
github.com/dgraph-io/badger/v3.(*StreamWriter).Write.func1()
/home/aman/gocode/pkg/mod/github.com/dgraph-io/badger/v3@v3.2103.6-0.20230214155941-0e7c6a7a614a/stream_writer.go:212 +0x167
github.com/dgraph-io/ristretto/z.(*Buffer).SliceIterate()
/home/aman/gocode/pkg/mod/github.com/dgraph-io/ristretto@v0.1.1/z/buffer.go:290 +0x30a
github.com/dgraph-io/badger/v3.(*StreamWriter).Write()
/home/aman/gocode/pkg/mod/github.com/dgraph-io/badger/v3@v3.2103.6-0.20230214155941-0e7c6a7a614a/stream_writer.go:146 +0x194
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*reducer).startWriting.func2()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/reduce.go:349 +0xd1
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*reducer).startWriting()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/reduce.go:389 +0x284
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*reducer).reduce.func4()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/reduce.go:475 +0x64
Goroutine 2957 (running) created at:
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*countIndexer).addCountEntry()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/count_index.go:100 +0x4c9
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*reducer).startWriting.func1.2()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/reduce.go:338 +0x4e
github.com/dgraph-io/ristretto/z.(*Buffer).SliceIterate()
/home/aman/gocode/pkg/mod/github.com/dgraph-io/ristretto@v0.1.1/z/buffer.go:290 +0x30a
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*reducer).startWriting.func1()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/reduce.go:336 +0x1cb
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*reducer).startWriting()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/reduce.go:392 +0x2bc
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*reducer).reduce.func4()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/reduce.go:475 +0x64
Goroutine 228 (running) created at:
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*reducer).reduce()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/reduce.go:475 +0x4c5
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*reducer).run.func1()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/reduce.go:109 +0xbe4
github.com/dgraph-io/dgraph/dgraph/cmd/bulk.(*reducer).run.func2()
/home/aman/gocode/src/github.com/dgraph-io/dgraph/dgraph/cmd/bulk/reduce.go:123 +0x66
==================
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
March 1, 2023 07:18
ed0f577 to
741ab83
Compare
@mangalaman93
mangalaman93
changed the base branch from
main-deprecated-v4
to
main
March 1, 2023 07:18
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
March 2, 2023 07:28
081b41b to
edb2318
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
March 15, 2023 19:44
8873caa to
4615899
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/sw
branch
from
March 15, 2023 19:45
edcc231 to
663ea22
Compare
ghost
ghost
reviewed
Mar 28, 2023
@ghost
ghost
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you write a benchmark for this?
@mangalaman93
mangalaman93
force-pushed
the
aman/sw
branch
from
May 17, 2023 04:43
663ea22 to
1bd711c
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
May 17, 2023 04:43
4615899 to
e39ebb5
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
June 7, 2023 14:37
e39ebb5 to
32230b1
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/sw
branch
2 times, most recently
from
June 12, 2023 04:07
77973c6 to
908259a
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
June 12, 2023 05:03
32230b1 to
08a3e2d
Compare
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
July 19, 2023 04:41
08a3e2d to
1218725
Compare
This PR has been stale for 60 days and will be closed automatically in 7 days. Comment to keep it open.
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
January 27, 2025 10:40
1218725 to
f31ff23
Compare
✅ Deploy Preview for badger-docs canceled.
|
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
2 times, most recently
from
January 27, 2025 11:29
c024d71 to
ffd74f3
Compare
closing PR for now, keeping branch active
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
June 27, 2025 13:31
ffd74f3 to
88c37d9
Compare
#1700) Also takes a bug fix from PR #1712, commit 58d0674 This PR adds FullCopy option in Stream. This allows sending the table entirely to the writer. If this option is set to true we directly copy over the tables from the last 2 levels. This option increases the stream speed while also lowering the memory consumption on the DB that is streaming the KVs. For 71GB, compressed and encrypted DB we observed 3x improvement in speed. The DB contained ~65GB in the last 2 levels while remaining in the above levels. To use this option, the following options should be set in Stream. stream.KeyToList = nil stream.ChooseKey = nil stream.SinceTs = 0 db.managedTxns = true If we use stream writer for receiving the KVs, the encryption mode has to be the same in sender and receiver. This will restrict db.StreamDB() to use the same encryption mode in both input and output DB. Added TODO for allowing different encryption modes.
@mangalaman93
mangalaman93
force-pushed
the
aman/copy-levels
branch
from
June 27, 2025 13:34
88c37d9 to
48955c8
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
Also takes a bug fix from PR #1712, commit 58d0674
This PR adds FullCopy option in Stream. This allows sending the table entirely to the writer. If this option is set to true we directly copy over the tables from the last 2 levels. This option increases the stream speed while also lowering the memory consumption on the DB that is streaming the KVs.
For 71GB, compressed and encrypted DB we observed 3x improvement in speed. The DB contained ~65GB in the last 2 levels while remaining in the above levels.
To use this option, the following options should be set in Stream.
stream.KeyToList = nil
stream.ChooseKey = nil
stream.SinceTs = 0
db.managedTxns = true
If we use stream writer for receiving the KVs, the encryption mode has to be the same in sender and receiver. This will restrict db.StreamDB() to use the same encryption mode in both input and output DB. Added TODO for allowing different encryption modes.