Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

segmentio/topk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

7 Commits

Repository files navigation

HeavyKeeper

Note

Segment has paused maintenance on this project, but may return it to an active status in the future. Issues and pull requests from external contributors are not being considered, although internal contributions may appear from time to time. The project remains available under its open source license for anyone to use.

This package implements the HeavyKeeper algorithm for efficiently finding top-K flows in an unbounded set of flows.

Paper: https://www.usenix.org/system/files/conference/atc18/atc18-gong.pdf

The reference implementation is pretty difficult to follow. I found the RedisBloom implementation to be far eaier to read, if you're interested in seeing a more battle-tested implementation. The RedisBloom implementation also has some optimizations that might be worth looking at (a decay lookup table, for example) and it supports arbitrary increments greater than one, which I've implemented here.

This implementation uses a default width and depth to simplify usage:

  • width = k * log(k) (minimum of 256)
  • height = log(k) (minimum of 3)

Usage

hk := topk.New(100, 0.9)
hk.Add("foo", 1)
hk.Add("bar", 5)
hk.Add("baz", 1)
hk.Add("baz", 1)
for _, fc := range hk.Top() {
 fmt.Printf("%s = %d\n", fc.Flow, fc.Count)
}
// bar = 5
// baz = 2
// foo = 1
count, ok := hk.Count("bar")
fmt.Println(count, ok)
// 5 true

Benchmarks

The algorithm itself is rather efficient on its own; I haven't invested any time in further optimizing things (yet).

goos: darwin
goarch: amd64
pkg: github.com/segmentio/topk
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkAdd/K=10-12 	17065675	 79.38 ns/op	 0 B/op	 0 allocs/op
BenchmarkAdd/K=50-12 	11193319	 106.5 ns/op	 0 B/op	 0 allocs/op
BenchmarkAdd/K=100-12 	 9880362	 131.8 ns/op	 0 B/op	 0 allocs/op
BenchmarkAdd/K=500-12 	 7442464	 159.6 ns/op	 0 B/op	 0 allocs/op
BenchmarkAdd/K=1000-12 	 7125268	 167.8 ns/op	 0 B/op	 0 allocs/op
BenchmarkAdd/K=5000-12 	 5797017	 206.3 ns/op	 0 B/op	 0 allocs/op
BenchmarkAdd/K=10000-12 	 5218218	 233.2 ns/op	 0 B/op	 0 allocs/op

About

Go implementation of HeavyKeeper for efficiently finding top K flows

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

Languages

AltStyle によって変換されたページ (->オリジナル) /