Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

long-lived pools/background jobs #93

bobheadxi started this conversation in Ideas
Discussion options

I think there might be a category of use cases where one might want to maintain a long-lived pool where:

  • tasks can be continuously submitted, with per-task error handling (ideally per-task panic handling as well, rather than aggregated panic handling) and concurrency limits
  • on service shutdown, offer a mechansim to signal and wait for all goroutines to stop

Somewhat related: #86 (comment), #89 (comment)

This might be something that's useful to offer separately, since I think all the conc.WaitGroup-based mechanisms should definitely not be used for long-lived goroutine pools (most notably because panics get swallowed until Wait())

You must be logged in to vote

Replies: 2 comments 4 replies

Comment options

I agree that this is a category of use for goroutines that is not currently nicely handled by conc. I don't have a good vision of how we can solve this while maintaining the scoped-concurrency spirit of conc though.

The stream package currently suffers from this too -- it doesn't propagate a panic until the stream is complete. So using it for long-lived streams is usually inappropriate. For the sake of a shared starting point, I'm going to explore the design here in terms of the existing stream package.

I think the biggest design hurdle here is, in order to propagate a panic before calling s.Wait(), we need something running in the parent goroutine. Consider the following snippet:

func processStream(input <-chan int, output chan<- int) {
	s := stream.New().WithMaxGoroutines(8)
	for elem := range input {
		elem := elem
		s.Go(func() stream.Callback {
			res := fetch(elem)
			return func() { output <- res }
		})
	}
	s.Wait()
}

While blocked on a read from input, it is impossible to propagate a panic to the parent goroutine. We could probably improve on the situation slightly by making it possible to propagate a panic from s.Go(), but that will still only trigger when we have a new task to spawn.

Another important consideration is, without a cancellation mechanism, groups don't work well for long-running goroutines because we can't propagate a panic until all the spawned goroutines exit. So even if we have a way to propagate a panic, we don't have a way to tell the other goroutines to exit unless we're using something like ContextPool.

In general, I think it's an antipattern to "handle" a panic by logging it or whatever, so I don't want to make it super easy to do that by adding something like "WithPanicHandler" (this is still achievable by using recover in your spawned tasks if you really want it). However, I think it might be valuable to allow opting out of the panic propagation, falling back to the default "crash when a panic hits the top of a goroutine" behavior.

I'm not sure this actually brought us closer to a solution, but I'll keep mulling over it 🙂 I'd be curious to see design ideas here.

You must be logged in to vote
1 reply
Comment options

bobheadxi Feb 24, 2023
Collaborator Author

Throwing a convoluted quick sketch out: maybe we can have something that handles incoming work as a queue, sends panics to a results channel, and monitors for cancellation and panics to interrupt work:

type Processor struct {
	ctx context.Context
	cancel func()
	wg sync.WaitGroup
	startDone chan struct{}
	work chan func(context.Context)
	panics chan *panics.RecoveredPanic
}
func (p *Processor) Submit(fn func(context.Context)) { p.work <- fn }
func (p *Processor) Start() {
	for {
		select {
		case <-p.ctx.Done():
			close(p.startDone)
			break
		case recovered := <-p.panics:
			p.cancel()
			p.wg.Wait()
			panic(recovered) // or send to Wait to throw?
		// TODO maybe have work handling and panic/cancel handling in separate goroutines
		case fn := <-p.work:
			// TODO something more robust like concurrency handling
			p.wg.Add(1)
			go func() {
				defer p.wg.Done()
				recovered := panics.Try(func() {
					fn(p.ctx)
				})
				if recovered != nil {
					p.panics <- recovered
				}
			}()
		}
	}
}
func (p *Processor) Cancel() {
	p.cancel()
}
func (p *Processor) Wait() {
	p.wg.Wait()
	<-p.startDone
}
Comment options

long-lived goroutine pools

Hi, I'm new to conc so sorry if I'm jumping in inappropriately, but, is there really a case for long-lived pools? Thread pools are a thing in other languages due to the cost of threads, but a conc pool is not expensive to create (or put another way, if you are creating so many pools that it becomes a problem, your probably have other issues.)

The thing I like about conc is that it steers me away from patterns that are more likely to result in headaches than benefits.

You must be logged in to vote
3 replies
Comment options

Hi @iamnoah! Good question. You're not wrong -- there are not many cases for long-lived pools. There is no real need to re-use pools over a long period of time since they are so cheap to create.

Maybe it would be more clear to phrase it as "pools with long-lived goroutines." Consider a situation like listening on two different ports. For each port, a goroutine is spawned to handle new connections. These two goroutines exist for the lifetime of the server. However, if either panics or errors, you might still want to return control to the spawning goroutine to do some cleanup tasks.

Right now, conc doesn't handle this situation well because a panic/error will not return control to the spawning goroutine until all goroutines have exited. If one listener fails and we catch its panic, the second listener will still happily continue listening for all eternity.

IMO, that's the crux of this discussion -- first, whether that's a situation conc can handle, and then whether it's a situation conc should handle. Because I agree: I do not want to introduce features in conc that make it easy to do bad things.

Comment options

Right now, conc doesn't handle this situation well because a panic/error will not return control to the spawning goroutine until all goroutines have exited.

What is the downside of WithCancelOnError? When I have multiple long-lived goroutines, I need them all to do their own cleanup before doing any final cleanup.

Comment options

bobheadxi Feb 24, 2023
Collaborator Author

Thread pools are a thing in other languages due to the cost of threads, but a conc pool is not expensive to create

There is no real need to re-use pools over a long period of time since they are so cheap to create.

I meant don't mean long-lived pools in the sense the goroutines themselves are the resource - one example is the one Camden raises, where e.g. one might have some long-lived processes, though like you mention WithCancelOnError can be used for this because if a process fails (or panics #92) the other task will be cancelled and control will be returned to the caller to handle.

However, another example is one where you want to control concurrency and cleanup over arbitrary tasks that might be submitted (as described in the original discussion post). For example, a processor that handles tasks being submitted continuously, such that it can:

  • only handle N at a time
  • handle tasks failing independently
  • signal that all tasks that are currently running should stop
  • etc

The original discussion post links a few requests regarding similar use cases that I think are along similar lines where whether conc/pool can/should be used is a bit dubious. #93 (comment) has another good example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /