[フレーム]

applyparallelworker.c

Go to the documentation of this file.

1/*-------------------------------------------------------------------------

2 * applyparallelworker.c

3 * Support routines for applying xact by parallel apply worker

4 *

6 *

7 * IDENTIFICATION

8 * src/backend/replication/logical/applyparallelworker.c

9 *

10 * This file contains the code to launch, set up, and teardown a parallel apply

11 * worker which receives the changes from the leader worker and invokes routines

12 * to apply those on the subscriber database. Additionally, this file contains

13 * routines that are intended to support setting up, using, and tearing down a

14 * ParallelApplyWorkerInfo which is required so the leader worker and parallel

15 * apply workers can communicate with each other.

16 *

17 * The parallel apply workers are assigned (if available) as soon as xact's

18 * first stream is received for subscriptions that have set their 'streaming'

19 * option as parallel. The leader apply worker will send changes to this new

20 * worker via shared memory. We keep this worker assigned till the transaction

21 * commit is received and also wait for the worker to finish at commit. This

22 * preserves commit ordering and avoid file I/O in most cases, although we

23 * still need to spill to a file if there is no worker available. See comments

24 * atop logical/worker to know more about streamed xacts whose changes are

25 * spilled to disk. It is important to maintain commit order to avoid failures

26 * due to: (a) transaction dependencies - say if we insert a row in the first

27 * transaction and update it in the second transaction on publisher then

28 * allowing the subscriber to apply both in parallel can lead to failure in the

29 * update; (b) deadlocks - allowing transactions that update the same set of

30 * rows/tables in the opposite order to be applied in parallel can lead to

31 * deadlocks.

32 *

33 * A worker pool is used to avoid restarting workers for each streaming

34 * transaction. We maintain each worker's information (ParallelApplyWorkerInfo)

35 * in the ParallelApplyWorkerPool. After successfully launching a new worker,

36 * its information is added to the ParallelApplyWorkerPool. Once the worker

37 * finishes applying the transaction, it is marked as available for re-use.

38 * Now, before starting a new worker to apply the streaming transaction, we

39 * check the list for any available worker. Note that we retain a maximum of

40 * half the max_parallel_apply_workers_per_subscription workers in the pool and

41 * after that, we simply exit the worker after applying the transaction.

42 *

43 * XXX This worker pool threshold is arbitrary and we can provide a GUC

44 * variable for this in the future if required.

45 *

46 * The leader apply worker will create a separate dynamic shared memory segment

47 * when each parallel apply worker starts. The reason for this design is that

48 * we cannot predict how many workers will be needed. It may be possible to

49 * allocate enough shared memory in one segment based on the maximum number of

50 * parallel apply workers (max_parallel_apply_workers_per_subscription), but

51 * this would waste memory if no process is actually started.

52 *

53 * The dynamic shared memory segment contains: (a) a shm_mq that is used to

54 * send changes in the transaction from leader apply worker to parallel apply

55 * worker; (b) another shm_mq that is used to send errors (and other messages

56 * reported via elog/ereport) from the parallel apply worker to leader apply

57 * worker; (c) necessary information to be shared among parallel apply workers

58 * and the leader apply worker (i.e. members of ParallelApplyWorkerShared).

59 *

60 * Locking Considerations

61 * ----------------------

62 * We have a risk of deadlock due to concurrently applying the transactions in

63 * parallel mode that were independent on the publisher side but became

64 * dependent on the subscriber side due to the different database structures

65 * (like schema of subscription tables, constraints, etc.) on each side. This

66 * can happen even without parallel mode when there are concurrent operations

67 * on the subscriber. In order to detect the deadlocks among leader (LA) and

68 * parallel apply (PA) workers, we used lmgr locks when the PA waits for the

69 * next stream (set of changes) and LA waits for PA to finish the transaction.

70 * An alternative approach could be to not allow parallelism when the schema of

71 * tables is different between the publisher and subscriber but that would be

72 * too restrictive and would require the publisher to send much more

73 * information than it is currently sending.

74 *

75 * Consider a case where the subscribed table does not have a unique key on the

76 * publisher and has a unique key on the subscriber. The deadlock can happen in

77 * the following ways:

78 *

79 * 1) Deadlock between the leader apply worker and a parallel apply worker

80 *

81 * Consider that the parallel apply worker (PA) is executing TX-1 and the

82 * leader apply worker (LA) is executing TX-2 concurrently on the subscriber.

83 * Now, LA is waiting for PA because of the unique key constraint of the

84 * subscribed table while PA is waiting for LA to send the next stream of

85 * changes or transaction finish command message.

86 *

87 * In order for lmgr to detect this, we have LA acquire a session lock on the

88 * remote transaction (by pa_lock_stream()) and have PA wait on the lock before

89 * trying to receive the next stream of changes. Specifically, LA will acquire

90 * the lock in AccessExclusive mode before sending the STREAM_STOP and will

91 * release it if already acquired after sending the STREAM_START, STREAM_ABORT

92 * (for toplevel transaction), STREAM_PREPARE, and STREAM_COMMIT. The PA will

93 * acquire the lock in AccessShare mode after processing STREAM_STOP and

94 * STREAM_ABORT (for subtransaction) and then release the lock immediately

95 * after acquiring it.

96 *

97 * The lock graph for the above example will look as follows:

98 * LA (waiting to acquire the lock on the unique index) -> PA (waiting to

99 * acquire the stream lock) -> LA

100 *

101 * This way, when PA is waiting for LA for the next stream of changes, we can

102 * have a wait-edge from PA to LA in lmgr, which will make us detect the

103 * deadlock between LA and PA.

104 *

105 * 2) Deadlock between the leader apply worker and parallel apply workers

106 *

107 * This scenario is similar to the first case but TX-1 and TX-2 are executed by

108 * two parallel apply workers (PA-1 and PA-2 respectively). In this scenario,

109 * PA-2 is waiting for PA-1 to complete its transaction while PA-1 is waiting

110 * for subsequent input from LA. Also, LA is waiting for PA-2 to complete its

111 * transaction in order to preserve the commit order. There is a deadlock among

112 * the three processes.

113 *

114 * In order for lmgr to detect this, we have PA acquire a session lock (this is

115 * a different lock than referred in the previous case, see

116 * pa_lock_transaction()) on the transaction being applied and have LA wait on

117 * the lock before proceeding in the transaction finish commands. Specifically,

118 * PA will acquire this lock in AccessExclusive mode before executing the first

119 * message of the transaction and release it at the xact end. LA will acquire

120 * this lock in AccessShare mode at transaction finish commands and release it

121 * immediately.

122 *

123 * The lock graph for the above example will look as follows:

124 * LA (waiting to acquire the transaction lock) -> PA-2 (waiting to acquire the

125 * lock due to unique index constraint) -> PA-1 (waiting to acquire the stream

126 * lock) -> LA

127 *

128 * This way when LA is waiting to finish the transaction end command to preserve

129 * the commit order, we will be able to detect deadlock, if any.

130 *

131 * One might think we can use XactLockTableWait(), but XactLockTableWait()

132 * considers PREPARED TRANSACTION as still in progress which means the lock

133 * won't be released even after the parallel apply worker has prepared the

134 * transaction.

135 *

136 * 3) Deadlock when the shm_mq buffer is full

137 *

138 * In the previous scenario (ie. PA-1 and PA-2 are executing transactions

139 * concurrently), if the shm_mq buffer between LA and PA-2 is full, LA has to

140 * wait to send messages, and this wait doesn't appear in lmgr.

141 *

142 * To avoid this wait, we use a non-blocking write and wait with a timeout. If

143 * the timeout is exceeded, the LA will serialize all the pending messages to

144 * a file and indicate PA-2 that it needs to read that file for the remaining

145 * messages. Then LA will start waiting for commit as in the previous case

146 * which will detect deadlock if any. See pa_send_data() and

147 * enum TransApplyAction.

148 *

149 * Lock types

150 * ----------

151 * Both the stream lock and the transaction lock mentioned above are

152 * session-level locks because both locks could be acquired outside the

153 * transaction, and the stream lock in the leader needs to persist across

154 * transaction boundaries i.e. until the end of the streaming transaction.

155 *-------------------------------------------------------------------------

156 */

157

158#include "postgres.h"

159

160#include "libpq/pqformat.h"

161#include "libpq/pqmq.h"

162#include "pgstat.h"

163#include "postmaster/interrupt.h"

164#include "replication/logicallauncher.h"

165#include "replication/logicalworker.h"

166#include "replication/origin.h"

167#include "replication/worker_internal.h"

168#include "storage/ipc.h"

169#include "storage/lmgr.h"

170#include "tcop/tcopprot.h"

171#include "utils/inval.h"

172#include "utils/memutils.h"

173#include "utils/syscache.h"

174

175 #define PG_LOGICAL_APPLY_SHM_MAGIC 0x787ca067

176

177/*

178 * DSM keys for parallel apply worker. Unlike other parallel execution code,

179 * since we don't need to worry about DSM keys conflicting with plan_node_id we

180 * can use small integers.

181 */

182 #define PARALLEL_APPLY_KEY_SHARED 1

183 #define PARALLEL_APPLY_KEY_MQ 2

184 #define PARALLEL_APPLY_KEY_ERROR_QUEUE 3

185

186/* Queue size of DSM, 16 MB for now. */

187 #define DSM_QUEUE_SIZE (16 * 1024 * 1024)

188

189/*

190 * Error queue size of DSM. It is desirable to make it large enough that a

191 * typical ErrorResponse can be sent without blocking. That way, a worker that

192 * errors out can write the whole message into the queue and terminate without

193 * waiting for the user backend.

194 */

195 #define DSM_ERROR_QUEUE_SIZE (16 * 1024)

196

197/*

198 * There are three fields in each message received by the parallel apply

199 * worker: start_lsn, end_lsn and send_time. Because we have updated these

200 * statistics in the leader apply worker, we can ignore these fields in the

201 * parallel apply worker (see function LogicalRepApplyLoop).

202 */

203 #define SIZE_STATS_MESSAGE (2 * sizeof(XLogRecPtr) + sizeof(TimestampTz))

204

205/*

206 * The type of session-level lock on a transaction being applied on a logical

207 * replication subscriber.

208 */

209 #define PARALLEL_APPLY_LOCK_STREAM 0

210 #define PARALLEL_APPLY_LOCK_XACT 1

211

212/*

213 * Hash table entry to map xid to the parallel apply worker state.

214 */

215 typedef struct ParallelApplyWorkerEntry

216{

217 TransactionId xid; /* Hash key -- must be first */

218 ParallelApplyWorkerInfo *winfo;

219 } ParallelApplyWorkerEntry;

220

221/*

222 * A hash table used to cache the state of streaming transactions being applied

223 * by the parallel apply workers.

224 */

225 static HTAB *ParallelApplyTxnHash = NULL;

226

227/*

228* A list (pool) of active parallel apply workers. The information for

229* the new worker is added to the list after successfully launching it. The

230* list entry is removed if there are already enough workers in the worker

231* pool at the end of the transaction. For more information about the worker

232* pool, see comments atop this file.

233 */

234 static List *ParallelApplyWorkerPool = NIL;

235

236/*

237 * Information shared between leader apply worker and parallel apply worker.

238 */

239 ParallelApplyWorkerShared *MyParallelShared = NULL;

240

241/*

242 * Is there a message sent by a parallel apply worker that the leader apply

243 * worker needs to receive?

244 */

245 volatile sig_atomic_t ParallelApplyMessagePending = false;

246

247/*

248 * Cache the parallel apply worker information required for applying the

249 * current streaming transaction. It is used to save the cost of searching the

250 * hash table when applying the changes between STREAM_START and STREAM_STOP.

251 */

252 static ParallelApplyWorkerInfo *stream_apply_worker = NULL;

253

254/* A list to maintain subtransactions, if any. */

255 static List *subxactlist = NIL;

256

257static void pa_free_worker_info(ParallelApplyWorkerInfo *winfo);

258static ParallelTransState pa_get_xact_state(ParallelApplyWorkerShared *wshared);

259static PartialFileSetState pa_get_fileset_state(void);

260

261/*

262 * Returns true if it is OK to start a parallel apply worker, false otherwise.

263 */

264static bool

265 pa_can_start(void)

266{

267 /* Only leader apply workers can start parallel apply workers. */

268 if (!am_leader_apply_worker())

269 return false;

270

271 /*

272 * It is good to check for any change in the subscription parameter to

273 * avoid the case where for a very long time the change doesn't get

274 * reflected. This can happen when there is a constant flow of streaming

275 * transactions that are handled by parallel apply workers.

276 *

277 * It is better to do it before the below checks so that the latest values

278 * of subscription can be used for the checks.

279 */

280 maybe_reread_subscription();

281

282 /*

283 * Don't start a new parallel apply worker if the subscription is not

284 * using parallel streaming mode, or if the publisher does not support

285 * parallel apply.

286 */

287 if (!MyLogicalRepWorker->parallel_apply)

288 return false;

289

290 /*

291 * Don't start a new parallel worker if user has set skiplsn as it's

292 * possible that they want to skip the streaming transaction. For

293 * streaming transactions, we need to serialize the transaction to a file

294 * so that we can get the last LSN of the transaction to judge whether to

295 * skip before starting to apply the change.

296 *

297 * One might think that we could allow parallelism if the first lsn of the

298 * transaction is greater than skiplsn, but we don't send it with the

299 * STREAM START message, and it doesn't seem worth sending the extra eight

300 * bytes with the STREAM START to enable parallelism for this case.

301 */

302 if (!XLogRecPtrIsInvalid(MySubscription->skiplsn))

303 return false;

304

305 /*

306 * For streaming transactions that are being applied using a parallel

307 * apply worker, we cannot decide whether to apply the change for a

308 * relation that is not in the READY state (see

309 * should_apply_changes_for_rel) as we won't know remote_final_lsn by that

310 * time. So, we don't start the new parallel apply worker in this case.

311 */

312 if (!AllTablesyncsReady())

313 return false;

314

315 return true;

316}

317

318/*

319 * Set up a dynamic shared memory segment.

320 *

321 * We set up a control region that contains a fixed-size worker info

322 * (ParallelApplyWorkerShared), a message queue, and an error queue.

323 *

324 * Returns true on success, false on failure.

325 */

326static bool

327 pa_setup_dsm(ParallelApplyWorkerInfo *winfo)

328{

329 shm_toc_estimator e;

330 Size segsize;

331 dsm_segment *seg;

332 shm_toc *toc;

333 ParallelApplyWorkerShared *shared;

334 shm_mq *mq;

335 Size queue_size = DSM_QUEUE_SIZE;

336 Size error_queue_size = DSM_ERROR_QUEUE_SIZE;

337

338 /*

339 * Estimate how much shared memory we need.

340 *

341 * Because the TOC machinery may choose to insert padding of oddly-sized

342 * requests, we must estimate each chunk separately.

343 *

344 * We need one key to register the location of the header, and two other

345 * keys to track the locations of the message queue and the error message

346 * queue.

347 */

348 shm_toc_initialize_estimator(&e);

349 shm_toc_estimate_chunk(&e, sizeof(ParallelApplyWorkerShared));

350 shm_toc_estimate_chunk(&e, queue_size);

351 shm_toc_estimate_chunk(&e, error_queue_size);

352

353 shm_toc_estimate_keys(&e, 3);

354 segsize = shm_toc_estimate(&e);

355

356 /* Create the shared memory segment and establish a table of contents. */

357 seg = dsm_create(shm_toc_estimate(&e), 0);

358 if (!seg)

359 return false;

360

361 toc = shm_toc_create(PG_LOGICAL_APPLY_SHM_MAGIC, dsm_segment_address(seg),

362 segsize);

363

364 /* Set up the header region. */

365 shared = shm_toc_allocate(toc, sizeof(ParallelApplyWorkerShared));

366 SpinLockInit(&shared->mutex);

367

368 shared->xact_state = PARALLEL_TRANS_UNKNOWN;

369 pg_atomic_init_u32(&(shared->pending_stream_count), 0);

370 shared->last_commit_end = InvalidXLogRecPtr;

371 shared->fileset_state = FS_EMPTY;

372

373 shm_toc_insert(toc, PARALLEL_APPLY_KEY_SHARED, shared);

374

375 /* Set up message queue for the worker. */

376 mq = shm_mq_create(shm_toc_allocate(toc, queue_size), queue_size);

377 shm_toc_insert(toc, PARALLEL_APPLY_KEY_MQ, mq);

378 shm_mq_set_sender(mq, MyProc);

379

380 /* Attach the queue. */

381 winfo->mq_handle = shm_mq_attach(mq, seg, NULL);

382

383 /* Set up error queue for the worker. */

384 mq = shm_mq_create(shm_toc_allocate(toc, error_queue_size),

385 error_queue_size);

386 shm_toc_insert(toc, PARALLEL_APPLY_KEY_ERROR_QUEUE, mq);

387 shm_mq_set_receiver(mq, MyProc);

388

389 /* Attach the queue. */

390 winfo->error_mq_handle = shm_mq_attach(mq, seg, NULL);

391

392 /* Return results to caller. */

393 winfo->dsm_seg = seg;

394 winfo->shared = shared;

395

396 return true;

397}

398

399/*

400 * Try to get a parallel apply worker from the pool. If none is available then

401 * start a new one.

402 */

403static ParallelApplyWorkerInfo *

404 pa_launch_parallel_worker(void)

405{

406 MemoryContext oldcontext;

407 bool launched;

408 ParallelApplyWorkerInfo *winfo;

409 ListCell *lc;

410

411 /* Try to get an available parallel apply worker from the worker pool. */

412 foreach(lc, ParallelApplyWorkerPool)

413 {

414 winfo = (ParallelApplyWorkerInfo *) lfirst(lc);

415

416 if (!winfo->in_use)

417 return winfo;

418 }

419

420 /*

421 * Start a new parallel apply worker.

422 *

423 * The worker info can be used for the lifetime of the worker process, so

424 * create it in a permanent context.

425 */

426 oldcontext = MemoryContextSwitchTo(ApplyContext);

427

428 winfo = (ParallelApplyWorkerInfo *) palloc0(sizeof(ParallelApplyWorkerInfo));

429

430 /* Setup shared memory. */

431 if (!pa_setup_dsm(winfo))

432 {

433 MemoryContextSwitchTo(oldcontext);

434 pfree(winfo);

435 return NULL;

436 }

437

438 launched = logicalrep_worker_launch(WORKERTYPE_PARALLEL_APPLY,

439 MyLogicalRepWorker->dbid,

440 MySubscription->oid,

441 MySubscription->name,

442 MyLogicalRepWorker->userid,

443 InvalidOid,

444 dsm_segment_handle(winfo->dsm_seg),

445 false);

446

447 if (launched)

448 {

449 ParallelApplyWorkerPool = lappend(ParallelApplyWorkerPool, winfo);

450 }

451 else

452 {

453 pa_free_worker_info(winfo);

454 winfo = NULL;

455 }

456

457 MemoryContextSwitchTo(oldcontext);

458

459 return winfo;

460}

461

462/*

463 * Allocate a parallel apply worker that will be used for the specified xid.

464 *

465 * We first try to get an available worker from the pool, if any and then try

466 * to launch a new worker. On successful allocation, remember the worker

467 * information in the hash table so that we can get it later for processing the

468 * streaming changes.

469 */

470void

471 pa_allocate_worker(TransactionId xid)

472{

473 bool found;

474 ParallelApplyWorkerInfo *winfo = NULL;

475 ParallelApplyWorkerEntry *entry;

476

477 if (!pa_can_start())

478 return;

479

480 winfo = pa_launch_parallel_worker();

481 if (!winfo)

482 return;

483

484 /* First time through, initialize parallel apply worker state hashtable. */

485 if (!ParallelApplyTxnHash)

486 {

487 HASHCTL ctl;

488

489 MemSet(&ctl, 0, sizeof(ctl));

490 ctl.keysize = sizeof(TransactionId);

491 ctl.entrysize = sizeof(ParallelApplyWorkerEntry);

492 ctl.hcxt = ApplyContext;

493

494 ParallelApplyTxnHash = hash_create("logical replication parallel apply workers hash",

495 16, &ctl,

496 HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);

497 }

498

499 /* Create an entry for the requested transaction. */

500 entry = hash_search(ParallelApplyTxnHash, &xid, HASH_ENTER, &found);

501 if (found)

502 elog(ERROR, "hash table corrupted");

503

504 /* Update the transaction information in shared memory. */

505 SpinLockAcquire(&winfo->shared->mutex);

506 winfo->shared->xact_state = PARALLEL_TRANS_UNKNOWN;

507 winfo->shared->xid = xid;

508 SpinLockRelease(&winfo->shared->mutex);

509

510 winfo->in_use = true;

511 winfo->serialize_changes = false;

512 entry->winfo = winfo;

513}

514

515/*

516 * Find the assigned worker for the given transaction, if any.

517 */

518ParallelApplyWorkerInfo *

519 pa_find_worker(TransactionId xid)

520{

521 bool found;

522 ParallelApplyWorkerEntry *entry;

523

524 if (!TransactionIdIsValid(xid))

525 return NULL;

526

527 if (!ParallelApplyTxnHash)

528 return NULL;

529

530 /* Return the cached parallel apply worker if valid. */

531 if (stream_apply_worker)

532 return stream_apply_worker;

533

534 /* Find an entry for the requested transaction. */

535 entry = hash_search(ParallelApplyTxnHash, &xid, HASH_FIND, &found);

536 if (found)

537 {

538 /* The worker must not have exited. */

539 Assert(entry->winfo->in_use);

540 return entry->winfo;

541 }

542

543 return NULL;

544}

545

546/*

547 * Makes the worker available for reuse.

548 *

549 * This removes the parallel apply worker entry from the hash table so that it

550 * can't be used. If there are enough workers in the pool, it stops the worker

551 * and frees the corresponding info. Otherwise it just marks the worker as

552 * available for reuse.

553 *

554 * For more information about the worker pool, see comments atop this file.

555 */

556static void

557 pa_free_worker(ParallelApplyWorkerInfo *winfo)

558{

559 Assert(!am_parallel_apply_worker());

560 Assert(winfo->in_use);

561 Assert(pa_get_xact_state(winfo->shared) == PARALLEL_TRANS_FINISHED);

562

563 if (!hash_search(ParallelApplyTxnHash, &winfo->shared->xid, HASH_REMOVE, NULL))

564 elog(ERROR, "hash table corrupted");

565

566 /*

567 * Stop the worker if there are enough workers in the pool.

568 *

569 * XXX Additionally, we also stop the worker if the leader apply worker

570 * serialize part of the transaction data due to a send timeout. This is

571 * because the message could be partially written to the queue and there

572 * is no way to clean the queue other than resending the message until it

573 * succeeds. Instead of trying to send the data which anyway would have

574 * been serialized and then letting the parallel apply worker deal with

575 * the spurious message, we stop the worker.

576 */

577 if (winfo->serialize_changes ||

578 list_length(ParallelApplyWorkerPool) >

579 (max_parallel_apply_workers_per_subscription / 2))

580 {

581 logicalrep_pa_worker_stop(winfo);

582 pa_free_worker_info(winfo);

583

584 return;

585 }

586

587 winfo->in_use = false;

588 winfo->serialize_changes = false;

589}

590

591/*

592 * Free the parallel apply worker information and unlink the files with

593 * serialized changes if any.

594 */

595static void

596 pa_free_worker_info(ParallelApplyWorkerInfo *winfo)

597{

598 Assert(winfo);

599

600 if (winfo->mq_handle)

601 shm_mq_detach(winfo->mq_handle);

602

603 if (winfo->error_mq_handle)

604 shm_mq_detach(winfo->error_mq_handle);

605

606 /* Unlink the files with serialized changes. */

607 if (winfo->serialize_changes)

608 stream_cleanup_files(MyLogicalRepWorker->subid, winfo->shared->xid);

609

610 if (winfo->dsm_seg)

611 dsm_detach(winfo->dsm_seg);

612

613 /* Remove from the worker pool. */

614 ParallelApplyWorkerPool = list_delete_ptr(ParallelApplyWorkerPool, winfo);

615

616 pfree(winfo);

617}

618

619/*

620 * Detach the error queue for all parallel apply workers.

621 */

622void

623 pa_detach_all_error_mq(void)

624{

625 ListCell *lc;

626

627 foreach(lc, ParallelApplyWorkerPool)

628 {

629 ParallelApplyWorkerInfo *winfo = (ParallelApplyWorkerInfo *) lfirst(lc);

630

631 if (winfo->error_mq_handle)

632 {

633 shm_mq_detach(winfo->error_mq_handle);

634 winfo->error_mq_handle = NULL;

635 }

636 }

637}

638

639/*

640 * Check if there are any pending spooled messages.

641 */

642static bool

643 pa_has_spooled_message_pending()

644{

645 PartialFileSetState fileset_state;

646

647 fileset_state = pa_get_fileset_state();

648

649 return (fileset_state != FS_EMPTY);

650}

651

652/*

653 * Replay the spooled messages once the leader apply worker has finished

654 * serializing changes to the file.

655 *

656 * Returns false if there aren't any pending spooled messages, true otherwise.

657 */

658static bool

659 pa_process_spooled_messages_if_required(void)

660{

661 PartialFileSetState fileset_state;

662

663 fileset_state = pa_get_fileset_state();

664

665 if (fileset_state == FS_EMPTY)

666 return false;

667

668 /*

669 * If the leader apply worker is busy serializing the partial changes then

670 * acquire the stream lock now and wait for the leader worker to finish

671 * serializing the changes. Otherwise, the parallel apply worker won't get

672 * a chance to receive a STREAM_STOP (and acquire the stream lock) until

673 * the leader had serialized all changes which can lead to undetected

674 * deadlock.

675 *

676 * Note that the fileset state can be FS_SERIALIZE_DONE once the leader

677 * worker has finished serializing the changes.

678 */

679 if (fileset_state == FS_SERIALIZE_IN_PROGRESS)

680 {

681 pa_lock_stream(MyParallelShared->xid, AccessShareLock);

682 pa_unlock_stream(MyParallelShared->xid, AccessShareLock);

683

684 fileset_state = pa_get_fileset_state();

685 }

686

687 /*

688 * We cannot read the file immediately after the leader has serialized all

689 * changes to the file because there may still be messages in the memory

690 * queue. We will apply all spooled messages the next time we call this

691 * function and that will ensure there are no messages left in the memory

692 * queue.

693 */

694 if (fileset_state == FS_SERIALIZE_DONE)

695 {

696 pa_set_fileset_state(MyParallelShared, FS_READY);

697 }

698 else if (fileset_state == FS_READY)

699 {

700 apply_spooled_messages(&MyParallelShared->fileset,

701 MyParallelShared->xid,

702 InvalidXLogRecPtr);

703 pa_set_fileset_state(MyParallelShared, FS_EMPTY);

704 }

705

706 return true;

707}

708

709/*

710 * Interrupt handler for main loop of parallel apply worker.

711 */

712static void

713 ProcessParallelApplyInterrupts(void)

714{

715 CHECK_FOR_INTERRUPTS();

716

717 if (ShutdownRequestPending)

718 {

719 ereport(LOG,

720 (errmsg("logical replication parallel apply worker for subscription \"%s\" has finished",

721 MySubscription->name)));

722

723 proc_exit(0);

724 }

725

726 if (ConfigReloadPending)

727 {

728 ConfigReloadPending = false;

729 ProcessConfigFile(PGC_SIGHUP);

730 }

731}

732

733/* Parallel apply worker main loop. */

734static void

735 LogicalParallelApplyLoop(shm_mq_handle *mqh)

736{

737 shm_mq_result shmq_res;

738 ErrorContextCallback errcallback;

739 MemoryContext oldcxt = CurrentMemoryContext;

740

741 /*

742 * Init the ApplyMessageContext which we clean up after each replication

743 * protocol message.

744 */

745 ApplyMessageContext = AllocSetContextCreate(ApplyContext,

746 "ApplyMessageContext",

747 ALLOCSET_DEFAULT_SIZES);

748

749 /*

750 * Push apply error context callback. Fields will be filled while applying

751 * a change.

752 */

753 errcallback.callback = apply_error_callback;

754 errcallback.previous = error_context_stack;

755 error_context_stack = &errcallback;

756

757 for (;;)

758 {

759 void *data;

760 Size len;

761

762 ProcessParallelApplyInterrupts();

763

764 /* Ensure we are reading the data into our memory context. */

765 MemoryContextSwitchTo(ApplyMessageContext);

766

767 shmq_res = shm_mq_receive(mqh, &len, &data, true);

768

769 if (shmq_res == SHM_MQ_SUCCESS)

770 {

771 StringInfoData s;

772 int c;

773

774 if (len == 0)

775 elog(ERROR, "invalid message length");

776

777 initReadOnlyStringInfo(&s, data, len);

778

779 /*

780 * The first byte of messages sent from leader apply worker to

781 * parallel apply workers can only be PqReplMsg_WALData.

782 */

783 c = pq_getmsgbyte(&s);

784 if (c != PqReplMsg_WALData)

785 elog(ERROR, "unexpected message \"%c\"", c);

786

787 /*

788 * Ignore statistics fields that have been updated by the leader

789 * apply worker.

790 *

791 * XXX We can avoid sending the statistics fields from the leader

792 * apply worker but for that, it needs to rebuild the entire

793 * message by removing these fields which could be more work than

794 * simply ignoring these fields in the parallel apply worker.

795 */

796 s.cursor += SIZE_STATS_MESSAGE;

797

798 apply_dispatch(&s);

799 }

800 else if (shmq_res == SHM_MQ_WOULD_BLOCK)

801 {

802 /* Replay the changes from the file, if any. */

803 if (!pa_process_spooled_messages_if_required())

804 {

805 int rc;

806

807 /* Wait for more work. */

808 rc = WaitLatch(MyLatch,

809 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,

810 1000L,

811 WAIT_EVENT_LOGICAL_PARALLEL_APPLY_MAIN);

812

813 if (rc & WL_LATCH_SET)

814 ResetLatch(MyLatch);

815 }

816 }

817 else

818 {

819 Assert(shmq_res == SHM_MQ_DETACHED);

820

821 ereport(ERROR,

822 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),

823 errmsg("lost connection to the logical replication apply worker")));

824 }

825

826 MemoryContextReset(ApplyMessageContext);

827 MemoryContextSwitchTo(oldcxt);

828 }

829

830 /* Pop the error context stack. */

831 error_context_stack = errcallback.previous;

832

833 MemoryContextSwitchTo(oldcxt);

834}

835

836/*

837 * Make sure the leader apply worker tries to read from our error queue one more

838 * time. This guards against the case where we exit uncleanly without sending

839 * an ErrorResponse, for example because some code calls proc_exit directly.

840 *

841 * Also explicitly detach from dsm segment to invoke on_dsm_detach callbacks,

842 * if any. See ParallelWorkerShutdown for details.

843 */

844static void

845 pa_shutdown(int code, Datum arg)

846{

847 SendProcSignal(MyLogicalRepWorker->leader_pid,

848 PROCSIG_PARALLEL_APPLY_MESSAGE,

849 INVALID_PROC_NUMBER);

850

851 dsm_detach((dsm_segment *) DatumGetPointer(arg));

852}

853

854/*

855 * Parallel apply worker entry point.

856 */

857void

858 ParallelApplyWorkerMain(Datum main_arg)

859{

860 ParallelApplyWorkerShared *shared;

861 dsm_handle handle;

862 dsm_segment *seg;

863 shm_toc *toc;

864 shm_mq *mq;

865 shm_mq_handle *mqh;

866 shm_mq_handle *error_mqh;

867 RepOriginId originid;

868 int worker_slot = DatumGetInt32(main_arg);

869 char originname[NAMEDATALEN];

870

871 InitializingApplyWorker = true;

872

873 /*

874 * Setup signal handling.

875 *

876 * Note: We intentionally used SIGUSR2 to trigger a graceful shutdown

877 * initiated by the leader apply worker. This helps to differentiate it

878 * from the case where we abort the current transaction and exit on

879 * receiving SIGTERM.

880 */

881 pqsignal(SIGHUP, SignalHandlerForConfigReload);

882 pqsignal(SIGTERM, die);

883 pqsignal(SIGUSR2, SignalHandlerForShutdownRequest);

884 BackgroundWorkerUnblockSignals();

885

886 /*

887 * Attach to the dynamic shared memory segment for the parallel apply, and

888 * find its table of contents.

889 *

890 * Like parallel query, we don't need resource owner by this time. See

891 * ParallelWorkerMain.

892 */

893 memcpy(&handle, MyBgworkerEntry->bgw_extra, sizeof(dsm_handle));

894 seg = dsm_attach(handle);

895 if (!seg)

896 ereport(ERROR,

897 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),

898 errmsg("could not map dynamic shared memory segment")));

899

900 toc = shm_toc_attach(PG_LOGICAL_APPLY_SHM_MAGIC, dsm_segment_address(seg));

901 if (!toc)

902 ereport(ERROR,

903 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),

904 errmsg("invalid magic number in dynamic shared memory segment")));

905

906 /* Look up the shared information. */

907 shared = shm_toc_lookup(toc, PARALLEL_APPLY_KEY_SHARED, false);

908 MyParallelShared = shared;

909

910 /*

911 * Attach to the message queue.

912 */

913 mq = shm_toc_lookup(toc, PARALLEL_APPLY_KEY_MQ, false);

914 shm_mq_set_receiver(mq, MyProc);

915 mqh = shm_mq_attach(mq, seg, NULL);

916

917 /*

918 * Primary initialization is complete. Now, we can attach to our slot.

919 * This is to ensure that the leader apply worker does not write data to

920 * the uninitialized memory queue.

921 */

922 logicalrep_worker_attach(worker_slot);

923

924 /*

925 * Register the shutdown callback after we are attached to the worker

926 * slot. This is to ensure that MyLogicalRepWorker remains valid when this

927 * callback is invoked.

928 */

929 before_shmem_exit(pa_shutdown, PointerGetDatum(seg));

930

931 SpinLockAcquire(&MyParallelShared->mutex);

932 MyParallelShared->logicalrep_worker_generation = MyLogicalRepWorker->generation;

933 MyParallelShared->logicalrep_worker_slot_no = worker_slot;

934 SpinLockRelease(&MyParallelShared->mutex);

935

936 /*

937 * Attach to the error queue.

938 */

939 mq = shm_toc_lookup(toc, PARALLEL_APPLY_KEY_ERROR_QUEUE, false);

940 shm_mq_set_sender(mq, MyProc);

941 error_mqh = shm_mq_attach(mq, seg, NULL);

942

943 pq_redirect_to_shm_mq(seg, error_mqh);

944 pq_set_parallel_leader(MyLogicalRepWorker->leader_pid,

945 INVALID_PROC_NUMBER);

946

947 MyLogicalRepWorker->last_send_time = MyLogicalRepWorker->last_recv_time =

948 MyLogicalRepWorker->reply_time = 0;

949

950 InitializeLogRepWorker();

951

952 InitializingApplyWorker = false;

953

954 /* Setup replication origin tracking. */

955 StartTransactionCommand();

956 ReplicationOriginNameForLogicalRep(MySubscription->oid, InvalidOid,

957 originname, sizeof(originname));

958 originid = replorigin_by_name(originname, false);

959

960 /*

961 * The parallel apply worker doesn't need to monopolize this replication

962 * origin which was already acquired by its leader process.

963 */

964 replorigin_session_setup(originid, MyLogicalRepWorker->leader_pid);

965 replorigin_session_origin = originid;

966 CommitTransactionCommand();

967

968 /*

969 * Setup callback for syscache so that we know when something changes in

970 * the subscription relation state.

971 */

972 CacheRegisterSyscacheCallback(SUBSCRIPTIONRELMAP,

973 invalidate_syncing_table_states,

974 (Datum) 0);

975

976 set_apply_error_context_origin(originname);

977

978 LogicalParallelApplyLoop(mqh);

979

980 /*

981 * The parallel apply worker must not get here because the parallel apply

982 * worker will only stop when it receives a SIGTERM or SIGUSR2 from the

983 * leader, or SIGINT from itself, or when there is an error. None of these

984 * cases will allow the code to reach here.

985 */

986 Assert(false);

987}

988

989/*

990 * Handle receipt of an interrupt indicating a parallel apply worker message.

991 *

992 * Note: this is called within a signal handler! All we can do is set a flag

993 * that will cause the next CHECK_FOR_INTERRUPTS() to invoke

994 * ProcessParallelApplyMessages().

995 */

996void

997 HandleParallelApplyMessageInterrupt(void)

998{

999 InterruptPending = true;

1000 ParallelApplyMessagePending = true;

1001 SetLatch(MyLatch);

1002}

1003

1004/*

1005 * Process a single protocol message received from a single parallel apply

1006 * worker.

1007 */

1008static void

1009 ProcessParallelApplyMessage(StringInfo msg)

1010{

1011 char msgtype;

1012

1013 msgtype = pq_getmsgbyte(msg);

1014

1015 switch (msgtype)

1016 {

1017 case PqMsg_ErrorResponse:

1018 {

1019 ErrorData edata;

1020

1021 /* Parse ErrorResponse. */

1022 pq_parse_errornotice(msg, &edata);

1023

1024 /*

1025 * If desired, add a context line to show that this is a

1026 * message propagated from a parallel apply worker. Otherwise,

1027 * it can sometimes be confusing to understand what actually

1028 * happened.

1029 */

1030 if (edata.context)

1031 edata.context = psprintf("%s\n%s", edata.context,

1032 _("logical replication parallel apply worker"));

1033 else

1034 edata.context = pstrdup(_("logical replication parallel apply worker"));

1035

1036 /*

1037 * Context beyond that should use the error context callbacks

1038 * that were in effect in LogicalRepApplyLoop().

1039 */

1040 error_context_stack = apply_error_context_stack;

1041

1042 /*

1043 * The actual error must have been reported by the parallel

1044 * apply worker.

1045 */

1046 ereport(ERROR,

1047 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),

1048 errmsg("logical replication parallel apply worker exited due to error"),

1049 errcontext("%s", edata.context)));

1050 }

1051

1052 /*

1053 * Don't need to do anything about NoticeResponse and

1054 * NotificationResponse as the logical replication worker doesn't

1055 * need to send messages to the client.

1056 */

1057 case PqMsg_NoticeResponse:

1058 case PqMsg_NotificationResponse:

1059 break;

1060

1061 default:

1062 elog(ERROR, "unrecognized message type received from logical replication parallel apply worker: %c (message length %d bytes)",

1063 msgtype, msg->len);

1064 }

1065}

1066

1067/*

1068 * Handle any queued protocol messages received from parallel apply workers.

1069 */

1070void

1071 ProcessParallelApplyMessages(void)

1072{

1073 ListCell *lc;

1074 MemoryContext oldcontext;

1075

1076 static MemoryContext hpam_context = NULL;

1077

1078 /*

1079 * This is invoked from ProcessInterrupts(), and since some of the

1080 * functions it calls contain CHECK_FOR_INTERRUPTS(), there is a potential

1081 * for recursive calls if more signals are received while this runs. It's

1082 * unclear that recursive entry would be safe, and it doesn't seem useful

1083 * even if it is safe, so let's block interrupts until done.

1084 */

1085 HOLD_INTERRUPTS();

1086

1087 /*

1088 * Moreover, CurrentMemoryContext might be pointing almost anywhere. We

1089 * don't want to risk leaking data into long-lived contexts, so let's do

1090 * our work here in a private context that we can reset on each use.

1091 */

1092 if (!hpam_context) /* first time through? */

1093 hpam_context = AllocSetContextCreate(TopMemoryContext,

1094 "ProcessParallelApplyMessages",

1095 ALLOCSET_DEFAULT_SIZES);

1096 else

1097 MemoryContextReset(hpam_context);

1098

1099 oldcontext = MemoryContextSwitchTo(hpam_context);

1100

1101 ParallelApplyMessagePending = false;

1102

1103 foreach(lc, ParallelApplyWorkerPool)

1104 {

1105 shm_mq_result res;

1106 Size nbytes;

1107 void *data;

1108 ParallelApplyWorkerInfo *winfo = (ParallelApplyWorkerInfo *) lfirst(lc);

1109

1110 /*

1111 * The leader will detach from the error queue and set it to NULL

1112 * before preparing to stop all parallel apply workers, so we don't

1113 * need to handle error messages anymore. See

1114 * logicalrep_worker_detach.

1115 */

1116 if (!winfo->error_mq_handle)

1117 continue;

1118

1119 res = shm_mq_receive(winfo->error_mq_handle, &nbytes, &data, true);

1120

1121 if (res == SHM_MQ_WOULD_BLOCK)

1122 continue;

1123 else if (res == SHM_MQ_SUCCESS)

1124 {

1125 StringInfoData msg;

1126

1127 initStringInfo(&msg);

1128 appendBinaryStringInfo(&msg, data, nbytes);

1129 ProcessParallelApplyMessage(&msg);

1130 pfree(msg.data);

1131 }

1132 else

1133 ereport(ERROR,

1134 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),

1135 errmsg("lost connection to the logical replication parallel apply worker")));

1136 }

1137

1138 MemoryContextSwitchTo(oldcontext);

1139

1140 /* Might as well clear the context on our way out */

1141 MemoryContextReset(hpam_context);

1142

1143 RESUME_INTERRUPTS();

1144}

1145

1146/*

1147 * Send the data to the specified parallel apply worker via shared-memory

1148 * queue.

1149 *

1150 * Returns false if the attempt to send data via shared memory times out, true

1151 * otherwise.

1152 */

1153bool

1154 pa_send_data(ParallelApplyWorkerInfo *winfo, Size nbytes, const void *data)

1155{

1156 int rc;

1157 shm_mq_result result;

1158 TimestampTz startTime = 0;

1159

1160 Assert(!IsTransactionState());

1161 Assert(!winfo->serialize_changes);

1162

1163 /*

1164 * We don't try to send data to parallel worker for 'immediate' mode. This

1165 * is primarily used for testing purposes.

1166 */

1167 if (unlikely(debug_logical_replication_streaming == DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE))

1168 return false;

1169

1170/*

1171 * This timeout is a bit arbitrary but testing revealed that it is sufficient

1172 * to send the message unless the parallel apply worker is waiting on some

1173 * lock or there is a serious resource crunch. See the comments atop this file

1174 * to know why we are using a non-blocking way to send the message.

1175 */

1176#define SHM_SEND_RETRY_INTERVAL_MS 1000

1177#define SHM_SEND_TIMEOUT_MS (10000 - SHM_SEND_RETRY_INTERVAL_MS)

1178

1179 for (;;)

1180 {

1181 result = shm_mq_send(winfo->mq_handle, nbytes, data, true, true);

1182

1183 if (result == SHM_MQ_SUCCESS)

1184 return true;

1185 else if (result == SHM_MQ_DETACHED)

1186 ereport(ERROR,

1187 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),

1188 errmsg("could not send data to shared-memory queue")));

1189

1190 Assert(result == SHM_MQ_WOULD_BLOCK);

1191

1192 /* Wait before retrying. */

1193 rc = WaitLatch(MyLatch,

1194 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,

1195 SHM_SEND_RETRY_INTERVAL_MS,

1196 WAIT_EVENT_LOGICAL_APPLY_SEND_DATA);

1197

1198 if (rc & WL_LATCH_SET)

1199 {

1200 ResetLatch(MyLatch);

1201 CHECK_FOR_INTERRUPTS();

1202 }

1203

1204 if (startTime == 0)

1205 startTime = GetCurrentTimestamp();

1206 else if (TimestampDifferenceExceeds(startTime, GetCurrentTimestamp(),

1207 SHM_SEND_TIMEOUT_MS))

1208 return false;

1209 }

1210}

1211

1212/*

1213 * Switch to PARTIAL_SERIALIZE mode for the current transaction -- this means

1214 * that the current data and any subsequent data for this transaction will be

1215 * serialized to a file. This is done to prevent possible deadlocks with

1216 * another parallel apply worker (refer to the comments atop this file).

1217 */

1218void

1219 pa_switch_to_partial_serialize(ParallelApplyWorkerInfo *winfo,

1220 bool stream_locked)

1221{

1222 ereport(LOG,

1223 (errmsg("logical replication apply worker will serialize the remaining changes of remote transaction %u to a file",

1224 winfo->shared->xid)));

1225

1226 /*

1227 * The parallel apply worker could be stuck for some reason (say waiting

1228 * on some lock by other backend), so stop trying to send data directly to

1229 * it and start serializing data to the file instead.

1230 */

1231 winfo->serialize_changes = true;

1232

1233 /* Initialize the stream fileset. */

1234 stream_start_internal(winfo->shared->xid, true);

1235

1236 /*

1237 * Acquires the stream lock if not already to make sure that the parallel

1238 * apply worker will wait for the leader to release the stream lock until

1239 * the end of the transaction.

1240 */

1241 if (!stream_locked)

1242 pa_lock_stream(winfo->shared->xid, AccessExclusiveLock);

1243

1244 pa_set_fileset_state(winfo->shared, FS_SERIALIZE_IN_PROGRESS);

1245}

1246

1247/*

1248 * Wait until the parallel apply worker's transaction state has reached or

1249 * exceeded the given xact_state.

1250 */

1251static void

1252 pa_wait_for_xact_state(ParallelApplyWorkerInfo *winfo,

1253 ParallelTransState xact_state)

1254{

1255 for (;;)

1256 {

1257 /*

1258 * Stop if the transaction state has reached or exceeded the given

1259 * xact_state.

1260 */

1261 if (pa_get_xact_state(winfo->shared) >= xact_state)

1262 break;

1263

1264 /* Wait to be signalled. */

1265 (void) WaitLatch(MyLatch,

1266 WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,

1267 10L,

1268 WAIT_EVENT_LOGICAL_PARALLEL_APPLY_STATE_CHANGE);

1269

1270 /* Reset the latch so we don't spin. */

1271 ResetLatch(MyLatch);

1272

1273 /* An interrupt may have occurred while we were waiting. */

1274 CHECK_FOR_INTERRUPTS();

1275 }

1276}

1277

1278/*

1279 * Wait until the parallel apply worker's transaction finishes.

1280 */

1281static void

1282 pa_wait_for_xact_finish(ParallelApplyWorkerInfo *winfo)

1283{

1284 /*

1285 * Wait until the parallel apply worker set the state to

1286 * PARALLEL_TRANS_STARTED which means it has acquired the transaction

1287 * lock. This is to prevent leader apply worker from acquiring the

1288 * transaction lock earlier than the parallel apply worker.

1289 */

1290 pa_wait_for_xact_state(winfo, PARALLEL_TRANS_STARTED);

1291

1292 /*

1293 * Wait for the transaction lock to be released. This is required to

1294 * detect deadlock among leader and parallel apply workers. Refer to the

1295 * comments atop this file.

1296 */

1297 pa_lock_transaction(winfo->shared->xid, AccessShareLock);

1298 pa_unlock_transaction(winfo->shared->xid, AccessShareLock);

1299

1300 /*

1301 * Check if the state becomes PARALLEL_TRANS_FINISHED in case the parallel

1302 * apply worker failed while applying changes causing the lock to be

1303 * released.

1304 */

1305 if (pa_get_xact_state(winfo->shared) != PARALLEL_TRANS_FINISHED)

1306 ereport(ERROR,

1307 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),

1308 errmsg("lost connection to the logical replication parallel apply worker")));

1309}

1310

1311/*

1312 * Set the transaction state for a given parallel apply worker.

1313 */

1314void

1315 pa_set_xact_state(ParallelApplyWorkerShared *wshared,

1316 ParallelTransState xact_state)

1317{

1318 SpinLockAcquire(&wshared->mutex);

1319 wshared->xact_state = xact_state;

1320 SpinLockRelease(&wshared->mutex);

1321}

1322

1323/*

1324 * Get the transaction state for a given parallel apply worker.

1325 */

1326static ParallelTransState

1327 pa_get_xact_state(ParallelApplyWorkerShared *wshared)

1328{

1329 ParallelTransState xact_state;

1330

1331 SpinLockAcquire(&wshared->mutex);

1332 xact_state = wshared->xact_state;

1333 SpinLockRelease(&wshared->mutex);

1334

1335 return xact_state;

1336}

1337

1338/*

1339 * Cache the parallel apply worker information.

1340 */

1341void

1342 pa_set_stream_apply_worker(ParallelApplyWorkerInfo *winfo)

1343{

1344 stream_apply_worker = winfo;

1345}

1346

1347/*

1348 * Form a unique savepoint name for the streaming transaction.

1349 *

1350 * Note that different subscriptions for publications on different nodes can

1351 * receive same remote xid, so we need to use subscription id along with it.

1352 *

1353 * Returns the name in the supplied buffer.

1354 */

1355static void

1356 pa_savepoint_name(Oid suboid, TransactionId xid, char *spname, Size szsp)

1357{

1358 snprintf(spname, szsp, "pg_sp_%u_%u", suboid, xid);

1359}

1360

1361/*

1362 * Define a savepoint for a subxact in parallel apply worker if needed.

1363 *

1364 * The parallel apply worker can figure out if a new subtransaction was

1365 * started by checking if the new change arrived with a different xid. In that

1366 * case define a named savepoint, so that we are able to rollback to it

1367 * if required.

1368 */

1369void

1370 pa_start_subtrans(TransactionId current_xid, TransactionId top_xid)

1371{

1372 if (current_xid != top_xid &&

1373 !list_member_xid(subxactlist, current_xid))

1374 {

1375 MemoryContext oldctx;

1376 char spname[NAMEDATALEN];

1377

1378 pa_savepoint_name(MySubscription->oid, current_xid,

1379 spname, sizeof(spname));

1380

1381 elog(DEBUG1, "defining savepoint %s in logical replication parallel apply worker", spname);

1382

1383 /* We must be in transaction block to define the SAVEPOINT. */

1384 if (!IsTransactionBlock())

1385 {

1386 if (!IsTransactionState())

1387 StartTransactionCommand();

1388

1389 BeginTransactionBlock();

1390 CommitTransactionCommand();

1391 }

1392

1393 DefineSavepoint(spname);

1394

1395 /*

1396 * CommitTransactionCommand is needed to start a subtransaction after

1397 * issuing a SAVEPOINT inside a transaction block (see

1398 * StartSubTransaction()).

1399 */

1400 CommitTransactionCommand();

1401

1402 oldctx = MemoryContextSwitchTo(TopTransactionContext);

1403 subxactlist = lappend_xid(subxactlist, current_xid);

1404 MemoryContextSwitchTo(oldctx);

1405 }

1406}

1407

1408/* Reset the list that maintains subtransactions. */

1409void

1410 pa_reset_subtrans(void)

1411{

1412 /*

1413 * We don't need to free this explicitly as the allocated memory will be

1414 * freed at the transaction end.

1415 */

1416 subxactlist = NIL;

1417}

1418

1419/*

1420 * Handle STREAM ABORT message when the transaction was applied in a parallel

1421 * apply worker.

1422 */

1423void

1424 pa_stream_abort(LogicalRepStreamAbortData *abort_data)

1425{

1426 TransactionId xid = abort_data->xid;

1427 TransactionId subxid = abort_data->subxid;

1428

1429 /*

1430 * Update origin state so we can restart streaming from correct position

1431 * in case of crash.

1432 */

1433 replorigin_session_origin_lsn = abort_data->abort_lsn;

1434 replorigin_session_origin_timestamp = abort_data->abort_time;

1435

1436 /*

1437 * If the two XIDs are the same, it's in fact abort of toplevel xact, so

1438 * just free the subxactlist.

1439 */

1440 if (subxid == xid)

1441 {

1442 pa_set_xact_state(MyParallelShared, PARALLEL_TRANS_FINISHED);

1443

1444 /*

1445 * Release the lock as we might be processing an empty streaming

1446 * transaction in which case the lock won't be released during

1447 * transaction rollback.

1448 *

1449 * Note that it's ok to release the transaction lock before aborting

1450 * the transaction because even if the parallel apply worker dies due

1451 * to crash or some other reason, such a transaction would still be

1452 * considered aborted.

1453 */

1454 pa_unlock_transaction(xid, AccessExclusiveLock);

1455

1456 AbortCurrentTransaction();

1457

1458 if (IsTransactionBlock())

1459 {

1460 EndTransactionBlock(false);

1461 CommitTransactionCommand();

1462 }

1463

1464 pa_reset_subtrans();

1465

1466 pgstat_report_activity(STATE_IDLE, NULL);

1467 }

1468 else

1469 {

1470 /* OK, so it's a subxact. Rollback to the savepoint. */

1471 int i;

1472 char spname[NAMEDATALEN];

1473

1474 pa_savepoint_name(MySubscription->oid, subxid, spname, sizeof(spname));

1475

1476 elog(DEBUG1, "rolling back to savepoint %s in logical replication parallel apply worker", spname);

1477

1478 /*

1479 * Search the subxactlist, determine the offset tracked for the

1480 * subxact, and truncate the list.

1481 *

1482 * Note that for an empty sub-transaction we won't find the subxid

1483 * here.

1484 */

1485 for (i = list_length(subxactlist) - 1; i >= 0; i--)

1486 {

1487 TransactionId xid_tmp = lfirst_xid(list_nth_cell(subxactlist, i));

1488

1489 if (xid_tmp == subxid)

1490 {

1491 RollbackToSavepoint(spname);

1492 CommitTransactionCommand();

1493 subxactlist = list_truncate(subxactlist, i);

1494 break;

1495 }

1496 }

1497 }

1498}

1499

1500/*

1501 * Set the fileset state for a particular parallel apply worker. The fileset

1502 * will be set once the leader worker serialized all changes to the file

1503 * so that it can be used by parallel apply worker.

1504 */

1505void

1506 pa_set_fileset_state(ParallelApplyWorkerShared *wshared,

1507 PartialFileSetState fileset_state)

1508{

1509 SpinLockAcquire(&wshared->mutex);

1510 wshared->fileset_state = fileset_state;

1511

1512 if (fileset_state == FS_SERIALIZE_DONE)

1513 {

1514 Assert(am_leader_apply_worker());

1515 Assert(MyLogicalRepWorker->stream_fileset);

1516 wshared->fileset = *MyLogicalRepWorker->stream_fileset;

1517 }

1518

1519 SpinLockRelease(&wshared->mutex);

1520}

1521

1522/*

1523 * Get the fileset state for the current parallel apply worker.

1524 */

1525static PartialFileSetState

1526 pa_get_fileset_state(void)

1527{

1528 PartialFileSetState fileset_state;

1529

1530 Assert(am_parallel_apply_worker());

1531

1532 SpinLockAcquire(&MyParallelShared->mutex);

1533 fileset_state = MyParallelShared->fileset_state;

1534 SpinLockRelease(&MyParallelShared->mutex);

1535

1536 return fileset_state;

1537}

1538

1539/*

1540 * Helper functions to acquire and release a lock for each stream block.

1541 *

1542 * Set locktag_field4 to PARALLEL_APPLY_LOCK_STREAM to indicate that it's a

1543 * stream lock.

1544 *

1545 * Refer to the comments atop this file to see how the stream lock is used.

1546 */

1547void

1548 pa_lock_stream(TransactionId xid, LOCKMODE lockmode)

1549{

1550 LockApplyTransactionForSession(MyLogicalRepWorker->subid, xid,

1551 PARALLEL_APPLY_LOCK_STREAM, lockmode);

1552}

1553

1554void

1555 pa_unlock_stream(TransactionId xid, LOCKMODE lockmode)

1556{

1557 UnlockApplyTransactionForSession(MyLogicalRepWorker->subid, xid,

1558 PARALLEL_APPLY_LOCK_STREAM, lockmode);

1559}

1560

1561/*

1562 * Helper functions to acquire and release a lock for each local transaction

1563 * apply.

1564 *

1565 * Set locktag_field4 to PARALLEL_APPLY_LOCK_XACT to indicate that it's a

1566 * transaction lock.

1567 *

1568 * Note that all the callers must pass a remote transaction ID instead of a

1569 * local transaction ID as xid. This is because the local transaction ID will

1570 * only be assigned while applying the first change in the parallel apply but

1571 * it's possible that the first change in the parallel apply worker is blocked

1572 * by a concurrently executing transaction in another parallel apply worker. We

1573 * can only communicate the local transaction id to the leader after applying

1574 * the first change so it won't be able to wait after sending the xact finish

1575 * command using this lock.

1576 *

1577 * Refer to the comments atop this file to see how the transaction lock is

1578 * used.

1579 */

1580void

1581 pa_lock_transaction(TransactionId xid, LOCKMODE lockmode)

1582{

1583 LockApplyTransactionForSession(MyLogicalRepWorker->subid, xid,

1584 PARALLEL_APPLY_LOCK_XACT, lockmode);

1585}

1586

1587void

1588 pa_unlock_transaction(TransactionId xid, LOCKMODE lockmode)

1589{

1590 UnlockApplyTransactionForSession(MyLogicalRepWorker->subid, xid,

1591 PARALLEL_APPLY_LOCK_XACT, lockmode);

1592}

1593

1594/*

1595 * Decrement the number of pending streaming blocks and wait on the stream lock

1596 * if there is no pending block available.

1597 */

1598void

1599 pa_decr_and_wait_stream_block(void)

1600{

1601 Assert(am_parallel_apply_worker());

1602

1603 /*

1604 * It is only possible to not have any pending stream chunks when we are

1605 * applying spooled messages.

1606 */

1607 if (pg_atomic_read_u32(&MyParallelShared->pending_stream_count) == 0)

1608 {

1609 if (pa_has_spooled_message_pending())

1610 return;

1611

1612 elog(ERROR, "invalid pending streaming chunk 0");

1613 }

1614

1615 if (pg_atomic_sub_fetch_u32(&MyParallelShared->pending_stream_count, 1) == 0)

1616 {

1617 pa_lock_stream(MyParallelShared->xid, AccessShareLock);

1618 pa_unlock_stream(MyParallelShared->xid, AccessShareLock);

1619 }

1620}

1621

1622/*

1623 * Finish processing the streaming transaction in the leader apply worker.

1624 */

1625void

1626 pa_xact_finish(ParallelApplyWorkerInfo *winfo, XLogRecPtr remote_lsn)

1627{

1628 Assert(am_leader_apply_worker());

1629

1630 /*

1631 * Unlock the shared object lock so that parallel apply worker can

1632 * continue to receive and apply changes.

1633 */

1634 pa_unlock_stream(winfo->shared->xid, AccessExclusiveLock);

1635

1636 /*

1637 * Wait for that worker to finish. This is necessary to maintain commit

1638 * order which avoids failures due to transaction dependencies and

1639 * deadlocks.

1640 */

1641 pa_wait_for_xact_finish(winfo);

1642

1643 if (!XLogRecPtrIsInvalid(remote_lsn))

1644 store_flush_position(remote_lsn, winfo->shared->last_commit_end);

1645

1646 pa_free_worker(winfo);

1647}

ParallelApplyWorkerEntry

struct ParallelApplyWorkerEntry ParallelApplyWorkerEntry

stream_apply_worker

static ParallelApplyWorkerInfo * stream_apply_worker

Definition: applyparallelworker.c:252

ParallelApplyWorkerPool

static List * ParallelApplyWorkerPool

Definition: applyparallelworker.c:234

pa_set_xact_state

void pa_set_xact_state(ParallelApplyWorkerShared *wshared, ParallelTransState xact_state)

Definition: applyparallelworker.c:1315

pa_unlock_stream

void pa_unlock_stream(TransactionId xid, LOCKMODE lockmode)

Definition: applyparallelworker.c:1555

pa_setup_dsm

static bool pa_setup_dsm(ParallelApplyWorkerInfo *winfo)

Definition: applyparallelworker.c:327

DSM_ERROR_QUEUE_SIZE

#define DSM_ERROR_QUEUE_SIZE

Definition: applyparallelworker.c:195

ParallelApplyMessagePending

volatile sig_atomic_t ParallelApplyMessagePending

Definition: applyparallelworker.c:245

pa_can_start

static bool pa_can_start(void)

Definition: applyparallelworker.c:265

HandleParallelApplyMessageInterrupt

void HandleParallelApplyMessageInterrupt(void)

Definition: applyparallelworker.c:997

ProcessParallelApplyMessages

void ProcessParallelApplyMessages(void)

Definition: applyparallelworker.c:1071

SHM_SEND_TIMEOUT_MS

#define SHM_SEND_TIMEOUT_MS

DSM_QUEUE_SIZE

#define DSM_QUEUE_SIZE

Definition: applyparallelworker.c:187

pa_savepoint_name

static void pa_savepoint_name(Oid suboid, TransactionId xid, char *spname, Size szsp)

Definition: applyparallelworker.c:1356

pa_stream_abort

void pa_stream_abort(LogicalRepStreamAbortData *abort_data)

Definition: applyparallelworker.c:1424

ProcessParallelApplyInterrupts

static void ProcessParallelApplyInterrupts(void)

Definition: applyparallelworker.c:713

ProcessParallelApplyMessage

static void ProcessParallelApplyMessage(StringInfo msg)

Definition: applyparallelworker.c:1009

pa_get_fileset_state

static PartialFileSetState pa_get_fileset_state(void)

Definition: applyparallelworker.c:1526

pa_free_worker_info

static void pa_free_worker_info(ParallelApplyWorkerInfo *winfo)

Definition: applyparallelworker.c:596

PARALLEL_APPLY_LOCK_XACT

#define PARALLEL_APPLY_LOCK_XACT

Definition: applyparallelworker.c:210

pa_lock_stream

void pa_lock_stream(TransactionId xid, LOCKMODE lockmode)

Definition: applyparallelworker.c:1548

subxactlist

static List * subxactlist

Definition: applyparallelworker.c:255

pa_has_spooled_message_pending

static bool pa_has_spooled_message_pending()

Definition: applyparallelworker.c:643

pa_shutdown

static void pa_shutdown(int code, Datum arg)

Definition: applyparallelworker.c:845

pa_set_fileset_state

void pa_set_fileset_state(ParallelApplyWorkerShared *wshared, PartialFileSetState fileset_state)

Definition: applyparallelworker.c:1506

pa_reset_subtrans

void pa_reset_subtrans(void)

Definition: applyparallelworker.c:1410

pa_get_xact_state

static ParallelTransState pa_get_xact_state(ParallelApplyWorkerShared *wshared)

Definition: applyparallelworker.c:1327

PARALLEL_APPLY_KEY_SHARED

#define PARALLEL_APPLY_KEY_SHARED

Definition: applyparallelworker.c:182

pa_lock_transaction

void pa_lock_transaction(TransactionId xid, LOCKMODE lockmode)

Definition: applyparallelworker.c:1581

MyParallelShared

ParallelApplyWorkerShared * MyParallelShared

Definition: applyparallelworker.c:239

pa_detach_all_error_mq

void pa_detach_all_error_mq(void)

Definition: applyparallelworker.c:623

LogicalParallelApplyLoop

static void LogicalParallelApplyLoop(shm_mq_handle *mqh)

Definition: applyparallelworker.c:735

pa_wait_for_xact_state

static void pa_wait_for_xact_state(ParallelApplyWorkerInfo *winfo, ParallelTransState xact_state)

Definition: applyparallelworker.c:1252

pa_start_subtrans

void pa_start_subtrans(TransactionId current_xid, TransactionId top_xid)

Definition: applyparallelworker.c:1370

PARALLEL_APPLY_KEY_ERROR_QUEUE

#define PARALLEL_APPLY_KEY_ERROR_QUEUE

Definition: applyparallelworker.c:184

pa_switch_to_partial_serialize

void pa_switch_to_partial_serialize(ParallelApplyWorkerInfo *winfo, bool stream_locked)

Definition: applyparallelworker.c:1219

pa_free_worker

static void pa_free_worker(ParallelApplyWorkerInfo *winfo)

Definition: applyparallelworker.c:557

pa_xact_finish

void pa_xact_finish(ParallelApplyWorkerInfo *winfo, XLogRecPtr remote_lsn)

Definition: applyparallelworker.c:1626

PARALLEL_APPLY_KEY_MQ

#define PARALLEL_APPLY_KEY_MQ

Definition: applyparallelworker.c:183

pa_wait_for_xact_finish

static void pa_wait_for_xact_finish(ParallelApplyWorkerInfo *winfo)

Definition: applyparallelworker.c:1282

SIZE_STATS_MESSAGE

#define SIZE_STATS_MESSAGE

Definition: applyparallelworker.c:203

SHM_SEND_RETRY_INTERVAL_MS

#define SHM_SEND_RETRY_INTERVAL_MS

pa_send_data

bool pa_send_data(ParallelApplyWorkerInfo *winfo, Size nbytes, const void *data)

Definition: applyparallelworker.c:1154

pa_allocate_worker

void pa_allocate_worker(TransactionId xid)

Definition: applyparallelworker.c:471

pa_process_spooled_messages_if_required

static bool pa_process_spooled_messages_if_required(void)

Definition: applyparallelworker.c:659

pa_set_stream_apply_worker

void pa_set_stream_apply_worker(ParallelApplyWorkerInfo *winfo)

Definition: applyparallelworker.c:1342

ParallelApplyTxnHash

static HTAB * ParallelApplyTxnHash

Definition: applyparallelworker.c:225

PARALLEL_APPLY_LOCK_STREAM

#define PARALLEL_APPLY_LOCK_STREAM

Definition: applyparallelworker.c:209

pa_find_worker

ParallelApplyWorkerInfo * pa_find_worker(TransactionId xid)

Definition: applyparallelworker.c:519

pa_unlock_transaction

void pa_unlock_transaction(TransactionId xid, LOCKMODE lockmode)

Definition: applyparallelworker.c:1588

pa_launch_parallel_worker

static ParallelApplyWorkerInfo * pa_launch_parallel_worker(void)

Definition: applyparallelworker.c:404

ParallelApplyWorkerMain

void ParallelApplyWorkerMain(Datum main_arg)

Definition: applyparallelworker.c:858

PG_LOGICAL_APPLY_SHM_MAGIC

#define PG_LOGICAL_APPLY_SHM_MAGIC

Definition: applyparallelworker.c:175

pa_decr_and_wait_stream_block

void pa_decr_and_wait_stream_block(void)

Definition: applyparallelworker.c:1599

pg_atomic_sub_fetch_u32

static uint32 pg_atomic_sub_fetch_u32(volatile pg_atomic_uint32 *ptr, int32 sub_)

Definition: atomics.h:437

pg_atomic_init_u32

static void pg_atomic_init_u32(volatile pg_atomic_uint32 *ptr, uint32 val)

Definition: atomics.h:219

pg_atomic_read_u32

static uint32 pg_atomic_read_u32(volatile pg_atomic_uint32 *ptr)

Definition: atomics.h:237

stream_cleanup_files

void stream_cleanup_files(Oid subid, TransactionId xid)

Definition: worker.c:5350

ApplyMessageContext

MemoryContext ApplyMessageContext

Definition: worker.c:471

InitializingApplyWorker

bool InitializingApplyWorker

Definition: worker.c:499

apply_dispatch

void apply_dispatch(StringInfo s)

Definition: worker.c:3747

ReplicationOriginNameForLogicalRep

void ReplicationOriginNameForLogicalRep(Oid suboid, Oid relid, char *originname, Size szoriginname)

Definition: worker.c:641

apply_error_context_stack

ErrorContextCallback * apply_error_context_stack

Definition: worker.c:469

stream_start_internal

void stream_start_internal(TransactionId xid, bool first_segment)

Definition: worker.c:1666

set_apply_error_context_origin

void set_apply_error_context_origin(char *originname)

Definition: worker.c:6260

ApplyContext

MemoryContext ApplyContext

Definition: worker.c:472

apply_error_callback

void apply_error_callback(void *arg)

Definition: worker.c:6118

store_flush_position

void store_flush_position(XLogRecPtr remote_lsn, XLogRecPtr local_lsn)

Definition: worker.c:3911

maybe_reread_subscription

void maybe_reread_subscription(void)

Definition: worker.c:5007

InitializeLogRepWorker

void InitializeLogRepWorker(void)

Definition: worker.c:5705

apply_spooled_messages

void apply_spooled_messages(FileSet *stream_fileset, TransactionId xid, XLogRecPtr lsn)

Definition: worker.c:2238

MySubscription

Subscription * MySubscription

Definition: worker.c:479

TimestampDifferenceExceeds

bool TimestampDifferenceExceeds(TimestampTz start_time, TimestampTz stop_time, int msec)

Definition: timestamp.c:1781

GetCurrentTimestamp

TimestampTz GetCurrentTimestamp(void)

Definition: timestamp.c:1645

pgstat_report_activity

void pgstat_report_activity(BackendState state, const char *cmd_str)

Definition: backend_status.c:572

STATE_IDLE

@ STATE_IDLE

Definition: backend_status.h:28

BackgroundWorkerUnblockSignals

void BackgroundWorkerUnblockSignals(void)

Definition: bgworker.c:927

unlikely

#define unlikely(x)

Definition: c.h:402

MemSet

#define MemSet(start, val, len)

Definition: c.h:1019

TransactionId

uint32 TransactionId

Definition: c.h:657

Size

size_t Size

Definition: c.h:610

TimestampTz

int64 TimestampTz

Definition: timestamp.h:39

dsm_segment_handle

dsm_handle dsm_segment_handle(dsm_segment *seg)

Definition: dsm.c:1123

dsm_detach

void dsm_detach(dsm_segment *seg)

Definition: dsm.c:803

dsm_segment_address

void * dsm_segment_address(dsm_segment *seg)

Definition: dsm.c:1095

dsm_create

dsm_segment * dsm_create(Size size, int flags)

Definition: dsm.c:516

dsm_attach

dsm_segment * dsm_attach(dsm_handle h)

Definition: dsm.c:665

dsm_handle

uint32 dsm_handle

Definition: dsm_impl.h:55

hash_search

void * hash_search(HTAB *hashp, const void *keyPtr, HASHACTION action, bool *foundPtr)

Definition: dynahash.c:952

hash_create

HTAB * hash_create(const char *tabname, int64 nelem, const HASHCTL *info, int flags)

Definition: dynahash.c:358

error_context_stack

ErrorContextCallback * error_context_stack

Definition: elog.c:95

errcode

int errcode(int sqlerrcode)

Definition: elog.c:854

errmsg

int errmsg(const char *fmt,...)

Definition: elog.c:1071

_

#define _(x)

Definition: elog.c:91

LOG

#define LOG

Definition: elog.h:31

errcontext

#define errcontext

Definition: elog.h:198

DEBUG1

#define DEBUG1

Definition: elog.h:30

ERROR

#define ERROR

Definition: elog.h:39

elog

#define elog(elevel,...)

Definition: elog.h:226

ereport

#define ereport(elevel,...)

Definition: elog.h:150

InterruptPending

volatile sig_atomic_t InterruptPending

Definition: globals.c:32

MyLatch

struct Latch * MyLatch

Definition: globals.c:63

ProcessConfigFile

void ProcessConfigFile(GucContext context)

Definition: guc-file.l:120

PGC_SIGHUP

@ PGC_SIGHUP

Definition: guc.h:75

Assert

Assert(PointerIsAligned(start, uint64))

HASH_FIND

@ HASH_FIND

Definition: hsearch.h:113

HASH_REMOVE

@ HASH_REMOVE

Definition: hsearch.h:115

HASH_ENTER

@ HASH_ENTER

Definition: hsearch.h:114

HASH_CONTEXT

#define HASH_CONTEXT

Definition: hsearch.h:102

HASH_ELEM

#define HASH_ELEM

Definition: hsearch.h:95

HASH_BLOBS

#define HASH_BLOBS

Definition: hsearch.h:97

SignalHandlerForShutdownRequest

void SignalHandlerForShutdownRequest(SIGNAL_ARGS)

Definition: interrupt.c:104

ShutdownRequestPending

volatile sig_atomic_t ShutdownRequestPending

Definition: interrupt.c:28

ConfigReloadPending

volatile sig_atomic_t ConfigReloadPending

Definition: interrupt.c:27

SignalHandlerForConfigReload

void SignalHandlerForConfigReload(SIGNAL_ARGS)

Definition: interrupt.c:61

interrupt.h

CacheRegisterSyscacheCallback

void CacheRegisterSyscacheCallback(int cacheid, SyscacheCallbackFunction func, Datum arg)

Definition: inval.c:1812

inval.h

before_shmem_exit

void before_shmem_exit(pg_on_exit_callback function, Datum arg)

Definition: ipc.c:337

proc_exit

void proc_exit(int code)

Definition: ipc.c:104

ipc.h

i

int i

Definition: isn.c:77

SetLatch

void SetLatch(Latch *latch)

Definition: latch.c:290

ResetLatch

void ResetLatch(Latch *latch)

Definition: latch.c:374

WaitLatch

int WaitLatch(Latch *latch, int wakeEvents, long timeout, uint32 wait_event_info)

Definition: latch.c:172

logicalrep_worker_launch

bool logicalrep_worker_launch(LogicalRepWorkerType wtype, Oid dbid, Oid subid, const char *subname, Oid userid, Oid relid, dsm_handle subworker_dsm, bool retain_dead_tuples)

Definition: launcher.c:317

logicalrep_worker_attach

void logicalrep_worker_attach(int slot)

Definition: launcher.c:731

logicalrep_pa_worker_stop

void logicalrep_pa_worker_stop(ParallelApplyWorkerInfo *winfo)

Definition: launcher.c:657

MyLogicalRepWorker

LogicalRepWorker * MyLogicalRepWorker

Definition: launcher.c:56

max_parallel_apply_workers_per_subscription

int max_parallel_apply_workers_per_subscription

Definition: launcher.c:54

list_delete_ptr

List * list_delete_ptr(List *list, void *datum)

Definition: list.c:872

lappend

List * lappend(List *list, void *datum)

Definition: list.c:339

lappend_xid

List * lappend_xid(List *list, TransactionId datum)

Definition: list.c:393

list_member_xid

bool list_member_xid(const List *list, TransactionId datum)

Definition: list.c:742

list_truncate

List * list_truncate(List *list, int new_size)

Definition: list.c:631

UnlockApplyTransactionForSession

void UnlockApplyTransactionForSession(Oid suboid, TransactionId xid, uint16 objid, LOCKMODE lockmode)

Definition: lmgr.c:1227

LockApplyTransactionForSession

void LockApplyTransactionForSession(Oid suboid, TransactionId xid, uint16 objid, LOCKMODE lockmode)

Definition: lmgr.c:1209

lmgr.h

LOCKMODE

int LOCKMODE

Definition: lockdefs.h:26

AccessExclusiveLock

#define AccessExclusiveLock

Definition: lockdefs.h:43

AccessShareLock

#define AccessShareLock

Definition: lockdefs.h:36

logicallauncher.h

logicalworker.h

MemoryContextReset

void MemoryContextReset(MemoryContext context)

Definition: mcxt.c:400

TopTransactionContext

MemoryContext TopTransactionContext

Definition: mcxt.c:171

pstrdup

char * pstrdup(const char *in)

Definition: mcxt.c:1759

pfree

void pfree(void *pointer)

Definition: mcxt.c:1594

palloc0

void * palloc0(Size size)

Definition: mcxt.c:1395

TopMemoryContext

MemoryContext TopMemoryContext

Definition: mcxt.c:166

CurrentMemoryContext

MemoryContext CurrentMemoryContext

Definition: mcxt.c:160

memutils.h

AllocSetContextCreate

#define AllocSetContextCreate

Definition: memutils.h:129

ALLOCSET_DEFAULT_SIZES

#define ALLOCSET_DEFAULT_SIZES

Definition: memutils.h:160

RESUME_INTERRUPTS

#define RESUME_INTERRUPTS()

Definition: miscadmin.h:135

CHECK_FOR_INTERRUPTS

#define CHECK_FOR_INTERRUPTS()

Definition: miscadmin.h:122

HOLD_INTERRUPTS

#define HOLD_INTERRUPTS()

Definition: miscadmin.h:133

replorigin_session_origin_timestamp

TimestampTz replorigin_session_origin_timestamp

Definition: origin.c:165

replorigin_by_name

RepOriginId replorigin_by_name(const char *roname, bool missing_ok)

Definition: origin.c:226

replorigin_session_origin

RepOriginId replorigin_session_origin

Definition: origin.c:163

replorigin_session_setup

void replorigin_session_setup(RepOriginId node, int acquired_by)

Definition: origin.c:1120

replorigin_session_origin_lsn

XLogRecPtr replorigin_session_origin_lsn

Definition: origin.c:164

origin.h

MemoryContextSwitchTo

static MemoryContext MemoryContextSwitchTo(MemoryContext context)

Definition: palloc.h:124

arg

void * arg

Definition: pg_backup_utils.c:29

NAMEDATALEN

#define NAMEDATALEN

Definition: pg_config_manual.h:29

len

const void size_t len

Definition: pg_crc32c_sse42.c:28

data

const void * data

Definition: pg_crc32c_sse42.c:27

lfirst

#define lfirst(lc)

Definition: pg_list.h:172

list_length

static int list_length(const List *l)

Definition: pg_list.h:152

NIL

#define NIL

Definition: pg_list.h:68

list_nth_cell

static ListCell * list_nth_cell(const List *list, int n)

Definition: pg_list.h:277

lfirst_xid

#define lfirst_xid(lc)

Definition: pg_list.h:175

die

#define die(msg)

Definition: pg_test_fsync.c:100

pgstat.h

pqsignal

#define pqsignal

Definition: port.h:531

snprintf

#define snprintf

Definition: port.h:239

postgres.h

PointerGetDatum

static Datum PointerGetDatum(const void *X)

Definition: postgres.h:332

Datum

uint64_t Datum

Definition: postgres.h:70

DatumGetPointer

static Pointer DatumGetPointer(Datum X)

Definition: postgres.h:322

DatumGetInt32

static int32 DatumGetInt32(Datum X)

Definition: postgres.h:212

InvalidOid

#define InvalidOid

Definition: postgres_ext.h:37

Oid

unsigned int Oid

Definition: postgres_ext.h:32

MyBgworkerEntry

BackgroundWorker * MyBgworkerEntry

Definition: postmaster.c:200

pq_getmsgbyte

int pq_getmsgbyte(StringInfo msg)

Definition: pqformat.c:399

pqformat.h

pq_set_parallel_leader

void pq_set_parallel_leader(pid_t pid, ProcNumber procNumber)

Definition: pqmq.c:82

pq_parse_errornotice

void pq_parse_errornotice(StringInfo msg, ErrorData *edata)

Definition: pqmq.c:222

pq_redirect_to_shm_mq

void pq_redirect_to_shm_mq(dsm_segment *seg, shm_mq_handle *mqh)

Definition: pqmq.c:53

pqmq.h

c

char * c

Definition: preproc-cursor.c:31

e

Definition: preproc-init.c:82

INVALID_PROC_NUMBER

#define INVALID_PROC_NUMBER

Definition: procnumber.h:26

SendProcSignal

int SendProcSignal(pid_t pid, ProcSignalReason reason, ProcNumber procNumber)

Definition: procsignal.c:284

PROCSIG_PARALLEL_APPLY_MESSAGE

@ PROCSIG_PARALLEL_APPLY_MESSAGE

Definition: procsignal.h:38

PqReplMsg_WALData

#define PqReplMsg_WALData

Definition: protocol.h:77

PqMsg_NotificationResponse

#define PqMsg_NotificationResponse

Definition: protocol.h:41

PqMsg_ErrorResponse

#define PqMsg_ErrorResponse

Definition: protocol.h:44

PqMsg_NoticeResponse

#define PqMsg_NoticeResponse

Definition: protocol.h:49

psprintf

char * psprintf(const char *fmt,...)

Definition: psprintf.c:43

ctl

tree ctl

Definition: radixtree.h:1838

debug_logical_replication_streaming

int debug_logical_replication_streaming

Definition: reorderbuffer.c:229

DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE

@ DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE

Definition: reorderbuffer.h:34

shm_mq_set_sender

void shm_mq_set_sender(shm_mq *mq, PGPROC *proc)

Definition: shm_mq.c:224

shm_mq_create

shm_mq * shm_mq_create(void *address, Size size)

Definition: shm_mq.c:177

shm_mq_detach

void shm_mq_detach(shm_mq_handle *mqh)

Definition: shm_mq.c:843

shm_mq_set_receiver

void shm_mq_set_receiver(shm_mq *mq, PGPROC *proc)

Definition: shm_mq.c:206

shm_mq_receive

shm_mq_result shm_mq_receive(shm_mq_handle *mqh, Size *nbytesp, void **datap, bool nowait)

Definition: shm_mq.c:572

shm_mq_send

shm_mq_result shm_mq_send(shm_mq_handle *mqh, Size nbytes, const void *data, bool nowait, bool force_flush)

Definition: shm_mq.c:329

shm_mq_attach

shm_mq_handle * shm_mq_attach(shm_mq *mq, dsm_segment *seg, BackgroundWorkerHandle *handle)

Definition: shm_mq.c:290

shm_mq_result

Definition: shm_mq.h:37

SHM_MQ_SUCCESS

@ SHM_MQ_SUCCESS

Definition: shm_mq.h:38

SHM_MQ_WOULD_BLOCK

@ SHM_MQ_WOULD_BLOCK

Definition: shm_mq.h:39

SHM_MQ_DETACHED

@ SHM_MQ_DETACHED

Definition: shm_mq.h:40

shm_toc_allocate

void * shm_toc_allocate(shm_toc *toc, Size nbytes)

Definition: shm_toc.c:88

shm_toc_estimate

Size shm_toc_estimate(shm_toc_estimator *e)

Definition: shm_toc.c:263

shm_toc_create

shm_toc * shm_toc_create(uint64 magic, void *address, Size nbytes)

Definition: shm_toc.c:40

shm_toc_insert

void shm_toc_insert(shm_toc *toc, uint64 key, void *address)

Definition: shm_toc.c:171

shm_toc_lookup

void * shm_toc_lookup(shm_toc *toc, uint64 key, bool noError)

Definition: shm_toc.c:232

shm_toc_attach

shm_toc * shm_toc_attach(uint64 magic, void *address)

Definition: shm_toc.c:64

shm_toc_estimate_chunk

#define shm_toc_estimate_chunk(e, sz)

Definition: shm_toc.h:51

shm_toc_initialize_estimator

#define shm_toc_initialize_estimator(e)

Definition: shm_toc.h:49

shm_toc_estimate_keys

#define shm_toc_estimate_keys(e, cnt)

Definition: shm_toc.h:53

SpinLockInit

#define SpinLockInit(lock)

Definition: spin.h:57

SpinLockRelease

#define SpinLockRelease(lock)

Definition: spin.h:61

SpinLockAcquire

#define SpinLockAcquire(lock)

Definition: spin.h:59

MyProc

PGPROC * MyProc

Definition: proc.c:66

appendBinaryStringInfo

void appendBinaryStringInfo(StringInfo str, const void *data, int datalen)

Definition: stringinfo.c:281

initStringInfo

void initStringInfo(StringInfo str)

Definition: stringinfo.c:97

initReadOnlyStringInfo

static void initReadOnlyStringInfo(StringInfo str, char *data, int len)

Definition: stringinfo.h:157

BackgroundWorker::bgw_extra

char bgw_extra[BGW_EXTRALEN]

Definition: bgworker.h:99

ErrorContextCallback

Definition: elog.h:296

ErrorContextCallback::previous

struct ErrorContextCallback * previous

Definition: elog.h:297

ErrorContextCallback::callback

void(* callback)(void *arg)

Definition: elog.h:298

ErrorData

Definition: elog.h:420

ErrorData::context

char * context

Definition: elog.h:436

HASHCTL

Definition: hsearch.h:66

HTAB

Definition: dynahash.c:222

List

Definition: pg_list.h:54

LogicalRepStreamAbortData

Definition: logicalproto.h:187

LogicalRepStreamAbortData::abort_lsn

XLogRecPtr abort_lsn

Definition: logicalproto.h:190

LogicalRepStreamAbortData::xid

TransactionId xid

Definition: logicalproto.h:188

LogicalRepStreamAbortData::subxid

TransactionId subxid

Definition: logicalproto.h:189

LogicalRepStreamAbortData::abort_time

TimestampTz abort_time

Definition: logicalproto.h:191

LogicalRepWorker::last_recv_time

TimestampTz last_recv_time

Definition: worker_internal.h:106

LogicalRepWorker::generation

uint16 generation

Definition: worker_internal.h:49

LogicalRepWorker::parallel_apply

bool parallel_apply

Definition: worker_internal.h:87

LogicalRepWorker::reply_time

TimestampTz reply_time

Definition: worker_internal.h:108

LogicalRepWorker::stream_fileset

FileSet * stream_fileset

Definition: worker_internal.h:78

LogicalRepWorker::subid

Oid subid

Definition: worker_internal.h:61

LogicalRepWorker::dbid

Oid dbid

Definition: worker_internal.h:55

LogicalRepWorker::leader_pid

pid_t leader_pid

Definition: worker_internal.h:84

LogicalRepWorker::last_send_time

TimestampTz last_send_time

Definition: worker_internal.h:105

LogicalRepWorker::userid

Oid userid

Definition: worker_internal.h:58

MemoryContextData

Definition: memnodes.h:118

ParallelApplyWorkerEntry

Definition: applyparallelworker.c:216

ParallelApplyWorkerEntry::winfo

ParallelApplyWorkerInfo * winfo

Definition: applyparallelworker.c:218

ParallelApplyWorkerEntry::xid

TransactionId xid

Definition: applyparallelworker.c:217

ParallelApplyWorkerInfo

Definition: worker_internal.h:203

ParallelApplyWorkerInfo::serialize_changes

bool serialize_changes

Definition: worker_internal.h:223

ParallelApplyWorkerInfo::in_use

bool in_use

Definition: worker_internal.h:229

ParallelApplyWorkerInfo::error_mq_handle

shm_mq_handle * error_mq_handle

Definition: worker_internal.h:214

ParallelApplyWorkerInfo::dsm_seg

dsm_segment * dsm_seg

Definition: worker_internal.h:216

ParallelApplyWorkerInfo::mq_handle

shm_mq_handle * mq_handle

Definition: worker_internal.h:208

ParallelApplyWorkerInfo::shared

ParallelApplyWorkerShared * shared

Definition: worker_internal.h:231

ParallelApplyWorkerShared

Definition: worker_internal.h:153

ParallelApplyWorkerShared::logicalrep_worker_slot_no

int logicalrep_worker_slot_no

Definition: worker_internal.h:171

ParallelApplyWorkerShared::mutex

slock_t mutex

Definition: worker_internal.h:154

ParallelApplyWorkerShared::pending_stream_count

pg_atomic_uint32 pending_stream_count

Definition: worker_internal.h:177

ParallelApplyWorkerShared::fileset

FileSet fileset

Definition: worker_internal.h:196

ParallelApplyWorkerShared::xid

TransactionId xid

Definition: worker_internal.h:156

ParallelApplyWorkerShared::fileset_state

PartialFileSetState fileset_state

Definition: worker_internal.h:195

ParallelApplyWorkerShared::logicalrep_worker_generation

uint16 logicalrep_worker_generation

Definition: worker_internal.h:170

ParallelApplyWorkerShared::xact_state

ParallelTransState xact_state

Definition: worker_internal.h:167

ParallelApplyWorkerShared::last_commit_end

XLogRecPtr last_commit_end

Definition: worker_internal.h:183

StringInfoData

Definition: stringinfo.h:47

StringInfoData::cursor

int cursor

Definition: stringinfo.h:51

StringInfoData::data

char * data

Definition: stringinfo.h:48

StringInfoData::len

int len

Definition: stringinfo.h:49

Subscription::skiplsn

XLogRecPtr skiplsn

Definition: pg_subscription.h:126

Subscription::name

char * name

Definition: pg_subscription.h:128

Subscription::oid

Oid oid

Definition: pg_subscription.h:123

dsm_segment

Definition: dsm.c:67

shm_mq_handle

Definition: shm_mq.c:138

shm_mq

Definition: shm_mq.c:72

shm_toc_estimator

Definition: shm_toc.h:44

shm_toc

Definition: shm_toc.c:27

syscache.h

AllTablesyncsReady

bool AllTablesyncsReady(void)

Definition: tablesync.c:1770

invalidate_syncing_table_states

void invalidate_syncing_table_states(Datum arg, int cacheid, uint32 hashvalue)

Definition: tablesync.c:280

tcopprot.h

TransactionIdIsValid

#define TransactionIdIsValid(xid)

Definition: transam.h:41

ListCell

Definition: pg_list.h:46

WL_TIMEOUT

#define WL_TIMEOUT

Definition: waiteventset.h:37

WL_EXIT_ON_PM_DEATH

#define WL_EXIT_ON_PM_DEATH

Definition: waiteventset.h:39

WL_LATCH_SET

#define WL_LATCH_SET

Definition: waiteventset.h:34

SIGHUP

#define SIGHUP

Definition: win32_port.h:158

SIGUSR2

#define SIGUSR2

Definition: win32_port.h:171

worker_internal.h

ParallelTransState

Definition: worker_internal.h:118

PARALLEL_TRANS_UNKNOWN

@ PARALLEL_TRANS_UNKNOWN

Definition: worker_internal.h:119

PARALLEL_TRANS_STARTED

@ PARALLEL_TRANS_STARTED

Definition: worker_internal.h:120

PARALLEL_TRANS_FINISHED

@ PARALLEL_TRANS_FINISHED

Definition: worker_internal.h:121

am_parallel_apply_worker

static bool am_parallel_apply_worker(void)

Definition: worker_internal.h:364

WORKERTYPE_PARALLEL_APPLY

@ WORKERTYPE_PARALLEL_APPLY

Definition: worker_internal.h:34

PartialFileSetState

Definition: worker_internal.h:141

FS_EMPTY

@ FS_EMPTY

Definition: worker_internal.h:142

FS_SERIALIZE_DONE

@ FS_SERIALIZE_DONE

Definition: worker_internal.h:144

FS_READY

@ FS_READY

Definition: worker_internal.h:145

FS_SERIALIZE_IN_PROGRESS

@ FS_SERIALIZE_IN_PROGRESS

Definition: worker_internal.h:143

am_leader_apply_worker

static bool am_leader_apply_worker(void)

Definition: worker_internal.h:357

DefineSavepoint

void DefineSavepoint(const char *name)

Definition: xact.c:4385

IsTransactionState

bool IsTransactionState(void)

Definition: xact.c:387

StartTransactionCommand

void StartTransactionCommand(void)

Definition: xact.c:3071

IsTransactionBlock

bool IsTransactionBlock(void)

Definition: xact.c:4983

BeginTransactionBlock

void BeginTransactionBlock(void)

Definition: xact.c:3936

CommitTransactionCommand

void CommitTransactionCommand(void)

Definition: xact.c:3169

RollbackToSavepoint

void RollbackToSavepoint(const char *name)

Definition: xact.c:4579

EndTransactionBlock

bool EndTransactionBlock(bool chain)

Definition: xact.c:4056

AbortCurrentTransaction

void AbortCurrentTransaction(void)

Definition: xact.c:3463

XLogRecPtrIsInvalid

#define XLogRecPtrIsInvalid(r)

Definition: xlogdefs.h:29

RepOriginId

uint16 RepOriginId

Definition: xlogdefs.h:68

XLogRecPtr

uint64 XLogRecPtr

Definition: xlogdefs.h:21

InvalidXLogRecPtr

#define InvalidXLogRecPtr

Definition: xlogdefs.h:28

PostgreSQL Source Code: src/backend/replication/logical/applyparallelworker.c Source File