[BP 7.2.4] - XDCR - Xmem nozzle cleanup is stuck due to waiting on non-existent bandwidth throttler

Description

When a pipeline/replication is configured with bandwidth limit and the pipeline stops the Xmem nozzles do a cleanup. This clean-up is stuck because the writers (i.e. xmem nozzle writing to socket) wait for bandwidth throttler (referred to as only throttler henceforth) to release some capacity/quota.

However due to pipeline stopping, the throttler goroutine also exits. So we now have a situation where the writers are waiting on a non-existent throttler. 

The stacktrace for Xmem Nozzle

following stacktrace shows waiting on the bandwidth throttler:

 

 

 

Steps to reproduce:

  1. Create replication with bandwidth usage limit

  2. Ensure that usage limit is such that the writers block all the time. The following log line will indicate such situation

  1. Pause the replication.

  2. Check goroutine stack trace

Components

Fix versions

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Activity

Show:

Neil Huang January 3, 2024 at 6:39 PM

Release Notes
Problem Description: If Bandwidth throttler is used, race condition may occur during pipeline shutdown where Out nozzle is unable to exit
Resolution: Fix race condition during shutdown to ensure out nozzle closes properly

Ayush Nayyar December 12, 2023 at 8:47 AM

Verified on 7.2.4-7045.

CB robot November 7, 2023 at 3:41 PM

Build couchbase-server-7.2.4-6949 contains goxdcr commit 2518e09 with commit message:
: Xmem nozzle cleanup is stuck due to waiting on non-existent bandwidth throttler

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

No

Triage

Untriaged

Story Points

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created November 3, 2023 at 5:50 PM
Updated August 31, 2024 at 11:07 AM
Resolved November 29, 2023 at 2:19 PM
Instabug