[BP 7.2.6] - XDCR - Deadlock because of routingUpdater when pipeline is stopping

Description

Found in QE testing by

Bidirectional replication in mobile mode (ECCV and mobile settings on) and changes_left was stuck. Pprof files (both of the same cluster, but taken at two different times) attached.

Components

Fix versions

Labels

Environment

None

Link to Log File, atop/blg, CBCollectInfo, Core dump

None

Release Notes Description

None

Activity

Show:

Sumukh Bhat January 8, 2025 at 3:55 AM

ECCV stands for enableCrossClusterVersioning - a bucket setting and mobile is a XDCR setting for a future feature. When QE was testing the feature, they found this bug. But it was coincidental that the bug was hit with those settings. The deadlock could be hit without having any settings on too.

Nirvair Bhinder January 7, 2025 at 9:46 PM

Hi , can you please explain what “ECCV and mobile settings” means here?

Ayush Nayyar August 12, 2024 at 2:59 PM

Verified on 7.2.6-8103.

Beth Favini August 12, 2024 at 11:16 AM

We are working on release notes for Server 7.2.6. If this issue should be included in the release notes, please do the following:

  • Add the releasenote label

  • Include a brief description of the issue and resolution (from the customer's point of view).

Thank you

Sumukh Bhat June 7, 2024 at 7:45 AM

Release notes:
The fix addresses the corner case that the routing updater to raise a backfill related event is stuck because of the race with closing the replication pipeline.

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Is this a Regression?

Unknown

Triage

Untriaged

Story Points

Priority

Instabug

Open Instabug

PagerDuty

Sentry

Zendesk Support

Created June 3, 2024 at 1:03 PM
Updated January 8, 2025 at 3:55 AM
Resolved June 6, 2024 at 9:48 AM
Instabug