drbd/0002-drbd-improve-decision-about-marking-a-failed-disk-Ou.patch
heming zhao 6ee9ba5898 - Update DRBD version from 9.1.22 to 9.1.23 (boo#1234849)
* Changelog from Linbit:
    9.1.23 (api:genl2/proto:86-101,118-121/transport:18)
    --------
     * Fix a corner case that can happen when DRBD establishes multiple
       connections in parallel, which could lead one connection to end up in
       an inconsistent replication state of WFBitMapT/Established
     * Fix a corner case in which a reconciliation resync ends up in
       WFBitMapT/Established
     * Restrict protocol compatibility to the most recent 8.4 and 9.0 releases
     * Fix a corner case causing a module ref leak on drbd_transport_tcp;
       if it hits, you can not rmmod it
     * rate-limit resync progress while resync is paused
     * resync-target inherits history UUIDs when resync finishes,
       this can prevent unexpected "unrelared data" events later
     * Updated compatibility code for Linux 6.11 and 6.12
  * remove patches which already included in the new version:
     0001-drbd-properly-rate-limit-resync-progress-reports.patch
     0002-drbd-inherit-history-UUIDs-from-sync-source-when-res.patch
     0003-build-compat-fix-line-offset-in-annotation-pragmas-p.patch
     0004-drbd-fix-exposed_uuid-going-backward.patch
     0005-drbd-Proper-locking-around-new_current_uuid-on-a-dis.patch
     0006-build-CycloneDX-fix-bom-ref-add-purl.patch
     0007-build-Another-update-to-the-spdx-files.patch
     0008-build-generate-spdx.json-not-tag-value-format.patch
     0009-compat-fix-gen_patch_names-for-bdev_file_open_by_pat.patch
     0010-compat-fix-nla_nest_start_noflag-test.patch
     0011-compat-fix-blk_alloc_disk-rule.patch
     0012-drbd-remove-const-from-function-return-type.patch
     0013-drbd-don-t-set-max_write_zeroes_sectors-in-decide_on.patch
     0014-drbd-split-out-a-drbd_discard_supported-helper.patch
     0015-drbd-atomically-update-queue-limits-in-drbd_reconsid.patch
     0016-compat-test-and-patch-for-queue_limits_start_update.patch
     0017-compat-specify-which-essential-change-was-not-made.patch
     0018-gen_patch_names-reorder-blk_mode_t.patch
     0019-compat-fix-blk_queue_update_readahead-patch.patch
     0020-compat-test-and-patch-for-que_limits-max_hw_discard_.patch
     0021-compat-fixup-write_zeroes__no_capable.patch
     0022-compat-fixup-queue_flag_discard__yes_present.patch
     0023-drbd-move-flags-to-queue_limits.patch
     0024-compat-test-and-patch-for-queue_limits.features.patch
     0025-drbd-Annotate-struct-fifo_buffer-with-__counted_by.patch
     0026-compat-test-and-patch-for-__counted_by.patch
     0027-drbd-fix-function-cast-warnings-in-state-machine.patch
     0028-Add-missing-documentation-of-peer_device-parameter-t.patch
     0030-drbd-kref_put-path-when-kernel_accept-fails.patch
     0031-build-fix-typo-in-Makefile.spatch.patch
     0032-drbd-open-do-not-delay-open-if-already-Primary.patch
  * removed patch which is not needed anymore:
     boo1231290_fix_drbd_build_error_against_kernel_v6.11.0.patch
     boo1233222_fix_drbd_build_error_against_kernel_v6.11.6.patch
  * update:
     drbd_git_revision
     drbd.spec
  * add upstream patches to align commit d64ebe7eb7df:
     0001-drbd-Fix-memory-leak.patch

OBS-URL: https://build.opensuse.org/package/show/network:ha-clustering:Factory/drbd?expand=0&rev=155
2024-12-27 03:43:25 +00:00

66 lines
2.5 KiB
Diff

From f2cd05b8d60d27f43b07175b92ef4c2a69b8e3a2 Mon Sep 17 00:00:00 2001
From: Joel Colledge <joel.colledge@linbit.com>
Date: Wed, 6 Sep 2023 15:49:44 +0200
Subject: [PATCH 02/20] drbd: improve decision about marking a failed disk
Outdated
Sometimes it is possible to update the metadata even after our disk has
failed. We were too eager to remove the MDF_WAS_UP_TO_DATE flag in this
case.
Firstly, we used the "NOW" states, so would mark our metadata Outdated
if we were a Primary with UpToDate data and no peers, and our disk
failed. Use the "NEW" states instead.
Secondly, do not consider peers that are disconnecting, because they
will not see that our disk state is Failed, and so will outdate
themselves. We do not want to outdate both nodes in this situation.
---
drbd/drbd_state.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/drbd/drbd_state.c b/drbd/drbd_state.c
index 7e6e3477893d..8b60afeb097b 100644
--- a/drbd/drbd_state.c
+++ b/drbd/drbd_state.c
@@ -2489,15 +2489,24 @@ static void initialize_resync(struct drbd_peer_device *peer_device)
/* Is there a primary with access to up to date data known */
static bool primary_and_data_present(struct drbd_device *device)
{
- bool up_to_date_data = device->disk_state[NOW] == D_UP_TO_DATE;
- bool primary = device->resource->role[NOW] == R_PRIMARY;
+ bool up_to_date_data = device->disk_state[NEW] == D_UP_TO_DATE;
+ struct drbd_resource *resource = device->resource;
+ bool primary = resource->role[NEW] == R_PRIMARY;
struct drbd_peer_device *peer_device;
for_each_peer_device(peer_device, device) {
- if (peer_device->connection->peer_role[NOW] == R_PRIMARY)
+ struct drbd_connection *connection = peer_device->connection;
+
+ /* Do not consider the peer if we are disconnecting. */
+ if (resource->remote_state_change &&
+ drbd_twopc_between_peer_and_me(connection) &&
+ resource->twopc_reply.is_disconnect)
+ continue;
+
+ if (connection->peer_role[NEW] == R_PRIMARY)
primary = true;
- if (peer_device->disk_state[NOW] == D_UP_TO_DATE)
+ if (peer_device->disk_state[NEW] == D_UP_TO_DATE)
up_to_date_data = true;
}
@@ -4808,6 +4817,7 @@ change_cluster_wide_state(bool (*change)(struct change_context *, enum change_ph
} else if (context->mask.conn == conn_MASK && context->val.conn == C_DISCONNECTING) {
reply->target_reachable_nodes = NODE_MASK(context->target_node_id);
reply->reachable_nodes &= ~reply->target_reachable_nodes;
+ reply->is_disconnect = 1;
} else {
reply->target_reachable_nodes = reply->reachable_nodes;
}
--
2.35.3