Skip to content
/ server Public

MDEV-38624: prevent parallel slave from skipping ER_CONNECTION_KILLED error#4658

Merged
gkodinov merged 2 commits intoMariaDB:10.11from
nadaelsayed11:fix/MDEV-38624-never-skip-connection-killed
Feb 27, 2026
Merged

MDEV-38624: prevent parallel slave from skipping ER_CONNECTION_KILLED error#4658
gkodinov merged 2 commits intoMariaDB:10.11from
nadaelsayed11:fix/MDEV-38624-never-skip-connection-killed

Conversation

@nadaelsayed11
Copy link
Contributor

No description provided.

@CLAassistant
Copy link

CLAassistant commented Feb 16, 2026

CLA assistant check
All committers have signed the CLA.

@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Feb 16, 2026
@gkodinov gkodinov changed the title Fix/MDEV-38624: prevent parallel slave from skipping ER_CONNECTION_KILLED error MDEV-38624: prevent parallel slave from skipping ER_CONNECTION_KILLED error Feb 16, 2026
Copy link
Contributor

@andrelkin andrelkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi!

Thanks for the patch, please check my two notes.
More to that, find out what possible tests cover these changes, and if none, please add them too.

Cheers,

Andrei

inline int ignored_error_code(int err_code)
{
// Force slave to stop on this error even if in --slave-skip-errors
if (err_code == ER_CONNECTION_KILLED)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is a possible place to mask ER_CONNECTION_KILLED error, it is less specific than
concurrency_error_code() to which ER_CONNECTION_KILLED belong just naturally.
Please consider that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At any rate the error-to-skip init_slave_skip_errors() function needs a warning about the supplied ER_CONNECTION_KILLED will be ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked concurrency_error_code(), and I think it is better than ignored_error_code(), but I need to ask about the function name, the ER_CONNECTION_KILLED, is not a concurrency error, right?

Is it good to add a connection error to concurrency errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I will check the tests and add a warning message.

Thanks a lot :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ER_CONNECTION_KILLED is a sort of concurrency error in the slave parallel setup. Yet its semantic is not about error, at facing it the parallel slave won't stop, rather it will retry.

However, let me adjust my yesterday point and by that to simplify this work.
Along with the init_slave_skip_errors() warning let's simply skip setting its bit in slave_error_mask.
Notice the single-threaded "legacy" slave is fine with this. There is nothing wrong with a requirement your patch would impose: if its lonely thread is connection-killed, it then must exit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the fix in init_slave_skip_errors() and added a warning in all option -where I cleared the bit map from connection error- and another warning in skip arg if it had the connection error.

Also, I made 2 tests for each senario please check them

@gkodinov gkodinov self-assigned this Feb 16, 2026
Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a preliminary review. Please make sure about the following:

  • you have buildbot tests running fine: right now you have compilation errors
  • you have some extra commits showing up. Please re-clone your branch from the latest main branch, then cherry pick the diff you need (preferably in a single commit) and override push this into your tree.
  • get a commit message that conforms to the coding standards please.

I will leave the technical review to Andrei. He has already started.

@andrelkin
Copy link
Contributor

I did not mention last time, but the PR contains unrelated commits. Those are better be stripped off sooner.

@gkodinov
Copy link
Member

The preferred way to update the base branch around here is with rebase. See https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/keeping-your-pull-request-in-sync-with-the-base-branch. This guarantees that only your changes will be visible in the PR.

I would suggest the following to make a clean single-commit PR:

  • export only your changes from your existing branch. I'd do this by git diff origin/main...<your branch> > diff.diff.
  • inspect diff.diff and ensure it contains only your changes
  • drop your local branch: git branch -d <your local branch>
  • re-create your local branch based on the latest main: git checkout upstream/main; git branch <your branch>; git checkout <your branch>
  • apply your changes and resolve any conflicts : git apply diff.diff
  • commit: git commit ....
  • push your new branch to your forked tree: git push --force origin/<your branch> <your branch>

This should ensure that your branch is based on the latest main from the upstream repo and that it has a single commit delta that contains only your changes.

Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testing my script. please ignore.

@nadaelsayed11 nadaelsayed11 force-pushed the fix/MDEV-38624-never-skip-connection-killed branch from af749ce to 1f1a08e Compare February 18, 2026 00:32
@nadaelsayed11 nadaelsayed11 changed the base branch from main to 10.11 February 18, 2026 00:50
@nadaelsayed11 nadaelsayed11 changed the base branch from 10.11 to main February 18, 2026 00:51
@nadaelsayed11 nadaelsayed11 changed the base branch from main to 10.11 February 18, 2026 00:53
@nadaelsayed11 nadaelsayed11 changed the base branch from 10.11 to main February 18, 2026 00:54
@nadaelsayed11
Copy link
Contributor Author

I cleared my branch, and it has only my commit now, but I couldn't clear the other 9 commits. I need to do cherry pick now, but I need to know if I should make my base main or 10.11?

Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://github.com/MariaDB/server/blob/main/CODING_STANDARDS.md "Bug fixes should be based against the earliest maintained branch in which the bug can be reproduced". This is a bug fix. I'd say go for 10.6 if you can reproduce it there.

On the PR content: I'm not the best one to ask, but I believe gh will just display whatever extra commits you have in your branch. This is why I believe dropping the branch and re-creating it with a single commit containing only your change will do the trick.

The important outcome should be that your branch (as pushed in your cloned tree) should not contain more than 1 commit compared to its base branch.

Copy link
Contributor

@andrelkin andrelkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Nada!

The patch is close to be perfect. I have few suggestions for the test part.

Cheers,

Andrei

@nadaelsayed11 nadaelsayed11 changed the base branch from main to 10.6 February 20, 2026 22:20
@nadaelsayed11 nadaelsayed11 force-pushed the fix/MDEV-38624-never-skip-connection-killed branch 3 times, most recently from f308a94 to ad82034 Compare February 21, 2026 00:57
Copy link
Contributor

@andrelkin andrelkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nada!
The commit looks very good.

Cheers,
Andrei

Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great that you managed to do a single commit. Please now add a commit message compliant to CODING_STANDARDS.md.

Also, please re-base to 10.11: this is not a critical bug fix.

Prevent ER_CONNECTION_KILLED (1927) from being added to the slave_error_mask bitmap in init_slave_skip_errors(), even when explicitly listed in --slave-skip-errors or when using --slave-skip-errors=all. A warning is emitted in the error log when either is attempted.
@nadaelsayed11 nadaelsayed11 force-pushed the fix/MDEV-38624-never-skip-connection-killed branch from ad82034 to f25ff47 Compare February 24, 2026 00:26
@nadaelsayed11 nadaelsayed11 changed the base branch from 10.6 to 10.11 February 24, 2026 00:26
@nadaelsayed11
Copy link
Contributor Author

I have rebased the commit and updated its message. Please check.

Copy link
Contributor

@andrelkin andrelkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@nadaelsayed11
Copy link
Contributor Author

Thanks a lot. Are there any other changes to get merged?

@nadaelsayed11
Copy link
Contributor Author

Hello @gkodinov, are there any other changes?

Copy link
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@gkodinov gkodinov enabled auto-merge (rebase) February 27, 2026 08:39
@gkodinov gkodinov merged commit c84dfbd into MariaDB:10.11 Feb 27, 2026
15 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

4 participants