Skip to content

Kill hanging processes of not started servers (reworked fix) #332

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 17, 2022

Conversation

ylobankov
Copy link
Contributor

This change reverts the previous fix (because it was wrong) and
provides the proper solution of the issue.

It was found that hanging processes of not started tarantool servers
are not killed by test-run and leave to hang. This situation can be
reproduced by creating the main server, then creating a replica server,
but the replica server is unable to join the master, for example, due
to lack of user permissions. In this case, the test fails by the server
start timeout and test-run kills the main server process only.
This patch fixes the issue.

Fixes #256
Follows #276

This reverts commit 9c2cc2d.

The provided fix in mentioned commit is wrong. Many tests in tarantool
just install a server (without starting) and in such tests I can see the
following errors:

    [001] [2022-03-17 18:20:36.437767] DEBUG: [Instance proxy] Stopping the server...
    [001]
    [001] Test.run() received the following error:
    [001] Traceback (most recent call last):
    [001]   File "/Users/y.lobankov/Workspace/tarantool/test-run/lib/test.py", line 195, in run
    [001]     self.execute(server)
    [001]   File "/Users/y.lobankov/Workspace/tarantool/test-run/lib/tarantool_server.py", line 389, in execute
    [001]     ts.stop_nondefault(signal=signal.SIGKILL)
    [001]   File "/Users/y.lobankov/Workspace/tarantool/test-run/lib/preprocessor.py", line 443, in stop_nondefault
    [001]     v.stop(silent=True, signal=signal)
    [001]   File "/Users/y.lobankov/Workspace/tarantool/test-run/lib/tarantool_server.py", line 1026, in stop
    [001]     if self.process is not None and self.process.returncode is None:
    [001] AttributeError: 'TarantoolServer' object has no attribute 'process'
    [001]

If the server doesn't start, then the server object doesn't have the
`process` attribute.
It was found that hanging processes of not started tarantool servers
are not killed by test-run and leave to hang. This situation can be
reproduced by creating the main server, then creating a replica server,
but the replica server is unable to join the master, for example, due
to lack of user permissions. In this case, the test fails by the server
start timeout and test-run kills the main server process only.
This patch fixes the issue.

Fixes #256
Follows up #276
@ylobankov ylobankov requested a review from Totktonada March 17, 2022 15:59
Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@ylobankov ylobankov merged commit f63d85e into master Mar 17, 2022
@ylobankov ylobankov deleted the ylobankov/kill-procs-of-not-started-servers branch March 17, 2022 20:13
kyukhin added a commit to kyukhin/tarantool that referenced this pull request Mar 18, 2022
Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing
kyukhin added a commit to tarantool/tarantool that referenced this pull request Mar 18, 2022
Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing
kyukhin added a commit to tarantool/tarantool that referenced this pull request Mar 18, 2022
Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing
kyukhin added a commit to tarantool/tarantool that referenced this pull request Mar 18, 2022
Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile
  (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing
kyukhin added a commit to tarantool/tarantool that referenced this pull request Mar 18, 2022
Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile
  (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing
kyukhin added a commit to tarantool/tarantool that referenced this pull request Mar 18, 2022
Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile
  (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing
kyukhin added a commit to tarantool/tarantool that referenced this pull request Mar 21, 2022
Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile
  (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing
kyukhin added a commit to tarantool/tarantool that referenced this pull request Mar 21, 2022
Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile
  (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing
kyukhin added a commit to tarantool/tarantool that referenced this pull request Mar 21, 2022
Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile
  (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing
kyukhin added a commit to tarantool/tarantool that referenced this pull request Mar 21, 2022
Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile
  (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing

(cherry picked from commit 0b46154)
kyukhin added a commit to tarantool/tarantool that referenced this pull request Mar 21, 2022
Bump test-run to new version with the following improvements:
- Add timeout for starting tarantool server (tarantool/test-run#302)
- Kill hanging processes of not started servers (tarantool/test-run#332)
- Rerun all failed tests, not only marked as fragile
  (tarantool/test-run#329)

NO_DOC=testing
NO_TEST=testing
NO_CHANGELOG=testing

(cherry picked from commit 0b46154)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test-run does not kill server that hangs in an instance script
2 participants