Concurrent Exec tests intermittently fail with various errors

After PR #6787 exec has become more stable in the concurrent and container shutdown paths. However, it still suffers from some race condition style failures. This has mainly been seen in the CI environment and has passed locally(consecutively) against a high resource complex VSAN deployment. below is a short catalog of the failures that have been seen that will cause *intermittent* failures: 

`Simple Concurrent Exec`:
* Failed with a `Feb 26 2018 16:27:06.142Z ERROR op=301.310: CommitHandler error on handle(c13c27abff7908e6146498325fa944c3) for 46bec94ca67b332d0f924df4f4d7c84168693b876a8aa15658050b2e3b2bf46e: The operation is not allowed in the current state.`

`Exec During Poweroff Of A Container Performing A Long Running Task`: 
* Failed because all execs actually succeeded(might need to improve test expectations)
* Container reported being in an invalid state. 

`Exec During Poweroff Of A Container Performing A Short Running Task`:
  * Long running exec returned an rc of 1 even though the output looked present.(looks successful in the portlayer of Exec-Failure-1) 

REFERENCE LOGS:

[Exec-CI-Failure-1-16518.zip](https://github.com/vmware/vic/files/1764632/Exec-CI-Failure-1-16518.zip)
[Exec-CI-Failure-2-16515.zip](https://github.com/vmware/vic/files/1764634/Exec-CI-Failure-2-16515.zip)
[Exec-Failure-3-FROM-FULL-CI.html.zip](https://github.com/vmware/vic/files/1764649/Exec-Failure-3-FROM-FULL-CI.html.zip)
[Exec-Failure-3-FROM-FULL-CI-16510.zip](https://github.com/vmware/vic/files/1764644/Exec-Failure-3-FROM-FULL-CI-16510.zip)

Currently using the following for basic concurrency testing:
```bash
c=1;date;id=`docker ps -q | awk '{print$1}'`; for i in `seq 1 $c`; do docker ^Cec $id /bin/echo /tmp/$i &done;for i in `seq 1 $c`;do wait %$i 2>/dev/null;done;date
```

TODO:
- [ ] investigate why vCenter reconfigure slows down over time with number of execs. Could be number of extraconfig keys in the VM config (we should not be sending the entire set to VC, but VC may be relaying the entire set to ESX - compare/contrast with ESX directy). Could also be that reconfigures result in accrual of state, irrespective of the keys. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Concurrent Exec tests intermittently fail with various errors #7410

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Concurrent Exec tests intermittently fail with various errors #7410

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions