Skip to content
This repository was archived by the owner on Feb 8, 2024. It is now read-only.

Conversation

@mssawant
Copy link

A motr client restart is considered as a permanent failure
and thus, if an existing motr client restarts its existing
fid is updated by Hare. A motr process comprises of motr
services associated to it. Interested motr modules subscribe
to change in the service states, also, motr confc (configuration
cache) maintains a hierarchy of processes and services, thus
if a process fid changes without updating the service fids,
it will not be possible to map a given process to its corresponding
services. Thus, It is important to update the services fids
along with its respective process fid.

Solution:

  • Allocate service fids along with its corresponding process fid.
  • Make fid allocation routine generic to work for any motr
    configuration object.

Signed-off-by: Mandar Sawant [email protected]

@mssawant
Copy link
Author

retest this please

Copy link
Contributor

@Shreya-18 Shreya-18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!!

@mssawant
Copy link
Author

mssawant commented Sep 1, 2022

retest this please.

A motr client restart is considered as a permanent failure
and thus, if an existing motr client restarts its existing
fid is updated by Hare. A motr process comprises of motr
services associated to it. Interested motr modules subscribe
to change in the service states, also, motr confc (configuration
cache) maintains a hierarchy of processes and services, thus
if a process fid changes without updating the service fids,
it will not be possible to map a given process to its corresponding
services. Thus, It is important to update the services fids
along with its respective process fid.

Solution:
- Allocate service fids along with its corresponding process fid.
- Make fid allocation routine generic to work for any motr
  configuration object.

Signed-off-by: Mandar Sawant <[email protected]>
@supriyachavan4398
Copy link
Contributor

Created custom build at https://eos-jenkins.colo.seagate.com/job/GitHub-custom-ci-builds/job/generic/job/custom-ci/7734/console
Deployed 6N cluster at https://eos-jenkins.colo.seagate.com/job/Cortx-Automation/job/RGW/job/setup-cortx-rgw-cluster/12135/

[root@ssc-vm-g2-rhev4-1630 ~]# kubectl get po
NAME                             READY   STATUS    RESTARTS   AGE
cortx-consul-client-6vv5k        1/1     Running   0          51m
cortx-consul-client-h6ntr        1/1     Running   0          51m
cortx-consul-client-mgm4g        1/1     Running   0          51m
cortx-consul-client-n9sz2        1/1     Running   0          51m
cortx-consul-client-ntpt6        1/1     Running   0          51m
cortx-consul-server-0            1/1     Running   0          51m
cortx-consul-server-1            1/1     Running   0          51m
cortx-consul-server-2            1/1     Running   0          51m
cortx-control-656977d448-h4lk7   1/1     Running   0          51m
cortx-data-g0-0                  3/3     Running   0          51m
cortx-data-g0-1                  3/3     Running   0          51m
cortx-data-g0-2                  3/3     Running   0          51m
cortx-data-g0-3                  3/3     Running   0          51m
cortx-data-g0-4                  3/3     Running   0          51m
cortx-data-g1-0                  3/3     Running   0          51m
cortx-data-g1-1                  3/3     Running   0          51m
cortx-data-g1-2                  3/3     Running   0          51m
cortx-data-g1-3                  3/3     Running   0          51m
cortx-data-g1-4                  3/3     Running   0          51m
cortx-ha-75c7899475-8snhz        3/3     Running   0          51m
cortx-kafka-0                    1/1     Running   0          51m
cortx-kafka-1                    1/1     Running   0          51m
cortx-kafka-2                    1/1     Running   0          51m
cortx-server-0                   2/2     Running   0          51m
cortx-server-1                   2/2     Running   0          51m
cortx-server-2                   2/2     Running   0          51m
cortx-server-3                   2/2     Running   0          51m
cortx-server-4                   2/2     Running   0          51m
cortx-zookeeper-0                1/1     Running   0          51m
cortx-zookeeper-1                1/1     Running   0          51m
cortx-zookeeper-2                1/1     Running   0          51m

[root@ssc-vm-g2-rhev4-1630 ~]# kubectl exec -it cortx-data-g0-1 -- /bin/bash
Defaulted container "cortx-hax" out of: cortx-hax, cortx-motr-confd, cortx-motr-io-001, node-config (init), cortx-setup (init)
[root@cortx-data-g0-1 /]# hctl status -d
Bytecount:
    critical : 0
    damaged : 0
    degraded : 0
    healthy : 1677721
Data pool:
    # fid name
    0x6f00000000000001:0x0 'storage-set-1__sns'
Profile:
    # fid name: pool(s)
    0x7000000000000001:0x0 'Profile_the_pool': 'storage-set-1__sns' 'storage-set-1__dix' None
Services:
    cortx-data-g0-0.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x0          inet:tcp:cortx-data-g0-0.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x1          inet:tcp:cortx-data-g0-0.cortx-data-headless.cortx.svc.cluster.local@21002
    [started]  confd               0x7200000000000001:0x2          inet:tcp:cortx-data-g0-0.cortx-data-headless.cortx.svc.cluster.local@21001
    cortx-data-g0-1.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x3          inet:tcp:cortx-data-g0-1.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x4          inet:tcp:cortx-data-g0-1.cortx-data-headless.cortx.svc.cluster.local@21002
    [started]  confd               0x7200000000000001:0x5          inet:tcp:cortx-data-g0-1.cortx-data-headless.cortx.svc.cluster.local@21001
    cortx-data-g0-2.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x6          inet:tcp:cortx-data-g0-2.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x7          inet:tcp:cortx-data-g0-2.cortx-data-headless.cortx.svc.cluster.local@21002
    [started]  confd               0x7200000000000001:0x8          inet:tcp:cortx-data-g0-2.cortx-data-headless.cortx.svc.cluster.local@21001
    cortx-data-g0-3.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x9          inet:tcp:cortx-data-g0-3.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0xa          inet:tcp:cortx-data-g0-3.cortx-data-headless.cortx.svc.cluster.local@21002
    [started]  confd               0x7200000000000001:0xb          inet:tcp:cortx-data-g0-3.cortx-data-headless.cortx.svc.cluster.local@21001
    cortx-data-g0-4.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0xc          inet:tcp:cortx-data-g0-4.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0xd          inet:tcp:cortx-data-g0-4.cortx-data-headless.cortx.svc.cluster.local@21002
    [started]  confd               0x7200000000000001:0xe          inet:tcp:cortx-data-g0-4.cortx-data-headless.cortx.svc.cluster.local@21001
    cortx-data-g1-0.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0xf          inet:tcp:cortx-data-g1-0.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x10         inet:tcp:cortx-data-g1-0.cortx-data-headless.cortx.svc.cluster.local@21002
    [started]  confd               0x7200000000000001:0x11         inet:tcp:cortx-data-g1-0.cortx-data-headless.cortx.svc.cluster.local@21001
    cortx-data-g1-1.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x12         inet:tcp:cortx-data-g1-1.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x13         inet:tcp:cortx-data-g1-1.cortx-data-headless.cortx.svc.cluster.local@21002
    [started]  confd               0x7200000000000001:0x14         inet:tcp:cortx-data-g1-1.cortx-data-headless.cortx.svc.cluster.local@21001
    cortx-data-g1-2.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x15         inet:tcp:cortx-data-g1-2.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x16         inet:tcp:cortx-data-g1-2.cortx-data-headless.cortx.svc.cluster.local@21002
    [started]  confd               0x7200000000000001:0x17         inet:tcp:cortx-data-g1-2.cortx-data-headless.cortx.svc.cluster.local@21001
    cortx-data-g1-3.cortx-data-headless.cortx.svc.cluster.local  (RC)
    [started]  hax                 0x7200000000000001:0x18         inet:tcp:cortx-data-g1-3.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x19         inet:tcp:cortx-data-g1-3.cortx-data-headless.cortx.svc.cluster.local@21002
    [started]  confd               0x7200000000000001:0x1a         inet:tcp:cortx-data-g1-3.cortx-data-headless.cortx.svc.cluster.local@21001
    cortx-data-g1-4.cortx-data-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x1b         inet:tcp:cortx-data-g1-4.cortx-data-headless.cortx.svc.cluster.local@22001
    [started]  ioservice           0x7200000000000001:0x1c         inet:tcp:cortx-data-g1-4.cortx-data-headless.cortx.svc.cluster.local@21002
    [started]  confd               0x7200000000000001:0x1d         inet:tcp:cortx-data-g1-4.cortx-data-headless.cortx.svc.cluster.local@21001
    cortx-server-0.cortx-server-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x1e         inet:tcp:cortx-server-0.cortx-server-headless.cortx.svc.cluster.local@22001
    [started]  rgw_s3              0x7200000000000001:0x1f         inet:tcp:cortx-server-0.cortx-server-headless.cortx.svc.cluster.local@22501
    cortx-server-1.cortx-server-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x20         inet:tcp:cortx-server-1.cortx-server-headless.cortx.svc.cluster.local@22001
    [started]  rgw_s3              0x7200000000000001:0x21         inet:tcp:cortx-server-1.cortx-server-headless.cortx.svc.cluster.local@22501
    cortx-server-2.cortx-server-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x22         inet:tcp:cortx-server-2.cortx-server-headless.cortx.svc.cluster.local@22001
    [started]  rgw_s3              0x7200000000000001:0x23         inet:tcp:cortx-server-2.cortx-server-headless.cortx.svc.cluster.local@22501
    cortx-server-3.cortx-server-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x24         inet:tcp:cortx-server-3.cortx-server-headless.cortx.svc.cluster.local@22001
    [started]  rgw_s3              0x7200000000000001:0x25         inet:tcp:cortx-server-3.cortx-server-headless.cortx.svc.cluster.local@22501
    cortx-server-4.cortx-server-headless.cortx.svc.cluster.local
    [started]  hax                 0x7200000000000001:0x26         inet:tcp:cortx-server-4.cortx-server-headless.cortx.svc.cluster.local@22001
    [started]  rgw_s3              0x7200000000000001:0x27         inet:tcp:cortx-server-4.cortx-server-headless.cortx.svc.cluster.local@22501
Devices:
    cortx-data-g0-0.cortx-data-headless.cortx.svc.cluster.local
    [online]  /dev/sdd
    [online]  /dev/sde
    [online]  /dev/sdc
    cortx-data-g0-1.cortx-data-headless.cortx.svc.cluster.local
    [online]  /dev/sdd
    [online]  /dev/sde
    [online]  /dev/sdc
    cortx-data-g0-2.cortx-data-headless.cortx.svc.cluster.local
    [online]  /dev/sdd
    [online]  /dev/sde
    [online]  /dev/sdc
    cortx-data-g0-3.cortx-data-headless.cortx.svc.cluster.local
    [online]  /dev/sdd
    [online]  /dev/sde
    [online]  /dev/sdc
    cortx-data-g0-4.cortx-data-headless.cortx.svc.cluster.local
    [online]  /dev/sdd
    [online]  /dev/sde
    [online]  /dev/sdc
    cortx-data-g1-0.cortx-data-headless.cortx.svc.cluster.local
    [online]  /dev/sdg
    [online]  /dev/sdh
    [online]  /dev/sdf
    cortx-data-g1-1.cortx-data-headless.cortx.svc.cluster.local
    [online]  /dev/sdg
    [online]  /dev/sdh
    [online]  /dev/sdf
    cortx-data-g1-2.cortx-data-headless.cortx.svc.cluster.local
    [online]  /dev/sdg
    [online]  /dev/sdh
    [online]  /dev/sdf
    cortx-data-g1-3.cortx-data-headless.cortx.svc.cluster.local
    [online]  /dev/sdg
    [online]  /dev/sdh
    [online]  /dev/sdf
    cortx-data-g1-4.cortx-data-headless.cortx.svc.cluster.local
    [online]  /dev/sdg
    [online]  /dev/sdh
    [online]  /dev/sdf
    cortx-server-0.cortx-server-headless.cortx.svc.cluster.local
    cortx-server-1.cortx-server-headless.cortx.svc.cluster.local
    cortx-server-2.cortx-server-headless.cortx.svc.cluster.local
    cortx-server-3.cortx-server-headless.cortx.svc.cluster.local
    cortx-server-4.cortx-server-headless.cortx.svc.cluster.local

Also in Consul kv checked all process fids and their service fids, all are in the online state.
cc. @mssawant, @d-nayak

@mssawant
Copy link
Author

mssawant commented Sep 2, 2022

Thanks @supriyachavan4398.
Going ahead with merging this PR.

@mssawant mssawant merged commit 9b96255 into Seagate:main Sep 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants