Skip to content

slot 空余,但是不会被全部用满 #16

@ynn2023

Description

@ynn2023

您好!

运行了一段时间,从今天早上10点开始,每台机器RUNING的任务一直维持在十来个个左右,机器负载不高,无论pending了多少个Job,RUNNIN的状态都维持在这个数值,不更多的计算服务器上分配了;
队列参数:RES_REQ: order[ut:mem] span[hosts=1]
运行了badmin reconfig ,状态依旧;

还请看一下这个现象,多谢~~~

Image

Image

mbatchd.log 最后的信息如下:
May 27 09:50:09 718 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 09:50:17 832 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 09:50:26 876 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 09:50:35 911 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 09:50:36 954 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 09:51:53 1041 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 09:54:43 2136 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 10:12:29 10780 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 10:12:43 11048 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 10:20:56 16716 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 11:30:19 6550 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 11:32:03 20832 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 11:39:36 24068 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 11:46:01 24199 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 11:46:18 26988 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 13:22:48 5795 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.
May 27 13:37:03 12223 3 20 do_jobInfoReq: chanWrite_() failed, Connection reset by peer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions