New grpc requests going to grpc service pods in terminating state.

## Ask your question here:
Hi,
We have C++ grpc service running. We are using KNative serving to do autoscaling of pods based on number of input requests. Auto scale up and down happens nicely with KNative, 
but we do have some requests getting error when the requests are going to terminating state pods, client gets stream error when the pods gets killed after termination grace period if the request is still in progress.
This doesn't happen always. We have observed that even some pods are in terminating state for about 5mins, new requests coming during this period will go to other pods or new pods get created,
but at times we see that new requests going into terminating pods that is causing error.
We had tried handling SIGTERM to do server shutdown, but didnt help much. 
We see new requests going to terminating pods and error happening more frequently.
I wanted to understand, how do we make KNative to not to send new requests when the pod goes to terminating state.
Highly appreciate your suggestion.

Here is my service code:
std::unique_ptrgrpc::Server server;

//thread function
void doShutdown()
{
cout << "Entering doShutdown" << endl;

//getchar();  // press a key to shutdown the thread
auto deadline = std::chrono::system_clock::now() +
	std::chrono::milliseconds(300);
server->Shutdown(deadline);
//server->Shutdown();
std::cout << "Server is shutting down. " << std::endl;	
}

void signal_handler(int signal_num)
{
//std::lock_guardstd::recursive_mutex lock(server_mutex);
cout << "The interrupt signal is (" << signal_num
<< "). \n";
LOG_INFO(LogLayer::Application) << "The interrupt signal is " << signal_num;

switch (signal_num)
{
case SIGINT:
	std::puts("It was SIGINT");
	LOG_INFO(LogLayer::Application) << "It was SIGINT called";
	break;
case SIGTERM:
	std::puts("It was SIGTERM");
	LOG_INFO(LogLayer::Application) << "It was SIGTERM called";
	break;
default:
	break;
}

// It terminates the  program	

LOG_INFO(LogLayer::Application) << "Calling Server Shutdown ";
cout << "Calling Server Shutdown" << endl;;
std::thread t = std::thread(doShutdown);	
LOG_INFO(LogLayer::Application) << "Call exit() ";
cout << "Calling exit()" << endl;
t.join();
//exit(0);
}

int appMain(const variables_map &values)
{
const auto port = boost::any_caststd::string(values[Services_Common_Options::PORT].value());

MyServiceImpl my_service;

grpc::EnableDefaultHealthCheckService(true);
grpc::reflection::InitProtoReflectionServerBuilderPlugin();
grpc::ServerBuilder builder;

builder.AddChannelArgument(GRPC_ARG_KEEPALIVE_TIME_MS, 1000 * 60 * 1);
builder.AddChannelArgument(GRPC_ARG_KEEPALIVE_TIMEOUT_MS, 1000 * 10);
builder.AddChannelArgument(GRPC_ARG_HTTP2_MIN_SENT_PING_INTERVAL_WITHOUT_DATA_MS, 1000 * 10);
builder.AddChannelArgument(GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA, 0);
builder.AddChannelArgument(GRPC_ARG_KEEPALIVE_PERMIT_WITHOUT_CALLS, 1);

//TODO: use secure SSL connection
builder.AddListeningPort(port, grpc::InsecureServerCredentials());
// Register "service" as the instance through which we'll communicate with
// clients. In this case it corresponds to an *synchronous* service.
builder.RegisterService(&my_service);
// Finally assemble the server.
server = builder.BuildAndStart();

LOG_INFO(LogLayer::Application) << SERVICE_NAME << " listening on " << port;
/*std::signal(SIGTERM, signal_handler);
std::signal(SIGSEGV, signal_handler);
std::signal(SIGINT, signal_handler);
std::signal(SIGABRT, signal_handler);*/

// Wait for the server to shutdown. Note that some other thread must be
// responsible for shutting down the server for this call to ever return.	
cout << "Server waiting " << endl;
server->Wait();
LOG_INFO(LogLayer::Application) << "Server Shutdown ";
cout << "Server exited " << endl;

return 0;
}

and here is my KNative service yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: MyKnativeService
spec:
template:
metadata:
name: MyKnativeService-rev1
annotations:
# Target 10 in-flight-requests per pod.
#autoscaling.knative.dev/target: "1"
# container-concurrency-target-percentage: "80"
autoscaling.knative.dev/targetUtilizationPercentage: "100"
#autoscaling.knative.dev/metric: "concurrency"
# autoscaling.knative.dev/initialScale: "0"
autoscaling.knative.dev/minScale: "0"
autoscaling.knative.dev/maxScale: "100"
autoscaling.knative.dev/scaleDownDelay: "3m"
spec:
containerConcurrency: 1
containers:
- name: MyKnativeService_container
image: ppfaservice:latest
imagePullPolicy: Always

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New grpc requests going to grpc service pods in terminating state. #13575

Ask your question here:

Target 10 in-flight-requests per pod.

container-concurrency-target-percentage: "80"

autoscaling.knative.dev/initialScale: "0"

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

New grpc requests going to grpc service pods in terminating state. #13575

Description

Ask your question here:

Target 10 in-flight-requests per pod.

container-concurrency-target-percentage: "80"

autoscaling.knative.dev/initialScale: "0"

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions