-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
@xCASx reported a bug where gRPC would get "hung" after a point, which included the following stack trace:
Jul 22, 2019 7:37:47 PM io.grpc.internal.LogExceptionRunnable run
SEVERE: Exception while executing runnable io.grpc.internal.SharedResourceHolder$2@73cfe40e
io.grpc.netty.shaded.io.netty.channel.ChannelException: eventfd_write() failed: Bad file descriptor
at io.grpc.netty.shaded.io.netty.channel.epoll.Native.eventFdWrite(Native Method)
at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.wakeup(EpollEventLoop.java:167)
at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.shutdownGracefully(SingleThreadEventExecutor.java:603)
at io.grpc.netty.shaded.io.netty.util.concurrent.MultithreadEventExecutorGroup.shutdownGracefully(MultithreadEventExecutorGroup.java:163)
at io.grpc.netty.shaded.io.grpc.netty.Utils$DefaultEventLoopGroupResource.close(Utils.java:346)
at io.grpc.netty.shaded.io.grpc.netty.Utils$DefaultEventLoopGroupResource.close(Utils.java:318)
at io.grpc.internal.SharedResourceHolder$2.run(SharedResourceHolder.java:145)
at io.grpc.internal.LogExceptionRunnable.run(LogExceptionRunnable.java:43)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
If we look in SharedResourceHolder, if resource.close(instance)
fails throws instances.remove(resource)
is not run. While we shouldn't encourage exceptions to be thrown during close, we would like it to be able to recover eventually. In this case, any future resource fetches will get the partially-closed resource, which will immediately fail.
grpc-java/core/src/main/java/io/grpc/internal/SharedResourceHolder.java
Lines 145 to 146 in d7b9438
resource.close(instance); | |
instances.remove(resource); |
@xCASx, a workaround would be to keep the client objects alive for as long as possible. We generally encourage that for performance, but as long as one client object is alive we won't attempt to shut down this executor.