感觉写这个标题,明眼人一看可能觉得这不就是死锁吗?但是今天说的情况还不是真正意义上的死锁,顶多算是宏观意义上的死锁。而且这个情况使用jstack工具查看不到死锁的信息。
# 使用线程池不当,导致的线程相互等待
# 今天的例子
公共类测试{静态线程池执行器线程池执行器=新线程池执行器(1,1,0L,时间单位.毫秒,new LinkedBlockingQueue());公共静态void main(String[]args)引发执行异常,中断异常{ Future outterFuture=threadpoolexecutor。提交(()-{ Future内部Future=threadpoolexecutor。提交(()-{系统。出去。println('内部完成'));返回"内部完成";});字符串s=innerfuture。get();系统。出去。println(' outter get inner finish :s);系统。出去。println(' outter finish ');返回”完成后”;});字符串s=outterfuture。get();系统。出去。println(' process get out er finish :s);}}
意思就是提交了一个线程1,线程一里面提交了一个线程2,线程一等待线程2的结果。可能有些人很明显就看出问题了,当然这个是简化后的结果,实际情况线程池使用可能比这隐晦的多。执行这个方法,直接就会导致两个线程相互等待。
# jstack现象
2020-09-12 09:52:41完整线程转储Java HotSpot(TM) 64位服务器虚拟机(25.131-b11混合模式):“附加监听器”#11守护进程prio=9 OS _ prio=0 tid=0x 00007 fbf 38001000 NID=0x 37c等待条件[0x 00000000000000000000]Java。朗。线程。状态33
9 os_prio=0 tid=0x00007fbf980d2000 nid=0x7930 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE"C1 CompilerThread3" #8 daemon prio=9 os_prio=0 tid=0x00007fbf980c7000 nid=0x792f waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE"C2 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007fbf980c4800 nid=0x792e waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007fbf980c3000 nid=0x792d waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007fbf980c0000 nid=0x792c waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007fbf980be800 nid=0x792b runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fbf9808b800 nid=0x792a in Object.wait() [0x00007fbf84371000] java.lang.Thread.State: WAITING (on object monitor)at java.lang.Object.wait(Native Method)- waiting on <0x00000006c8e01a60> (a java.lang.ref.ReferenceQueue$Lock)at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)- locked <0x00000006c8e01a60> (a java.lang.ref.ReferenceQueue$Lock)at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fbf98086800 nid=0x7929 in Object.wait() [0x00007fbf84472000] java.lang.Thread.State: WAITING (on object monitor)at java.lang.Object.wait(Native Method)- waiting on <0x00000006c8e0f950> (a java.lang.ref.Reference$Lock)at java.lang.Object.wait(Object.java:502)at java.lang.ref.Reference.tryHandlePending(Reference.java:191)- locked <0x00000006c8e0f950> (a java.lang.ref.Reference$Lock)at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)"main" #1 prio=5 os_prio=0 tid=0x00007fbf98008800 nid=0x791e waiting on condition [0x00007fbf9e635000] java.lang.Thread.State: WAITING (parking)at sun.misc.Unsafe.park(Native Method)- parking to wait for <0x00000006c8e177b8> (a java.util.concurrent.FutureTask)at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)at java.util.concurrent.FutureTask.get(FutureTask.java:191)at Test.main(Test.java:31)"VM Thread" os_prio=0 tid=0x00007fbf9807f000 nid=0x7928 runnable "GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fbf9801d800 nid=0x791f runnable "GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fbf9801f800 nid=0x7920 runnable "GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007fbf98021800 nid=0x7921 runnable "GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007fbf98023000 nid=0x7922 runnable "GC task thread#4 (ParallelGC)" os_prio=0 tid=0x00007fbf98025000 nid=0x7923 runnable "GC task thread#5 (ParallelGC)" os_prio=0 tid=0x00007fbf98027000 nid=0x7925 runnable "GC task thread#6 (ParallelGC)" os_prio=0 tid=0x00007fbf98028800 nid=0x7926 runnable "GC task thread#7 (ParallelGC)" os_prio=0 tid=0x00007fbf9802a800 nid=0x7927 runnable "VM Periodic Task Thread" os_prio=0 tid=0x00007fbf980d5000 nid=0x7931 waiting on condition JNI global references: 201通过jstack没有主动发现死锁情况。由于真实情况业务和组件的线程很多更难判断。
# 线程池参数解析
下面是ThreadPoolExecutor线程池参数最多的构造函数
public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue
函数的参数含义如下(具体细节请自行百度):
* corePoolSize: 线程池核心线程数
* maximumPoolSize:线程池最大数
* keepAliveTime: 空闲线程存活时间
* unit: 时间单位
* workQueue: 线程池所使用的缓冲队列
* threadFactory:线程池创建线程使用的工厂
* handler: 线程池对拒绝任务的处理策略
# 原因分析1
例子中定义的核心线程数和最大线程数都是1,说明线程池只能同时有一个线程在执行。然后定义了一个线程队列存放待执行的线程。问题就在于,提交线程outter,该线程就占据了核心线程数1,然后线程outter里面提交了一个线程inner,并等待线程inner的执行结果。而线程inner一直没执行,因为线程inner需要等待线程池当前执行线程数小于最大线程数之后才能,在队列中等待的线程。导致了线程outter占据了线程池能执行任务的最大数量,等待线程inner的结果,线程inner等待线程池来执行而未返回结果。
# 原因分析2
其实通过jstack 的日志也是能发现问题的,如名为Reference Handler和名为Finalizer的线程中,自生waiting
on和locked的条件是相同的,就是自己等自己,出现了一直等待。
# 死锁
这里先温习一下死锁的情况。
# 死锁条件
1. 互斥使用,即当资源被一个线程使用(占有)时,别的线程不能使用
2. 不可抢占,资源请求者不能强制从资源占有者手中夺取资源,资源只能由资源占用者主动释放
3. 请求和保持,即当资源的请求者在请求其他的资源的同时保持对原有资源的占有
4. 循环等待,即存在一个等待队列: P1占有P2的资源,P2占有P3的资源,P3占有P1的资源。
# 死锁例子
public class DeadLock implements Runnable{ private static Object obj1 = new Object(); private static Object obj2 = new Object(); private boolean flag; public DeadLock(boolean flag){ this.flag = flag; } @Override public void run(){ System.out.println(Thread.currentThread().getName() + "运行"); if(flag){ synchronized(obj1){ System.out.println(Thread.currentThread().getName() + "已经锁住obj1"); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } synchronized(obj2){ // 执行不到这里 System.out.println("1秒钟后,"+Thread.currentThread().getName() + "锁住obj2"); } } }else{ synchronized(obj2){ System.out.println(Thread.currentThread().getName() + "已经锁住obj2"); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } synchronized(obj1){ // 执行不到这里 System.out.println("1秒钟后,"+Thread.currentThread().getName() + "锁住obj1"); } } } } public static void main(String[] args) { Thread t1 = new Thread(new DeadLock(true), "线程1"); Thread t2 = new Thread(new DeadLock(false), "线程2"); t1.start(); t2.start(); }}
# jstack现象
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode):"DestroyJavaVM" #13 prio=5 os_prio=0 tid=0x0000000003866000 nid=0x2ffc waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE"线程2" #12 prio=5 os_prio=0 tid=0x000000001e6b8000 nid=0x20e4 waiting for monitor entry [0x000000001f8bf000] java.lang.Thread.State: BLOCKED (on object monitor) at com.wp.security.springboot.DeadLock.run(DeadLock.java:42) - waiting to lock <0x000000076b47b980> (a java.lang.Object) - locked <0x000000076b47b990> (a java.lang.Object) at java.lang.Thread.run(Thread.java:748)"线程1" #11 prio=5 os_prio=0 tid=0x000000001eec8800 nid=0x11d8 waiting for monitor entry [0x000000001f7bf000] java.lang.Thread.State: BLOCKED (on object monitor) at com.wp.security.springboot.DeadLock.run(DeadLock.java:28) - waiting to lock <0x000000076b47b990> (a java.lang.Object) - locked <0x000000076b47b980> (a java.lang.Object) at java.lang.Thread.run(Thread.java:748)"Service Thread" #10 daemon prio=9 os_prio=0 tid=0x000000001e607000 nid=0x3888 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE"C1 CompilerThread2" #9 daemon prio=9 os_prio=2 tid=0x000000001e57c800 nid=0x1a1c waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE"C2 CompilerThread1" #8 daemon prio=9 os_prio=2 tid=0x000000001e56f000 nid=0x37b4 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE"C2 CompilerThread0" #7 daemon prio=9 os_prio=2 tid=0x000000001e56e800 nid=0x1eb0 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE"Monitor Ctrl-Break" #6 daemon prio=5 os_prio=0 tid=0x000000001e56a800 nid=0x2298 runnable [0x000000001e9be000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) - locked <0x000000076b4cf910> (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:161) at java.io.BufferedReader.readLine(BufferedReader.java:324) - locked <0x000000076b4cf910> (a java.io.InputStreamReader) at java.io.BufferedReader.readLine(BufferedReader.java:389) at com.intellij.rt.execution.application.AppMainV2$1.run(AppMainV2.java:61)"Attach Listener" #5 daemon prio=5 os_prio=2 tid=0x000000001cf8a000 nid=0x1e84 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE"Signal Dispatcher" #4 daemon prio=9 os_prio=2 tid=0x000000001cf74000 nid=0x2330 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE"Finalizer" #3 daemon prio=8 os_prio=1 tid=0x000000001cf4e800 nid=0x4168 in Object.wait() [0x000000001e2bf000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x000000076b208ed0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) - locked <0x000000076b208ed0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:212)"Reference Handler" #2 daemon prio=10 os_prio=2 tid=0x0000000003956000 nid=0x3478 in Object.wait() [0x000000001e1bf000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x000000076b206bf8> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference.tryHandlePending(Reference.java:191) - locked <0x000000076b206bf8> (a java.lang.ref.Reference$Lock) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)"VM Thread" os_prio=2 tid=0x000000001cf27000 nid=0x47a4 runnable"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x000000000387b800 nid=0x1ec8 runnable"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x000000000387d000 nid=0x47a0 runnable"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x000000000387e800 nid=0x3364 runnable"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x0000000003881800 nid=0x4848 runnable"VM Periodic Task Thread" os_prio=2 tid=0x000000001e5e5800 nid=0x1318 waiting on conditionJNI global references: 12Found one Java-level deadlock:============================="线程2": waiting to lock monitor 0x000000001cf4b598 (object 0x000000076b47b980, a java.lang.Object), which is held by "线程1""线程1": waiting to lock monitor 0x000000001cf4ded8 (object 0x000000076b47b990, a java.lang.Object), which is held by "线程2"Java stack information for the threads listed above:==================================================="线程2": at com.wp.security.springboot.DeadLock.run(DeadLock.java:42) - waiting to lock <0x000000076b47b980> (a java.lang.Object) - locked <0x000000076b47b990> (a java.lang.Object) at java.lang.Thread.run(Thread.java:748)"线程1": at com.wp.security.springboot.DeadLock.run(DeadLock.java:28) - waiting to lock <0x000000076b47b990> (a java.lang.Object) - locked <0x000000076b47b980> (a java.lang.Object) at java.lang.Thread.run(Thread.java:748)Found 1 deadlock.
这里看线程1和线程2中的waiting to lock 和locked 后的资源,一目了然。而且jstack结尾也有提示发现死锁Found one
Java-level deadlock
# 为什么jstack不能主动发现死锁
在线程池的例子中并没有明确的是通过占用锁,导致死锁,所以这个例子中不算死锁。而死锁的例子很明确,就是两个线程相互抢占锁导致的,所以这个就是死锁,在jstack中会发现死锁。
# 如何判断类似于死锁的相互等待
出现类似这种情况,在jstack不提示的情况下,通过分析业务逻辑的线程确实难以发现问题所在。我对比了一下这两个例子的线程dump,注意到waiting
on、waiting to lock、parking to wait for、locked这几个关键字。在百度查了一下。
* waiting on condition表示非Object.wait的条件等待,比如说你调用了sleep,park等操作
* parking to wait for 就是调用了park动作了
* waiting to lock 就是等待一个锁对象
死锁的例子中jstack之所以能检测出死锁,我猜估计他是通过waiting to lock 和 locked 判断,也就是真正意义上的死锁。而waiting
on和locked,是今天讨论线程池中线程等待出现的情况。如果想判断线程是否出现这种类似于死锁的相互等待和死锁,其实需要判断所有的waiting和locked条件中是否相同。
如果感觉本文对你有一点帮助,点关注一起学习进步~