记一次跟踪php进程的代码问题

记一次追踪php进程bug

现象:
项目中bin/invite_email.php这个后台运行的进程,每10分钟重启一次来保证后台进程可以有机会执行到更新后的代码.但是下午发现 invite_email.php这个进程的启动时间是上午的9:20分.以致lpush到redis后,这个进程应该lpop一直没有执行.

跟踪:
pid: 22978,

<code>[root@ Dianji_Web_12 /proc/22978]# strace -p 22978
Process 22978 attached - interrupt to quit
clock_gettime(CLOCK_MONOTONIC, {25088659, 736169268}) = 0
poll([{fd=8, events=POLLIN|POLLPRI}], 1, 836) = 0 (Timeout)
poll([{fd=8, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
clock_gettime(CLOCK_MONOTONIC, {25088660, 572234268}) = 0
clock_gettime(CLOCK_MONOTONIC, {25088660, 572302268}) = 0
clock_gettime(CLOCK_MONOTONIC, {25088660, 572461268}) = 0
poll([{fd=8, events=POLLIN|POLLPRI}], 1, 1000) = 0 (Timeout)
poll([{fd=8, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
clock_gettime(CLOCK_MONOTONIC, {25088661, 572976268}) = 0
clock_gettime(CLOCK_MONOTONIC, {25088661, 573026268}) = 0
clock_gettime(CLOCK_MONOTONIC, {25088661, 573073268}) = 0
poll([{fd=8, events=POLLIN|POLLPRI}], 1, 1000 &lt;unfinished ...&gt;
</code>

一直是poll状态,文件描述符是8

<code>[root@ Dianji_Web_12 /proc/22978]# ll /proc/22978/fd/
总计 0
lr-x------ 1 root root 64 02-24 16:08 0 -&gt; /dev/null
l-wx------ 1 root root 64 02-24 16:08 1 -&gt; /tmp/invite_mail_xnw.log
l-wx------ 1 root root 64 02-24 16:08 2 -&gt; /tmp/invite_mail_xnw.log
lrwx------ 1 root root 64 02-24 16:08 3 -&gt; socket:[1167074663]
lrwx------ 1 root root 64 02-24 16:08 4 -&gt; socket:[1167074664]
lrwx------ 1 root root 64 02-24 16:08 5 -&gt; socket:[1167074665]
lrwx------ 1 root root 64 02-24 16:08 6 -&gt; socket:[1167075855]
lrwx------ 1 root root 64 02-24 16:08 7 -&gt; socket:[1167075856]
lrwx------ 1 root root 64 02-24 16:08 8 -&gt; socket:[1167085181]
</code>

上面的结果知道他是一个socket

<code>[root@ Dianji_Web_12 /proc/22978]# lsof -p 22978
...........此处省略n行..
php     22978 root    2w   REG        8,3   873908    3808044 /tmp/invite_mail_xnw.log
php     22978 root    3u  IPv4 1167074663                 TCP localhost:35653-&gt;localhost:entextxid (ESTABLISHED)
php     22978 root    4u  IPv4 1167074664                 TCP localhost:45819-&gt;push.lava.cn:6381 (CLOSE_WAIT)
php     22978 root    5u  IPv4 1167074665                 TCP localhost:45820-&gt;push.lava.cn:6381 (CLOSE_WAIT)
php     22978 root    6u  sock        0,5          1167075855 can't identify protocol
php     22978 root    7u  IPv4 1167075856                 TCP localhost:44916-&gt;localhost:13001 (ESTABLISHED)
php     22978 root    8u  IPv4 1167085181                 TCP 117.79.236.12:35050-&gt;211.153.75.9:http (ESTABLISHED)
</code>

看lsof的最后一行是文件描述符8所打开的socket,是连接到211.153.75.9的80端口,但是一直处 于读取数据状态.也就是说211.153.75.9这个http server没有返回数据.有了这条线,就好办了.这个陌生的ip肯定是通过域名解析得到的.仔细排查,发现是curl到api.eduyun.cn导致 的.检查curl这段代码.原来是忘加超时时间了..

<code>    curl_setopt($ch, CURLOPT_TIMEOUT, 4); 
</code>

引以为诫..