这是很久以前遇到的一个问题,做下记录
之前公司网站总出现502,进过排查发现,是因为php-fpm配置的原因。
502 一般是服务器资源耗尽,导致nginx(我们用的是nginx)无法创建新的进程(限制于php-fpm的配置),无法提供服务。
这里主要介绍几个参数
pm = dynamic pm.max_children = 100 pm.start_servers = 30 pm.min_spare_servers = 30 pm.max_spare_servers = 100 pm.max_requests = 500 request_terminate_timeout = 100 request_slowlog_timeout = 5s slowlog = /home/wwwlogs/slow.log pm.status_path = /nginx_status
dynamic说明nginx是动态增加进程的,最大是max_children=100(502就是因为同时有100个进程在使用,nginx无法提供服务而返回的),但这个参数也要根据服务器内存来设置,一般来说一个进程占用20~30M,可以大概算一下。若你的并发链接经常大于这个数,你就要考虑换大一点的内存了(在加了max_requests参数的前提下)。
重点:max_requests表示每个进程使用多少次后关闭。因为nginx每个进程打开后都会存在一个类似进程池的东西里面,不会释放内存,这样就会造成内存浪费,而且一个进程使用久后可能会内存泄漏,max_requests就是用来解决这个问题的,后面的500可以自行设置,不可太小,太小也会造成502,最好是在每天凌晨1点reload一下nginx。
假设服务器可用内存为 8G
如何查看可用内存?
free -m total used free shared buffers cached Mem: 988 927 61 0 56 348 -/+ buffers/cache: 522 466 Swap: 1983 152 1831
以我的1G虚拟机为例
实际可用内存为:buffers+cached+free = 462 ,还有462M可以用(其实看第二行466也是可使用内存)
单个进程消耗内存为 20M
如何查看每个fpm进程占用的内存大小?
ps --no-headers -o "rss,cmd" -C php-fpm | awk '{ sum+=$1 } END { printf ("%d%s\n", sum/NR/1024,"M") }'
20M8 *1024 / 20 = 409.6
那么 max_children 设为400即可(注意,这里的8要为上面计算出的可用内存)
配置 pm.status_path 来实时查看进程使用情况(生产环境不推荐使用)
pm.status_path = /nginx_status
nginx 配置中做相应配置
server{
listen 80 ;
location ~ ^/(nginx_status)$ {
include fastcgi_params;
fastcgi_pass unix:/tmp/php-cgi.sock;
fastcgi_param SCRIPT_FILENAME $fastcgi_script_name;
}
}fastcgi_params 文件内容(将它放到配置文件的同级目录)
fastcgi_param QUERY_STRING $query_string; fastcgi_param REQUEST_METHOD $request_method; fastcgi_param CONTENT_TYPE $content_type; fastcgi_param CONTENT_LENGTH $content_length; fastcgi_param SCRIPT_NAME $fastcgi_script_name; fastcgi_param REQUEST_URI $request_uri; fastcgi_param DOCUMENT_URI $document_uri; fastcgi_param DOCUMENT_ROOT $document_root; fastcgi_param SERVER_PROTOCOL $server_protocol; fastcgi_param HTTPS $https if_not_empty; fastcgi_param GATEWAY_INTERFACE CGI/1.1; fastcgi_param SERVER_SOFTWARE nginx/$nginx_version; fastcgi_param REMOTE_ADDR $remote_addr; fastcgi_param REMOTE_PORT $remote_port; fastcgi_param SERVER_ADDR $server_addr; fastcgi_param SERVER_PORT $server_port; fastcgi_param SERVER_NAME $server_name; # PHP only, required if PHP was built with --enable-force-cgi-redirect fastcgi_param REDIRECT_STATUS 200; # PHP only, required if PHP was built with --enable-force-cgi-redirect fastcgi_param REDIRECT_STATUS 200;
重启nginx ,访问 http://192.168.1.136/nginx_status,其中 192.168.1.136改为你的ip地址即可
有四种输出格式 full,json,xml,html
http://192.168.1.136/nginx_status?full 将显示所有进程的详细信息
pool: www process manager: dynamic start time: 22/Nov/2017:10:07:06 +0800 start since: 2878 accepted conn: 62 listen queue: 0 max listen queue: 0 listen queue len: 0 idle processes: 10 active processes: 1 total processes: 11 max active processes: 1 max children reached: 0 slow requests: 0 ************************ pid: 48176 state: Idle start time: 22/Nov/2017:10:07:06 +0800 start since: 2878 requests: 6 request duration: 534 request method: GET request URI: /nginx_status?json&full content length: 0 user: - script: /nginx_status last request cpu: 0.00 last request memory: 262144 ************************ pid: 48177 state: Running start time: 22/Nov/2017:10:07:06 +0800 start since: 2878 requests: 7 request duration: 141 request method: GET request URI: /nginx_status?full content length: 0 user: - script: /nginx_status last request cpu: 0.00 last request memory: 0 ....
其中 full 又可以和其它三个参数并用如
http://192.168.1.136/nginx_status?json&full 将以json的方式显示输出所有的进程信息
自己写了个分析统计代码,以我的虚拟机为例
$j = myCurl('http://192.168.1.136/nginx_status?json&full');
$arr = json_decode($j, true);
$requestTime = !empty($_GET['rtime']) ? $_GET['rtime'] : '';
$requests = !empty($_GET['rs']) ? $_GET['rs'] : '';
$pmsg = array(
'pool' => 'fpm池子名称,大多数为www',
'process manager' => '进程管理方式,值:static, dynamic or ondemand. dynamic',
'start time' => '启动日期,如果reload了php-fpm,时间会更新',
'start since' => '运行时长',
'accepted conn' => '当前池子接受的请求数',
'listen queue' => '请求等待队列,如果这个值不为0,那么要增加FPM的进程数量',
'max listen queue' => '请求等待队列最高的数量',
'listen queue len' => 'socket等待队列长度',
'idle processes' => '空闲进程数量',
'active processes' => '活跃进程数量',
'total processes' => '总进程数量',
'max active processes' => '最大的活跃进程数量(FPM启动开始算)',
'max children reached' => '大道进程最大数量限制的次数,如果这个数量不为0,那说明你的最大进程数量太小了,请改大一点。',
'slow requests' => '慢请求'
);
$pinfomsg = array(
'pid' => '进程PID,可以单独kill这个进程',
'state' => '当前进程的状态',
'start time' => '进程启动的日期',
'start since' => '当前进程运行时长',
'requests' => '当前进程处理了多少个请求',
'request duration' => '请求时长(微妙)',
'request method' => '请求方法 (GET, POST, …)',
'request uri' => '请求URI',
'content length' => '请求内容长度 (仅用于 POST)',
'user' => '用户 (PHP_AUTH_USER) (or ‘-’ 如果没设置)',
'script' => 'PHP脚本',
'last request cpu' => '最后一个请求CPU使用率。',
'last request memory' => '上一个请求使用的内存'
);
$run = array();
foreach ($arr as $k => $p) {
if ($k == 'processes') {
$red = '';
if ($requestTime) {
echo '---------------------------------------------------------------请求时长------------------------------------------------------------------------<br>';
$processes = _multi_array_sort($p, 'request duration', SORT_DESC);
$red = 'request duration';
} elseif ($requests) {
echo '-------------------------------------------------------------进程使用次数----------------------------------------------------------------------<br>';
$processes = _multi_array_sort($p, 'requests', SORT_DESC);
$red = 'requests';
} else {
$processes = $p;
}
foreach ($processes as $pinfo) {
if (strtolower($pinfo['state']) == 'running') {
$run[] = $pinfo;
}
$script[] = $pinfo['script'];
$scriptUrl[$pinfo['script']][] = urldecode($pinfo['request uri']);
echo '-----------------------------------------------------------------------------------------------------------------------------------------------<br>';
foreach ($pinfo as $k1 => $pinfo1) {
switch ($k1) {
case 'request duration':
$pinfo2 = ($pinfo1 / 1000000) . ' 秒';
$light = $red == 'request duration' ? 1 : 0;
break;
case 'start time':
$pinfo2 = date('Y-m-d H:i:s', $pinfo1);
break;
case 'start since':
$pinfo2 = time2day($pinfo1);
break;
case 'requests':
$pinfo2 = $pinfo1;
$light = $red == 'requests' ? 1 : 0;
break;
case 'request uri':
$pinfo2 = urldecode($pinfo1);
break;
default :
$pinfo2 = $pinfo1;
$light = '';
break;
}
style($pinfo2, $pinfomsg[$k1], $light);
}
}
} else {
$pred = '';
switch ($k) {
case 'start time':
$p1 = date('Y-m-d H:i:s', $p);
break;
case 'start since':
$p1 = time2day($p);
break;
case 'listen queue':
$p1 = $p;
$pred = 1;
break;
case 'max children reached':
$p1 = $p;
$pred = 1;
break;
default :
$p1 = $p;
break;
}
style($p1, $pmsg[$k], $pred);
}
}
echo '-----------------------------------------访问路径------------------------------------------------------';
echo "<pre>";
print_r(array_count_values($script));
print_r($scriptUrl);
echo "</pre>";
echo '-----------------------------------------正在执行------------------------------------------------------';
echo "<pre>";
print_r($run);
echo "</pre>";
function style($val, $k, $redk = '') {
$style = '';
if ($redk) {
$style = 'style="color:red;"';
}
echo '<div ' . $style . '><span style="display:inline-block;width:40%;">' . $val . '</span><span style="display:inline-block;width:60%;">' . $k . '</span></div>';
}
function time2day($str) {
$min = $str / 60;
$hours = $min / 60;
$days = floor($hours / 24);
$hours = floor($hours - ($days * 24));
$min = floor($min - ($days * 60 * 24) - ($hours * 60));
$s = $str % 60;
if ($days !== 0)
$day = $days . " 天 ";
if ($hours !== 0)
$day .= $hours . " 小时 ";
$day .= $min . " 分钟 ";
$day .= $s . " 秒 ";
return $day;
}
function _multi_array_sort($multi_array, $sort_key, $sort = SORT_ASC) {
if (is_array($multi_array)) {
foreach ($multi_array as $row_array) {
if (is_array($row_array)) {
$key_array[] = $row_array[$sort_key];
} else {
return false;
}
}
} else {
return false;
}
array_multisort($key_array, $sort, $multi_array);
return $multi_array;
}
function myCurl($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$dataBlock = curl_exec($ch);
curl_close($ch);
return $dataBlock;
}具体的对应信息上面已经有了,还做了访问路径/页面的统计,最后显示出正在执行的进程
效果图

简单的统计
