computerの日記

Cisco,SHELL,C,Qt,C++,Linux,ネットワーク,Windows Scriptなどの発言です

自分のサーバを自分で作った sar-analyzer で評価する

 ということで、やってみました。

AWS の貧弱なマシンになります。

#### Report by sar-analyzer ####

-- Report of CPU utilization --

Highest Average value of '%usr(%user)' for CPU all is 21.07 (01/10/18)
Lowest Average value of '%usr(%user)' for CPU all is 0.02 (01/02/18)
Highest Average value of '%sys(%system)' for CPU all is 33.84 (01/10/18)
Lowest Average value of '%sys(%system)' for CPU all is 0.05 (01/02/18)
Highest Average value of '%iowait' for CPU all is 0.20 (01/10/18)
Lowest Average value of '%iowait' for CPU all is 0.05 (01/02/18)
Highest Average value of '%idle' for CPU all is 99.82 (01/02/18)
Lowest Average value of '%idle' for CPU all is 43.98 (01/10/18)

Highest Average value of '%usr(%user)' for CPU 0 is 21.07 (01/10/18)
Lowest Average value of '%usr(%user)' for CPU 0 is 0.02 (01/02/18)
Highest Average value of '%sys(%system)' for CPU 0 is 33.84 (01/10/18)
Lowest Average value of '%sys(%system)' for CPU 0 is 0.05 (01/02/18)
Highest Average value of '%iowait' for CPU 0 is 0.20 (01/10/18)
Lowest Average value of '%iowait' for CPU 0 is 0.05 (01/02/18)
Highest Average value of '%idle' for CPU 0 is 99.82 (01/02/18)
Lowest Average value of '%idle' for CPU 0 is 43.98 (01/10/18)
--------
Each CPU can be in one of four states: user, sys, idle, iowait.
If '%usr' is over 60%, applications are in a busy state. Check with ps command which application is busy.
If '%sys' is over '%usr', kernel is in a busy state. Check cswch is high or not.
If '%iowait' is high, cpu is working for other task more. Note that iowait sometimes meaningless, at all.
Check swap statistics or high disk I/O would be the cause. Also check process or memory statistics.
If %idle is lower than 30%, you would need new CPU or cores.
Check not only 'CPU all', but each CPU values. And if some of their values are high, check the sar file of that date.

-- Report of queue length and load averages --

Highest Average value of 'runq-sz' is 1 (12/30/17)
Lowest Average value of 'runq-sz' is 0 (01/01/18)
Highest Average value of 'plist-sz' is 357 (01/03/18)
Lowest Average value of 'plist-sz' is 293 (01/10/18)
Highest Average value of 'ldavg-1' is 0.56 (01/10/18)
Lowest Average value of 'ldavg-1' is 0.00 (01/01/18)
Highest Average value of 'ldavg-5' is 1.40 (01/10/18)
Lowest Average value of 'ldavg-5' is 0.00 (01/01/18)
Highest Average value of 'ldavg-15' is 0.77 (01/10/18)
Lowest Average value of 'ldavg-15' is 0.00 (01/01/18)
--------
If 'runq-sz' is over 2, the box is cpu bound.
If that is the case, you might need more cpu power to do the task.
If 'plist-sz' is higher than 10,000 for example, there are waits.
If 'ldavg-<minites>' exceeds number of cores, cpu load is high.
Check number of cores with, $cat /proc/cpuinfo | grep 'cpu cores'.
Check number of physical cpu with, $cat /proc/cpuinfo | grep 'pysical id'.
Check if hyperthreading is enabled with, $cat /proc/cpuinfo | grep 'siblings'.
Devide the result of above command and if it is not same as core, hyperthreading is enabled.
So, if you have 8 cores, highest value is 800.00 and above 70% of this value would be a trouble.

-- Report task creation and system switching activity --

Highest Average value of '%proc/s' is 758.82 (01/10/18)
Lowest Average value of '%proc/s' is 0.20 (01/02/18)
Highest Average value of '%cswch/s' is 10015.51 (01/10/18)
Lowest Average value of '%cswch/s' is 80.21 (01/06/18)
--------
proc/s shows number of tasks which was created per second.
Check the order. Depends on cores, but under 100 would be fine.
cswch/s shows number of context switching activity of CPU per second.
Check the order. Depends on cores, but under 10000 would be fine.

-- Report paging statistics --

Highest Average value of '%fault/s' is 48092.51 (01/10/18)
Lowest Average value of '%fault/s' is 86.95 (01/11/18)
Highest Average value of '%majflt/s' is 0.35 (01/10/18)
Lowest Average value of '%majflt/s' is 0.00 (01/01/18)
Highest Average value of '%vmeff/s' is 100.00 (01/12/18)
Lowest Average value of '%vmeff/s' is 0.00 (01/06/18)
--------
If fault/s is high, programs may requiring memory. Check with, say '# ps -o min_flt,maj_flt <PID>'.
If majflt/s is high, some big program had been started somehow on that day.
If vmeff/s is 0, no worry on memory, if vmeff/s is not 0 and over 90.00, it is good.
If vmeff/s is under 30.00, somethig is wrong.

-- Report memory utilization statistics --

Highest Average value of '%memused/s' is 85.49 (01/01/18)
Lowest Average value of '%memused/s' is 33.22 (01/10/18)
Highest Average value of 'kbcommit' is 2620525 (01/03/18)
Lowest Average value of 'kbcommit' is 2048996 (01/10/18)
Highest Average value of '%commit/s' is 258.92 (01/03/18)
Lowest Average value of '%commit/s' is 203.63 (01/10/18)
--------
Even if %memused is around 99.0%, it's OK with Linux.
Check the highest value of kbcommit. This amount of memory is needed for the system. If lacking, consider adding more memory.
If %commit is over 100%, memory shortage is occurring. Gain swap or add more memory.

-- Report I/O and transfer rate statistics --

Highest Average value of 'tps' is 5.70 (01/10/18)
Lowest Average value of 'tps' is 1.79 (01/11/18)
Highest Average value of 'bread/s' is 106.29 (01/10/18)
Lowest Average value of 'bread/s' is 0.04 (01/08/18)
Highest Average value of 'bwrtn/s' is 104.01 (01/10/18)
Lowest Average value of 'bwrtn/s' is 42.03 (01/06/18)
--------
tps is total number of transfers per second that were issued to physical devices.
A transfer is an I/O request to a physical device.
Multiple logical requests can be combined into a single I/O request to the device.
A transfer is of indeterminate size.
bread/s is Total amount of data read from the devices in blocks per second.
Blocks are equivalent to sectors and therefore have a size of 512 bytes.
bwrtn/s is Total amount of data written to devices in blocks per second.
If these values are, say over 10000 or some, I/O was heavy on that day. Chech the sar file related.
Check iowait on CPU, also.

-- Report activity for each block device --

Highest Average value of 'areq-sz' of dev202-0 is 18.44 (01/10/18)
Lowest Average value of 'areq-sz' of dev202-0 is 8.49 (01/01/18)
Highest Average value of '%util' of dev202-0 is 0.13 (01/10/18)
Lowest Average value of '%util' of dev202-0 is 0.05 (01/06/18)
--------
'areq-sz' is the average size (in kilobytes) of the I/O requests that were issued to the device.
Note: In previous versions, this field was known as avgrq-sz and was expressed in sectors.
'%util'is percentage of elapsed time during which I/O requests were issued to the device
(bandwidth utilization for the device). Device saturation occurs when this value
is close to 100% for devices serving requests serially. But for devices serving requests in
parallel, such as RAID arrays and modern SSDs, this number does not reflect their performance limits.

-- Report swap space utilization statistics --

Highest Average value of '%swpused' is 0.00 (01/01/18)
Lowest Average value of '%swpused' is 0.00 (01/01/18)
--------
%swpused percentage of used swap space.
If it's high, the system is memory bound.
--------

やっぱ、メモリ足りんかな。

 

github.com