computerの日記

Cisco,SHELL,C,Qt,C++,Linux,ネットワーク,Windows Scriptなどの発言です

LACP 802.3ad の実験

今回の実験は、スイッチにおいて etherchannel 、サーバにおいて bonding を使って、Link Aggregation を実現できるか、というところです。

今回のトポロジは以下となります。

f:id:intrajp:20180325190140p:plain

さて、設定していきます。

スイッチから。

IOU1#debug etherchannel event
PAgP/LACP Shim Events debugging is on

IOU1#conf t

IOU1(config)#int range e0/0 - 1
IOU1(config-if-range)#chann
IOU1(config-if-range)#channel-pro
IOU1(config-if-range)#channel-protocol ?
lacp Prepare interface for LACP protocol
pagp Prepare interface for PAgP protocol

IOU1(config-if-range)#channel-protocol lacp

IOU1(config-if-range)#channel-group 1 mode active

IOU1#sho ip int bri
Interface IP-Address OK? Method Status Protocol
Ethernet0/0 unassigned YES unset up up
Ethernet0/1 unassigned YES unset up up
Ethernet0/2 unassigned YES unset up up
Ethernet0/3 unassigned YES unset up up
Ethernet1/0 unassigned YES unset up up
Ethernet1/1 unassigned YES unset up up
Ethernet1/2 unassigned YES unset up up
Ethernet1/3 unassigned YES unset up up
Ethernet2/0 unassigned YES unset up up
Ethernet2/1 unassigned YES unset up up
Ethernet2/2 unassigned YES unset up up
Ethernet2/3 unassigned YES unset up up
Ethernet3/0 unassigned YES unset up up
Ethernet3/1 unassigned YES unset up up
Ethernet3/2 unassigned YES unset up up
Ethernet3/3 unassigned YES unset up up
Port-channel1 unassigned YES unset up up
Vlan1 unassigned YES unset administratively down down

うまくいくと、こうなる。

IOU1#show etherchannel summary
Flags: D - down P - bundled in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use f - failed to allocate aggregator

M - not in use, minimum links not met
u - unsuitable for bundling
w - waiting to be aggregated
d - default port


Number of channel-groups in use: 1
Number of aggregators: 1

Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
1 Po1(SU) LACP Et0/0(P) Et0/1(P)

 

 

IOU1#
*Mar 25 12:02:34.740: FEC: lacp_fec_unbundle_internal: Et0/0
*Mar 25 12:02:34.740: FEC: lacp_switch_delete_port_from_agport_internal: removing Et0/0 from Po1
*Mar 25 12:02:34.740: FEC: pagp_switch_delete_port_from_agport_list: afb->nports-- = 1 [Et0/0]
*Mar 25 12:02:35.027: FEC: lacp_fec_unbundle_internal: Et0/1
*Mar 25 12:02:35.027: FEC: lacp_switch_delete_port_from_agport_internal: removing Et0/1 from Po1
*Mar 25 12:02:35.027: FEC: pagp_switch_delete_port_from_agport_list: afb->nports-- = 0 [Et0/1]
*Mar 25 12:02:35.744: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet0/0, changed state to down
IOU1#
*Mar 25 12:02:36.031: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet0/1, changed state to down
IOU1#
*Mar 25 12:02:37.031: %LINK-3-UPDOWN: Interface Port-channel1, changed state to down
*Mar 25 12:02:38.031: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel1, changed state to down
IOU1#
*Mar 25 12:02:39.878: FEC: lacp_fec_dontbundle_internal: Et0/0
IOU1#
*Mar 25 12:02:39.878: %EC-5-L3DONTBNDL2: Et0/0 suspended: LACP currently not enabled on the remote port.
*Mar 25 12:02:40.270: FEC: lacp_fec_dontbundle_internal: Et0/1
IOU1#
*Mar 25 12:02:40.270: %EC-5-L3DONTBNDL2: Et0/1 suspended: LACP currently not enabled on the remote port.
IOU1#
*Mar 25 12:02:50.219: FEC: lacp_fec_unbundle_internal: Et0/1
*Mar 25 12:02:50.219: FEC: Unbundling req for port: Et0/1 neitherstand-alone Agport nor in a bundle
*Mar 25 12:02:50.219: FEC: lacp_switch_add_port_to_associated_list_internal: Et0/1 added to list for Po1
*Mar 25 12:02:50.219: FEC: lacp_fec_unbundle_internal: Et0/0
*Mar 25 12:02:50.219: FEC: Unbundling req for port: Et0/0 neitherstand-alone Agport nor in a bundle
*Mar 25 12:02:50.220: FEC: lacp_switch_add_port_to_associated_list_internal: Et0/0 added to list for Po1
*Mar 25 12:02:51.223: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet0/1, changed state to up
*Mar 25 12:02:51.223: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet0/0, changed state to up
IOU1#
*Mar 25 12:02:53.810: FEC: pagp_switch_add_port_to_agport_list: afb->nports++ = 1 [Et0/0]
*Mar 25 12:02:53.810: FEC: lacp_switch_add_port_to_agport_internal: Et0/0 added to aggregator Po1 list
*Mar 25 12:02:53.889: FEC: pagp_switch_add_port_to_agport_list: afb->nports++ = 2 [Et0/1]
*Mar 25 12:02:53.889: FEC: lacp_switch_add_port_to_agport_internal: Et0/1 added to aggregator Po1 list
IOU1#
*Mar 25 12:02:55.814: %LINK-3-UPDOWN: Interface Port-channel1, changed state to up
*Mar 25 12:02:56.819: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel1, changed state to up

 

次のコマンドの結果

IOU1#show etherchannel port-channel
Channel-group listing:
----------------------

Group: 1
----------
Port-channels in the group:
---------------------------

Port-channel: Po1 (Primary Aggregator)

------------

Age of the Port-channel = 0d:01h:33m:19s
Logical slot/port = 16/0 Number of ports = 2
HotStandBy port = null
Port state = Port-channel Ag-Inuse
Protocol = LACP
Port security = Disabled

Ports in the Port-channel:

Index Load Port EC state No of bits
------+------+------+------------------+-----------
0 00 Et0/0 Active 0
0 00 Et0/1 Active 0

Time since last port bundled: 0d:00h:28m:03s Et0/1
Time since last port Un-bundled: 0d:00h:29m:30s Et0/1

 

次のコマンドの結果

IOU1#show etherchannel load-balance
EtherChannel Load-Balancing Configuration:
src-dst-ip

EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Source XOR Destination MAC address
IPv4: Source XOR Destination IP address
IPv6: Source XOR Destination IP address

 

次のコマンドの結果

IOU1#show etherchannel detail
Channel-group listing:
----------------------

Group: 1
----------
Group state = L2
Ports: 2 Maxports = 4
Port-channels: 1 Max Port-channels = 4
Protocol: LACP
Minimum Links: 0


Ports in the group:
-------------------
Port: Et0/0
------------

Port state = Up Mstr Assoc In-Bndl
Channel group = 1 Mode = Active Gcchange = -
Port-channel = Po1 GC = - Pseudo port-channel = Po1
Port index = 0 Load = 0x00 Protocol = LACP

Flags: S - Device is sending Slow LACPDUs F - Device is sending fast LACPDUs.
A - Device is in active mode. P - Device is in passive mode.

Local information:
LACP port Admin Oper Port Port
Port Flags State Priority Key Key Number State
Et0/0 SA bndl 32768 0x1 0x1 0x1 0x3D

Partner's information:

LACP port Admin Oper Port Port
Port Flags Priority Dev ID Age key Key Number State
Et0/0 FA 255 0800.27a9.4bf0 16s 0x0 0x9 0x1 0xF

Age of the port in the current state: 0d:00h:22m:22s

Port: Et0/1
------------

Port state = Up Mstr Assoc In-Bndl
Channel group = 1 Mode = Active Gcchange = -
Port-channel = Po1 GC = - Pseudo port-channel = Po1
Port index = 0 Load = 0x00 Protocol = LACP

Flags: S - Device is sending Slow LACPDUs F - Device is sending fast LACPDUs.
A - Device is in active mode. P - Device is in passive mode.

Local information:
LACP port Admin Oper Port Port
Port Flags State Priority Key Key Number State
Et0/1 SA bndl 32768 0x1 0x1 0x2 0x3D

Partner's information:

LACP port Admin Oper Port Port
Port Flags Priority Dev ID Age key Key Number State
Et0/1 FA 255 0800.27a9.4bf0 16s 0x0 0x9 0x2 0xF

Age of the port in the current state: 0d:00h:22m:21s

Port-channels in the group:
---------------------------

Port-channel: Po1 (Primary Aggregator)

------------

Age of the Port-channel = 0d:01h:27m:37s
Logical slot/port = 16/0 Number of ports = 2
HotStandBy port = null
Port state = Port-channel Ag-Inuse
Protocol = LACP
Port security = Disabled

Ports in the Port-channel:

Index Load Port EC state No of bits
------+------+------+------------------+-----------
0 00 Et0/0 Active 0
0 00 Et0/1 Active 0

Time since last port bundled: 0d:00h:22m:21s Et0/1
Time since last port Un-bundled: 0d:00h:23m:48s Et0/1

 

サーバの bonding の設定は、よくあるやつでOK(802.3ad なので、mode = 4)

うまくいくと、こうなる。

# cat /proc/net/bonding/bond0

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 08:00:27:a9:4b:f0
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 2
Actor Key: 9
Partner Key: 1
Partner Mac Address: aa:bb:cc:00:01:00

Slave Interface: enp0s3
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 08:00:27:a9:4b:f0
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
system priority: 65535
system mac address: 08:00:27:a9:4b:f0
port key: 9
port priority: 255
port number: 1
port state: 63
details partner lacp pdu:
system priority: 32768
system mac address: aa:bb:cc:00:01:00
oper key: 1
port priority: 32768
port number: 1
port state: 61

Slave Interface: enp0s8
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 08:00:27:e0:d6:ae
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
system priority: 65535
system mac address: 08:00:27:a9:4b:f0
port key: 9
port priority: 255
port number: 2
port state: 63
details partner lacp pdu:
system priority: 32768
system mac address: aa:bb:cc:00:01:00
oper key: 1
port priority: 32768
port number: 2
port state: 61

  

ここで、実験。

# ifdown bond0

# ifup bond0

したり、

スイッチを停止したり、再起動したりして確認する。

をやってみる。

サーバで、# ifdown bond0 としたときの、スイッチのデバッグログ(一例)

IOU1#
*Mar 25 12:09:08.045: FEC: lacp_fec_unbundle_internal: Et0/0
*Mar 25 12:09:08.045: FEC: lacp_switch_delete_port_from_agport_internal: removing Et0/0 from Po1
*Mar 25 12:09:08.045: FEC: pagp_switch_delete_port_from_agport_list: afb->nports-- = 1 [Et0/0]
*Mar 25 12:09:08.329: FEC: lacp_fec_unbundle_internal: Et0/1
*Mar 25 12:09:08.329: FEC: lacp_switch_delete_port_from_agport_internal: removing Et0/1 from Po1
*Mar 25 12:09:08.329: FEC: pagp_switch_delete_port_from_agport_list: afb->nports-- = 0 [Et0/1]
*Mar 25 12:09:09.052: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet0/0, changed state to down
IOU1#
*Mar 25 12:09:09.330: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet0/1, changed state to down
*Mar 25 12:09:10.330: %LINK-3-UPDOWN: Interface Port-channel1, changed state to down
IOU1#
*Mar 25 12:09:11.331: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel1, changed state to down
IOU1#
*Mar 25 12:09:13.590: FEC: lacp_fec_dontbundle_internal: Et0/1
IOU1#
*Mar 25 12:09:13.590: %EC-5-L3DONTBNDL2: Et0/1 suspended: LACP currently not enabled on the remote port.
*Mar 25 12:09:13.917: FEC: lacp_fec_dontbundle_internal: Et0/0
IOU1#
*Mar 25 12:09:13.917: %EC-5-L3DONTBNDL2: Et0/0 suspended: LACP currently not enabled on the remote port.

 

 

スイッチを停止した時の、/proc/net/bonding/bond0 の結果(Link Aggregation できていない)(一例)

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: down
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Min links: 2
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 08:00:27:a9:4b:f0
Active Aggregator Info:
Aggregator ID: 7
Number of ports: 1
Actor Key: 9
Partner Key: 1
Partner Mac Address: 00:00:00:00:00:00

Slave Interface: enp0s3
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 08:00:27:a9:4b:f0
Slave queue ID: 0
Aggregator ID: 7
Actor Churn State: monitoring
Partner Churn State: monitoring
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
system priority: 65535
system mac address: 08:00:27:a9:4b:f0
port key: 9
port priority: 255
port number: 1
port state: 79
details partner lacp pdu:
system priority: 65535
system mac address: 00:00:00:00:00:00
oper key: 1
port priority: 255
port number: 1
port state: 1

Slave Interface: enp0s8
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 08:00:27:e0:d6:ae
Slave queue ID: 0
Aggregator ID: 8
Actor Churn State: monitoring
Partner Churn State: monitoring
Actor Churned Count: 0
Partner Churned Count: 1
details actor lacp pdu:
system priority: 65535
system mac address: 08:00:27:a9:4b:f0
port key: 9
port priority: 255
port number: 2
port state: 71
details partner lacp pdu:
system priority: 65535
system mac address: 00:00:00:00:00:00
oper key: 1
port priority: 255
port number: 1
port state: 1

 

サーバで、# ifup bnod0 したときの、スイッチのデバッグログ(一例)

IOU1#
*Mar 25 12:11:37.006: FEC: lacp_fec_unbundle_internal: Et0/1
*Mar 25 12:11:37.006: FEC: Unbundling req for port: Et0/1 neitherstand-alone Agport nor in a bundle
*Mar 25 12:11:37.006: FEC: lacp_switch_add_port_to_associated_list_internal: Et0/1 added to list for Po1
*Mar 25 12:11:37.006: FEC: lacp_fec_unbundle_internal: Et0/0
*Mar 25 12:11:37.006: FEC: Unbundling req for port: Et0/0 neitherstand-alone Agport nor in a bundle
*Mar 25 12:11:37.006: FEC: lacp_switch_add_port_to_associated_list_internal: Et0/0 added to list for Po1
*Mar 25 12:11:38.006: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet0/1, changed state to up
*Mar 25 12:11:38.006: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet0/0, changed state to up
IOU1#
*Mar 25 12:11:40.447: FEC: pagp_switch_add_port_to_agport_list: afb->nports++ = 1 [Et0/0]
*Mar 25 12:11:40.447: FEC: lacp_switch_add_port_to_agport_internal: Et0/0 added to aggregator Po1 list
*Mar 25 12:11:40.448: %IDBMAN-3-INVALIDAGGPORTBANDWIDTH: Port-channel1(16 / 0) has an invalid bandwidth value of 0
*Mar 25 12:11:40.639: FEC: pagp_switch_add_port_to_agport_list: afb->nports++ = 2 [Et0/1]
*Mar 25 12:11:40.639: FEC: lacp_switch_add_port_to_agport_internal: Et0/1 added to aggregator Po1 list
IOU1#
*Mar 25 12:11:40.639: %IDBMAN-3-INVALIDAGGPORTBANDWIDTH: Port-channel1(16 / 0) has an invalid bandwidth value of 0
IOU1#
*Mar 25 12:11:42.450: %LINK-3-UPDOWN: Interface Port-channel1, changed state to up
*Mar 25 12:11:43.455: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel1, changed state to up

 

スイッチを再起動した時の、/proc/net/bonding/bond0 の結果(一例)

Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Min links: 2
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 08:00:27:a9:4b:f0
Active Aggregator Info:
Aggregator ID: 7
Number of ports: 2
Actor Key: 9
Partner Key: 1
Partner Mac Address: aa:bb:cc:00:01:00

Slave Interface: enp0s3
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 08:00:27:a9:4b:f0
Slave queue ID: 0
Aggregator ID: 7
Actor Churn State: monitoring
Partner Churn State: monitoring
Actor Churned Count: 0
Partner Churned Count: 1
details actor lacp pdu:
system priority: 65535
system mac address: 08:00:27:a9:4b:f0
port key: 9
port priority: 255
port number: 1
port state: 63
details partner lacp pdu:
system priority: 32768
system mac address: aa:bb:cc:00:01:00
oper key: 1
port priority: 32768
port number: 1
port state: 13

Slave Interface: enp0s8
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 08:00:27:e0:d6:ae
Slave queue ID: 0
Aggregator ID: 7
Actor Churn State: monitoring
Partner Churn State: monitoring
Actor Churned Count: 1
Partner Churned Count: 2
details actor lacp pdu:
system priority: 65535
system mac address: 08:00:27:a9:4b:f0
port key: 9
port priority: 255
port number: 2
port state: 15
details partner lacp pdu:
system priority: 32768
system mac address: aa:bb:cc:00:01:00
oper key: 1
port priority: 32768
port number: 2
port state: 5

 

 

再起動しても、異常なく繋がった。

 (参考)

https://www.kernel.org/doc/Documentation/networking/bonding.txt

 

1台のハードディスクに OS をインストールしているが、ハードディスクを 1台追加し、 RAID1 (ミラーリング) を実現する方法

1. 同じサイズ(または新しいHDは大きい容量を持つ)の HD を用意する。

2. パーティションテーブルをコピーする。
# sfdisk -d /dev/sda | sfdisk --force /dev/sdb
確認する。
# lsblk -f
/dev/sda1 を真似して以下を実行する。
# mkfs.ext4 /dev/sdb1
確認する。
# lsblk -f

3. 以下を確認する。
# fdisk -l
/dev/sdb のパーティションを、Linux raid autodetect に変更すること。
やり方としては、fdisk /dev/sdb use "t" to convert all 3 partitions to "fd"
以下を確認する。
# fdisk -l
これも確認すること。
# parted
(parted) select /dev/sdb
(parted) print

(表示例)
========
1 ... boot,raid
2 ... raid
========

# mdadm --zero-superblock /dev/sdb1
# mdadm --zero-superblock /dev/sdb2
# mdadm --create /dev/md0 --level=1 --metadata=0.90 --raid-disks=2 missing /dev/sdb1
# mdadm --create /dev/md1 --level=1 --metadata=0.90 --raid-disks=2 missing /dev/sdb2
確認する。
# cat /proc/mdstat
# lsblk -f
# fdisk -l

4. /boot の内容を、/dev/mdX にコピーする。
# mkdir /mnt/tmp
初期化する。
# mkfs.ext4 /dev/mdX
# mount /dev/mdX /mnt/tmp
# rsync -aAXv --delete /boot/* /mnt/tmp
# umount /mnt/tmp

5. device.map を編集する
# vim /boot/grub/device.map
(表示例)
========
# this device map was generated by anaconda
(hd0) /dev/sda
----
以下を追加する。
(hd1) /dev/sdb
----
========
同様の作業を、/dev/mdX 上のデバイスでも実行する。
vim /mnt/tmp/grub/device.map
以下を追加して保存。
(hd1) /dev/sdb

以下のコマンドを実行する。
# grub-install /dev/sdb
(表示例)
Installation finished. No error reported.
This is the contents of the device map /boot/grub/device.map.
Check if this is correct or not. If any of the lines is incorrect.
fix it and re-run the script 'grub-install'.

# this device map was generated by anaconda
(hd0) /dev/sda
(hd1) /dev/sdb
----

これでよい。
まだまだ、作業が続きます。
ここで、一度仮想マシンを保存しておきます。
スナップショット作成 - "RAID-TEST-1"
ーーーーーーーーーーーーーーーーーーーーーーーー
以下を設定して、
# vim /boot/grub/grub.conf
rd_NO_MD を削除する。
再起動 - 起動した。

以下をやってみる。
mdadm --detail --scan
(表示)
ARRAY /dev/md/0_0 metadata=0.90 UUID=xxxx:xxxx:xxxx:xxxx
ARRAY /dev/md/1_0 metadata=0.90 UUID=xxxx:xxxx:xxxx:xxxx

mdadm --detail --scan > /etc/mdadm.conf
ここで、一度仮想マシンを保存しておきます。

#### スナップショット作成 - "RAID-TEST-2"

再起動する。
起動しました。

次に、/dev/sdb1 から boot させてみます。
/etc/fstab を編集して、/dev/sdb1 の ext4 にフォーマットされたデバイスの UUID にしてみます。
再起動します。
無事に、/dev/sdb1 の ext4 にフォーマットされたデバイスの UUID から起動したようです。
ここで、一度仮想マシンを保存しておきます。

#### スナップショット作成 - "RAID-TEST-3"

初期化する。
# mkfs.ext4 /dev/mdX
# lsblk -f

RAID アレイ /dev/md127 に LVM を作成して、現在の VG Name VolGroup に参加させる。
# pvcreate /dev/mdX
(表示例)
Physical volumn "/dev/mdX" successfully created
# vgextend VolGroup /dev/mdX
(表示例)
Volume group "VolGroup" successfully extended
確認する。

# pvdisplay
(表示例)
--- Physical volume ---
PV Name /dev/sda2
VG Name VolGroup
PV Size 31.51 GiB /not usable 3.00 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 8066
Free PE 0
Allocated PE 8066
PV UUID xxxx-xxxx-xxxx...
...(snip)...

--- Physical volume ---
PV Name /dev/mdX
VG Name VolGroup
PV Size 31.51 GiB /not usable 2.94 MiB
Allocatable yes (but full)
PE Size 4.00 MiB
Total PE 8066
Free PE 8066
Allocated PE 0
PV UUID xxxx-xxxx-xxxx...
...(snip)...

以下のコマンドでも確認する。
# vgs
(表示の一例)
VG #PV #LV #SN Attr VSize VFree
VolGroup 2 2 0 wz- -n- 63.02g 31.51g

Volume Group を確認する。
# lvs -o+devices
LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert Devices
lv_root VolGroup -wi-ao---- 28.31g /dev/sda2(0)
lv_swap VolGroup -wi-ao---- 3.20g /dev/sda2(7247)

データを移動する(コピーだと、/dev/sda2を抜けないので注意)。
実行
# pvmove -i 2 /dev/sda2 /dev/mdX
(表示例)
...(snip)...
/dev/sda2: Moved: 3.5%
...(snip)...
100% になったら終了

ここで、一度仮想マシンを保存しておきます。

#### スナップショット作成 - "RAID-TEST-4"

再起動する。

では、/dev/sda2 を抜いてみる。
VolGroup から /dev/sda2 を抜く
# vgreduce VolGroup /dev/sda2
(表示例)
Removed "/dev/sda2" from volume group "VolGroup"
# pvremove /dev/sda2
(表示例)
Labels on physical volume "/dev/sda2" successfully wiped
確認する
# pvdisplay
(表示例)
--- Physical volume ---
PV Name /dev/mdX
VG Name VolGroup
PV Size 31.51 GiB / not usable 2.94 MiB
...(snip)...

ここで、lsblk -f を確認する。
# lsblk -f
(確認)
sdb2 linux_raid_mem
|
-md127 LVM2_member
|-VolGroup-lv_root (dm-0)
ext4 xxxx-xxxx-... /
|-VolGroup-lv_swap (dm-1)
ext4 xxxx-xxxx-... [SWAP]

sda から消されて、sdb に、移った感じだ。

#### RAID-TEST-4.5 として保存する。####

ちゃんと起動しました。

grub.conf を編集する。

次のカーネルオプションを削除する。
rd_NO_MD rd_NO_DM
もう一度、mdadm.conf を作成します。
mdadm --detail --scan > /etc/mdadm.conf
そして、initramfs を再作成する。
設定等やドライバを含めまる。
# dracut -f -v --mdadmconf --lvmconf --add-drivers "raid1 raid10 raid456"

/boot に initramfs が異常なく作成できたようであれば、
ここで、一度仮想マシンを保存しておきます。

#### スナップショット作成 - "RAID-TEST-5"

再起動する。
# shutdown -r now

立ち上がった。
ここで、一度仮想マシンを保存しておきます。

#### スナップショット作成 - "RAID-TEST-6"

再起動する。
うまくいきました。
今、起動しているのは、/dev/sdb からという認識です。
grub.conf には、root (hd0,0) としていました。
これは、このままでいきます。
立ち上がりました。
ここで、一度仮想マシンを保存しておきます。

#### スナップショット作成 - "RAID-TEST-7"

それでは、実験です。
/dev/sda を切り離します。
その前に、fstab を編集します。
/dev/md127 /boot ext4 defaults 1 2
とします。
(そして、もう一度、mdadm を実行してみます。)

# fdisk /dev/sda
d 1 d 2 w

再起動します。
上手く起動しないようです。
レスキューDVDから起動します。
# grub
grub> find /grub/grub.conf としました。
grub> root
(hd0,0): Filesystem type unknown, partition type 0x0
grub> find /grub/grub.conf または、 find /grub/stage1
(hd0,0)
以下のようにした。
grub> root (hd0,0)
Filesystem type is ext2fs, partition type 0xfd
grub> setup (hd0)
grub> root
(hd0,0): Filesystem type is ext2fs, partition type 0x0
grub> reboot

起動した。

#### スナップショット作成 - "RAID-TEST-8"

# lsblk -f
上記で、sda に何もぶら下がっていないことを確認します。
では、VirtualBox で、最初のハードディスクを切り離してみます。
シャットダウンして、ハードディスクを、1つにしてみます。
わおーー、起動しましたーーー!!

これで、最初のハードディスクが一つの状態から、別のハードディスクに RAID1 になった状態で
移行できたことにーー、まだなりません。
なぜなら、今、ハードディスクが一つだからです。では、もう一つ、同じ容量のハードディスクを
つなげて、本当のRAID1にしてみましょう。
さて、以下のコマンドを実行します。

# lsblk -f

NAME FSTYPE LABEL UUID MOUNTPOINT
sr0
sda
sda1 linux_raid_mem xxxxxxx
md127 ext4 xxxxxxx
sda2 linux_raid_mem xxxxxxx
md126 LVM2_member xxxxxxx
VolGroup-lv_root (dm-0)
ext4 xxxxxxx
VolGroup-lv_swap (dm-1)
swap xxxxxxx
すばらしいーー。

#### スナップショット作成 - "RAID-TEST-9"

では、ハードディスクを追加してみましょう。

同じ要領のディスクを追加します。
では、前と同じ様にやっていきます。

# sfdisk /dev/sda | sfdisk --force /dev/sdb
# mkfs.ext4 /dev/sdb1
# mkfs.ext4 /dev/sdb2

# cat /proc/mdstat
Personalities : [raid1]
md126 : active raid1 sda2[1]
33041344 blocks [2/1] [_U]
md127 : active raid1 sda1[1]
511936 blocks [2/1] [_U]
unused devices: <none>

# mdadm -D /dev/mdX
# mdadm -D /dev/mdX

# mdadm --manage /dev/mdX --add /dev/sdb1
# mdadm --manage /dev/mdX --add /dev/sdb2

# cat /proc/mdstat
100% になるまで、観察します。
完了しました。

#### スナップショット作成 - "RAID-TEST-10"

ハードディスクを外します。
再起動します。
立ち上がらないです。
レスキューモードで立ち上げます。
# grub
grub> root
(hd0,0)
となっていたので、
grub> set root (hd0,0)
grub> setup (hd0)
grub> root
grub> reboot
bash# reboot
立ち上がりました。
さがせなければ、以下のようにします。
grub> find /grub/grub.conf

#### スナップショット作成 - "RAID-TEST-11"

では、ハードディスクを追加します。
no such partition...
grub> edit > root (hd0,0)
立ち上がりました。

今までのハードディスクをポート0につなぎます。
新しいハードディスクをポート1につなぎます。
容量は、少し大きいものがよいでしょう。
# fdisk /dev/sdb
同じように設定します。

一度、/dev/sda を実行して、p して、メモしておきます。
/dev/sda1 * 1 64 51200 fd
/dev/sda2 64 4178 33041408 fd

# mdadm --manage /dev/mdX --add /dev/sdb1
# mdadm --manage /dev/mdX --add /dev/sdb2

# cat /proc/mdstat
100% になったら終了。
以下のコマンドを実行して、同じ構成になったことを確認する。
# lsblk -f

#### スナップショット作成 - "RAID-TEST-12"

再起動
立ち上がります。

grub.conf を編集ししてみます。
(hd0,0) を、(hd1,0) にしてみます。
立ち上がります。

元に戻します。(hd1,0) を、(hd0,0) に。

では、いよいよ、ハードディスクを外してみます。
新しいハードディスクを外します。
起動させます。起動しました。
シャットダウンします。
古いハードディスク(RAID-NEWDISK-ADDED)を外して、新しいハードディスク(RAID-NEWLY-ADDED2)を接続します。
起動します。
おめでとうございます!成功です。

SOFTWARE RAID 1 の復旧方法

RAID1 のテストをしていたのですが、一度、ハードディスクを外して、またつけても、
何もしないと、ちゃんと認識されないようです。

今、RAID1 にしていて、2つのハードディスクを使用しているとしましょう。
そこで、試しに、1つのハードディスクを外して、また付けたとしましょう。
ここで、ちゃんと認識されるかというと、そうはいきません。

確認してみます。
----------------------------
# cat /proc/mdstat
(表示例)
Personalities : [raid1]
md0 : active raid1 sda1[1] ★<---- 1台しか見えていない
1023936 blocks super 1.0 [2/1] [_U] ★<---- 同上

md1 : active raid1 sda2[1]
32512896 blocks super 1.1 [2/1] [_U] ★<---- 1台しか見えていない
bitmap: 1/1 pages [4KB], 65536KB chunk ★<---- 同上

unused devices: <none>

物理的には 2台接続した(し直した)つもりでも、ソフトウェア上は、認識していない、と言えます。
mdadm コマンドでも確認してみましょう。
---------------------------
# mdadm -D /dev/md0
(表示例)

...(snip)...
Raid Devices : 2
Total Devices : 1 ★<---- 1となっている
...(snip)...
Active Devices : 1 ★<---- 1となっている
Working Devices : 1 ★<---- 1となっている
Failed Devices : 0
Spare Devices : 0

Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda1
1 0 0 1 removed ★<---- removed となっている。
---------------------------
# mdadm -D /dev/md1
(表示例)

...(snip)...
Raid Devices : 2
Total Devices : 1 ★<---- 1となっている
...(snip)...
Active Devices : 1 ★<---- 1となっている
Working Devices : 1 ★<---- 1となっている
Failed Devices : 0
Spare Devices : 0

Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
1 0 0 1 removed ★<---- removed となっている。
---------------------------
となっていて、外されたままの状態となっています。

本来ならば、以下の様にならないといけません。
---------------------------
# mdadm -D /dev/md0
(表示例)

...(snip)...
Raid Devices : 2
Total Devices : 2
...(snip)...
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
---------------------------
# mdadm -D /dev/md1
(表示例)

...(snip)...
Raid Devices : 2
Total Devices : 2
...(snip)...
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
1 8 18 1 active sync /dev/sdb2
---------------------------
これを直すには、mdadm コマンドにより、ディスクを再度追加する必要があります。

ここでは、逆に、/dev/sdb1 や /dev/sdb2 しか表示されていなかったとしましょう。
以下のコマンドを実行します。

# mdadm --manage /dev/md0 -a /dev/sda1
(表示例)
mdadm: added /dev/sda1

# mdadm --manage /dev/md1 -a /dev/sda2
(表示例)
mdadm: re-added /dev/sda2

これで、mdadm -D /dev/md0 および、mdadm -D /dev/md1 を実行して、ディスクが追加されたことを確認しましょう。
-----------------------------
以下のコマンドを実行します。

# cat /proc/mdstat
(表示例)
Personalities : [raid1]
md0 : active raid1 sda1[2] sdb1[1]
1023936 blocks super 1.0 [2/2] [UU]

md1 : active raid1 sda2[0] sdb2[1]
32512896 blocks super 1.1 [2/1] [_U]
[====>...............] recovery = 16.2% (5276416/32512896) finish~2.4min
speed=188443K/sec
bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>

上記のように、逐次回復動作が実行されているのが確認できます。
最終的には、以下の様に完了すればよいです。
---------------------
# cat /proc/mdstat
(表示例)
Personalities : [raid1]
md0 : active raid1 sda1[2] sdb1[1]
1023936 blocks super 1.0 [2/2] [UU]

md1 : active raid1 sda2[0] sdb2[1]
32512896 blocks super 1.1 [2/2] [UU]
bitmap: 1/1 pages [4KB], 65536KB chunk

unused devices: <none>
-----------------------------

実際は、一度再起動するまでは、

bitmap: 0/1 pages [0KB]... となっていたので、一度再起動した方がいいかもしれません。

 ソフトウェア RAID はおすすめですよ。

Spectre & Meltdown Patch 適用前、適用後、nopti による、UnixBench の結果。

Spectre & Meltdown Patch 適用前、適用後、nopti による、UnixBench の結果。

OS: CentOS 6.9

古いカーネル
2.6.32-696.el6

BYTE UNIX Benchmarks (Version 5.1.3)

System: localhost.localdomain: GNU/Linux
OS: GNU/Linux -- 2.6.32-696.el6.x86_64 -- #1 SMP Tue Mar 21 19:29:05 UTC
2017
Machine: x86_64 (x86_64)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz (5184.1 bogomips)
x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
21:58:20 up 29 min, 2 users, load average: 0.21, 0.08, 0.02; runlevel
2018-02-20

------------------------------------------------------------------------
Benchmark Run: 火 2月 20 2018 21:58:20 - 22:26:29
1 CPU in system; running 1 parallel copy of tests

Dhrystone 2 using register variables 40858230.7 lps (10.0 s, 7 samples)
Double-Precision Whetstone 4610.9 MWIPS (9.8 s, 7 samples)
Execl Throughput 5350.3 lps (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 1172325.8 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 311188.0 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 2578243.0 KBps (30.0 s, 2 samples)
Pipe Throughput 1954746.6 lps (10.0 s, 7 samples)
Pipe-based Context Switching 418002.3 lps (10.0 s, 7 samples)
Process Creation 14349.6 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 6546.5 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 884.9 lpm (60.0 s, 2 samples)
System Call Overhead 3074144.1 lps (10.0 s, 7 samples)

System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 40858230.7 3501.1
Double-Precision Whetstone 55.0 4610.9 838.3
Execl Throughput 43.0 5350.3 1244.2
File Copy 1024 bufsize 2000 maxblocks 3960.0 1172325.8 2960.4
File Copy 256 bufsize 500 maxblocks 1655.0 311188.0 1880.3
File Copy 4096 bufsize 8000 maxblocks 5800.0 2578243.0 4445.2
Pipe Throughput 12440.0 1954746.6 1571.3
Pipe-based Context Switching 4000.0 418002.3 1045.0
Process Creation 126.0 14349.6 1138.9
Shell Scripts (1 concurrent) 42.4 6546.5 1544.0
Shell Scripts (8 concurrent) 6.0 884.9 1474.8
System Call Overhead 15000.0 3074144.1 2049.4
========
System Benchmarks Index Score 1743.4

 

 Spectre & Meltdown patched カーネル
2.6.32-696.18.7

BYTE UNIX Benchmarks (Version 5.1.3)

System: localhost.localdomain: GNU/Linux
OS: GNU/Linux -- 2.6.32-696.18.7.el6.x86_64 -- #1 SMP Thu Jan 4 17:31:22 UTC
2018
Machine: x86_64 (x86_64)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz (5184.1 bogomips)
x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
23:57:53 up 1 min, 2 users, load average: 1.24, 0.53, 0.20; runlevel
2018-02-20

------------------------------------------------------------------------
Benchmark Run: 火 2月 20 2018 23:57:53 - 00:26:02
1 CPU in system; running 1 parallel copy of tests

Dhrystone 2 using register variables 37888385.8 lps (10.0 s, 7 samples)
Double-Precision Whetstone 4506.9 MWIPS (10.0 s, 7 samples)
Execl Throughput 3336.3 lps (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 341121.1 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 90343.2 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 1180761.0 KBps (30.0 s, 2 samples)
Pipe Throughput 492884.3 lps (10.0 s, 7 samples)
Pipe-based Context Switching 150710.6 lps (10.0 s, 7 samples)
Process Creation 11108.0 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 4923.0 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 682.2 lpm (60.0 s, 2 samples)
System Call Overhead 431462.7 lps (10.0 s, 7 samples)

System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 37888385.8 3246.6
Double-Precision Whetstone 55.0 4506.9 819.4
Execl Throughput 43.0 3336.3 775.9
File Copy 1024 bufsize 2000 maxblocks 3960.0 341121.1 861.4
File Copy 256 bufsize 500 maxblocks 1655.0 90343.2 545.9
File Copy 4096 bufsize 8000 maxblocks 5800.0 1180761.0 2035.8
Pipe Throughput 12440.0 492884.3 396.2
Pipe-based Context Switching 4000.0 150710.6 376.8
Process Creation 126.0 11108.0 881.6
Shell Scripts (1 concurrent) 42.4 4923.0 1161.1
Shell Scripts (8 concurrent) 6.0 682.2 1136.9
System Call Overhead 15000.0 431462.7 287.6
========
System Benchmarks Index Score 824.5

 

 同カーネルで、カーネルオプションに、nopti を追加して実施した結果

BYTE UNIX Benchmarks (Version 5.1.3)

System: localhost.localdomain: GNU/Linux
OS: GNU/Linux -- 2.6.32-696.18.7.el6.x86_64 -- #1 SMP Thu Jan 4 17:31:22 UTC
2018
Machine: x86_64 (x86_64)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz (5184.1 bogomips)
x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
03:50:07 up 1 min, 2 users, load average: 1.08, 0.44, 0.16; runlevel
2018-02-21

------------------------------------------------------------------------
Benchmark Run: 水 2月 21 2018 03:50:07 - 04:18:15
1 CPU in system; running 1 parallel copy of tests

Dhrystone 2 using register variables 38510363.9 lps (10.0 s, 7 samples)
Double-Precision Whetstone 4467.9 MWIPS (9.8 s, 7 samples)
Execl Throughput 4413.4 lps (29.9 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 870529.3 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 255228.5 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 2259120.5 KBps (30.0 s, 2 samples)
Pipe Throughput 1494173.8 lps (10.0 s, 7 samples)
Pipe-based Context Switching 216663.0 lps (10.0 s, 7 samples)
Process Creation 12555.5 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 6162.9 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 802.3 lpm (60.0 s, 2 samples)
System Call Overhead 1684728.3 lps (10.0 s, 7 samples)

System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 38510363.9 3299.9
Double-Precision Whetstone 55.0 4467.9 812.4
Execl Throughput 43.0 4413.4 1026.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 870529.3 2198.3
File Copy 256 bufsize 500 maxblocks 1655.0 255228.5 1542.2
File Copy 4096 bufsize 8000 maxblocks 5800.0 2259120.5 3895.0
Pipe Throughput 12440.0 1494173.8 1201.1
Pipe-based Context Switching 4000.0 216663.0 541.7
Process Creation 126.0 12555.5 996.5
Shell Scripts (1 concurrent) 42.4 6162.9 1453.5
Shell Scripts (8 concurrent) 6.0 802.3 1337.2
System Call Overhead 15000.0 1684728.3 1123.2
========
System Benchmarks Index Score 1388.7

 

 Page Table Isolation が効いたことにより、性能低下が認められました。
nopti により、かなり戻ったかも。
もっとも、それがいいかは別問題として。

 

CentOS 7系 で firewalld ではなく iptables を使う

CentOS 7系では、firewalld がデフォルトですが、iptables を使うこともできます。

以下、手順となります。

 

1. iptables-services パッケージをインストールする

# yum install iptables-services

ls -la /etc/sysconfig/iptables

設定ファイルが、既にファイル化されていました。

 

2.  firewalld が吐いたポリシを、iptables のポリシに変換する

# iptables -S | tee ~/firewalld_iptables_rules
# ip6tables -S | tee ~/firewalld_ip6tables_rules

 以下のようなコマンドで、firewalld のポリシを /etc/sysconfig/iptables にコピペできるようになると思います。これでひっかからないものでも、作成したものを忘れずにコピーしましょう。

grep 'ACCEPT\|DROP\|QUEUE\|RETURN\|REJECT\|LOG' ~/firewalld_iptables_rules > firewalld_iptables_rules_x

適宜、firewalld のポリシルールを /etc/sysconfig/iptables にコピーします。

 

3. サービス起動の設定

firewalld が立ち上がらないようにして、iptables が立ち上がるようにしましょう。

# systemctl disable firewalld.service

# systemctl mask firewalld.service

# systemctl enable iptables.service

システム再起動

 

4. 調整

# systemctl status iptables.service

うまく立ち上がっていなかったら、ログを見て、調整してください。

たぶん、firewalld 特有の設定をコピーしているのかもしれませんので、適宜直してください。

 

(参考)
https://www.digitalocean.com/community/tutorials/how-to-migrate-from-firewalld-to-iptables-on-centos-7

spectre_meltdown_checker をかけてみた(最新版)

表示が詳しくなって、対策が進んでいる事がわかります。

 CPU には脆弱性があるが、kernel の Mitigation でなんとかこらえている感じか。。

# ./spectre_meltdown_checker.sh
Spectre and Meltdown mitigation detection tool v0.34+

Checking for vulnerabilities on current system
Kernel is Linux 4.14.14-300.fc27.x86_64 #1 SMP Fri Jan 19 13:19:54 UTC 2018 x86_64
CPU is Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz

Hardware check
* Hardware support (CPU microcode) for mitigation techniques
* Indirect Branch Restricted Speculation (IBRS)
* SPEC_CTRL MSR is available: YES
* CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
* Indirect Branch Prediction Barrier (IBPB)
* PRED_CMD MSR is available: YES
* CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
* Single Thread Indirect Branch Predictors (STIBP)
* SPEC_CTRL MSR is available: YES
* CPU indicates STIBP capability: YES
* Enhanced IBRS (IBRS_ALL)
* CPU indicates ARCH_CAPABILITIES MSR availability: NO
* ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
* CPU explicitly indicates not being vulnerable to Meltdown (RDCL_NO): NO
* CPU microcode is known to cause stability problems: YES (model 94 stepping 3 ucode 0xc2)

The microcode your CPU is running on is known to cause instability problems,
such as intempestive reboots or random crashes.
You are advised to either revert to a previous microcode version (that might not have
the mitigations for Spectre), or upgrade to a newer one if available.

* CPU vulnerability to the three speculative execution attacks variants
* Vulnerable to Variant 1: YES
* Vulnerable to Variant 2: YES
* Vulnerable to Variant 3: YES

CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Mitigated according to the /sys interface: NO (kernel confirms your system is vulnerable)
* Kernel has array_index_mask_nospec: NO
* Checking count of LFENCE instructions following a jump in kernel: NO (only 5 jump-then-lfence instructions found, should be >= 30 (heuristic))
> STATUS: VULNERABLE (Kernel source needs to be patched to mitigate the vulnerability)

CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Mitigated according to the /sys interface: YES (kernel confirms that the mitigation is active)
* Mitigation 1
* Kernel is compiled with IBRS/IBPB support: NO
* Currently enabled features
* IBRS enabled for Kernel space: NO
* IBRS enabled for User space: NO
* IBPB enabled: NO
* Mitigation 2
* Kernel compiled with retpoline option: YES
* Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline compilation)
* Retpoline enabled: YES
> STATUS: NOT VULNERABLE (Mitigation: Full generic retpoline)

CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Mitigated according to the /sys interface: YES (kernel confirms that the mitigation is active)
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI)

A false sense of security is worse than no security at all, see --disclaimer

spectre_meltdown_checker をかけてみた

ノートPC

# ./spectre_meltdown_checker.sh
Spectre and Meltdown mitigation detection tool v0.31

Checking for vulnerabilities against running kernel Linux 4.14.13-300.fc27.x86_64 #1 SMP Thu Jan 11 04:00:01 UTC 2018 x86_64
CPU is Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz

CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Checking whether we're safe according to the /sys interface: NO (kernel confirms your system is vulnerable)
> STATUS: VULNERABLE (Vulnerable)

CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Checking whether we're safe according to the /sys interface: NO (kernel confirms your system is vulnerable)
> STATUS: VULNERABLE (Vulnerable: Minimal generic ASM retpoline)

CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Checking whether we're safe according to the /sys interface: YES (kernel confirms that the mitigation is active)
> STATUS: NOT VULNERABLE (Mitigation: PTI)

A false sense of security is worse than no security at all, see --disclaimer
------------------------

AWS の貧弱なサーバ

# ./spectre_meltdown_checker.sh
Spectre and Meltdown mitigation detection tool v0.31

Checking for vulnerabilities against running kernel Linux 4.14.11-300.fc27.x86_64 #1 SMP Wed Jan 3 13:52:28 UTC 2018 x86_64
CPU is Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz

CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Checking count of LFENCE opcodes in kernel: NO
> STATUS: VULNERABLE (only 25 opcodes found, should be >= 70, heuristic to be improved when official patches become available)

CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Mitigation 1
* Hardware (CPU microcode) support for mitigation
* The SPEC_CTRL MSR is available: YES
* The SPEC_CTRL CPUID feature bit is set: NO
* Kernel support for IBRS: NO
* IBRS enabled for Kernel space: NO
* IBRS enabled for User space: NO
* Mitigation 2
* Kernel compiled with retpoline option: NO
* Kernel compiled with a retpoline-aware compiler: NO
> STATUS: VULNERABLE (IBRS hardware + kernel support OR kernel with retpoline are needed to mitigate the vulnerability)

CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Checking if we're running under Xen PV (64 bits): NO
> STATUS: NOT VULNERABLE (PTI mitigates the vulnerability)

A false sense of security is worse than no security at all, see --disclaimer

--------

古いカーネルでやってみた。

 

$ ./spectre_meltdown_checker.sh
Spectre and Meltdown mitigation detection tool v0.31

Note that you should launch this script with root privileges to get accurate information.
We'll proceed but you might see permission denied errors.
To run it as root, you can try the following command: sudo ./spectre_meltdown_checker.sh

Checking for vulnerabilities against running kernel Linux 4.14.8-300.fc27.x86_64 #1 SMP Wed Dec 20 19:00:18 UTC 2017 x86_64
CPU is Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz

CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1'
* Checking count of LFENCE opcodes in kernel: NO
> STATUS: VULNERABLE (only 25 opcodes found, should be >= 70, heuristic to be improved when official patches become available)

CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2'
* Mitigation 1
* Hardware (CPU microcode) support for mitigation
* The SPEC_CTRL MSR is available: NO
* The SPEC_CTRL CPUID feature bit is set: NO
* Kernel support for IBRS: NO
* IBRS enabled for Kernel space: NO
* IBRS enabled for User space: NO
* Mitigation 2
* Kernel compiled with retpoline option: NO
* Kernel compiled with a retpoline-aware compiler: NO
> STATUS: VULNERABLE (IBRS hardware + kernel support OR kernel with retpoline are needed to mitigate the vulnerability)

CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3'
* Kernel supports Page Table Isolation (PTI): NO
* PTI enabled and active: NO
* Checking if we're running under Xen PV (64 bits): NO
> STATUS: VULNERABLE (PTI is needed to mitigate the vulnerability)

A false sense of security is worse than no security at all, see --disclaimer

spectre の点検

https://gist.github.com/intrajp/ae240cc69b37537957eadb29103bd9be

[ae240cc69b37537957eadb29103bd9be-a7ac31bcd12657a3d8dfa868b4c23e39ee68b137]$ ./spectre
Reading 40 bytes:
Reading at malicious_x = 0xffffffffffdffac0... Unclear: 0x54=’T’ score=976 (second best: 0x01 score=806)
Reading at malicious_x = 0xffffffffffdffac1... Unclear: 0x68=’h’ score=999 (second best: 0x01 score=778)
Reading at malicious_x = 0xffffffffffdffac2... Unclear: 0x65=’e’ score=998 (second best: 0x01 score=803)
Reading at malicious_x = 0xffffffffffdffac3... Unclear: 0x20=’ ’ score=999 (second best: 0x01 score=811)
Reading at malicious_x = 0xffffffffffdffac4... Unclear: 0x4D=’M’ score=999 (second best: 0x01 score=818)
Reading at malicious_x = 0xffffffffffdffac5... Unclear: 0x61=’a’ score=999 (second best: 0x01 score=823)
Reading at malicious_x = 0xffffffffffdffac6... Unclear: 0x67=’g’ score=999 (second best: 0x01 score=802)
Reading at malicious_x = 0xffffffffffdffac7... Unclear: 0x69=’i’ score=999 (second best: 0x01 score=825)
Reading at malicious_x = 0xffffffffffdffac8... Unclear: 0x63=’c’ score=999 (second best: 0x01 score=808)
Reading at malicious_x = 0xffffffffffdffac9... Unclear: 0x20=’ ’ score=999 (second best: 0x01 score=833)
Reading at malicious_x = 0xffffffffffdffaca... Unclear: 0x57=’W’ score=998 (second best: 0x01 score=792)
Reading at malicious_x = 0xffffffffffdffacb... Unclear: 0x6F=’o’ score=999 (second best: 0x01 score=793)
Reading at malicious_x = 0xffffffffffdffacc... Unclear: 0x72=’r’ score=999 (second best: 0x01 score=789)
Reading at malicious_x = 0xffffffffffdffacd... Unclear: 0x64=’d’ score=999 (second best: 0x01 score=824)
Reading at malicious_x = 0xffffffffffdfface... Unclear: 0x73=’s’ score=999 (second best: 0x01 score=769)
Reading at malicious_x = 0xffffffffffdffacf... Unclear: 0x20=’ ’ score=999 (second best: 0x01 score=812)
Reading at malicious_x = 0xffffffffffdffad0... Unclear: 0x61=’a’ score=999 (second best: 0x01 score=810)
Reading at malicious_x = 0xffffffffffdffad1... Unclear: 0x72=’r’ score=999 (second best: 0x01 score=792)
Reading at malicious_x = 0xffffffffffdffad2... Unclear: 0x65=’e’ score=998 (second best: 0x01 score=799)
Reading at malicious_x = 0xffffffffffdffad3... Unclear: 0x20=’ ’ score=999 (second best: 0x01 score=802)
Reading at malicious_x = 0xffffffffffdffad4... Unclear: 0x53=’S’ score=999 (second best: 0x01 score=793)
Reading at malicious_x = 0xffffffffffdffad5... Unclear: 0x71=’q’ score=998 (second best: 0x01 score=809)
Reading at malicious_x = 0xffffffffffdffad6... Unclear: 0x75=’u’ score=999 (second best: 0x01 score=824)
Reading at malicious_x = 0xffffffffffdffad7... Unclear: 0x65=’e’ score=999 (second best: 0x01 score=819)
Reading at malicious_x = 0xffffffffffdffad8... Unclear: 0x61=’a’ score=999 (second best: 0x01 score=801)
Reading at malicious_x = 0xffffffffffdffad9... Unclear: 0x6D=’m’ score=999 (second best: 0x01 score=783)
Reading at malicious_x = 0xffffffffffdffada... Unclear: 0x69=’i’ score=999 (second best: 0x01 score=821)
Reading at malicious_x = 0xffffffffffdffadb... Unclear: 0x73=’s’ score=999 (second best: 0x01 score=803)
Reading at malicious_x = 0xffffffffffdffadc... Unclear: 0x68=’h’ score=999 (second best: 0x01 score=801)
Reading at malicious_x = 0xffffffffffdffadd... Unclear: 0x20=’ ’ score=997 (second best: 0x01 score=832)
Reading at malicious_x = 0xffffffffffdffade... Unclear: 0x4F=’O’ score=999 (second best: 0x01 score=771)
Reading at malicious_x = 0xffffffffffdffadf... Unclear: 0x73=’s’ score=999 (second best: 0x01 score=818)
Reading at malicious_x = 0xffffffffffdffae0... Unclear: 0x73=’s’ score=999 (second best: 0x01 score=821)
Reading at malicious_x = 0xffffffffffdffae1... Unclear: 0x69=’i’ score=999 (second best: 0x01 score=837)
Reading at malicious_x = 0xffffffffffdffae2... Unclear: 0x66=’f’ score=998 (second best: 0x01 score=770)
Reading at malicious_x = 0xffffffffffdffae3... Unclear: 0x72=’r’ score=998 (second best: 0x01 score=820)
Reading at malicious_x = 0xffffffffffdffae4... Unclear: 0x61=’a’ score=996 (second best: 0x01 score=795)
Reading at malicious_x = 0xffffffffffdffae5... Unclear: 0x67=’g’ score=986 (second best: 0x01 score=755)
Reading at malicious_x = 0xffffffffffdffae6... Unclear: 0x65=’e’ score=951 (second best: 0x01 score=760)
Reading at malicious_x = 0xffffffffffdffae7... Unclear: 0x2E=’.’ score=995 (second best: 0x01 score=802)

自分のサーバを自分で作った sar-analyzer で評価する

 ということで、やってみました。

AWS の貧弱なマシンになります。

#### Report by sar-analyzer ####

-- Report of CPU utilization --

Highest Average value of '%usr(%user)' for CPU all is 21.07 (01/10/18)
Lowest Average value of '%usr(%user)' for CPU all is 0.02 (01/02/18)
Highest Average value of '%sys(%system)' for CPU all is 33.84 (01/10/18)
Lowest Average value of '%sys(%system)' for CPU all is 0.05 (01/02/18)
Highest Average value of '%iowait' for CPU all is 0.20 (01/10/18)
Lowest Average value of '%iowait' for CPU all is 0.05 (01/02/18)
Highest Average value of '%idle' for CPU all is 99.82 (01/02/18)
Lowest Average value of '%idle' for CPU all is 43.98 (01/10/18)

Highest Average value of '%usr(%user)' for CPU 0 is 21.07 (01/10/18)
Lowest Average value of '%usr(%user)' for CPU 0 is 0.02 (01/02/18)
Highest Average value of '%sys(%system)' for CPU 0 is 33.84 (01/10/18)
Lowest Average value of '%sys(%system)' for CPU 0 is 0.05 (01/02/18)
Highest Average value of '%iowait' for CPU 0 is 0.20 (01/10/18)
Lowest Average value of '%iowait' for CPU 0 is 0.05 (01/02/18)
Highest Average value of '%idle' for CPU 0 is 99.82 (01/02/18)
Lowest Average value of '%idle' for CPU 0 is 43.98 (01/10/18)
--------
Each CPU can be in one of four states: user, sys, idle, iowait.
If '%usr' is over 60%, applications are in a busy state. Check with ps command which application is busy.
If '%sys' is over '%usr', kernel is in a busy state. Check cswch is high or not.
If '%iowait' is high, cpu is working for other task more. Note that iowait sometimes meaningless, at all.
Check swap statistics or high disk I/O would be the cause. Also check process or memory statistics.
If %idle is lower than 30%, you would need new CPU or cores.
Check not only 'CPU all', but each CPU values. And if some of their values are high, check the sar file of that date.

-- Report of queue length and load averages --

Highest Average value of 'runq-sz' is 1 (12/30/17)
Lowest Average value of 'runq-sz' is 0 (01/01/18)
Highest Average value of 'plist-sz' is 357 (01/03/18)
Lowest Average value of 'plist-sz' is 293 (01/10/18)
Highest Average value of 'ldavg-1' is 0.56 (01/10/18)
Lowest Average value of 'ldavg-1' is 0.00 (01/01/18)
Highest Average value of 'ldavg-5' is 1.40 (01/10/18)
Lowest Average value of 'ldavg-5' is 0.00 (01/01/18)
Highest Average value of 'ldavg-15' is 0.77 (01/10/18)
Lowest Average value of 'ldavg-15' is 0.00 (01/01/18)
--------
If 'runq-sz' is over 2, the box is cpu bound.
If that is the case, you might need more cpu power to do the task.
If 'plist-sz' is higher than 10,000 for example, there are waits.
If 'ldavg-<minites>' exceeds number of cores, cpu load is high.
Check number of cores with, $cat /proc/cpuinfo | grep 'cpu cores'.
Check number of physical cpu with, $cat /proc/cpuinfo | grep 'pysical id'.
Check if hyperthreading is enabled with, $cat /proc/cpuinfo | grep 'siblings'.
Devide the result of above command and if it is not same as core, hyperthreading is enabled.
So, if you have 8 cores, highest value is 800.00 and above 70% of this value would be a trouble.

-- Report task creation and system switching activity --

Highest Average value of '%proc/s' is 758.82 (01/10/18)
Lowest Average value of '%proc/s' is 0.20 (01/02/18)
Highest Average value of '%cswch/s' is 10015.51 (01/10/18)
Lowest Average value of '%cswch/s' is 80.21 (01/06/18)
--------
proc/s shows number of tasks which was created per second.
Check the order. Depends on cores, but under 100 would be fine.
cswch/s shows number of context switching activity of CPU per second.
Check the order. Depends on cores, but under 10000 would be fine.

-- Report paging statistics --

Highest Average value of '%fault/s' is 48092.51 (01/10/18)
Lowest Average value of '%fault/s' is 86.95 (01/11/18)
Highest Average value of '%majflt/s' is 0.35 (01/10/18)
Lowest Average value of '%majflt/s' is 0.00 (01/01/18)
Highest Average value of '%vmeff/s' is 100.00 (01/12/18)
Lowest Average value of '%vmeff/s' is 0.00 (01/06/18)
--------
If fault/s is high, programs may requiring memory. Check with, say '# ps -o min_flt,maj_flt <PID>'.
If majflt/s is high, some big program had been started somehow on that day.
If vmeff/s is 0, no worry on memory, if vmeff/s is not 0 and over 90.00, it is good.
If vmeff/s is under 30.00, somethig is wrong.

-- Report memory utilization statistics --

Highest Average value of '%memused/s' is 85.49 (01/01/18)
Lowest Average value of '%memused/s' is 33.22 (01/10/18)
Highest Average value of 'kbcommit' is 2620525 (01/03/18)
Lowest Average value of 'kbcommit' is 2048996 (01/10/18)
Highest Average value of '%commit/s' is 258.92 (01/03/18)
Lowest Average value of '%commit/s' is 203.63 (01/10/18)
--------
Even if %memused is around 99.0%, it's OK with Linux.
Check the highest value of kbcommit. This amount of memory is needed for the system. If lacking, consider adding more memory.
If %commit is over 100%, memory shortage is occurring. Gain swap or add more memory.

-- Report I/O and transfer rate statistics --

Highest Average value of 'tps' is 5.70 (01/10/18)
Lowest Average value of 'tps' is 1.79 (01/11/18)
Highest Average value of 'bread/s' is 106.29 (01/10/18)
Lowest Average value of 'bread/s' is 0.04 (01/08/18)
Highest Average value of 'bwrtn/s' is 104.01 (01/10/18)
Lowest Average value of 'bwrtn/s' is 42.03 (01/06/18)
--------
tps is total number of transfers per second that were issued to physical devices.
A transfer is an I/O request to a physical device.
Multiple logical requests can be combined into a single I/O request to the device.
A transfer is of indeterminate size.
bread/s is Total amount of data read from the devices in blocks per second.
Blocks are equivalent to sectors and therefore have a size of 512 bytes.
bwrtn/s is Total amount of data written to devices in blocks per second.
If these values are, say over 10000 or some, I/O was heavy on that day. Chech the sar file related.
Check iowait on CPU, also.

-- Report activity for each block device --

Highest Average value of 'areq-sz' of dev202-0 is 18.44 (01/10/18)
Lowest Average value of 'areq-sz' of dev202-0 is 8.49 (01/01/18)
Highest Average value of '%util' of dev202-0 is 0.13 (01/10/18)
Lowest Average value of '%util' of dev202-0 is 0.05 (01/06/18)
--------
'areq-sz' is the average size (in kilobytes) of the I/O requests that were issued to the device.
Note: In previous versions, this field was known as avgrq-sz and was expressed in sectors.
'%util'is percentage of elapsed time during which I/O requests were issued to the device
(bandwidth utilization for the device). Device saturation occurs when this value
is close to 100% for devices serving requests serially. But for devices serving requests in
parallel, such as RAID arrays and modern SSDs, this number does not reflect their performance limits.

-- Report swap space utilization statistics --

Highest Average value of '%swpused' is 0.00 (01/01/18)
Lowest Average value of '%swpused' is 0.00 (01/01/18)
--------
%swpused percentage of used swap space.
If it's high, the system is memory bound.
--------

やっぱ、メモリ足りんかな。

 

github.com