使用ceilometer meter-list时,报错HTTPNotFound (HTTP 404) error

具体报错就是

The request you have made requires authentication. (HTTP 401) (Request-ID: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

解决办法:
把admin-token中的admin-token环境变量去掉

[root@server_2 ~]

# unset OS_TOKEN

完成之后,再次运行,一切正常

[root@server_2 ~]

# ceilometer meter-list
+———————————+——-+———–+—————————————+———+———————————-+
| Name | Type | Unit | Resource ID | User ID | Project ID |
+———————————+——-+———–+—————————————+———+———————————-+
| image | gauge | image | 9b3d8758-9863-4f21-87d8-bf4bb98884e2 | None | 0341849e239042fba0fce28f32e541b0 |
| image | gauge | image | 9f41c726-387c-478c-bbc4-3efa5f7cdd64 | None | 0341849e239042fba0fce28f32e541b0 |

发表在 OpenStack | 标签为 | 留下评论

Swift创建容器没多久就消失解决办法

昨天添加swift组件时遇到一个诡异的问题,就是使用命令行上传数据到container,一会短则几秒钟,长则几十秒就自动消失了,查阅了谷歌资料,总结的解决思路如下:

1、时间服务器NTP或者存储节点的NTP配置有问题,总之就是时间不同步。我的确实是存储节点忘了配置NTP了。但是我配置了之后还是有问题;

2、删除/etc/swift下面的gz,包括backup文件夹,也就是删除配置重新再来一遍。我删除重配之后,一切正常了。

下面是我搜索到的资料,你们参考一下:

链接自:https://ask.openstack.org/en/question/56642/swift-containers-disappear-in-less-than-1-min/

原来小哥的问题是这样的,和我的一模一样:

Hi,

I had an RDO IceHouse installation of openstack on Fedora 20. It had no swift installed, so I was trying to install it. The installation was done ~6-8 month ago, and since then some additional configs were made, so running packstack again was a bit of risk.

Following mainly that document I succeeded to have swift running.

But the problem is that when I create a container (from a dashboard, for example), in 5-30 seconds it disappears.

    $ swift stat
       Account: AUTH_df715cfea8e240e3be22ba7bd56d148a
    Containers: 1
       Objects: 0
         Bytes: 0
 Accept-Ranges: bytes
   X-Timestamp: 1418907580.14515
    X-Trans-Id: tx811daed7a0d846d8b7ad7-005492cfcb
  Content-Type: text/plain; charset=utf-8

  $ swift stat
       Account: AUTH_df715cfea8e240e3be22ba7bd56d148a
    Containers: 0
       Objects: 0
         Bytes: 0
X-Put-Timestamp: 1418907615.35912
   X-Timestamp: 1418907615.35912
    X-Trans-Id: tx62147245b42340419681f-005492cfdf
  Content-Type: text/plain; charset=utf-8

看上面两个图的对比,第一个图是上传数据一切正常,第二个图是过一会重新检查,containers神奇消失了,变为0

下面是小哥解决的办法:

Ok, seems that it is fixed somehow

Possible reason: in parallel with services (openstack-swift-…), i run swift-init (which run the same processes) and the configuration changes I did were not taken into account.

  1. stop things with swift-init: swift-init kill all (Note: I tried to fix it by doing all the following steps without this one. It did not work, so it was crucial)
  2. stop all services (I run all on one node):for service in openstack-swift-object openstack-swift-object-replicator openstack-swift-object-updater openstack-swift-object-auditor openstack-swift-container openstack-swift-container-replicator openstack-swift-container-updater openstack-swift-container-auditor openstack-swift-account openstack-swift-account-replicator openstack-swift-account-reaper openstack-swift-account-auditor openstack-swift-proxy openstack-swift-account; do service $service stop; done
  3. remove all files from the node (I had it at /srv/node/partition1)
  4. in /etc/swift, removed {account,container,object}{.builder,.ring.gz} (also removed things from /etc/swift/backup)
  5. Recreated rings:cd /etc/swiftswift-ring-builder account.builder create 18 1 1swift-ring-builder container.builder create 18 1 1swift-ring-builder object.builder create 18 1 1swiftstorage=ip-of-your-storage-nodeswift-ring-builder account.builder add z1-$swiftstorage:6202/partition1 100swift-ring-builder container.builder add z1-$swiftstorage:6201/partition1 100swift-ring-builder object.builder add z1-$swiftstorage:6200/partition1 100swift-ring-builder account.builder rebalanceswift-ring-builder container.builder rebalanceswift-ring-builder object.builder rebalancechown -R swift:swift .
  6. restarted services:for service in openstack-swift-object openstack-swift-object-replicator openstack-swift-object-updater openstack-swift-object-auditor openstack-swift-container openstack-swift-container-replicator openstack-swift-container-updater openstack-swift-container-auditor openstack-swift-account openstack-swift-account-replicator openstack-swift-account-reaper openstack-swift-account-auditor openstack-swift-proxy openstack-swift-account; do service $service start; done

Now stuff started working… Hope it can help someone.

我是直接把分区格式化重新再来,注意,如果你重启,重新挂载,记得给/srv对应的权限,也就是 chown -R swift.swift /srv ,否则上传文件报错404

[root@server_2 ~]

# swift upload C1 admin-openrc.sh
Warning: failed to create container ‘C1’: 404 Not Found:

Not Found

The resource could not be found.< Object PUT failed: http://176.204.66.102:8080/v1/AUTH_0341849e239042fba0fce28f32e541b0/C1/admin-openrc.sh 404 Not Found [first 60 chars of response]

Not Found

The resource could not be found.<

给权限,再上传,就OK了
[root@object1 swift]# chown -R swift.swift /srv/

[root@server_2 ~]

# swift upload C1 admin-openrc.sh
admin-openrc.sh


发表在 OpenStack | 标签为 | 留下评论

创建容器对象时报错 failed to create container ‘container2’: 404 Not Found:

Not Found

The resource could not be found.<

使用swift创建容器时报错404,谷歌查阅资料如下

原文地址是:https://answers.launchpad.net/swift/+question/235980

A 404 on a PUT means the “group of things” above the request failed the existence check. So if you get a 404 on object PUT it’s because the proxy failed the container existence check. If you get a 404 on a container PUT it’s because the proxy failed the account existence check.

Check the following:

1) are you running account_autocreate? You should, it is normally on by default [1]
2) are you running with 0 recheck_account_existence? ’cause that’s broken [2]

I’m curious if when you list/stat the account does it have any containers in it? Swift will lazy provision authorized accounts, and that code has been worked on in the last development cycle. If you have containers/objects in then the account definitely has a real physical data file and then it definitely exists – but if there’s no data in the account it can mean different things from a troubleshooting perspective. There may be some backend error talking to the account servers – if you’re unable to resolve the issue thoroughly examine any ERROR lines from /var/log/syslog for messages relating to problems connecting to account servers.

1. http://docs.openstack.org/developer/swift/deployment_guide.html#proxy-server-configuration
2. see: https://bugs.launchpad.net/swift/+bug/1224734

解决方法是按照一个小哥说的:

Actually account autocreate is not failing. Problem is with permission on /srv/[1-4]/. I corrected the permission and it’s solved now.

也就是说,在存储节点上,给/srv的权限设置成swift

发表在 OpenStack | 标签为 | 留下评论

openstack-swift-object.service等服务重启不成功

[root@object1 swift]

# systemctl status openstack-swift-object.service openstack-swift-object-auditor.service openstack-swift-object-replicator.service openstack-swift-object-updater.service -l
● openstack-swift-object.service – OpenStack Object Storage (swift) – Object Server
Loaded: loaded (/usr/lib/systemd/system/openstack-swift-object.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2019-02-19 00:20:53 EST; 3s ago
Main PID: 24340 (swift-object-se)
Tasks: 4
CGroup: /system.slice/openstack-swift-object.service
├─24340 /usr/bin/python2 /usr/bin/swift-object-server /etc/swift/object-server.conf
├─24376 /usr/bin/python2 /usr/bin/swift-object-server /etc/swift/object-server.conf
├─24378 /usr/bin/python2 /usr/bin/swift-object-server /etc/swift/object-server.conf
└─24379 /usr/bin/python2 /usr/bin/swift-object-server /etc/swift/object-server.conf

Feb 19 00:20:53 object1 systemd[1]: Started OpenStack Object Storage (swift) – Object Server.
Feb 19 00:20:53 object1 object-server[24340]: Started child 24376
Feb 19 00:20:53 object1 object-server[24340]: Started child 24378
Feb 19 00:20:53 object1 object-server[24340]: Started child 24379

● openstack-swift-object-auditor.service – OpenStack Object Storage (swift) – Object Auditor
Loaded: loaded (/usr/lib/systemd/system/openstack-swift-object-auditor.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2019-02-19 00:20:53 EST; 3s ago
Main PID: 24337 (swift-object-au)
Tasks: 2
CGroup: /system.slice/openstack-swift-object-auditor.service
└─24337 /usr/bin/python2 /usr/bin/swift-object-auditor /etc/swift/object-server.conf

Feb 19 00:20:53 object1 systemd[1]: Started OpenStack Object Storage (swift) – Object Auditor.
Feb 19 00:20:53 object1 object-auditor[24337]: Exception dumping recon cache: #012Traceback (most recent call last):#012 File “/usr/lib/python2.7/site-packages/swift/common/utils.py”, line 2597, in dump_recon_cache#012 with lock_file(cache_file, lock_timeout, unlink=False) as cf:#012 File “/usr/lib64/python2.7/contextlib.py”, line 17, in enter#012 return self.gen.next()#012 File “/usr/lib/python2.7/site-packages/swift/common/utils.py”, line 1877, in lock_file#012 fd = os.open(filename, flags)#012OSError: [Errno 13] Permission denied: ‘/var/cache/swift/object.recon’
Feb 19 00:20:53 object1 object-auditor[24337]: Exception dumping recon cache: #012Traceback (most recent call last):#012 File “/usr/lib/python2.7/site-packages/swift/common/utils.py”, line 2597, in dump_recon_cache#012 with lock_file(cache_file, lock_timeout, unlink=False) as cf:#012 File “/usr/lib64/python2.7/contextlib.py”, line 17, in enter#012 return self.gen.next()#012 File “/usr/lib/python2.7/site-packages/swift/common/utils.py”, line 1877, in lock_file#012 fd = os.open(filename, flags)#012OSError: [Errno 13] Permission denied: ‘/var/cache/swift/object.recon’
Feb 19 00:20:53 object1 object-auditor[24372]: Begin object audit “forever” mode (ZBF)
Feb 19 00:20:53 object1 object-auditor[24372]: Object audit (ZBF) “forever” mode completed: 0.00s. Total quarantined: 0, Total errors: 0, Total files/sec: 0.00, Total bytes/sec: 0.00, Auditing time: 0.00, Rate: 0.00
Feb 19 00:20:53 object1 object-auditor[24373]: Begin object audit “forever” mode (ALL)
Feb 19 00:20:53 object1 object-auditor[24373]: Object audit (ALL) “forever” mode completed: 0.00s. Total quarantined: 0, Total errors: 0, Total files/sec: 0.00, Total bytes/sec: 0.00, Auditing time: 0.00, Rate: 0.00

● openstack-swift-object-replicator.service – OpenStack Object Storage (swift) – Object Replicator
Loaded: loaded (/usr/lib/systemd/system/openstack-swift-object-replicator.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2019-02-19 00:20:53 EST; 3s ago
Main PID: 24338 (swift-object-re)
Tasks: 1
CGroup: /system.slice/openstack-swift-object-replicator.service
└─24338 /usr/bin/python2 /usr/bin/swift-object-replicator /etc/swift/object-server.conf

Feb 19 00:20:53 object1 systemd[1]: Started OpenStack Object Storage (swift) – Object Replicator.
Feb 19 00:20:53 object1 object-replicator[24338]: Starting object replicator in daemon mode.
Feb 19 00:20:53 object1 object-replicator[24338]: Starting object replication pass.
Feb 19 00:20:53 object1 object-replicator[24338]: Nothing replicated for 0.00165200233459 seconds.
Feb 19 00:20:53 object1 object-replicator[24338]: Object replication complete. (0.00 minutes)
Feb 19 00:20:53 object1 object-replicator[24338]: Exception dumping recon cache: #012Traceback (most recent call last):#012 File “/usr/lib/python2.7/site-packages/swift/common/utils.py”, line 2597, in dump_recon_cache#012 with lock_file(cache_file, lock_timeout, unlink=False) as cf:#012 File “/usr/lib64/python2.7/contextlib.py”, line 17, in enter#012 return self.gen.next()#012 File “/usr/lib/python2.7/site-packages/swift/common/utils.py”, line 1877, in lock_file#012 fd = os.open(filename, flags)#012OSError: [Errno 13] Permission denied: ‘/var/cache/swift/object.recon’

● openstack-swift-object-updater.service – OpenStack Object Storage (swift) – Object Updater
Loaded: loaded (/usr/lib/systemd/system/openstack-swift-object-updater.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2019-02-19 00:20:53 EST; 3s ago
Main PID: 24342 (swift-object-up)
Tasks: 1
CGroup: /system.slice/openstack-swift-object-updater.service
└─24342 /usr/bin/python2 /usr/bin/swift-object-updater /etc/swift/object-server.conf

Feb 19 00:20:53 object1 systemd[1]: Started OpenStack Object Storage (swift) – Object Updater.

重启以上服务均报错(systemctl restart openstack-swift-object.service openstack-swift-object-auditor.service openstack-swift-object-replicator.service openstack-swift-object-updater.service)

报错如下

Feb 19 00:20:53 object1 object-auditor[24337]: Exception dumping recon cache: #012Traceback (most recent call last):#012 File “/usr/lib/python2.7/site-packages/swift/common/utils.py”, line 2597, in dump_recon_cache#012 with lock_file(cache_file, lock_timeout, unlink=False) as cf:#012 File “/usr/lib64/python2.7/contextlib.py”, line 17, in enter#012 return self.gen.next()#012 File “/usr/lib/python2.7/site-packages/swift/common/utils.py”, line 1877, in lock_file#012 fd = os.open(filename, flags)#012OSError: [Errno 13] Permission denied: ‘/var/cache/swift/object.recon’

解决方法参考https://bugs.launchpad.net/openstack-manuals/+bug/1569878

at the place of the command “chown -R root:swift /var/cache/swift”, we need also execute the command “chmod -R 775 /var/cache/swift”, as when we make dir (as root user) with command “mkdir -p /var/cache/swift”, the access permission of dir /var/cache/swift will be 755(as rwxr-xr-x), which means “swift” user can not write file in this dir. so the exception like below will occur in the log /var/log/messages.”[Errno 13] Permission denied: ‘/var/cache/swift/object.recon'”
—————————
Apr 13 20:36:27 kvm-sz-004-003 object-auditor: Exception dumping recon cache: #012Traceback (most recent call last):#012 File “/usr/lib/python2.7/site-packages/swift/common/utils.py”, line 2597, in dump_recon_cache#012 with lock_file(cache_file, lock_timeout, unlink=False) as cf:#012 File “/usr/lib64/python2.7/contextlib.py”, line 17, in __enter__#012 return self.gen.next()#012 File “/usr/lib/python2.7/site-packages/swift/common/utils.py”, line 1877, in lock_file#012 fd = os.open(filename, flags)#012OSError: [Errno 13] Permission denied: ‘/var/cache/swift/object.recon’
—————————-
if we execute the command “chmod -R 775 /var/cache/swift”, this kind of exception will not occur.

———————————–
Release: 0.1 on 2016-04-13 06:51
SHA: 97286840e4879e0e0ec745b9c813d396f218f032
Source: http://git.openstack.org/cgit/openstack/openstack-manuals/tree/doc/install-guide/source/swift-storage-install.rst
URL: http://docs.openstack.org/liberty/install-guide-rdo/swift-storage-install.html

也就是将/var/cache/swift 配置权限为755,即使你看到是755也要使用chmod命令设置一次,然后再重启服务,一切正常了。

发表在 OpenStack | 标签为 | 留下评论

Swift 使用Builder 文件 创建 Rings

转载自:https://blog.csdn.net/rollingwayne/article/details/38396877

主要命令为:swift-ring-builder

命令使用方法:

swift-ring-builder add

swift-ring-builder create

swift-ring-builder list_parts

swift-ring-builder rebalance

命令详解:

swift-ring-builder create 后面跟的三个值表示的意思分别是:

,是以2为底数的指数的幂,即2^ part_power,所得的值表示总共的partition 的数目。如果设置为2,则2^2=4,表示总共有4个partition。partiton的数目最好设置为1024以上

表示每个object 在swift中储存的数目。

表示一个partiton 能够再次更改的最小时间。防止没有进行同步便进行下次更改。

swift-ring-builder account/container/object.builder create 10 3 24

swift-ring-builder add z-:/_

例如: swift-ring-builder account/container/object.builder add z1-10.0.0.1:6000/swift01 1024 运行完命令后会有三个ring 文件出现在/etc/swift目录下

ip 为每一个运行 swift服务的主机,一般三个服务运行的端口都有默认端口:

account : 6002

container : 6001

object : 6000

最后的weight 是一个相对值,比如说,如果一个1T的硬盘你设置成 100 ,一个2T 的硬盘 就要设置成 200.这样swift就会自动将更多的数据放置到2T的那块硬盘上。

最后总结,创建ring文件的流程是:

1,运行 swift-ring-builder account.builder/container.builder/object.builder create 命令

2,运行 swift-ring-builder account.builder/container.builder/object.builder add 命令

3 , 运行 swift -ring -builder account.builder/container.builder/object.builder reblance 命令 执行完这条命令后就会在/etc/swift 目录下出现三个builder 文件分别是: account.builder container.builder object.builder

4 , 如果你有多个节点,将运行生成的/etc/swift/*.gz , swift.conf 和 *-server.conf 拷贝到其他节点上的/etc/swift 目录内。

发表在 OpenStack | 标签为 | 留下评论

libvirt_storage_backend_rbd.so报错及解决方法

虚拟化服务libvirtd无法正常启动,报错:error : virModuleLoadFile:53 : internal error: Failed to load module ‘/usr/lib64/libvirt/storag  e-backend/libvirt_storage_backend_rbd.so’: /usr/lib64/libvirt/storage-backend/libvirt_storage_backend_rbd.so: undefined symbol: rbd_diff_iterate2 。

问题原因:older version of libvirt-daemon-driver-storage-rbd usually used in older RHEL 7.x version which is missing the librbd1 dependency。(动态链接库缺失)

具体可以通过如下方法查看:

[root@rhvh42 storage-backend]# file libvirt_storage_backend_rbd.solibvirt_storage_backend_rbd.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=f0289019f4ff9a0c87c47c928457be3672448e47, stripped

缺失的链接库文件在librbd1包里,解决方法如下:

yum update librbd1systemctl restart libvirtd
发表在 OpenStack | 标签为 | 留下评论

KVM主机配置文件关于CPU选项的详细说明

1、在kvm主机上修改配置文件

[root@localhost ~]# virsh edit CentOS-7.3-X86_64  
将xml配置文件中的:
  <cpu mode='custom' match='exact'>
    <model fallback='allow'>IvyBridge</model>
  </cpu>
修改为:
  <cpu mode='host-passthrough'/>

2、kvm关于cpu型号的定义(也就说默认支持模拟这些cpu型号)

[root@localhost ~]# cat /usr/share/libvirt/cpu_map.xml | tail -11
    <model name='POWERPC_e5500'>
      <vendor name='Freescale'/>
      <pvr value='0x80240000' mask='0xffff0000'/>
    </model>
    <model name='POWERPC_e6500'>
      <vendor name='Freescale'/>
      <pvr value='0x80400000' mask='0xffff0000'/>
    </model>
  </arch>
</cpus>
如上所示:我这里仅截取部分内容
'486' 'pentium' 'pentium2' 'pentium3' 'pentiumpro' 'coreduo' 'pentiumpro' 'n270' 'coreduo' 'core2duo' 'qemu32' 'kvm32' 'cpu64-rhel5' 'cpu64-rhel6' 'kvm64' 'qemu64' 'Conroe' 'Penryn' 'Nehalem''Westmere' 'SandyBridge' 'Haswell' 'athlon' 'phenom' 'Opteron_G1' 'Opteron_G2' 'Opteron_G3' 'Opteron_G4' 'Opteron_G5' 'POWER7' 'POWER7_v2.1' 'POWER7_v2.3'
使用这种方案主要是为了在虚拟机迁移的时候,在不同的宿主机间保证兼容性。

3、cpu配置模式主要有以下几种

a、custom 自己定义(默认)
<cpu mode='custom' match='exact'>
    <model fallback='allow'>kvm64</model>
 ...
    <feature policy='require' name='monitor'/>
</cpu>
b、host-model(根据物理CPU的特性,选择一个最靠近的标准CPU型号,如果没有指定CPU模式,默认这种模式)
  <cpu mode='host-model' />
c、host-passthrough(直接将物理CPU 暴露给虚拟机使用,在虚拟机上完全可以看到的就是物理CPU的型号)
 <cpu mode='host-passthrough'/>

4、在虚拟机内,查看cpu的信息:

[root@localhost ~]# cat /proc/cpuinfo 
processor: 0
vendor_id: GenuineIntel
cpu family: 6
model: 58
model name: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
stepping: 9
microcode: 0x1
cpu MHz: 2494.342
cache size: 4096 KB
physical id: 0
siblings: 1
core id: 0
cpu cores: 1
apicid: 0
initial apicid: 0
fpu: yes
fpu_exception: yes
cpuid level: 13
wp: yes
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc arch_perfmon rep_good nopl pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm fsgsbase tsc_adjust smep
bogomips: 4988.68
clflush size: 64
cache_alignment: 64
address sizes: 42 bits physical, 48 bits virtual
power management:

5、关于host-passthrough的注意事项

  • 需要将物理CPU的一些特性传给虚拟机使用,比如虚拟机嵌套的nested技术的时候
  • 需要在虚拟机里面看到和物理CPU一模一样的CPU品牌型号,这个在一些公有云很有意义,用户体验比较好
  • 注意,不同型号的CPU的宿主机之间虚拟机不能迁移
发表在 kvm | 标签为 | 留下评论

KVM迁移时报错:Host CPU does not provide required features: xxx,xxx,xxx,….的解决办法

我做实验的时候,把主机A的KVM虚拟机迁移到主机B时,提示报错如下

[root@server_3 qemu]

# virsh start openstack
error: Failed to start domain openstack
error: the CPU is incompatible with host CPU: Host CPU does not provide required features: fma, x2apic, movbe, tsc-deadline, xsave, avx, f16c, rdrand, fsgsbase, bmi1, hle, avx2, smep, bmi2, erms, invpcid, rtm, mpx, rdseed, adx, smap, xsaveopt, xsavec, xgetbv1, abm, 3dnowprefetch

其实这个报错的意思就是新的CPU不支持某些指令集,确实我迁移过去的新主机,上面的CPU是很多年前的了,如何解决这个问题,很简单,就是把xml文件内的CPU型号,选择成和主机一致,即可

我之前的XML文件中关于CPU的定义如下:

<cpu mode='custom' match='exact' check='partial'>
    <model fallback='allow'>Skylake-Client</model>
</cpu>

然后修改成和主机一致:

  <cpu mode='host-passthrough' check='none'/>

再次重新定义,并启动,成功

发表在 kvm | 标签为 | 留下评论

Kvm 虚拟化技术

Kvm 虚拟化技术

KVM(Kernel-based Virtual Machine的英文缩写)是内核内建的虚拟机。有点类似于 Xen ,但更追求更简便的运作,比如运行此虚拟机,仅需要加载相应的 kvm 模块即可后台待命。和 Xen 的完整模拟不同的是,KVM 需要芯片支持虚拟化技术(英特尔的 VT 扩展或者 AMD 的 AMD-V 扩展)。

转载说明: 转载自Itweet的博客

本章节我们主要介绍通过VMware技术虚拟出相关的Linux软件环境,在Linux系统中,安装KVM虚拟化软件,实实在在的去实践一下KVM到底是一个什么样的技术?

VMware虚拟机支持Kvm虚拟化技术?

在VMware创建的虚拟机中,默认不支持Kvm虚拟化技术,需要芯片级的扩展支持,幸好VMware提供完整的解决方案,可以通过修改虚拟化引擎。

VMware软件版本信息,VMware® Workstation 11.0.0 build-2305329

首先,你需要启动VMware软件,新建一个CentOS 6.x类型的虚拟机,正常安装完成,这个虚拟机默认的虚拟化引擎首选模式为”自动”。

如果想让我们的VMware虚拟化出来的CentOS虚拟机支持KVM虚拟化,我们需要修改它支持的虚拟化引擎,打开新建的虚拟机,虚拟机状态必须处于关闭状态,通过双击编辑虚拟机设置 > 硬件 ,选择处理器菜单,右边会出现虚拟化引擎区域,选择首选模式为 Intel Tv-x/EPT或AMD-V/RVI,接下来勾选虚拟化Intel Tv-x/EPT或AMD-V/RVI(v),点击确定

KVM需要虚拟机宿主(host)的处理器带有虚拟化支持(对于Intel处理器来说是VT-x,对于AMD处理器来说是AMD-V)。你可以通过以下命令来检查你的处理器是否支持虚拟化:

 grep --color -E '(vmx|svm)' /proc/cpuinfo

如果运行后没有显示,那么你的处理器不支持硬件虚拟化,你不能使用KVM。

安装Kvm虚拟化软件

安装kvm虚拟化软件,我们需要一个Linux操作系统环境,这里我们选择的Linux版本为CentOS release 6.8 (Final),在这个VMware虚拟化出来的虚拟机中安装kvm虚拟化软件,具体步骤如下:

  • 首选安装epel源sudo rpm -ivh http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm
  • 安装kvm虚拟化软件sudo yum install qemu-kvm qeum-kvm-tools virt-manager libvirt
  • 启动kvm虚拟化软件sudo /etc/init.d/libvirtd start

启动成功之后你可以通过/etc/init.d/libvirtd status查看启动状态,这个时候,kvm会自动生成一个本地网桥 virbr0,可以通过命令查看他的详细信息

# ifconfig virbr0
virbr0    Link encap:Ethernet  HWaddr 52:54:00:D7:23:AD  
          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

KVM默认使用NAT网络模式。虚拟机获取一个私有 IP(例如 192.168.122.0/24 网段的),并通过本地主机的NAT访问外网。

# brctl show
bridge name     bridge id               STP enabled     interfaces
virbr0          8000.525400d723ad       yes             virbr0-nic

创建一个本地网桥virbr0,包括两个端口:virbr0-nic 为网桥内部端口,vnet0 为虚拟机网关端口(192.168.122.1)。

虚拟机启动后,配置 192.168.122.1(vnet0)为网关。所有网络操作均由本地主机系统负责。

DNS/DHCP的实现,本地主机系统启动一个 dnsmasq 来负责管理。

ps aux|grep dnsmasq

注意: 启动libvirtd之后自动启动iptables,并且写上一些默认规则。

# iptables -nvL -t nat
Chain PREROUTING (policy ACCEPT 304 packets, 38526 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain POSTROUTING (policy ACCEPT 7 packets, 483 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 MASQUERADE  tcp  --  *      *       192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535 
    0     0 MASQUERADE  udp  --  *      *       192.168.122.0/24    !192.168.122.0/24    masq ports: 1024-65535 
    0     0 MASQUERADE  all  --  *      *       192.168.122.0/24    !192.168.122.0/24    

Chain OUTPUT (policy ACCEPT 7 packets, 483 bytes)
 pkts bytes target     prot opt in     out     source               destination

kvm创建虚拟机

上传一个镜像文件:CentOS-6.6-x86_64-bin-DVD1.iso

通过qemu创建一个raw格式的文件(注:QEMU使用的镜像文件:qcow2与raw,它们都是QEMU(KVM)虚拟机使用的磁盘文件格式),大小为5G。

qemu-img create -f raw /data/Centos-6.6-x68_64.raw 5G

查看创建的raw磁盘格式文件信息

qemu-img info /data/Centos-6.6-x68_64.raw 

image: /data/Centos-6.6-x68_64.raw
file format: raw
virtual size: 5.0G (5368709120 bytes)
disk size: 0

启动,kvm虚拟机,进行操作系统安装

virt-install  --virt-type kvm --name CentOS-6.6-x86_64 --ram 512 --cdrom /data/CentOS-6.6-x86_64-bin-DVD1.iso --disk path=/data/Centos-6.6-x68_64.raw --network network=default --graphics vnc,listen=0.0.0.0 --noautoconsole

启动之后,通过命令查看启动状态,默认会在操作系统开一个5900的端口,可以通过虚拟机远程管理软件vnc客户端连接,然后可视化的方式安装操作系统。

# netstat -ntlp|grep 5900
tcp        0      0 0.0.0.0:5900                0.0.0.0:*                   LISTEN      2504/qemu-kvm

注意:kvm安装的虚拟机,不确定是那一台,在后台就是一个进程,每增加一台端口号+1,第一次创建的为5900!

虚拟机远程管理软件

我们可以使用虚拟机远程管理软件VNC进行操作系统的安装,我使用过的两款不错的虚拟机远程管理终端软件,一个是Windows上使用,一个在Mac上为了方便安装一个Google Chrome插件后即可开始使用,软件信息 Tightvnc 或者 VNC@Viewer for Google Chrome

如果你和我一样使用的是Google Chrome提供的VNC插件,使用方式,在Address输入框中输入,宿主机IP:59000,Picture Quality选择框使用默认选项,点击Connect进入到安装操作系统的界面,你可以安装常规的方式进行安装,等待系统安装完成重启,然后就可以正常使用kvm虚拟化出来的操作系统了。

Tightvnc软件的使用,请参考官方手册。

KVM虚拟机管理

kvm虚拟机是通过virsh命令进行管理的,libvirt是Linux上的虚拟化库,是长期稳定的C语言API,支持KVM/QEMU、Xen、LXC等主流虚拟化方案。链接:libvirt.org/
virsh是Libvirt对应的shell命令。

查看所有虚拟机状态

virsh list --all

启动虚拟机

virsh start [NAME]

列表启动状态的虚拟机

virsh list
  • 常用命令查看virsh --help|more less

libvirt虚拟机配置文件

虚拟机libvirt配置文件在/etc/libvirt/qemu路径下,生产中我们需要去修改它的网络信息。

# ll
total 8
-rw-------. 1 root root 3047 Oct 19  2016 Centos-6.6-x68_64.xml
drwx------. 3 root root 4096 Oct 17  2016 networks

注意:不能直接修改xml文件,需要通过提供的命令!

 virsh edit Centos-6.6-x68_64

kvm三种网络类型,桥接、NAT、仅主机模式,默认NAT模式,其他机器无法登陆,生产中一般选择桥接。

监控kvm虚拟机

  • 安装软件监控虚拟机
yum install virt-top -y
  • 查看虚拟机资源使用情况
virt-top

virt-top 23:46:39 - x86_64 1/1CPU 3392MHz 3816MB
1 domains, 1 active, 1 running, 0 sleeping, 0 paused, 0 inactive D:0 O:0 X:0
CPU: 5.6%  Mem: 2024 MB (2024 MB by guests)

   ID S RDRQ WRRQ RXBY TXBY %CPU %MEM    TIME   NAME                                                                                                 
    1 R    0    1   52    0  5.6 53.0   5:16.15 centos-6.8

KVM修改NAT模式为桥接[案例]

在开始案例之前,需要知道的必要信息,宿主机IP是192.168.2.200,操作系统版本Centos-6.6-x68_64

启动虚拟网卡

ifup eth0

这里网卡是NAT模式,可以上网,ping通其他机器,但是其他机器无法登陆!

宿主机查看网卡信息

brctl show

ifconfig virbr0

ifconfig vnet0

实现网桥,在kvm宿主机完成

  • 步骤1,创建一个网桥,新建网桥连接到eth0,删除eth0,让新的网桥拥有eth0的ip
brctl addbr br0  #创建一个网桥

brctl show       #显示网桥信息

brctl addif br0 eth0 && ip addr del dev eth0 192.168.2.200/24 && ifconfig br0 192.168.2.200/24 up

brctl show      #查看结果
ifconfig br0    #验证br0是否成功取代了eth0的IP

注意: 这里的IP地址为 宿主机ip

  • 修改虚拟机桥接到br0网卡,在宿主机修改
virsh list --all

ps aux |grep kvm

virsh stop Centos-6.6-x68_64

virsh list --all

修改虚拟机桥接到宿主机,修改52行type为bridge,第54行bridge为br0

# virsh edit Centos-6.6-x68_64  # 命令

52     <interface type='network'>
     53       <mac address='52:54:00:2a:2d:60'/>
     54       <source network='default'/>
     55       <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
     56     </interface>

修改为:
52     <interface type='bridge'>
     53       <mac address='52:54:00:2a:2d:60'/>
     54       <source bridge='br0'/>
     55       <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
     56     </interface>

启动虚拟机,看到启动前后,桥接变化,vnet0被桥接到了br0

启动前:

# brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.000c29f824c9       no              eth0
virbr0          8000.525400353d8e       yes             virbr0-nic

启动后:

# virsh start CentOS-6.6-x86_64
Domain CentOS-6.6-x86_64 started

# brctl show                   
bridge name     bridge id               STP enabled     interfaces
br0             8000.000c29f824c9       no              eth0
                                                        vnet0
virbr0          8000.525400353d8e       yes             virbr0-nic

Vnc登陆后,修改ip地址,看到dhcp可以使用,被桥接到现有的ip段,ip是自动获取,而且是和宿主机在同一个IP段.

# ifup eth0

从宿主机登陆此服务器,可以成功。

# ssh 192.168.2.108
[email protected]'s password: 
Last login: Sat Jan 30 12:40:28 2016

从同一网段其他服务器登陆此虚拟机,也可以成功,至此让kvm管理的服务器能够桥接上网就完成了,在生产环境中,桥接上网是非常必要的。

总结

通过kvm相关的命令来创建虚拟机,安装和调试是非常必要的,因为现有的很多私有云,公有云产品都使用到了kvm这样的技术,学习基本的kvm使用对维护openstack集群有非常要的作用,其次所有的openstack image制作也得通过kvm这样的底层技术来完成,最后上传到openstack的镜像管理模块,才能开始通过openstack image生成云主机。

到此,各位应该能够体会到,其实kvm是一个非常底层和核心的虚拟化技术,而openstack就是对kvm这样的技术进行了一个上层封装,可以非常方便,可视化的操作和维护kvm虚拟机,这就是现在上天的云计算技术最底层技术栈,具体怎么实现请看下图。

Libvirt_support

如上图,没有openstack我们依然可以通过,libvirt来对虚拟机进行操作,只不过比较繁琐和难以维护。通过openstack就可以非常方便的进行底层虚拟化技术的管理、维护、使用。

发表在 kvm | 标签为 | 留下评论

解决systemd’s start request repeated too quickly for xxx.service的办法

Linux: systemd’s start request repeated too quickly for xxx.service

转载自https://www.hiroom2.com/2017/02/18/linux-systemd-s-start-request-repeated-too-quickly-for-xxx-service/

Repeating “systemctl restart xxx” over 6 times in 10 seconds will cause error. This article will describe the workaround.

1 start request repeated too quickly for xxx.service

This error is for preventing many repetition of service restart in system error.

If the service restart exceeds the value of StartLimitBurst within the time specified by the value of StartLimitInterval, the service startup will fail. StartLimitInterval is 10 seconds by default and StartLimitBurst is 5 by default.

For example, dhcpd is as below. The service startup will fail with the 6th “systemctl restart dhcpd”.

$ i=1
$ while : ; do
  echo ${i}
  i=$(expr ${i} + 1);
  sudo systemctl restart dhcpd || break
done
1
2
3
4
5
6
Job for dhcpd.service failed because start of the service was
attempted too often. See "systemctl status dhcpd.service" and
"journalctl -xe" for details.
To force a start use "systemctl reset-failed dhcpd.service" followed
by "systemctl start dhcpd.service" again.

“start request repeated too quickly for dhcpd.service” is in journal.

$ sudo journalctl -xeu dhcpd
<snip>
systemd[1]: start request repeated too quickly for dhcpd.service
<snip>

2 Use systemctl reload

Use “systemctl reload” if the service can reload.

$ systemctl show -p CanReload named
CanReload=yes

Running “systemctl reload” more than 6 times within 10 seconds will not fail.

$ sudo systemctl start named
$ i=1
$ while : ; do
  echo ${i}
  i=$(expr ${i} + 1);
  sudo systemctl restart named || break
done
1
2
3
4
5
6
7
8
<snip>

3 Use StartLimitBurst=0 if the service cannot reload

There is a service which cannot reload. The dhcpd community seems not to have the resource for implementation and maintenance of reload.

$ systemctl show -p CanReload dhcpd
CanReload=no

For a service which cannot reload, disable checker of service start with making StartLimitBurst be 0.

Check systemd’s file with the following command.

$ systemctl show -p FragmentPath dhcpd
FragmentPath=/usr/lib/systemd/system/dhcpd.service

Write StartLimitBurst=0 in [Service] section.

$ diff -uprN /usr/lib/systemd/system/dhcpd.service{.org,}
--- /usr/lib/systemd/system/dhcpd.service.org   2017-02-17 10:57:45.657561554 -0500
+++ /usr/lib/systemd/system/dhcpd.service       2017-02-17 12:25:15.733977821 -0500
@@ -8,6 +8,7 @@ After=time-sync.target
 [Service]
 Type=notify
 ExecStart=/usr/sbin/dhcpd -f -cf /etc/dhcp/dhcpd.conf -user dhcpd -group dhcpd --no-pid
+StartLimitBurst=0

 [Install]
 WantedBy=multi-user.target

Load changed systemd’s file.

$ sudo systemctl daemon-reload

Running “systemctl restart” more than 6 times within 10 seconds will not fail.

$ i=1
$ while : ; do
  echo ${i}
  i=$(expr ${i} + 1);
  sudo systemctl restart dhcpd || break
done
1
2
3
4
5
6
7
8
<snip>
发表在 LinuxBasic | 标签为 | 留下评论