Winse Blog

走走停停都是风景, 熙熙攘攘都向最好, 忙忙碌碌都为明朝, 何畏之.

Puppet Basic

简单使用

安装

quick

simple example

https://docs.puppet.com/puppet/4.4/reference/quick_start_user_group.html

puppet apply -e "user { 'jargyle': ensure => present, }"
puppet apply -e "group { 'web': ensure => present, }"

puppet resource -e group web
puppet resource -e user jargyle

cd /etc/puppetlabs/code/environments/production/manifests

[root@cu2 manifests]# vi site.pp
group { 'web':
  ensure => present, # absent, present
}

user { 'jargyle':
  ensure => present,
  home => '/home/jargyle',
  shell => '/bin/bash',
  password_max_age => '99999',
  password_min_age => '0',
  groups => web,
}

puppet parser validate site.pp

module helloworld

/ if $fqdn != ‘cu2.esw.cn’ { class { ‘ntp’: runmode => ‘cron’, cron_command => ‘ntpdate cu2’, require => [ Package[‘ntp’, ‘ntpdate’], File[‘/etc/cron.hourly’] ], } } /

– hosts / 多网卡的时刻需要注意 class { ‘hosts’: dynamic_mode => true, dynamic_ip => $::ipaddress_bond0 } / if $fqdn =~ /.*.ds.ctyun/ { class { ‘hosts’: dynamic_mode => true, } }

cron {‘run-puppet’: command => “source /etc/profile; puppet agent –test >/tmp/puppet-cron.log 2>&1”, minute => inline_template(‘<%= @hostname.hash.abs % 60 %>’), }

file{‘/etc/puppetlabs/mcollective/facts.yaml’: owner => root, group => root, mode => ‘400’, loglevel => debug, # reduce noise in Puppet reports content => inline_template(“<%= scope.to_hash.reject { |k,v| k.to_s =~ /(uptime_seconds|timestamp|free)/ }.to_yaml %>”), # exclude rapidly changing facts } }

modules install

https://docs.puppet.com/puppet/latest/reference/modules_installing.html

The full name of a Forge module is formatted as username-modulename.

https://docs.puppet.com/puppet/latest/reference/modules_fundamentals.html#writing-modules

[root@cu2 code]# cd environments/production/modules/
[root@cu2 modules]# puppet module generate --skip-interview winse-hello

[root@cu2 modules]# puppet module install puppetlabs-stdlib
Notice: Preparing to install into /etc/puppetlabs/code/environments/production/modules ...
Notice: Downloading from https://forgeapi.puppetlabs.com ...
Notice: Installing -- do not interrupt ...
/etc/puppetlabs/code/environments/production/modules
└── puppetlabs-stdlib (v4.11.0)
[root@cu2 modules]# puppet module list
/etc/puppetlabs/code/environments/production/modules
├── puppetlabs-stdlib (v4.11.0)
└── winse-hello (v0.1.0)
/etc/puppetlabs/code/modules (no modules installed)
/opt/puppetlabs/puppet/modules (no modules installed)

sudo puppet module install ~/puppetlabs-apache-0.10.0.tar.gz –ignore-dependencies

Listing Installed Modules Use the module tool’s list action to see which modules you have installed (and which directory they’re installed in).

Use the –tree option to view the modules arranged by dependency instead of by location on disk.

puppet4 插件同步选项默认是开启的 pluginsync=true,不需要额外的操作。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
# https://github.com/example42/puppet-nrpe/issues/1
[root@cu2 modules]# tar zxvf puppet-hosts-2.0.18.tar.gz  
[root@cu2 modules]# tar zxvf puppi-2.1.12.tar.gz 
[root@cu2 modules]# ll
total 16
drwxr-xr-x 3 root root 4096 Apr 22 14:37 helloworld
drwxrwxr-x 6 root root 4096 Aug 10  2015 hosts
drwxrwxr-x 7 root root 4096 Aug  8  2015 puppi
drwxr-xr-x 6 root root 4096 Jan 12 19:08 stdlib

[root@cu2 modules]# vi /etc/puppetlabs/code/environments/production/manifests/site.pp 
node default {
  class { 'hosts': 
    dynamic_mode => true,
  }
}

# 效果。好像要活跃的主机才会添加,顺序执行两边 agent -t 就可以把所有的agent全部加到hosts文件
[root@hadoop-slaver3 ~]# puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Info: Caching catalog for hadoop-slaver3.ds.ctyun
Info: Applying configuration version '1461309849'
Notice: Applied catalog in 0.06 seconds
[root@hadoop-slaver3 ~]# cat /etc/hosts
# HEADER: This file was autogenerated at 2016-04-22 07:23:45 +0000
# HEADER: by puppet.  While it can still be managed manually, it
# HEADER: is definitely not recommended.
172.17.0.5      hadoop-slaver3
127.0.0.1       localhost
::1     localhost       ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

172.17.42.1     cu2     cu2.esw.cn
172.17.0.5      hadoop-slaver3.ds.ctyun hadoop-slaver3
172.17.0.1      hadoop-master1.ds.ctyun hadoop-master1
172.17.0.2      hadoop-master2.ds.ctyun hadoop-master2
172.17.0.3      hadoop-slaver1.ds.ctyun hadoop-slaver1
172.17.0.4      hadoop-slaver2.ds.ctyun hadoop-slaver2
  • ntp

docker不能修改系统时间!!

https://github.com/example42/puppet-ntp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
[root@cu2 ~]# cd /etc/puppetlabs/code/environments/production/modules/
[root@cu2 modules]# ll
total 20
drwxr-xr-x 3 root root 4096 Apr 22 14:37 helloworld
drwxrwxr-x 6 root root 4096 Aug 10  2015 hosts
drwxrwxr-x 5 root root 4096 Oct 30 00:24 ntp
drwxrwxr-x 7 root root 4096 Aug  8  2015 puppi
drwxr-xr-x 6 root root 4096 Jan 12 19:08 stdlib

[root@cu2 ~]# cat /etc/puppetlabs/code/environments/production/manifests/site.pp 
node default {

  file { '/etc/cron.hourly':
    ensure => directory,
  }
 
  package { ['ntp', 'ntpdate']:
    ensure => installed,
  }

  class { 'ntp':
    runmode => 'cron',
    cron_command => 'ntpdate cu2',
    require => [ Package['ntp', 'ntpdate'], File['/etc/cron.hourly'] ],
  }

  if $fqdn =~ /.*\.ds\.ctyun/  {
    class { 'hosts':
      dynamic_mode => true,
    }
  }

}


[root@hadoop-master2 puppetlabs]# puppet agent -t
...
[root@hadoop-master2 puppetlabs]# ll /etc/cron.hourly/
total 4
-rwxr-xr-x 1 root root 197 Apr 22 08:59 ntpdate
[root@hadoop-master2 puppetlabs]# cat /etc/cron.hourly/ntpdate 
#!/bin/bash
# Managed by Puppet
export PATH=$PATH:/usr/bin:/usr/sbin:/bin:/sbin

# Wait up to 600 seconds 
randomsec=$RANDOM
let "randomsec %= 600"
sleep $randomsec

ntpdate cu2 >/dev/null

exit 0
  • sudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[root@cu2 modules]# mv saz-sudo-3.1.0 sudo
[root@cu2 modules]# ll
total 20
drwxrwxr-x 6 hadoop root  4096 Aug 10  2015 hosts
drwxrwxr-x 5 hadoop root  4096 Oct 30  2015 ntp
drwxrwxr-x 7 hadoop root  4096 Aug  8  2015 puppi
drwxr-xr-x 6 hadoop root  4096 Jan 12 19:08 stdlib
drwxr-xr-x 8 hadoop games 4096 Jun  6  2015 sudo
[root@cu2 modules]# puppet apply -e "include sudo
> sudo::conf { 'hadoop':
> content => 'hadoop ALL=(ALL) NOPASSWD: ALL',
> }
> "
Notice: Compiled catalog for cu2.esw.cn in environment production in 0.64 seconds
Notice: /Stage[main]/Sudo/File[/etc/sudoers]/content: content changed '{md5}d31d7fefba87710cfaf3be96d81104d3' to '{md5}dc7c9180ad39e78a8c91291f4743437b'
Notice: /Stage[main]/Sudo/File[/etc/sudoers.d/]/mode: mode changed '0750' to '0550'
Notice: /Stage[main]/Main/Sudo::Conf[hadoop]/File[10_hadoop]/ensure: defined content as '{md5}627f25fd210c1351a6ed664c93b5be37'
Notice: /Stage[main]/Main/Sudo::Conf[hadoop]/Exec[sudo-syntax-check for file /etc/sudoers.d/10_hadoop]: Triggered 'refresh' from 1 events
Notice: Applied catalog in 0.43 seconds

上面简单的列出了 puppet 的简单使用,但是如果有大文件。。。

文件

有时可为了传输临时的几个文件,要个单独整一个module比较麻烦,可以使用fileserver直接在site.pp中进行更新同步处理。

  1. 添加fileserver.conf配置
1
2
3
[aj_files]
    path /etc/puppetlabs/code/environments/production/files
    allow *

同时修改files目录的权限: chown -R puppet files

  1. 在site.pp中添加更新文件的配置
1
2
3
4
5
6
7
8
9
10
11
12
13
file {'/etc/ssh/sshd_config':
  ensure   => 'file',
  source   => 'puppet:///aj_files/etc/ssh/sshd_config',
  notify   => Service['sshd'],
}

service{'sshd':
  ensure     => 'running',
  enable     => 'true',
  hasstatus  => 'true', 
  hasrestart => 'true',
  restart    => '/etc/init.d/sshd reload',  #将restart改成reload
}

文件比较多时,可以使用循环:

1
2
3
4
5
6
7
8
9
$binaries = ["facter", "hiera", "mco", "puppet", "puppetserver"]

# function call with lambda:
$binaries.each |String $binary| {
  file {"/usr/bin/$binary":
    ensure => link,
    target => "/opt/puppetlabs/bin/$binary",
  }
}

或者

1
2
3
4
5
6
7
8
9
10
11
12
13
# one-off defined resource type, in
# /etc/puppetlabs/code/environments/production/modules/puppet/manifests/binary/symlink.pp
define puppet::binary::symlink ($binary = $title) {
  file {"/usr/bin/$binary":
    ensure => link,
    target => "/opt/puppetlabs/bin/$binary",
  }
}

# using defined type for iteration, somewhere else in your manifests
$binaries = ["facter", "hiera", "mco", "puppet", "puppetserver"]

puppet::binary::symlink { $binaries: }

模板

https://docs.puppet.com/puppet/latest/reference/lang_relationships.html#ordering-and-notification

节点定义

官网文档

Puppetexplorer设置

注意: 使用 PuppetExplorer 的前提是已经安装 PuppetDB (安装参考:Puppetdb安装配置)。

PuppetDB 提供的8080界面太过于简单,其实8080主要提供非常多的接口。PuppetExplorer 就是使用这些 restful 查询接口来进行展示。比默认的 PuppetDB-UI 更具体和详细。

配置 PuppetExplorer 有两种方式:

  • 两个服务在 同一个域 下面,配置 /api 跳转到 PuppetDB:8080
  • 两个服务,配置各自的地址 。修改config.js,同时处理跨域的问题。

https://github.com/spotify/puppetexplorer

  • The recommended way to install it is on the same host as your PuppetDB instance. Then proxy /api to port 8080 of your PuppetDB instance (except the /commands endpoint). This avoids the need for any CORS headers.
  • It is possible to have it on a separate domain from your PuppetDB though. If you do, make sure you have the correct Access-Control-Allow-Origin header and a Access-Control-Expose-Headers: X-Records header.

适配 PuppetDB4

官网的版本已经几个月没有更新,新的 API 接口略有不同:

1
2
3
4
5
# puppetdb-4.0
/metrics/v1/mbeans/puppetlabs.puppetdb.population:name=num-active-nodes

# puppetexplorer-2.0.0
/metrics/v1/mbeans/puppetlabs.puppetdb.query.population:type=default,name=num-nodes

修改 app.js 拼接链接的字符串即可,删除 .query.type=default :

配置好后的效果:

同一服务器下访问配置

使用 nginx 作为html的服务器,同时 proxy_pass 代理跳转到 cu3:8080(PuppetDB服务) :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[hadoop@cu2 puppetexplorer-2.0.0]$ mv config.js.example config.js

[hadoop@cu2 nginx]$ vi conf/nginx.conf
...
# 路径最后带上 / !!
location /api/ {
  proxy_pass http://cu3:8080/;
  proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header Host $http_host;
}
location /puppetexplorer {
  alias /opt/puppetlabs/puppetexplorer-2.0.0;
}

[hadoop@cu2 nginx]$ sbin/nginx -s reload

然后打开网页访问 http://cu2:8888/puppetexplorer 即可。

nginx的配置参考: Nginx配置proxy_pass转发的/路径问题, nginx as proxy to jetty

apache配置

1
2
3
4
5
6
7
8
9
10
[root@hadoop-master1 html]# vi /etc/httpd/conf/httpd.conf 
...
# puppetdb
ProxyPass /puppetdb/api/ http://hadoop-master1:8080/

[root@hadoop-master1 html]# vi puppetexplorer/config.js 
...
PUPPETDB_SERVERS = [
  ['production', '/puppetdb/api'],
  ['testing', '/puppetdb/api']

不同服务器,跨域访问

老实说,完全不推荐这种做法。但是跨域的设置震惊到我了,原来自认为的方式完全不对。例如A javascript访问B,跨域头设置在B服务,是要B容许A访问!!

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@hadoop-master2 puppetexplorer]# vi config.js 
// List of PuppetDB servers, pairs of name, URL and $http config object
// The first one will be used as the default server
PUPPETDB_SERVERS = [
  ['production', 'http://cu2:8888'],
  ['testing', 'http://cu2:8888']
];

# Nginx配置,加上跨域访问源范围控制
location ~ /(metrics|pdb) {
add_header "Access-Control-Allow-Origin" "*";
proxy_pass http://cu3:8080;
}

参考

–END

Puppetdb安装配置

安装 PuppetDB 后,还得修改 PuppetServer 的配置。由于测试环境机器硬件一般般,把 PuppetDB 安装在 cu3。

  • cu2: master server, ca server, postgresql
  • cu3: puppetdb, agent
1
2
3
4
5
6
7
[root@cu3 puppet]# puppetdb -v
puppetdb version: 4.0.0

[root@cu2 ~]# puppetserver -v
puppetserver version: 2.3.1
[root@cu2 ~]# puppet -V
4.4.1

原来老的版本有资源(清单)导出的功能,到了Puppet4后被PuppetDB取代了。见官网文档: Inventory Service

同时老版本用ruby写的 puppet-dashboard 也没有必要安装了,前后端分离大势所趋:后端提供接口,前端用ajax来展现。

安装PuppetDB

https://docs.puppetlabs.com/puppetdb/latest/install_from_packages.html

由于天朝特殊环境,本地repo的创建参考第一篇文章: puppet4.4.1入门安装

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
[root@cu3 ~]# yum install puppetdb
Loaded plugins: fastestmirror
Setting up Install Process
Loading mirror speeds from cached hostfile
 * epel: ftp.cuhk.edu.hk
Resolving Dependencies
--> Running transaction check
---> Package puppetdb.noarch 0:4.0.0-1.el6 will be installed
--> Processing Dependency: java-1.8.0-openjdk-headless for package: puppetdb-4.0.0-1.el6.noarch
--> Running transaction check
---> Package java-1.8.0-openjdk-headless.x86_64 1:1.8.0.77-0.b03.el6_7 will be installed
--> Processing Dependency: tzdata-java >= 2014f-1 for package: 1:java-1.8.0-openjdk-headless-1.8.0.77-0.b03.el6_7.x86_64
--> Processing Dependency: jpackage-utils for package: 1:java-1.8.0-openjdk-headless-1.8.0.77-0.b03.el6_7.x86_64
--> Running transaction check
---> Package jpackage-utils.noarch 0:1.7.5-3.14.el6 will be installed
---> Package tzdata-java.noarch 0:2016c-1.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

===========================================================================================================================================================================================
 Package                                                Arch                              Version                                            Repository                               Size
===========================================================================================================================================================================================
Installing:
 puppetdb                                               noarch                            4.0.0-1.el6                                        puppet-local                             21 M
Installing for dependencies:
 java-1.8.0-openjdk-headless                            x86_64                            1:1.8.0.77-0.b03.el6_7                             updates                                  32 M
 jpackage-utils                                         noarch                            1.7.5-3.14.el6                                     base                                     60 k
 tzdata-java                                            noarch                            2016c-1.el6                                        updates                                 179 k

Transaction Summary
===========================================================================================================================================================================================
Install       4 Package(s)

Total size: 53 M
Total download size: 53 M
Installed size: 126 M
Is this ok [y/N]: y
Downloading Packages:
(1/3): java-1.8.0-openjdk-headless-1.8.0.77-0.b03.el6_7.x86_64.rpm                                                                                                  |  32 MB     00:00     
(2/3): puppetdb-4.0.0-1.el6.noarch.rpm                                                                                                                              |  21 MB     00:00     
(3/3): tzdata-java-2016c-1.el6.noarch.rpm                                                                                                                           | 179 kB     00:00     
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                                                       32 MB/s |  53 MB     00:01     
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : tzdata-java-2016c-1.el6.noarch                                                                                                                                          1/4 
  Installing : jpackage-utils-1.7.5-3.14.el6.noarch                                                                                                                                    2/4 
  Installing : 1:java-1.8.0-openjdk-headless-1.8.0.77-0.b03.el6_7.x86_64                                                                                                               3/4 
  Installing : puppetdb-4.0.0-1.el6.noarch                                                                                                                                             4/4 
Config archive not found. Not proceeding with migration
PEM files in /etc/puppetlabs/puppetdb/ssl are missing, we will move them into place for you
Warning: Unable to find all puppet certificates to copy

  This tool requires the following certificates to exist:

  * /etc/puppetlabs/puppet/ssl/certs/ca.pem
  * /etc/puppetlabs/puppet/ssl/private_keys/cu3.esw.cn.pem
  * /etc/puppetlabs/puppet/ssl/certs/cu3.esw.cn.pem

  These files may be missing due to the fact that your host's Puppet
  certificates may not have been signed yet, probably due to the
  lack of a complete Puppet agent run. Try running puppet first, for
  example:

      puppet agent --test

  Afterwards re-run this tool then restart PuppetDB to complete the SSL
  setup:

      puppetdb ssl-setup -f
  Verifying  : jpackage-utils-1.7.5-3.14.el6.noarch                                                                                                                                    1/4 
  Verifying  : tzdata-java-2016c-1.el6.noarch                                                                                                                                          2/4 
  Verifying  : puppetdb-4.0.0-1.el6.noarch                                                                                                                                             3/4 
  Verifying  : 1:java-1.8.0-openjdk-headless-1.8.0.77-0.b03.el6_7.x86_64                                                                                                               4/4 

Installed:
  puppetdb.noarch 0:4.0.0-1.el6                                                                                                                                                            

Dependency Installed:
  java-1.8.0-openjdk-headless.x86_64 1:1.8.0.77-0.b03.el6_7                   jpackage-utils.noarch 0:1.7.5-3.14.el6                   tzdata-java.noarch 0:2016c-1.el6                  

Complete!

PuppetDB 需要与 puppetserver 通信,需要签名证书。如果安装之前本机 Puppet-agent 证书已签名,安装会自动把证书拷贝到 puppetdb/ssl 目录下。我们这里先签名agent再配置 puppetdb-ssl 。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
[root@cu3 ~]# puppet agent --server cu2.esw.cn --test
Info: Creating a new SSL key for cu3.esw.cn
Info: Caching certificate for ca
Info: csr_attributes file loading from /etc/puppetlabs/puppet/csr_attributes.yaml
Info: Creating a new SSL certificate request for cu3.esw.cn
Info: Certificate Request fingerprint (SHA256): 16:CB:A3:6D:21:69:78:D0:0D:37:1F:A7:C1:86:2E:55:7F:B1:60:77:05:EC:F5:37:81:12:28:73:61:1A:4F:20
Info: Caching certificate for ca
Exiting; no certificate found and waitforcert is disabled

# 服务端签名: puppet cert sign cu3.esw.cn

[root@cu3 ~]# puppet agent --server cu2.esw.cn --test
Info: Caching certificate for cu3.esw.cn
Info: Caching certificate_revocation_list for ca
Info: Caching certificate for cu3.esw.cn
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Caching catalog for cu3.esw.cn
Info: Applying configuration version '1461159906'
Info: Creating state file /opt/puppetlabs/puppet/cache/state/state.yaml
Notice: Applied catalog in 0.02 seconds
[root@cu3 ~]# puppetdb ssl-setup -f
PEM files in /etc/puppetlabs/puppetdb/ssl are missing, we will move them into place for you
Copying files: /etc/puppetlabs/puppet/ssl/certs/ca.pem, /etc/puppetlabs/puppet/ssl/private_keys/cu3.esw.cn.pem and /etc/puppetlabs/puppet/ssl/certs/cu3.esw.cn.pem to /etc/puppetlabs/puppetdb/ssl
Backing up /etc/puppetlabs/puppetdb/conf.d/jetty.ini to /etc/puppetlabs/puppetdb/conf.d/jetty.ini.bak.1461159930 before making changes
Updated default settings from package installation for ssl-host in /etc/puppetlabs/puppetdb/conf.d/jetty.ini.
Updated default settings from package installation for ssl-port in /etc/puppetlabs/puppetdb/conf.d/jetty.ini.
Updated default settings from package installation for ssl-key in /etc/puppetlabs/puppetdb/conf.d/jetty.ini.
Updated default settings from package installation for ssl-cert in /etc/puppetlabs/puppetdb/conf.d/jetty.ini.
Updated default settings from package installation for ssl-ca-cert in /etc/puppetlabs/puppetdb/conf.d/jetty.ini.
[root@cu3 ~]# 

安装Postgres

配置好 ssl 后,下一步就是连接数据库。puppet4.4 默认配置里面只有 postgres 数据库。直接用 yum 安装,这里简单列出配置过程。

https://docs.puppetlabs.com/puppetdb/latest/configure.html#using-postgresql

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
[root@cu2 ~]# yum localinstall http://yum.postgresql.org/9.4/redhat/rhel-6-x86_64/pgdg-centos94-9.4-1.noarch.rpm
[root@cu2 ~]# yum install postgresql94-server
[root@cu2 ~]# yum install postgresql94-contrib

[root@cu2 ~]# service postgresql-9.4 initdb
Initializing database:                                     [  OK  ]
[root@cu2 ~]# service postgresql-9.4 status
postgresql-9.4 is stopped
[root@cu2 ~]# service postgresql-9.4 start
Starting postgresql-9.4 service:                           [  OK  ]


# 先查看 PGDATA 的目录!!
[root@cu2 data]# grep "PGDATA=" /etc/init.d/postgresql-9.4 
PGDATA=/usr/local/pgsql/data
OLDPGDATA=` sed -n 's/^PGDATA=//p' /etc/init.d/postgresql-$PGPREVMAJORVERSION`
NEWPGDATA=` sed -n 's/^PGDATA=//p' /etc/init.d/postgresql-$PGMAJORVERSION`


# 切换到 postgres 用户,先验证环境变量 PGDATA 是否正确!!否则自己修改 .bash_profile 文件!!
[root@cu2 puppet]# su - postgres
-bash-4.1$ echo $PGDATA
/usr/local/pgsql/data

# 创建用户
-bash-4.1$ createuser -DRSP puppetdb
Enter password for new role: 
Enter it again: 
-bash-4.1$ 
-bash-4.1$ createdb -E utf8 -O puppetdb puppetdb

-bash-4.1$ psql puppetdb -c 'create extension pg_trgm'
CREATE EXTENSION

# 配置连接选项(相当于mysql的privilege)
-bash-4.1$ vi $PGDATA/pg_hba.conf 
host    all             all              0.0.0.0/0               md5

# 重启
[root@cu2 puppet]# service postgresql-9.4 restart
Stopping postgresql-9.4 service:                           [  OK  ]
Starting postgresql-9.4 service:                           [  OK  ]

# 测试 
[root@cu2 puppet]# psql -h localhost puppetdb puppetdb
psql (9.4.5)
Type "help" for help.

puppetdb=> 
puppetdb=> \q

查看 postgres 的端口:

1
2
3
4
5
6
7
8
9
10
11
[root@cu2 puppet]# netstat -anp | grep post
tcp        0      0 0.0.0.0:5432                0.0.0.0:*                   LISTEN      8126/postmaster     
tcp        0      0 :::5432                     :::*                        LISTEN      8126/postmaster     
udp        0      0 ::1:39400                   ::1:39400                   ESTABLISHED 8126/postmaster     
unix  2      [ ACC ]     STREAM     LISTENING     954965338 8126/postmaster     /tmp/.s.PGSQL.5432

# 有客户端连上来后:
[root@cu2 ~]# netstat -anp | grep post
tcp        0      0 0.0.0.0:5432                0.0.0.0:*                   LISTEN      8126/postmaster     
tcp        0      0 192.168.0.214:5432          192.168.0.148:60626         ESTABLISHED 20589/postgres 
...

启动PuppetDB

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
[root@cu3 ~]# vi /etc/puppetlabs/puppetdb/conf.d/database.ini 
[database]
classname = org.postgresql.Driver
subprotocol = postgresql

# The database address, i.e. //HOST:PORT/DATABASE_NAME
subname = //cu2:5432/puppetdb

# Connect as a specific user
username = puppetdb

# Use a specific password
password = puppetdb

# How often (in minutes) to compact the database
# gc-interval = 60
# 通过api/name=num-active-nodes查询不到了,但是pgsql数据库中还没有删除。也可以通过 puppet node deactivate 手动执行
# node-ttl = 30d
# 默认没有设置,disabled。格式与node-ttl一样
# node-purge-ttl = 
# report-ttl = 14d

# Number of seconds before any SQL query is considered 'slow'; offending
# queries will not be interrupted, but will be logged at the WARN log level.
log-slow-statements = 10


# 注意修改,不然web-ui就只能localhost访问了!!
[root@cu3 ~]# vi /etc/puppetlabs/puppetdb/conf.d/jetty.ini
...
host = 0.0.0.0

# JVM 参数修改
[root@cu3 ~]# less /etc/sysconfig/puppetdb 
JAVA_BIN="/usr/local/jdk1.7.0_17/bin/java"
JAVA_ARGS="-XX:MaxPermSize=128m -Xmx2g"

[root@cu3 ~]# service puppetdb start
Starting puppetdb:                                         [  OK  ]
[root@cu3 ~]# 
[root@cu3 ~]# service puppetdb status
puppetdb (pid  8452) is running...

# 8081 为 puppetserver 写数据的https接口。8080 为http web-ui端口
[root@cu3 ~]# netstat -anp | grep 8081
tcp        0      0 :::8081                     :::*                        LISTEN      8794/java           

查看 8080 端口通过网页查看集群的状态,现在还什么数据都获取不到,需要配置服务端把数据发送给puppetdb。

服务端配置

https://docs.puppet.com/puppetdb/latest/connect_puppet_master.html

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# 安装Plug-in
# 服务端还要安装 puppetdb-termini ,不然会报错。
[root@cu2 puppet]# yum install puppetdb-termini
Loaded plugins: fastestmirror, priorities
Setting up Install Process
Loading mirror speeds from cached hostfile
 * epel: mirrors.opencas.cn
Resolving Dependencies
--> Running transaction check
---> Package puppetdb-termini.noarch 0:3.2.4-1.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

==========================================================================================================================================================================
 Package                                      Arch                               Version                                   Repository                                Size
==========================================================================================================================================================================
Installing:
 puppetdb-termini                             noarch                             3.2.4-1.el6                               puppet-local                              25 k

Transaction Summary
==========================================================================================================================================================================
Install       1 Package(s)

Total download size: 25 k
Installed size: 69 k
Is this ok [y/N]: y
Downloading Packages:
puppetdb-termini-3.2.4-1.el6.noarch.rpm                                                                                                            |  25 kB     00:00     
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : puppetdb-termini-3.2.4-1.el6.noarch                                                                                                                    1/1 
  Verifying  : puppetdb-termini-3.2.4-1.el6.noarch                                                                                                                    1/1 

Installed:
  puppetdb-termini.noarch 0:3.2.4-1.el6                                                                                                                                   

Complete!


# 注意这里URL的域名,要与CA中的名称对应!! 设置成 cu3 是不正确的!!
# /etc/puppetlabs/puppet
[root@cu2 puppet]# vi puppetdb.conf 
[main]
server_urls = https://cu3.esw.cn:8081

[root@cu2 puppet]# vi puppet.conf 
# This file can be used to override the default puppet settings.
# See the following links for more details on what settings are available:
# - https://docs.puppetlabs.com/puppet/latest/reference/config_important_settings.html
# - https://docs.puppetlabs.com/puppet/latest/reference/config_about_settings.html
# - https://docs.puppetlabs.com/puppet/latest/reference/config_file_main.html
# - https://docs.puppetlabs.com/puppet/latest/reference/configuration.html
[master]
vardir = /opt/puppetlabs/server/data/puppetserver
logdir = /var/log/puppetlabs/puppetserver
rundir = /var/run/puppetlabs/puppetserver
pidfile = /var/run/puppetlabs/puppetserver/puppetserver.pid
codedir = /etc/puppetlabs/code

#autosign = true

storeconfigs = true
storeconfigs_backend = puppetdb
reports = store,puppetdb

[root@cu2 puppet]# puppet master --configprint route_file
/etc/puppetlabs/puppet/routes.yaml

[root@hadoop-master2 puppet]# vi routes.yaml 
---
master:
  facts:
    terminus: puppetdb
    cache: yaml

[root@cu2 puppet]# service puppetserver restart
Stopping puppetserver:                                     [  OK  ]
Starting puppetserver:                                     [  OK  ]

[root@cu2 puppet]# puppet agent --server cu2.esw.cn --test 
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Caching catalog for cu2.esw.cn
Info: Applying configuration version '1461162748'
Notice: Applied catalog in 0.01 seconds

如果 puppet-agent 服务没有启动,分别在各台机器上面执行 –test 连一下 PuppetServer,就可以在8080 puppetdb页面看到主机的数量了。

–END

Puppet入门之域名证书

说 Puppet 入门配置过程中 90% 的问题与域名有关毫不为过!!因为节点之间的通信都需要证书验证,而证书验证和域名绑定。

主要讲讲 FQDN(Fully Qualified Domain Name) 查看和配置,以及 Puppet4.4 认证相关的操作。

环境说明

测试环境是几台云主机 ,主机名根据项目情况命名(也就是说云主机内网域名解析是不行的)。操作系统没特殊说明那么使用的是 Centos6 。

  • cu2: 服务端master,证书服务器ca
  • cu1/cu3/cu4/cu5: agent

这里列出来的是部署之前的域名情况。一步步的处理域名代码的麻烦。如果想避免不必要的烦恼,请使用 FQDN 加上

服务节点证书重新签名

安装后直接测试,默认连接的服务器是 puppet 。所以要么指定 puppet 对应主机,要么加上 –server 参数。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
# 默认的 puppet 服务器找不到对应的主机
[root@cu2 ~]# puppet agent --test
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: getaddrinfo: Name or service not known
Info: Retrieving pluginfacts
Error: /File[/opt/puppetlabs/puppet/cache/facts.d]: Failed to generate additional resources using 'eval_generate': getaddrinfo: Name or service not known
Error: /File[/opt/puppetlabs/puppet/cache/facts.d]: Could not evaluate: Could not retrieve file metadata for puppet:///pluginfacts: getaddrinfo: Name or service not known
Info: Retrieving plugin
Error: /File[/opt/puppetlabs/puppet/cache/lib]: Failed to generate additional resources using 'eval_generate': getaddrinfo: Name or service not known
Error: /File[/opt/puppetlabs/puppet/cache/lib]: Could not evaluate: Could not retrieve file metadata for puppet:///plugins: getaddrinfo: Name or service not known
Error: Could not retrieve catalog from remote server: getaddrinfo: Name or service not known
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Error: Could not send report: getaddrinfo: Name or service not known


# 加上 域 后不通,DNS服务器不识别自定义的主机名
[root@cu2 ~]# cat /etc/resolv.conf 
; generated by /sbin/dhclient-script
search ds.ctyun
nameserver 192.168.0.1
[root@cu2 ~]# puppet agent --server cu2.ds.ctyun --test
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: getaddrinfo: Name or service not known
Info: Retrieving pluginfacts
Error: /File[/opt/puppetlabs/puppet/cache/facts.d]: Failed to generate additional resources using 'eval_generate': getaddrinfo: Name or service not known
Error: /File[/opt/puppetlabs/puppet/cache/facts.d]: Could not evaluate: Could not retrieve file metadata for puppet:///pluginfacts: getaddrinfo: Name or service not known
Info: Retrieving plugin
Error: /File[/opt/puppetlabs/puppet/cache/lib]: Failed to generate additional resources using 'eval_generate': getaddrinfo: Name or service not known
Error: /File[/opt/puppetlabs/puppet/cache/lib]: Could not evaluate: Could not retrieve file metadata for puppet:///plugins: getaddrinfo: Name or service not known
Error: Could not retrieve catalog from remote server: getaddrinfo: Name or service not known
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Error: Could not send report: getaddrinfo: Name or service not known
[root@cu2 ~]# ping cu2.ds.ctyun
ping: unknown host cu2.ds.ctyun


# 传说中用的 -f 参数没L用
[root@cu2 ~]# hostname -i
192.168.0.x
[root@cu2 ~]# hostname -f
cu2


# 加自定义 域 ,并重新设定 FQDN hostname。 修改主机hostname的步骤可以替换成在 /etc/resolv.conf 加 **domain esw.cn**
[root@cu2 ~]# vi /etc/hosts
192.168.0.x cu1 cu1.esw.cn
192.168.0.x cu2 cu2.esw.cn

192.168.0.x cu3 cu3.esw.cn
192.168.0.x cu4 cu4.esw.cn
192.168.0.x cu5 cu5.esw.cn

[root@cu2 ~]# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=cu2.esw.cn
[root@cu2 ~]# hostname cu2.esw.cn
[root@cu2 ~]# hostname
cu2.esw.cn

# 确认
[root@cu2 ~]# puppet config print certname
cu2.esw.cn

[root@cu2 puppet]# dnsdomainname -v
gethostname()=`cu2.esw.cn'
Resolving `cu2.esw.cn' ...
Result: h_name=`cu2'
Result: h_aliases=`cu2.esw.cn'
Result: h_addr_list=`192.168.0.214'

[root@cu2 puppet]# hostname -f -v
gethostname()=`cu2.esw.cn'
Resolving `cu2.esw.cn' ...
Result: h_name=`cu2'
Result: h_aliases=`cu2.esw.cn'
Result: h_addr_list=`192.168.0.214'
cu2


# 清理已经为本机签发的证书
[root@cu2 ~]# puppet cert list -all
+ "cu2.ds.ctyun" (SHA256) A6:30:6D:80:A8:04:60:56:4C:F3:D5:3C:9A:5C:2A:11:6C:A6:A9:F7:6E:5E:A5:37:59:28:5B:B6:E3:D3:73:D5 (alt names: "DNS:puppet", "DNS:cu2.ds.ctyun")

[root@cu2 ~]# puppet cert clean cu2.ds.ctyun
Notice: Revoked certificate with serial 2
Notice: Removing file Puppet::SSL::Certificate cu2.ds.ctyun at '/etc/puppetlabs/puppet/ssl/ca/signed/cu2.ds.ctyun.pem'
Notice: Removing file Puppet::SSL::Certificate cu2.ds.ctyun at '/etc/puppetlabs/puppet/ssl/certs/cu2.ds.ctyun.pem'
Notice: Removing file Puppet::SSL::Key cu2.ds.ctyun at '/etc/puppetlabs/puppet/ssl/private_keys/cu2.ds.ctyun.pem'


# 由于是server节点的证书变更,重启puppetserver会重新生成/签发证书
[root@cu2 ~]# service puppetserver restart
Stopping puppetserver:                                     [  OK  ]
Starting puppetserver:                                     [  OK  ]

[root@cu2 puppet]# tree /etc/puppetlabs/puppet/ssl
/etc/puppetlabs/puppet/ssl
├── ca
│   ├── ca_crl.pem
│   ├── ca_crt.pem
│   ├── ca_key.pem
│   ├── ca_pub.pem
│   ├── inventory.txt
│   ├── private
│   ├── requests
│   ├── serial
│   └── signed
│       └── cu2.esw.cn.pem
├── certificate_requests
├── certs
│   ├── ca.pem
│   └── cu2.esw.cn.pem
├── crl.pem
├── private
├── private_keys
│   └── cu2.esw.cn.pem
└── public_keys
    └── cu2.esw.cn.pem

9 directories, 12 files

[root@cu2 ~]# puppet agent --server cu2.esw.cn --test
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Caching catalog for cu2.esw.cn
Info: Applying configuration version '1461149778'
Info: Creating state file /opt/puppetlabs/puppet/cache/state/state.yaml
Notice: Applied catalog in 0.01 seconds

Agent 重新签名

涉及到客户端域名错误,需要服务端配合清理签名请求等操作。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# 首先同步 /etc/hosts 到所有agent节点


# cu1 连接 服务器cu2
[root@cu1 ~]# puppet agent --server cu2.esw.cn --test
Info: Creating a new SSL key for cu1.ds.ctyun
Info: Caching certificate for ca
Info: csr_attributes file loading from /etc/puppetlabs/puppet/csr_attributes.yaml
Info: Creating a new SSL certificate request for cu1.ds.ctyun
Info: Certificate Request fingerprint (SHA256): 4F:D6:DC:25:22:D9:44:E5:70:9F:9B:B1:0F:99:B2:AC:F5:5F:50:CE:B7:C3:AF:65:F4:E2:DF:D5:2D:6F:96:07
Info: Caching certificate for ca
Exiting; no certificate found and waitforcert is disabled


# 在没有修改 域 的情况下,已经发送了 ds.ctyun 域 的签名请求
# 修改主机域,再发送请求
[root@cu1 ~]# vi /etc/resolv.conf 
; generated by /sbin/dhclient-script
domain esw.cn
search ds.ctyun
nameserver 192.168.0.1

[root@cu1 ~]#  puppet config print certname
cu1.esw.cn

[root@cu1 ~]# puppet agent --server cu2.esw.cn --test
Info: Creating a new SSL key for cu1.esw.cn
Info: csr_attributes file loading from /etc/puppetlabs/puppet/csr_attributes.yaml
Info: Creating a new SSL certificate request for cu1.esw.cn
Info: Certificate Request fingerprint (SHA256): B8:A1:65:B6:FE:02:87:B1:8D:0A:62:2E:FE:30:DD:B3:3B:C9:A2:B2:A1:50:11:D3:FE:03:6A:81:A6:84:C0:6B
Exiting; no certificate found and waitforcert is disabled


# 此时服务端cu2已包括了 cu1 的两个签名请求信息
[root@cu2 puppet]# puppet cert list -all
  "cu1.ds.ctyun"  (SHA256) 4F:D6:DC:25:22:D9:44:E5:70:9F:9B:B1:0F:99:B2:AC:F5:5F:50:CE:B7:C3:AF:65:F4:E2:DF:D5:2D:6F:96:07
  "cu1.esw.cn" (SHA256) B8:A1:65:B6:FE:02:87:B1:8D:0A:62:2E:FE:30:DD:B3:3B:C9:A2:B2:A1:50:11:D3:FE:03:6A:81:A6:84:C0:6B
+ "cu2.esw.cn" (SHA256) 3D:8E:4E:18:45:F4:8C:9B:71:7C:13:45:0D:8A:6F:A5:6E:22:D5:0E:B1:B0:54:29:47:02:AE:95:8B:E6:A6:B7 (alt names: "DNS:puppet", "DNS:cu2.esw.cn")


# 本地清理 无效的签名请求 或者直接删除ssl目录: rm -rf /var/lib/puppet/ssl
[root@cu1 ~]# puppet certificate_request destroy cu1.ds.ctyun
Notice: Removing file Puppet::SSL::CertificateRequest cu1.ds.ctyun at '/etc/puppetlabs/puppet/ssl/certificate_requests/cu1.ds.ctyun.pem'
1


# 服务端清理 特定客户端无效请求
# http://serverfault.com/questions/574976/puppet-trying-to-configure-puppet-client-for-first-use-but-got-some-problems-wi
[root@cu2 puppet]# puppet node clean cu1.ds.ctyun 
Notice: Removing file Puppet::SSL::CertificateRequest cu1.ds.ctyun at '/etc/puppetlabs/puppet/ssl/ca/requests/cu1.ds.ctyun.pem'
cu1.ds.ctyun


# 服务端签名,客户端agent同步manifest
[root@cu2 puppet]# puppet cert sign cu1.esw.cn
Notice: Signed certificate request for cu1.esw.cn
Notice: Removing file Puppet::SSL::CertificateRequest cu1.esw.cn at '/etc/puppetlabs/puppet/ssl/ca/requests/cu1.esw.cn.pem'

[root@cu1 ~]# puppet agent --server cu2.esw.cn --test
Info: Caching certificate_revocation_list for ca
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Caching catalog for cu1.esw.cn
Info: Applying configuration version '1461156849'
Info: Creating state file /opt/puppetlabs/puppet/cache/state/state.yaml
Notice: Applied catalog in 0.01 seconds

其他修改主机域后统一签名:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[root@cu2 puppet]# puppet cert list 
  "cu3.esw.cn" (SHA256) 16:CB:A3:6D:21:69:78:D0:0D:37:1F:A7:C1:86:2E:55:7F:B1:60:77:05:EC:F5:37:81:12:28:73:61:1A:4F:20
  "cu4.esw.cn" (SHA256) CB:80:64:BD:B8:56:56:43:90:11:D4:B2:A9:7B:D8:DC:E4:0C:8D:5A:71:0B:FF:97:65:20:F5:B4:D7:15:11:B6
  "cu5.esw.cn" (SHA256) D6:9A:B0:93:98:94:D2:D2:E3:A9:55:24:EC:7A:E0:13:48:5B:26:16:6C:5A:B6:11:F5:7C:F2:56:E4:DA:D8:31
[root@cu2 puppet]# puppet cert sign --all
Notice: Signed certificate request for cu5.esw.cn
Notice: Removing file Puppet::SSL::CertificateRequest cu5.esw.cn at '/etc/puppetlabs/puppet/ssl/ca/requests/cu5.esw.cn.pem'
Notice: Signed certificate request for cu4.esw.cn
Notice: Removing file Puppet::SSL::CertificateRequest cu4.esw.cn at '/etc/puppetlabs/puppet/ssl/ca/requests/cu4.esw.cn.pem'
Notice: Signed certificate request for cu3.esw.cn
Notice: Removing file Puppet::SSL::CertificateRequest cu3.esw.cn at '/etc/puppetlabs/puppet/ssl/ca/requests/cu3.esw.cn.pem'


# 最终效果
[root@cu2 puppet]# puppet cert list -all
+ "cu1.esw.cn" (SHA256) 46:69:EE:A8:E5:F9:FB:E3:59:63:C5:FC:52:AF:14:43:70:EF:D0:42:70:C4:0E:D2:14:E4:1C:D9:94:F8:9E:E7
+ "cu2.esw.cn" (SHA256) 3D:8E:4E:18:45:F4:8C:9B:71:7C:13:45:0D:8A:6F:A5:6E:22:D5:0E:B1:B0:54:29:47:02:AE:95:8B:E6:A6:B7 (alt names: "DNS:puppet", "DNS:cu2.esw.cn")
+ "cu3.esw.cn" (SHA256) 58:ED:A3:CC:B9:53:34:4B:64:3C:2A:B4:91:AD:0D:8F:AF:EA:B0:5C:A7:73:06:F1:A7:4B:D2:E2:06:B5:21:39
+ "cu4.esw.cn" (SHA256) DD:A2:B9:86:53:29:DB:12:A3:0C:AA:9C:11:68:72:70:72:E2:16:36:8E:20:AC:E5:48:12:36:E2:80:6C:F0:E6
+ "cu5.esw.cn" (SHA256) EE:E6:FB:D2:1A:04:AD:C3:5B:1F:4F:79:C3:B6:36:15:B5:AC:8B:8B:5D:CB:A4:AA:AF:7B:FB:50:0B:83:7E:38

自动签名配置文件

反正都是学习,在无尽的折腾成长。如果是生产环境最好不要清理服务端的已签名证书,不但客户端要重新签,如果安装了puppetdb等其他程序需要签名都得重新配置签名。

注意: 如果已经安装官网的步骤安装 PuppetDB ,清理服务端的证书建议通过命令 puppet cert clean DOMAIN 来清理。否则 PuppetDB 中还有对应的证书缓存信息。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
# https://tickets.puppetlabs.com/browse/PUP-1426
# 貌似不支持全部清除已签名证书
[root@cu2 ~]# puppet cert clean --all 
Error: Refusing to revoke all certs, provide an explicit list of certs to revoke

# 直接删掉ssl目录
[root@cu2 ~]# puppet master --configprint ssldir
/etc/puppetlabs/puppet/ssl

[root@cu2 ~]# cd /etc/puppetlabs/puppet
[root@cu2 puppet]# ll
...
drwxrwx--x 8 puppet puppet 4096 Apr 20 15:10 ssl

# 注意ssl目录的权限。这里仅删除目录里面的文件
[root@cu2 puppet]# service puppetserver stop
Stopping puppetserver:                                     [  OK  ]
[root@cu2 puppet]# 
[root@cu2 puppet]# rm -rf ssl/*


# 先启动服务看看原来已签名的再连服务器是什么情况
[root@cu2 puppet]# service puppetserver start
Starting puppetserver:                                     [  OK  ]

[root@cu2 puppet]# tree ssl/
ssl/
├── ca
│   ├── ca_crl.pem
│   ├── ca_crt.pem
│   ├── ca_key.pem
│   ├── ca_pub.pem
│   ├── inventory.txt
│   ├── requests
│   ├── serial
│   └── signed
│       └── cu2.esw.cn.pem
├── certificate_requests
├── certs
│   ├── ca.pem
│   └── cu2.esw.cn.pem
├── crl.pem
├── private
├── private_keys
│   └── cu2.esw.cn.pem
└── public_keys
    └── cu2.esw.cn.pem


# agent 再请求,会报错。删除 ssl 后,再签名
[root@cu3 ~]# puppet agent --server cu2.esw.cn --test
Warning: Unable to fetch my node definition, but the agent run will continue:
Warning: SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get local issuer certificate for /CN=cu2.esw.cn]
Info: Retrieving pluginfacts
Error: /File[/opt/puppetlabs/puppet/cache/facts.d]: Failed to generate additional resources using 'eval_generate': SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get local issuer certificate for /CN=cu2.esw.cn]
Error: /File[/opt/puppetlabs/puppet/cache/facts.d]: Could not evaluate: Could not retrieve file metadata for puppet:///pluginfacts: SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get local issuer certificate for /CN=cu2.esw.cn]
Info: Retrieving plugin
Error: /File[/opt/puppetlabs/puppet/cache/lib]: Failed to generate additional resources using 'eval_generate': SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get local issuer certificate for /CN=cu2.esw.cn]
Error: /File[/opt/puppetlabs/puppet/cache/lib]: Could not evaluate: Could not retrieve file metadata for puppet:///plugins: SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get local issuer certificate for /CN=cu2.esw.cn]
Error: Could not retrieve catalog from remote server: SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get local issuer certificate for /CN=cu2.esw.cn]
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run
Error: Could not send report: SSL_connect returned=1 errno=0 state=error: certificate verify failed: [unable to get local issuer certificate for /CN=cu2.esw.cn]

[root@cu3 ~]# puppet agent --configprint ssldir
/etc/puppetlabs/puppet/ssl
[root@cu3 ~]# cd /etc/puppetlabs/puppet
[root@cu3 puppet]# rm -rf ssl/*
[root@cu3 puppet]# puppet agent --server cu2.esw.cn --test
Info: Creating a new SSL key for cu3.esw.cn
Info: Caching certificate for ca
Info: csr_attributes file loading from /etc/puppetlabs/puppet/csr_attributes.yaml
Info: Creating a new SSL certificate request for cu3.esw.cn
Info: Certificate Request fingerprint (SHA256): 9D:58:14:C0:CA:DD:51:77:0B:3F:EB:09:02:9B:D6:67:04:FD:48:7A:6E:CB:83:43:8D:5B:A9:78:0C:89:90:5B
Info: Caching certificate for ca
Exiting; no certificate found and waitforcert is disabled

[root@cu2 puppet]# puppet cert list -all
  "cu3.esw.cn" (SHA256) 9D:58:14:C0:CA:DD:51:77:0B:3F:EB:09:02:9B:D6:67:04:FD:48:7A:6E:CB:83:43:8D:5B:A9:78:0C:89:90:5B
+ "cu2.esw.cn" (SHA256) BA:C4:C9:CC:92:6E:45:2E:B1:7F:BC:15:49:0A:2C:BB:5F:C6:B0:73:EB:6C:21:EA:C8:A6:DD:2D:FE:DF:67:70 (alt names: "DNS:puppet", "DNS:cu2.esw.cn")
[root@cu2 puppet]# puppet cert sign --all
Notice: Signed certificate request for cu3.esw.cn
Notice: Removing file Puppet::SSL::CertificateRequest cu3.esw.cn at '/etc/puppetlabs/puppet/ssl/ca/requests/cu3.esw.cn.pem'

[root@cu3 puppet]# puppet agent --server cu2.esw.cn --test
Info: Caching certificate for cu3.esw.cn
Info: Caching certificate_revocation_list for ca
Info: Caching certificate for cu3.esw.cn
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Caching catalog for cu3.esw.cn
Info: Applying configuration version '1461205206'
Notice: Applied catalog in 0.01 seconds


# 配置autosign
# https://docs.puppet.com/puppet/4.4/reference/ssl_autosign.html
# 在CA的服务器配置的master节点下配置autosign: Naïve Autosigning
[root@cu2 puppet]# vi puppet.conf 
...
autosign = true
# 或者添加配置文件: Basic Autosigning (autosign.conf)
[root@cu2 puppet]# vi autosign.conf
*.esw.cn

[root@cu2 puppet]# service puppetserver restart
Stopping puppetserver:                                     [  OK  ]
Starting puppetserver:                                     [  OK  ]


# agent 自动重新签名
[root@cu1 ~]# cd /etc/puppetlabs/puppet/
[root@cu1 puppet]# rm -rf ssl/*
[root@cu1 puppet]# 
[root@cu1 puppet]# puppet agent --server cu2.esw.cn --test
Info: Creating a new SSL key for cu1.esw.cn
Info: Caching certificate for ca
Info: csr_attributes file loading from /etc/puppetlabs/puppet/csr_attributes.yaml
Info: Creating a new SSL certificate request for cu1.esw.cn
Info: Certificate Request fingerprint (SHA256): D1:F5:6D:A4:91:57:DF:92:47:98:B7:C6:78:E5:C5:E0:AA:DA:70:90:0D:68:48:09:81:FA:65:98:02:F0:84:A9
Info: Caching certificate for cu1.esw.cn
Info: Caching certificate_revocation_list for ca
Info: Caching certificate for ca
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Caching catalog for cu1.esw.cn
Info: Applying configuration version '1461205750'
Notice: Applied catalog in 0.02 seconds

[root@cu2 puppet]# puppet cert list -all
+ "cu1.esw.cn" (SHA256) F9:48:1D:85:A7:44:78:71:AA:44:02:3F:98:20:DB:20:B1:DA:10:EC:3A:6A:AE:85:D4:37:EC:9E:20:AB:84:AA
+ "cu2.esw.cn" (SHA256) BA:C4:C9:CC:92:6E:45:2E:B1:7F:BC:15:49:0A:2C:BB:5F:C6:B0:73:EB:6C:21:EA:C8:A6:DD:2D:FE:DF:67:70 (alt names: "DNS:puppet", "DNS:cu2.esw.cn")
+ "cu3.esw.cn" (SHA256) BA:00:57:50:1D:91:40:0D:7D:E4:C5:99:6F:3F:77:D6:E8:C4:71:5B:8D:8C:AB:FA:D0:D4:5C:36:5D:AB:A7:1B
+ "cu4.esw.cn" (SHA256) 96:64:4A:73:EC:D7:A6:0D:73:37:82:33:2D:0D:B3:BF:A6:A8:6B:9B:D4:05:D0:2C:46:3B:E2:22:6E:43:39:91
+ "cu5.esw.cn" (SHA256) 54:48:34:BF:C9:60:8C:4C:D2:9D:C9:A3:52:2E:EB:29:AC:2E:84:2E:9E:34:F1:A3:30:83:46:0E:BF:A9:5D:9A

autosign 除了使用 autosign.conf 配置,还可以使用 shell/命令 来进行适配,具体查看官网文档: https://docs.puppet.com/puppet/4.4/reference/ssl_autosign.html

agent执行同步命令每次都要指定server很麻烦,可以修改 puppet.conf 配置,每次执行是从配置文件读取:

1
2
3
4
5
[root@cu2 plugins]# vi /etc/puppetlabs/puppet/puppet.conf 
...
[agent]
server = cu2.esw.cn
certname = cu2.esw.cn  # 主机名不确定情况下,可以通过这个来指定当前机器的主机名!!每台机器根据主机单独设置!

命令合集

1
2
3
4
5
6
7
8
9
10
11
12
13
14
puppet agent --server cu2.esw.cn --test

puppet cert list -all

puppet node clean cu1.ds.ctyun 
puppet cert clean cu2.ds.ctyun
puppet certificate_request destroy cu1.ds.ctyun

puppet cert sign cu1.esw.cn
puppet cert sign --all

puppet config print certname
puppet master --configprint ssldir
puppet agent --configprint ssldir

–END

Alluxio入门大全2

alluxio就是原来的tachyon。老大是华人,文档自然就有福利,把en改成cn就可以查看中文版的文档了。

http://alluxio.org/documentation/master/cn/Architecture.html

注意:docker暂时不能部署alluxio: mount: permission denied

首先介绍alluxio的编译,然后进行本地和集群两种方式的部署,同时介绍HDFS底层存储系统配置和一些常用命令行的使用,最后通过代码和spark读写Alluxio数据,以及升级到V1.1查看系统的Metrics指标来了解存储系统使用情况。

回头看:Alluxio启动时会挂载一个Mem内存盘,其实可以把内存盘路径指定到 /dev/shm 。其他操作就很简单了,也不需要root权限。

编译

1
2
3
4
5
6
# 下载官网打包的bin.tar.gz。不推荐去github下v1.0.1,编译时findbug检查server有两个bug
http://alluxio.org/downloads/files/1.0.1/alluxio-1.0.1-bin.tar.gz

[hadoop@cu2 ~]$ cd ~/sources/alluxio-1.0.1/
[hadoop@cu2 alluxio-1.0.1]$ export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"
[hadoop@cu2 alluxio-1.0.1]$ mvn clean package assembly:single -Phadoop-2.6 -Dhadoop.version=2.6.3 -Pyarn,spark -Dmaven.test.skip=true -Dmaven.javadoc.skip=true

编译成功后会生成 assembly/target/alluxio-1.0.1.tar.gz 文件。部署的时刻直接用编译好的 tar.gz 就行了,内容比较简洁和清晰。

还有一个问题,不要加Profile compileJsp ,编译没问题但是部署后访问网页抛 ClassNotFound 异常。

windows alluxio-1.1-snapshot 编译需要注意下。打包 assembly 的时刻换行符没有格式化,还有 mvn 编译时需要用到 test 项目(改成skipTests)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
$ git diff assembly/src/main/assembly/alluxio-dist.xml
diff --git a/assembly/src/main/assembly/alluxio-dist.xml b/assembly/src/main/assembly/alluxio-dist.xml
index 14ecd19..06ddd51 100644
--- a/assembly/src/main/assembly/alluxio-dist.xml
+++ b/assembly/src/main/assembly/alluxio-dist.xml
@@ -11,6 +11,7 @@
       <outputDirectory>/bin</outputDirectory>
       <fileMode>0755</fileMode>
       <directoryMode>0755</directoryMode>
+      <lineEnding>unix</lineEnding>
     </fileSet>
     <fileSet>
       <directory>${basedir}/../conf</directory>
@@ -19,6 +20,7 @@
     <fileSet>
       <directory>${basedir}/../libexec</directory>
       <outputDirectory>/libexec</outputDirectory>
+      <lineEnding>unix</lineEnding>
     </fileSet>
     <fileSet>
       <directory>${basedir}/..</directory>

E:\git\alluxio>set MAVEN_OPTS="-Xmx2g"
E:\git\alluxio>mvn clean package assembly:single -Phadoop-2.6 -Dhadoop.version=2.6.3 -Pyarn,spark -DskipTests -Dmaven.javadoc.skip=true

Local部署配置

http://alluxio.org/documentation/master/cn/Running-Alluxio-Locally.html

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
[hadoop@hadoop-master2 ~]$ tar zxf alluxio-1.0.1.tar.gz  
[hadoop@hadoop-master2 ~]$ cd alluxio-1.0.1/conf/
[hadoop@hadoop-master2 conf]$ cp alluxio-env.sh.template alluxio-env.sh
[hadoop@hadoop-master2 conf]$ vi alluxio-env.sh
...
JAVA_HOME=/opt/jdk1.7.0_60
ALLUXIO_UNDERFS_ADDRESS=/home/hadoop/tmp

[hadoop@hadoop-master2 conf]$ cd ..
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio format
Connecting to localhost as hadoop...
Formatting Alluxio Worker @ hadoop-master2
Connection to localhost closed.
Formatting Alluxio Master @ localhost
[hadoop@hadoop-master2 alluxio-1.0.1]$ 

# 把hadoop用户加入sudo
[root@hadoop-master2 ~]# visudo 
...
hadoop        ALL=(ALL)       NOPASSWD: ALL

# 机器原来部署过hadoop,localhost已经可以无密钥登录。

# 启动
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio-start.sh local
Killed 0 processes on hadoop-master2
Killed 0 processes on hadoop-master2
Connecting to localhost as hadoop...
Killed 0 processes on hadoop-master2
Connection to localhost closed.
Formatting RamFS: /mnt/ramdisk (1gb)
Starting master @ localhost. Logging to /home/hadoop/alluxio-1.0.1/logs
Starting worker @ hadoop-master2. Logging to /home/hadoop/alluxio-1.0.1/logs
[hadoop@hadoop-master2 alluxio-1.0.1]$ 
[hadoop@hadoop-master2 alluxio-1.0.1]$ jps
3780 AlluxioMaster
3845 Jps
3807 AlluxioWorker

# localhost:19999 通过web页查看集群状态

# 关闭
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio-stop.sh all
Killed 1 processes on hadoop-master2
Killed 1 processes on hadoop-master2
Connecting to localhost as hadoop...
Killed 0 processes on hadoop-master2
Connection to localhost closed.

这里完全安装官网的步骤来弄,正式环境的时刻可以用 root 来 mount 内存盘。下面集群部署再介绍。

集群部署

步骤和Local类似。把程序部署到workers节点,所有workers节点都 mount 内存盘,然后调用 start.sh 。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# master 和 workers 的无密钥登录。部署过apache-hadoop的肯定都已经弄过了

# 修改配置
[hadoop@hadoop-master2 alluxio-1.0.1]$ vi conf/workers 
bigdata1
[hadoop@hadoop-master2 alluxio-1.0.1]$ vi conf/alluxio-env.sh
ALLUXIO_MASTER_ADDRESS=hadoop-master2

# 部署程序
# bin/alluxio copyDir <dirname> 慎用,会把logs目录也同步过去的,
# 当然可以修改alluxio的脚本,反正要知道脚本的作用
[hadoop@hadoop-master2 ~]$ rsync -az alluxio-1.0.1 bigdata1:~/ --exclude=logs --exclude=/*/src --exclude=underfs --exclude=journal

# 使用root用户挂载(workers)节点的内存盘
# 当然还有最简单的方式,直接把 ALLUXIO_RAM_FOLDER=/dev/shm 指定到系统的tmpfs,系统的tmpfs其实也主要用的是内存。
# 变量 ALLUXIO_WORKER_MEMORY_SIZE=512MB 修改内存盘的大小,小于 /dev/shm 的空间大小。
[root@hadoop-master2 ~]# cd /home/hadoop/alluxio-1.0.1
[root@hadoop-master2 alluxio-1.0.1]# bin/alluxio-mount.sh Mount workers
Connecting to bigdata1 as root...
Warning: Permanently added 'bigdata1,192.168.191.133' (RSA) to the list of known hosts.
Formatting RamFS: /mnt/ramdisk (1gb)
Connection to bigdata1 closed.

# worker节点确认
[hadoop@bigdata1 ~]$ mount
/dev/mapper/VolGroup-lv_root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/sda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
ramfs on /mnt/ramdisk type ramfs (rw,size=1gb)

# 格式化:主要是清理/创建JOURNAL目录,清理workers本地缓存(tiered-storage)目录数据
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio format
Connecting to bigdata1 as hadoop...
Formatting Alluxio Worker @ bigdata1
Connection to bigdata1 closed.
Formatting Alluxio Master @ localhost

# 启动
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio-start.sh all NoMount
Killed 1 processes on hadoop-master2
Killed 1 processes on hadoop-master2
Connecting to bigdata1 as hadoop...
Killed 0 processes on bigdata1
Connection to bigdata1 closed.
Starting master @ localhost. Logging to /home/hadoop/alluxio-1.0.1/logs
Connecting to bigdata1 as hadoop...
Starting worker @ bigdata1. Logging to /home/hadoop/alluxio-1.0.1/logs
Connection to bigdata1 closed.
[hadoop@hadoop-master2 alluxio-1.0.1]$ jps
5164 AlluxioMaster
5219 Jps

[hadoop@bigdata1 alluxio-1.0.1]$ jps
1849 Jps
1829 AlluxioWorker

通过网页查看,如果 Running Workers0 ,到workers节点 alluxio-1.0.1/logs 下面去看日志然后定位问题。防火墙没开放?还是其他配置不正确,如hosts等等。

命令行HelloWorld

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs copyFromLocal conf/alluxio-env.sh /
Copied conf/alluxio-env.sh to /

# worker节点查看内容(当前只有这一个文件啊,查看方便),block-id可以通过网页或者 fs fileInfo查看
[hadoop@bigdata1 alluxio-1.0.1]$ tail -1 /mnt/ramdisk/alluxioworker/117440512 
export ALLUXIO_WORKER_JAVA_OPTS="${ALLUXIO_JAVA_OPTS}"


# 在master机器上调用 persist ,在worker节点没找到对应的数据。竟然直接存储在执行命令的节点了,囧!!!
# alluxio.client.file.FileSystemUtils#persistFile
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs persist /alluxio-env.sh
persisted file /alluxio-env.sh with size 5493
[hadoop@hadoop-master2 alluxio-1.0.1]$ ll /home/hadoop/tmp/
total 28
-rwxrwxrwx  1 hadoop hadoop 5493 Apr 15 03:33 alluxio-env.sh

[hadoop@bigdata1 alluxio-1.0.1]$ bin/alluxio fs persist /alluxio-env.sh
/alluxio-env.sh is already persisted
[hadoop@bigdata1 alluxio-1.0.1]$ ll /home/hadoop/tmp
总用量 0

在master调用 persist 后,再在worker节点调用 persist 竟然提示 already persisted 了。如果在分布式的情况下,本地磁盘 不适合 用于做 underfs !!官网也是说 单节点 本地文件系统

Alluxio提供了通用接口以简化插入不同的底层存储系统。目前我们支持Amazon S3,OpenStack Swift,Apache HDFS,GlusterFS以及单节点本地文件系统

使用HDFS作为底层存储

http://alluxio.org/documentation/master/en/Configuring-Alluxio-with-HDFS.html

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[hadoop@hadoop-master2 alluxio-1.0.1]$ vi conf/alluxio-env.sh
...
JAVA_HOME=/opt/jdk1.7.0_60
HADOOP_HOME=/home/hadoop/hadoop-2.6.3

# source $HADOOP_HOME/libexec/hadoop-config.sh
JAVA_LIBRARY_PATH="$HADOOP_HOME/lib/native"
ALLUXIO_JAVA_OPTS="$ALLUXIO_JAVA_OPTS -Djava.library.path=$JAVA_LIBRARY_PATH"
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$JAVA_LIBRARY_PATH

ALLUXIO_CLASSPATH=$HADOOP_HOME/etc/hadoop:$ALLUXIO_CLASSPATH
ALLUXIO_UNDERFS_ADDRESS=hdfs:///alluxio                       # 配置一个alluxio子路径比较好管理
ALLUXIO_MASTER_ADDRESS=hadoop-master2

# 清理/创建元数据目录和workers节点本地缓冲存储的数据
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio format
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio-start.sh master

# master启动正常后,启动workers节点
# 上面已经用root mount了内存盘了,没有的用root执行 bin/alluxio-mount.sh Mount workers
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio-start.sh workers NoMount
  • 使用

http://alluxio.org/documentation/master/en/Command-Line-Interface.html

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs copyFromLocal  ~/hadoop-2.6.3/README.txt /
Copied /home/hadoop/hadoop-2.6.3/README.txt to /
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs ls /
1366.00B  04-15-2016 09:30:45:829  In Memory      /README.txt
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs location /README.txt
/README.txt with file id 33554431 is on nodes: 
bigdata1

[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs persist /README.txt
/README.txt is already persisted

# 默认文件只写到 Cache ,可以修改配置来进行修改
# alluxio.client.WriteType
[hadoop@hadoop-master2 alluxio]$ export ALLUXIO_JAVA_OPTS="-Dalluxio.user.file.writetype.default=CACHE_THROUGH"
[hadoop@hadoop-master2 alluxio]$ bin/alluxio fs copyFromLocal ~/hadoop-2.6.3/README.txt /                      
Copied /home/hadoop/hadoop-2.6.3/README.txt to /
[hadoop@hadoop-master2 alluxio]$ bin/alluxio fs fileInfo /README.txt                                           
FileInfo{fileId=452984831, name=README.txt, path=/README.txt, ufsPath=hdfs:///alluxio/README.txt, length=1366, blockSizeBytes=536870912, creationTimeMs=1460765370996, completed=true, folder=false, pinned=false, cacheable=true, persisted=true, blockIds=[436207616], inMemoryPercentage=100, lastModificationTimesMs=1460765372423, ttl=-1, userName=, groupName=, permission=0, persistenceState=PERSISTED, mountPoint=false}
Containing the following blocks: 
BlockInfo{id=436207616, length=1366, locations=[BlockLocation{workerId=1, address=WorkerNetAddress{host=bigdata1, rpcPort=29998, dataPort=29999, webPort=30000}, tierAlias=MEM}]}

# Creates a 0 byte file. The file will be written to the under file system. 
# For example, touch can be used to create a file signifying the compeletion of analysis on a directory.
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs touch /1234.txt    
/1234.txt has been created

# 已经persist的文件,重命名后,hdfs上面的文件也立即改变了
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs mv /1234.txt /4321.txt
Renamed /1234.txt to /4321.txt

# 空文件没有分配实际的存储,只有元数据
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs location /4321.txt    
/4321.txt with file id 67108863 is on nodes: 

# free掉memory,然后删掉underfs目录下的文件
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs free /
/ was successfully freed from memory.
[hadoop@hadoop-master2 hadoop-2.6.3]$ bin/hdfs dfs -rmr /alluxio/*

[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs ls /  
1366.00B  04-15-2016 09:30:45:829  Not In Memory  /README.txt
0.00B     04-15-2016 09:37:48:971  In Memory      /4321.txt
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs tail /README.txt
File does not exist: /alluxio/README.txt
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1893)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1834)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1814)

# 按照文件名把 README.txt 放到 underfs 目录下面
[hadoop@hadoop-master2 hadoop-2.6.3]$ bin/hdfs dfs -put *.txt /alluxio/ 
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs tail /README.txt
...
software:
  Hadoop Core uses the SSL libraries from the Jetty project written 
by mortbay.org.

# 数据载入内存
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs load /
/README.txt loaded
/ loaded
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs ls /  
1366.00B  04-15-2016 09:30:45:829  In Memory      /README.txt
0.00B     04-15-2016 09:37:48:971  In Memory      /4321.txt

# 载入underfs的目录结构
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio loadufs / hdfs:///alluxio 
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs ls /
1366.00B  04-15-2016 09:30:45:829  In Memory      /README.txt
0.00B     04-15-2016 09:37:48:971  In Memory      /4321.txt
15.07KB   04-15-2016 10:12:33:176  Not In Memory  /LICENSE.txt
101.00B   04-15-2016 10:12:33:190  Not In Memory  /NOTICE.txt

# 通过 fileInfo 查看信息; fileId, ufsPath, 和分区blocks信息
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs fileInfo /README.txt

# 通配符要这么写,也是醉鸟
[hadoop@hadoop-master2 alluxio-1.1.0-SNAPSHOT]$ bin/alluxio fs rm /\\*
/4321.txt has been removed
/LICENSE.txt has been removed
/NOTICE.txt has been removed
/README.md has been removed
/README.txt has been removed

# alluxio系统中没有的文件,但是underfs包括的文件,读取一遍后元数据会载入alluxio
[hadoop@hadoop-master2 ~]$ alluxio fs ls /
1366.00B  04-16-2016 08:09:30:996  In Memory      /README.txt
[hadoop@hadoop-master2 ~]$ alluxio fs cat /LICENSE.txt
[hadoop@hadoop-master2 ~]$ alluxio fs ls /
1366.00B  04-16-2016 08:09:30:996  In Memory      /README.txt
15.07KB   04-16-2016 08:26:22:495  Not In Memory  /LICENSE.txt

文件结构大概搞明白了,从 underfs 加载目录结构(loadufs),文件载入alluxio内存(fs load),alluxio文件持久化(fs persist)都有对应的命令。 理解 mount 和linux的mount类似,把 underfs 当做一个硬盘设备去理解。

但是好像没有修改文件的API,难道不支持修改??暂时好像没有(2016-4-15 23:06:20 v1.1)

http://alluxio.org/documentation/master/en/Key-Value-System-API.html Like files in Alluxio filesystem, the semantics of key-value system are also write-once

FileSystem API

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# scala
object App {

  def using[A <: {def close() : Unit}, B](resource: A)(f: A => B): B =
    try { f(resource) } finally { resource.close() }

  def main(args: Array[String]) {
    // @see alluxio.Configuration.Configuration(boolean)
    System.setProperty(Constants.MASTER_HOSTNAME, "192.168.191.132")
    System.setProperty("HADOOP_USER_NAME", "hadoop")

    val fs = FileSystem.Factory.get();
    val path = new AlluxioURI("/README.md");
    using(fs.createFile(path, CreateFileOptions.defaults().setWriteType(WriteType.THROUGH))){ out =>
      val content =
"""FileSystem API Write.
   -------------------------
   Hello World!
"""
      out.write(content.getBytes)
    }

    using(fs.openFile(path)) { in =>
      val buffer = new Array[Byte](1024)
      val size = in.read(buffer)
      System.out.println(new String(buffer, 0, size))
    }
  }
}

# THROUGH 仅写入到underfs
[hadoop@hadoop-master2 alluxio-1.0.1]$ bin/alluxio fs ls /README.md
115.00B   04-15-2016 20:36:57:345  Not In Memory  /README.md
[hadoop@hadoop-master2 alluxio-1.0.1]$ ~/hadoop-2.6.3/bin/hadoop fs -cat /alluxio/README.md
FileSystem API Write.
   -------------------------
   Hello World!
[hadoop@hadoop-master2 alluxio-1.0.1]$ 

程序在win10系统运行,需要把 core-site.xml 加到 src/main/resources 下面(前面配置为了省事直接写 hdfs:///alluxio, 不加载配置的话程序不知道namenode)

如果把WriteType设置为 CACHE_THROUGH ,写 underfs 的同时也会写本地缓存。提交成功后,文件的状态为:

1
2
3
4
5
6
7
8
9
10
11
[hadoop@hadoop-master2 alluxio-1.1.0-SNAPSHOT]$ bin/alluxio fs ls /README.md
115.00B   04-15-2016 23:48:33:749  In Memory      /README.md
[hadoop@hadoop-master2 alluxio-1.1.0-SNAPSHOT]$ bin/alluxio fs fileInfo /README.md
FileInfo{fileId=318767103, name=README.md, path=/README.md, ufsPath=hdfs:///alluxio/README.md, length=115, blockSizeBytes=536870912, creationTimeMs=1460735313749, completed=true, folder=false, pinned=false, cacheable=true, persisted=true, blockIds=[301989888], inMemoryPercentage=100, lastModificationTimesMs=1460735315749, ttl=-1, userName=, groupName=, permission=0, persistenceState=PERSISTED, mountPoint=false}
Containing the following blocks: 
BlockInfo{id=301989888, length=115, locations=[BlockLocation{workerId=1, address=WorkerNetAddress{host=bigdata1, rpcPort=29998, dataPort=29999, webPort=30000}, tierAlias=MEM}]}
[hadoop@hadoop-master2 alluxio-1.1.0-SNAPSHOT]$ ~/hadoop-2.6.3/bin/hadoop fs -ls /alluxio/ 
Found 4 items
-rw-r--r--   1 hadoop supergroup      15429 2016-04-15 09:57 /alluxio/LICENSE.txt
-rw-r--r--   1 hadoop supergroup        101 2016-04-15 09:57 /alluxio/NOTICE.txt
-rwxrwxrwx   1 hadoop supergroup        115 2016-04-15 23:48 /alluxio/README.md

大数据程序中使用Alluxio

hadoop2通过 org.apache.hadoop.fs.FileSystem services获取绑定的对象,所以不需要在core-site.xml里面配置 fs.alluxio.implfs.alluxio-ft.impl

其实都是通过 Hadoop FileSystem API 来访问Alluxio的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# .bash_profile加环境变量
[hadoop@hadoop-master2 ~]$ vi ~/.bash_profile 
...
HADOOP_HOME=~/hadoop
SPARK_HOME=~/spark
ALLUXIO_HOME=~/alluxio

PATH=$HADOOP_HOME/bin:$SPARK_HOME/bin:$ALLUXIO_HOME/bin:$MAVEN_HOME/bin:$ANT_HOME/bin:$PATH
# 这里没有 export HADOOP_HOME SPARK_HOME 
# 因为在hadoop/spark的启动脚本也定义了这些变量。如果export,也需要把软链接同步到slaves节点
export PATH ANT_HOME MAVEN_HOME

[hadoop@hadoop-master2 ~]$ ln -s hadoop-2.6.3 hadoop
[hadoop@hadoop-master2 ~]$ ln -s alluxio-1.1.0-SNAPSHOT alluxio
[hadoop@hadoop-master2 ~]$ ln -s spark-1.6.0-bin-2.6.3 spark
[hadoop@hadoop-master2 ~]$ . .bash_profile 

[hadoop@hadoop-master2 ~]$ export SPARK_CLASSPATH=\
> ~/alluxio/core/client/target/alluxio-core-client-1.1.0-SNAPSHOT-jar-with-dependencies.jar 
[hadoop@hadoop-master2 ~]$ 
[hadoop@hadoop-master2 ~]$ spark-shell --master local
scala> val file=sc.textFile("alluxio://hadoop-master2:19998/README.txt")
file: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at <console>:27

scala> file.count()
res0: Long = 31

scala> file.take(2)
res1: Array[String] = Array(For the latest information about Hadoop, please visit our website at:, "")

# wordcount
scala> val op = file.flatMap(_.split(" ")).map((_,1)).reduceByKey(_ + _)
# word sort asc
scala> op.sortByKey().take(10)
# count sort desc
scala> op.map(kv => (kv._2, kv._1)).sortByKey(false).map(kv => (kv._2, kv._1)).take(10)

scala> op.map(kv => (kv._2, kv._1)).sortByKey(false).map(kv => (kv._2, kv._1)).saveAsTextFile("alluxio://hadoop-master2:19998/output/")

[hadoop@hadoop-master2 ~]$ alluxio fs cat /output/*
(,18)
(the,8)
(and,6)
(of,5)
(The,4)
(this,3)
(encryption,3)
(for,3)
...

如果运行在集群,在slave的节点也需要与主节点一样的目录结构。 或者按照官网的教程操作。

1
2
3
4
5
6
7
8
9
10
11
12
13
# spark_classpath 会被带到 task 的启动环境变量里面
[hadoop@hadoop-master2 ~]$ rsync -az alluxio bigdata1:~/
[hadoop@hadoop-master2 ~]$ export SPARK_CLASSPATH=\
> ~/alluxio/core/client/target/alluxio-core-client-1.1.0-SNAPSHOT-jar-with-dependencies.jar
 
[hadoop@hadoop-master2 ~]$ spark-shell --master spark://hadoop-master2:7077
scala> val file=sc.textFile("alluxio://hadoop-master2:19998/README.txt")
file: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at <console>:27

scala> val op = file.flatMap(_.split(" ")).map((_,1)).reduceByKey(_ + _)
op: org.apache.spark.rdd.RDD[(String, Int)] = ShuffledRDD[4] at reduceByKey at <console>:29

scala> op.map(kv => (kv._2, kv._1)).sortByKey(false).map(kv => (kv._2, kv._1)).saveAsTextFile("alluxio://hadoop-master2:19998/output2/")

Metrics

http://www.alluxio.org/documentation/master/cn/Metrics-System.html

v1.0.1有对应的api,可以通过 http://hadoop-master2:19999/metrics/json/ 查看。当前master主干分支v1.1.0可以在网页上面查看这些指标。

1
2
3
4
5
6
7
8
9
10
11
# 拷贝配置
[hadoop@hadoop-master2 alluxio-1.1.0-SNAPSHOT]$ cd conf
[hadoop@hadoop-master2 conf]$ cp ~/alluxio-1.0.1/conf/alluxio-env.sh ./
[hadoop@hadoop-master2 conf]$ cp ~/alluxio-1.0.1/conf/log4j.properties ./
[hadoop@hadoop-master2 conf]$ cp ~/alluxio-1.0.1/conf/workers ./ 

# 启动master(使用原来的元数据)
# 共享元数据,在 alluxio-env.sh 修改环境变量 ALLUXIO_JAVA_OPTS 
# 添加 -Dalluxio.master.journal.folder=${ALLUXIO_JOURNAL_FOLDER} / ALLUXIO_JOURNAL_FOLDER=/home/hadoop/journal
[hadoop@hadoop-master2 alluxio-1.1.0-SNAPSHOT]$ bin/alluxio-start.sh master
Starting master @ hadoop-master2. Logging to /home/hadoop/alluxio-1.1.0-SNAPSHOT/logs

v1.1.0 页面多了 Metrics 页签:

其他文档

–END