简单介绍下软件的安装,配置。同时实践下从mysql迁移到postgres。
安装配置
这里直接使用rpm包来安装。如果是centos6.6以下版本的系统需要更新openssl。
使用YUM安装看https://wiki.postgresql.org/wiki/YUM_Installation
1
2
3
4
5
6
7
8
9
10
11
[root@hadoop-master1 postgres]# ll
total 20708
-rw-r--r-- 1 root root 1593932 Dec 11 10:02 openssl-1.0.1e-42.el6.x86_64.rpm
-rw-r--r-- 1 root root 1085208 Dec 11 09:12 postgresql94-9.4.5-1PGDG.rhel6.x86_64.rpm
-rw-r--r-- 1 root root 541376 Dec 11 09:12 postgresql94-contrib-9.4.5-1PGDG.rhel6.x86_64.rpm
-rw-r--r-- 1 root root 1600736 Dec 11 09:12 postgresql94-devel-9.4.5-1PGDG.rhel6.x86_64.rpm
-rw-r--r-- 1 root root 11485008 Dec 11 09:13 postgresql94-docs-9.4.5-1PGDG.rhel6.x86_64.rpm
-rw-r--r-- 1 root root 198968 Dec 11 09:12 postgresql94-libs-9.4.5-1PGDG.rhel6.x86_64.rpm
-rw-r--r-- 1 root root 60688 Dec 11 09:12 postgresql94-plperl-9.4.5-1PGDG.rhel6.x86_64.rpm
-rw-r--r-- 1 root root 68884 Dec 11 09:12 postgresql94-plpython-9.4.5-1PGDG.rhel6.x86_64.rpm
-rw-r--r-- 1 root root 4556880 Dec 11 09:11 postgresql94-server-9.4.5-1PGDG.rhel6.x86_64.rpm
1
2
3
4
# yum install -y openssl-1.0.1e-42.el6.x86_64.rpm
# useradd postgres
# rpm -i postgresql94-*
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# su - postgres
$ vi .bash_profile
export PGDATA=/var/lib/pgsql/9.4/data
PG_HOME=/usr/pgsql-9.4
PATH=$PG_HOME/bin:$PATH
export PATH
$ initdb
$ vi $PGDATA/pg_hba.conf
host all all 192.168.0.0/16 md5
$ vi /var/lib/pgsql/9.4/data/postgresql.conf
listen_addresses = '*'
# 切回root
# service postgresql-9.4 start
# chkconfig postgresql-9.4 on --level 2345
pg_hba.conf用来控制什么用于可以被远程访问。而postgresql.conf修改的监听的地址,默认是localhost改成*后就可以所有地址都可以访问了。
1
2
3
4
5
-bash-4.1$ psql
create user dpi;
create database dpi owner dpi;
alter user dpi with password 'XXXX';
建表:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
CREATE TABLE t_dta_illegalweb (
...
day varchar(10) DEFAULT NULL,
...
);
create or replace function t_dta_illegalweb_insert_trigger()
returns trigger as $$
begin
return null;
end;
$$ language plpgsql;
CREATE TRIGGER trigger_t_dta_illegalweb_insert
BEFORE INSERT ON t_dta_illegalweb
FOR EACH ROW EXECUTE PROCEDURE t_dta_illegalweb_insert_trigger();
后面会使用分区表,先把触发器都建好。把框框搭好,后面修改就行了。
数据迁移
postgres创建表:
1
2
3
4
5
CREATE TABLE IF NOT EXISTS t_dta_illegalweb20151211 (check(day = '2015-12-11')) INHERITS (t_dta_illegalweb);
CREATE TABLE IF NOT EXISTS t_dta_illegalweb20151210 (check(day = '2015-12-10')) INHERITS (t_dta_illegalweb);
CREATE TABLE IF NOT EXISTS t_dta_illegalweb20151209 (check(day = '2015-12-09')) INHERITS (t_dta_illegalweb);
CREATE TABLE IF NOT EXISTS t_dta_illegalweb20151208 (check(day = '2015-12-08')) INHERITS (t_dta_illegalweb);
CREATE TABLE IF NOT EXISTS t_dta_illegalweb20151207 (check(day = '2015-12-07')) INHERITS (t_dta_illegalweb);
mysql导出数据:
1
2
3
select * from t_dta_illegalweb where day='2015-12-09' into outfile '/tmp/etl/t_dta_illegalweb20151209.sql' fields terminated by '|';
select * from t_dta_illegalweb where day='2015-12-08' into outfile '/tmp/etl/t_dta_illegalweb20151208.sql' fields terminated by '|';
select * from t_dta_illegalweb where day='2015-12-07' into outfile '/tmp/etl/t_dta_illegalweb20151207.sql' fields terminated by '|';
数据在mysql服务器的/tmp/etl目录下面。如果mysql和postgres不在同一台机,需要把这些文件拷贝到postgres的服务器。
导入数据到postgres:
用psql登录dpi,然后执行copy命令把数据导入到对应的表。
1
2
3
\copy t_dta_illegalweb20151209 from '/tmp/etl/t_dta_illegalweb20151209.sql' using delimiters '|' ;
\copy t_dta_illegalweb20151208 from '/tmp/etl/t_dta_illegalweb20151208.sql' using delimiters '|' ;
\copy t_dta_illegalweb20151207 from '/tmp/etl/t_dta_illegalweb20151207.sql' using delimiters '|' ;
程序修改
程序修改是一件头痛的事情,虽然大部分都是SQL,但是MYSQL的比较宽泛,很多语句都兼容不报错也能出来想要的结果。但是这些语句在postgres下面执行是会报错的。比如说,select count(*)对所有数据count的时刻不能加order by(提示要groupby);再比如,mysql遇到字符串字段和数字比较会统一转换成数字比较,等等这些在postgres中都需要在SQL中显示的转换的。
那么postgres的类型转换怎么实现呢?两种形式cast(X as TYPE) 或者 X::TYPE。
由于程序是用hibernate来做数据库访问的,会遇到如下的问题
如果用hql的话CAST函数hibernate首先会进行转换。(转换类型与hibernate对象的类型不匹配)
而用X::TYPE会把:TYPE作为一个name parameter。
不用hql用sql的话,要自己做对象转换,这是我们不愿意去做的事情(不然用hibernate干嘛)
各种尝试过后,修改PostgreSQLDialect来实现,添加一个自定义的hibernate函数,把字符串转成bigint即可。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
import java.sql.Types;
import org.hibernate.Hibernate;
import org.hibernate.dialect.function.SQLFunctionTemplate;
public class PostgreSQLDialect extends org.hibernate.dialect.PostgreSQLDialect {
public PostgreSQLDialect() {
super();
registerFunction( "bigint", new SQLFunctionTemplate(Hibernate.BIG_INTEGER, "cast(?1 as bigint)") );
}
}
使用如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
StringBuilder hql = new StringBuilder("from IllegalWebInfo where 1=1 ");
List<Object> params = new ArrayList<>();
String domain = queryBean.getDomain();
if (StringUtils.isNotBlank(domain)) {
hql.append(" and ").append("domain=?");
params.add(domain.toLowerCase());
}
String houseId = queryBean.getHouseId();
if (StringUtils.isNotBlank(houseId)) {
hql.append(" and ").append("houseId=?");
params.add(houseId);
}
String day = queryBean.getDay();
if (StringUtils.isNotBlank(day)) {
hql.append(" and ").append("day=?");
params.add(day);
}
int threshold = queryBean.getThreshold();
if(threshold > 0){
hql.append(" and ").append("bigint(visitsCount) >= ?");
params.add(BigInteger.valueOf(threshold)); // 注意这里的类型转换,把int装成bigint
}
Object[] paramArray = params.toArray();
String detailHQL = hql.toString(); // + " order by bigint(visitsCount) desc ";
List<ActiveResourcesDomainInfo> hist = activeResourcesDomainDao.findPageable(detailHQL, currentPage, pageSize, paramArray);
String countHQL = "select count(*) " + hql;
long count = (long) illegalWebDao.findByHql(countHQL, paramArray).iterator().next();
定时任务,创建和更新触发器函数
函数:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
create or replace function create_partition_table_everyday (t TEXT) returns timestamp as $$
declare
i int;
cnt int;
stmt text;
select_stmt text;
day date;
isInherit BOOLEAN;
begin
day := now() + interval '-1 day';
stmt := 'CREATE TABLE IF NOT EXISTS ' || t || to_char(day, 'YYYYMMDD') || '(check(day = ''' || to_char(day, 'YYYY-MM-DD') || ''')) INHERITS (' || t || ')';
RAISE INFO '[DEBUG] %', stmt;
EXECUTE stmt;
day := now() + interval '-183 day';
stmt := 'DROP TABLE IF EXISTS ' || t || to_char(day, 'YYYYMMDD');
RAISE INFO '[DEBUG] %', stmt;
EXECUTE stmt;
-- try-catch
BEGIN
day := now() + interval '-32 day';
stmt := 'ALTER TABLE IF EXISTS ' || t || to_char(day, 'YYYYMMDD') || ' NO INHERIT ' || t;
RAISE INFO '[DEBUG] %', stmt;
EXECUTE stmt;
EXCEPTION WHEN OTHERS THEN
RAISE INFO '[WARN] % %', SQLERRM, SQLSTATE;
END;
i := 0;
cnt := 6; -- 用于生成触发器分发最近几天的insert功能
day := now() + interval '-1 day';
stmt := ' create or replace function ' || t || '_insert_trigger() returns trigger as $' || '$ ';
stmt := stmt || ' begin ';
stmt := stmt || ' if (new.day = ''' || to_char(day, 'YYYY-MM-DD') || ''') then INSERT INTO ' || t || to_char(day, 'YYYYMMDD') || ' VALUES (new.*); ';
while i < cnt
loop
day := day + interval '-1 day';
stmt := stmt || ' elsif (new.day = ''' || to_char(day, 'YYYY-MM-DD') || ''') then INSERT INTO ' || t || to_char(day, 'YYYYMMDD') || ' VALUES (new.*); ';
i := i + 1;
end loop;
stmt := stmt || ' else raise exception ''DATE out of range. Fix the ' || t || '_insert_trigger() func!!''; ';
stmt := stmt || ' end if; ';
stmt := stmt || ' return null; ';
stmt := stmt || ' end; ';
stmt := stmt || ' $' || '$ language plpgsql; ';
RAISE INFO '[DEBUG] %', stmt;
EXECUTE stmt;
return now();
end;
$$ language plpgsql;
脚本:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
vi update_dta_postgres.sh
#!/bin/sh
source ~/.bash_profile
psql -d dpi -c "select create_partition_table_everyday('t_dta_illegalweb')"
psql -d dpi -c "select create_partition_table_everyday('t_dta_activeresources_domain')"
psql -d dpi -c "select create_partition_table_everyday('t_dta_activeresources_ip')"
$
chmod +x update_dta_postgres.sh
crontab -e
10 0 * * * sh ~/scripts/update_dta_postgres.sh >~/scripts/update_dta_postgres.log 2>&1
参考
–END