Article
oozie start guide
# 步骤记录
说明:cu2就是hadoop-master2
- 打包
[hadoop@cu2 oozie-4.2.0]$ vi bin/mkdistro.sh
MVN_OPTS="-Dbuild.time=${DATETIME} -Dvc.revision=${VC_REV} -Dvc.url=${VC_URL} "
[hadoop@cu2 oozie-4.2.0]$ bin/mkdistro.sh -DskipTests -Dmaven.javadoc.skip=true
- 依赖
打包后,文件的位置
[hadoop@cu2 ~]$ tar zxvf sources/oozie-4.2.0/distro/target/oozie-4.2.0-distro.tar.gz
下载 <http://dev.sencha.com/deploy/ext-2.2.zip>
yum install zip
[hadoop@cu2 oozie-4.2.0]$ mkdir libext
[hadoop@cu2 oozie-4.2.0]$ cd libext/
[hadoop@cu2 libext]$ ll
total 7584
-rw-rw-r-- 1 hadoop hadoop 6800612 Sep 7 16:00 ext-2.2.zip
-rw-rw-r-- 1 hadoop hadoop 960372 Feb 28 2015 mysql-connector-java-5.1.34.jar
- 安装
[hadoop@cu2 oozie-4.2.0]$ bin/oozie-setup.sh prepare-war
setup后,生成的war的位置:/home/hadoop/oozie-4.2.0/oozie-server/webapps/oozie.war
- 初始化数据库
创建数据库用户
CREATE DATABASE oozie;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
FLUSH PRIVILEGES;
GRANT ALL ON oozie.* TO 'oozie'@'localhost' IDENTIFIED BY 'oozie';
FLUSH PRIVILEGES;
show grants for oozie;
[hadoop@cu2 oozie-4.2.0]$ vi conf/oozie-site.xml
<property>
<name>oozie.service.JPAService.jdbc.driver</name><value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.url</name><value>jdbc:mysql://localhost:3306/oozie</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.username</name><value>oozie</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.password</name><value>oozie</value>
</property>
这里直接把hadoop的jar添加到脚本中,不拷贝到libext下面
[hadoop@cu2 oozie-4.2.0]$ vi bin/ooziedb.sh
OOZIECPPATH=""
if [ ! -z ${HADOOP_HOME} ] ; then
OOZIECPPATH="${OOZIECPPATH}:$($HADOOP_HOME/bin/hadoop classpath)"
fi
照着写就行了,不必考虑sql文件的存在与否
[hadoop@cu2 oozie-4.2.0]$ bin/ooziedb.sh create -sqlfile oozie.sql -run
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
Validate DB Connection
DONE
DB schema does not exist
Check OOZIE_SYS table does not exist
DONE
Create SQL schema
DONE
Create OOZIE_SYS table
DONE
Oozie DB has been created for Oozie version '4.2.0'
The SQL commands have been written to: oozie.sql
- 启动服务
由于war中没有hadoop的jar,所以这里也需要把它们添加到tomcat
[hadoop@cu2 oozie-4.2.0]$ $HADOOP_HOME/bin/hadoop classpath | sed 's/:/,/g'
/home/hadoop/hadoop-2.7.1/etc/hadoop,/home/hadoop/hadoop-2.7.1/share/hadoop/common/lib/*,/home/hadoop/hadoop-2.7.1/share/hadoop/common/*,/home/hadoop/hadoop-2.7.1/share/hadoop/hdfs,/home/hadoop/hadoop-2.7.1/share/hadoop/hdfs/lib/*,/home/hadoop/hadoop-2.7.1/share/hadoop/hdfs/*,/home/hadoop/hadoop-2.7.1/share/hadoop/yarn/lib/*,/home/hadoop/hadoop-2.7.1/share/hadoop/yarn/*,/home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/lib/*,/home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/*,/home/hadoop/hadoop-2.7.1/contrib/capacity-scheduler/*.jar
处理下把*改成*.jar
[hadoop@cu2 oozie-4.2.0]$ vi oozie-server/conf/catalina.properties
common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,/home/hadoop/hadoop-2.7.1/etc/hadoop,/home/hadoop/hadoop-2.7.1/share/hadoop/common/lib/*.jar,/home/hadoop/hadoop-2.7.1/share/hadoop/common/*.jar,/home/hadoop/hadoop-2.7.1/share/hadoop/hdfs,/home/hadoop/hadoop-2.7.1/share/hadoop/hdfs/lib/*.jar,/home/hadoop/hadoop-2.7.1/share/hadoop/hdfs/*.jar,/home/hadoop/hadoop-2.7.1/share/hadoop/yarn/lib/*.jar,/home/hadoop/hadoop-2.7.1/share/hadoop/yarn/*.jar,/home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/lib/*.jar,/home/hadoop/hadoop-2.7.1/share/hadoop/mapreduce/*.jar,/home/hadoop/hadoop-2.7.1/contrib/capacity-scheduler/*.jar
# 前台运行 bin/oozied.sh run
[hadoop@cu2 oozie-4.2.0]$ bin/oozied.sh start
http://localhost:11000/
- 测试
[hadoop@cu2 oozie-4.2.0]$ vi bin/oozie
OOZIECPPATH=""
if [ ! -z ${HADOOP_HOME} ] ; then
OOZIECPPATH="${OOZIECPPATH}:$($HADOOP_HOME/bin/hadoop classpath)"
fi
[hadoop@cu2 oozie-4.2.0]$ bin/oozie admin -oozie http://localhost:11000/oozie -status
System mode: NORMAL
- 跑个helloworld
[hadoop@cu2 oozie-4.2.0]$ tar zxvf oozie-sharelib-4.2.0.tar.gz
[hadoop@cu2 oozie-4.2.0]$ ~/hadoop-2.7.1/bin/hadoop fs -rmr share
[hadoop@cu2 oozie-4.2.0]$ ~/hadoop-2.7.1/bin/hadoop fs -put share share
[hadoop@cu2 oozie-4.2.0]$ tar zxvf oozie-examples.tar.gz
[hadoop@cu2 oozie-4.2.0]$ ~/hadoop-2.7.1/bin/hadoop fs -put examples examples
修改share后重启下oozie,sharelib在应用中会缓冲,中间上传程序不能识别,会报`Could not locate Oozie sharelib`的错。
[hadoop@cu2 oozie-4.2.0]$ vi examples/apps/map-reduce/job.properties
nameNode=hdfs://hadoop-master2:9000
jobTracker=hadoop-master2:8032
queueName=default
examplesRoot=examples
oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/apps/map-reduce/workflow.xml
outputDir=map-reduce
[hadoop@cu2 oozie-4.2.0]$ bin/oozie job -oozie http://localhost:11000/oozie -config examples/apps/map-reduce/job.properties -run
Error: E0501 : E0501: Could not perform authorization operation, User: hadoop is not allowed to impersonate hadoop
[hadoop@cu2 hadoop-2.7.1]$ vi etc/hadoop/core-site.xml
<property>
<name>hadoop.proxyuser.hadoop.hosts</name><value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name><value>*</value>
</property>
[hadoop@cu2 ~]$ for h in `cat /etc/hosts | grep slaver | awk '{print $2}' ` ; do rsync -vaz hadoop-2.7.1 $h:~/ --exclude=logs ; done
同步重启集群
注:增加以上配置后,无需重启集群,可以直接用hadoop管理员账号重新加载这两个属性值,命令为:
hdfs dfsadmin -refreshSuperUserGroupsConfiguration
yarn rmadmin -refreshSuperUserGroupsConfiguration
[hadoop@cu2 oozie-4.2.0]$ bin/oozie job -oozie http://localhost:11000/oozie -config examples/apps/map-reduce/job.properties -run
job: 0000000-150908082015741-oozie-hado-W
[hadoop@cu2 hadoop-2.7.1]$ bin/hadoop fs -cat /user/hadoop/examples/output-data/map-reduce/part-00000
尽管能看到结果了,但是不算任务执行成功。任务是有报错的`JA006: Call From cu2/192.168.0.214 to hadoop-master2:10020 failed on connection exception`
[hadoop@cu2 hadoop-2.7.1]$ sbin/mr-jobhistory-daemon.sh start historyserver
在运行一次就ok了。
# 参考
-
http://stackoverflow.com/questions/30926357/oozie-on-yarn-oozie-is-not-allowed-to-impersonate-hadoop
-
http://oozie.apache.org/docs/4.0.0/DG_QuickStart.html#Oozie_Share_Lib_Installation
-
http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
–END
Related
Related posts
-
杀鸡焉用牛刀:DuckDB 正取代部分 Spark 场景
2026-02-16
-
基于对象存储的 Spark 数据读写实战:从末尾追加到任意更新
2025-10-28
-
jarsperreports生成PDF中文问题
2017-01-21
-
jasperreports使用小结
2016-12-01