shell脚本中调用kitchen 和 pan去执行,job和transformation文件。分 windows和 dos系统两种。
举个简单的小例子
shell脚本:
export JAVA_HOME=/usr/local/java/jdk export PATH=$JAVA_HOME/bin:$PATH export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/mysql-connector-java-5.1.18-bin.jar export KETTLE_HOME=/home/www/allyes/a3tracker/bi/etl/kettle/kh_cloud/ export LC_ALL=en_US.UTF-8
echo "KETTLE_HOME=$KETTLE_HOME" echo "starting..." yesterdayid=`date -d $yesterday +%Y%m%d`
/home/www/allyes/a3tracker/bi/etl/kettle/data-integration/kitchen.sh -param:Yesterday='2014-02-24' -file /home/www/allyes/a3tracker/bi/etl/kettle/etlscript/playdata_etl_day.kjb>/home/www/allyes/a3tracker/bi/etl/kettle/logs/a3tracker_cloud_etl_"$yesterdayid"_"$vardate".txt
完整的脚本
#!/bin/sh check_date() { [ $# -ne 1 ] && return 1 _lenStr=`expr length "$1"` [ "$_lenStr" -ne 10 ] && return 1 date -d $1 "+%Y/%m/%d" | grep -q $1 if [ $? -eq 1 ] then return 1 else return 0 fi return 0 }
vardate=`date +%Y%m%d%H%M%S` echo today is `date +%Y/%m/%d` yesterday=`date -d "yesterday" +%Y/%m/%d` while [ -n "$1" ]; do case $1 in -d) shift yesterday=$1 echo "your input is $yesterday" shift;; *) echo "$1 is wrong paratism" break;; esac done
check_date $yesterday
if [ $? -eq 1 ];then echo "date format error! date format:(<yyyy/mm/dd>)" exit 1 fi
echo Data aggregation date : $yesterday
export JAVA_HOME=/usr/local/java/jdk export PATH=$JAVA_HOME/bin:$PATH export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/mysql-connector-java-5.1.18-bin.jar export KETTLE_HOME=/home/www/allyes/a3tracker/bi/etl/kettle/kh_cloud/ export LC_ALL=en_US.UTF-8
echo "KETTLE_HOME=$KETTLE_HOME" echo "starting..." yesterdayid=`date -d $yesterday +%Y%m%d` /home/www/allyes/a3tracker/bi/etl/kettle/data-integration/kitchen.sh -param:Yesterday=$yesterday -file /home/www/allyes/a3tracker/bi/etl/kettle/etlscript/playdata_etl_day.kjb>/home/www/allyes/a3tracker/bi/etl/kettle/logs/a3tracker_cloud_etl_"$yesterdayid"_"$vardate".txt echo "done!"
命令行参数传入:
几篇讲解:
http://blog.csdn.net/john_f_lau/article/details/9260863 http://forums.pentaho.com/showthread.php?54423-Passing-parameters-to-jobs-on-kitchen-command-line http://wiki.pentaho.com/display/EAI/Named+Parameters http://wiki.pentaho.com/display/EAI/Kitchen+User+Documentation http://wiki.pentaho.com/display/EAI/Named+Parameters http://blog.csdn.net/qqzyb/article/details/8939517
http://blog.sina.com.cn/s/blog_543e73a80100k0vz.html http://www.cnblogs.com/wxjnew/p/3620792.html
两个例子,传入多个参数:
/home/www/allyes/aso/kettle/kitchen.sh -file /home/www/allyes/aso/etl/test.kjb -param:os='1' -param:appstore='all' -param:dt='2014-02-24' >/home/www/allyes/aso/etl/log.txt 2>/home/www/allyes/aso/etl/error.txt
/home/www/allyes/aso/kettle/kitchen.sh -file /home/www/allyes/aso/etl/test.kjb -param:os=1 -param:appstore='all' -param:dt='2014-02-24' -level=Detailed >/home/www/allyes/aso/etl/log.txt
命令行执行,options 后面可以是"="也可以是":"也可以是空格,三者都行,如kitchen.bat /file d:\ 或者 -file=D:\ 或者/file:D:\
kitchen.bat /norep -file=D:/kettledata/mysal2orcle.kjb >> kitchen_%date:~0,10%.log
参数传入后,必须先在transformation中的setting设置里添加对应参数。然后用get variables控件获得 http://wiki.pentaho.com/display/EAI/Named+Parameters http://type-exit.org/adventures-with-open-source-bi/2010/07/using-named-parameters-in-kettle/
两种格式(住linux下可以没有双引号quotation,windows要求参数parameter必须有双引号) 1:kitchen /file:"MyJob.kjb" /param:ServerName=MyServer 多个param: Linux: ./kitchen.sh -file:job.kjb -param:files.dir=/opt/files -param:max.date=2010-06-02 Windows: Kitchen.bat -file:job.kjb “-param:files.dir=/opt/files” “-param:max.date=2010-06-02″
2:kitchen /file:"your job name.kjb" "command line argument 1" "command line argument 2" "command line argument 3"....
listparam,也是使用多个parameters,如: sh pan.sh -file:/tmp/foo.ktr -listparam Parameter: MASTER_HOST=, default=localhost : The master slave server hostname to connect to Parameter: MASTER_PORT=, default=8080 : The master slave server HTTP control port
也可以写成,等同于: user@host:$ sh pan.sh -file:/tmp/foo.ktr -param:MASTER_HOST=192.168.1.3 -param:MASTER_PORT=8181 Windows requires you to use quotes around the parameter otherwise the equals sign is treated as a space by the command interpreter: c:\> pan.sh -file:/tmp/foo.ktr "-param:MASTER_HOST=192.168.1.3" "-param:MASTER_PORT=8181"
日志的选择,不同参数的设定: -level 日志级别:(运行界面,log显示框左上角三个小图标,最后一个扳手锤子为设置level) Rowlevel: print所有在Kettle中的有效日志,包括在大量复杂步骤的信息; Debugging: 产生大量的日志信息,主要用于调试,但是不是在行级别(row level); Detailed:允许用户看到比基本日志级别更富比较性的信息,额外的信息实例包括SQL查询语句和一般的DDL都会产生。 Basic:默认的日子级别;仅仅打印这些能够反映在步骤或者任务条目上的信息。 Minimal:通知你仅仅关于一个任务或者转化的信息。 Errorlogging only: 如果那儿有一个错误,显示错误消息;否则,什么都不显示。 Nothingat all: 即使当有错误存在的时候,不要产生任何日志。 ———————————————— 版权声明:本文为博主「longshenlmj」的原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接及本声明。 原文链接:https://blog.csdn.net/longshenlmj/article/details/20060877