正文
最近公司要从聚石塔上抽取数据,其中有JSON格式数据,所以学习一下Kettle解析JSON,碰到小小问题,记录一下:
(1)
2015/07/15 15:22:48 - trade_detail.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Unexpected Error : org.pentaho.di.core.exception.KettleException:
2015/07/15 15:22:48 - trade_detail.0 - The data structure is not the same inside the resource! We found 2 values for json path [$..title], which is different that the number returned for path [$..buyer_alipay_no] (1 values). We MUST have the same number of values for all paths.
2015/07/15 15:22:48 - trade_detail.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : org.pentaho.di.core.exception.KettleException:
2015/07/15 15:22:48 - trade_detail.0 - The data structure is not the same inside the resource! We found 2 values for json path [$..title], which is different that the number returned for path [$..buyer_alipay_no] (1 values). We MUST have the same number of values for all paths.
2015/07/15 15:22:48 - trade_detail.0 -
原因是由于JSON嵌套内的对象还有同样的title,所以无法解析。
方法:
在外面这层用javascript解析,然后重命名,继续往下用JSON控件解析...
JAVASCRIPT代码:
var data = JSON.parse(trade);
var alipay_id = data.alipay_id;
var buyer_alipay_no = data.buyer_alipay_no;
var seller_nick = data.seller_nick;
var orders1 = data.orders;
var title1 = data.title;
var logistics_company1 = data.title;
这样只能解决2个同名的JSON结构,当JSON存在3个同名的,1个在外层,2个嵌套在不同的对象里面,还是会报错。
既然是path问题,那就看看是否可以修改path。
官网上又这样的路径,依样画葫芦,OK!
XPath JSONPath Description / $ the root object/element . @ the current object/element / . or [] child operator .. n/a parent operator // .. recursive descent. JSONPath borrows this syntax from E4X. * * wildcard. All objects/elements regardless their names. @ n/a attribute access. JSON structures don't have attributes. [] [] subscript operator. XPath uses it to iterate over element collections and forpredicates. In Javascript and JSON it is the native array operator. | [,] Union operator in XPath results in a combination of node sets. JSONPath allows alternate names or array indices as a set. n/a [start:end:step] array slice operator borrowed from ES4. [] ?() applies a filter (script) expression. n/a () script expression, using the underlying script engine. () n/a grouping in Xpath XPath JSONPath Result /store/book/author $.store.book[*].author the authors of all books in the store //author $..author all authors /store/* $.store.* all things in store, which are some books and a red bicycle. /store//price $.store..price the price of everything in the store. //book[3] $..book[2] the third book //book[last()] $..book[(@.length-1)]$..book[-1:] the last book in order. //book[position()<3] $..book[0,1]
$..book[:2] the first two books //book[isbn] $..book[?(@.isbn)] filter all books with isbn number //book[price<10] $..book[?(@.price<10)] filter all books cheapier than 10 //* $..* all Elements in XML document. All members of JSON structure.
官网关于path的URL:+Input
根据官网提示链接到:
(2)
2015/07/15 15:30:48 - trade_detail.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Unexpected Error : org.pentaho.di.core.exception.KettleException:
2015/07/15 15:30:48 - trade_detail.0 - org.pentaho.di.core.exception.KettleException:
2015/07/15 15:22:48 - trade_detail.0 -
2015/07/15 15:22:48 - trade_detail.0 -
2015/07/15 15:22:48 - trade_detail.0 -We can not find and data with path [$..logistics_company]
原因:JSON中存在不完整的字段,无法解析。
方法:
1、用javascript脚本读取内容
2、勾选json控件-->内容-->设置忽视不完整的路径
(不过这个方式对嵌套多层的无效)
javascript脚本例子:
var json={"status":1,"info":"u83b7u53d6u6210u529f","data":[{ "name":"dingding","information":"电视迷","Region_id":"3","type":"1"},{"id":"5","name":"kenwong","information":"you are ok","Region_id":"5","type":"100"}]}
var data=json.data;
var str1 ="";
var str2 ="";
var str3 ="";
for(i=0;i<data.length;i++)
{
str1 += data[i].name +","
str2 += data[i].information +",";
str3 += data[i].id +",";
}