On a client project today we had the need to create a table from an XML data file sent in a hierarchical format. The XML file contains multiple orders. Each order is for a particular store which is provided inside a header tag. Each store can order multiple items which are provided via sub tags of each header. A simplified sample of the file looks like this:
[box type=”shadow”]
<orders>
<order_header>
<id>{6315E9FF}</id>
<account_number>12345</account_number>
<account_name>JOES CONVENIENCE</account_name>
<store_address>123 MAIN STREET</store_address>
<store_address_city>Pittsburgh</store_address_city>
<store_address_state>PA</store_address_state>
<store_address_postcode>15235</store_address_postcode>
<order_line>
<product_code>123456</product_code>
<product_externalid>123456</product_externalid>
<product_name>Peg Counter Spinner</product_name>
<order_quantity>1</order_quantity>
</order_line>
<order_line>
<product_code>723678</product_code>
<product_externalid>723678</product_externalid>
<product_name>Skittles</product_name>
<order_quantity>1</order_quantity>
</order_line>
</order_header>
<order_header>
<id>{3CB06C14}</id>
<account_number>23456</account_number>
<account_name>SMITH LIMITED</account_name>
<store_address>132 WEST AVENUE</store_address>
<store_address_city>Pittsburgh</store_address_city>
<store_address_state>PA</store_address_state>
<store_address_postcode>15213</store_address_postcode>
<order_line>
<product_code>123456</product_code>
<product_externalid>123456</product_externalid>
<product_name>Peg Counter Spinner</product_name>
<order_quantity>1</order_quantity>
</order_line>
</order_header>
</orders>
[/box]
The challenge was to write an Oracle query that could easily return both the header and detail information on a single row. After some quick research I found that this can be done by using two separate XMLTables that hit the same file. The first XMLTable contains store-level information. The second XMLTable contains item level information. When used together the second table starts with the XMLType (order_lines) that is defined in the first XMLTable. The outer query simply selects fields you need. It’s essentially a Cartesian product but it only intersects where appropriate given the “passing” clause. Here is a sample of the query to read the data. This query assumes the file name is orders.xml and it is sitting in the previously defined XMLDIR Oracle directory.
[box type=”shadow”]
SELECT stores.order_id
,stores.account_number
,stores.account_name
,stores.retail_number
,stores.store_name
,stores.store_address
,stores.store_state
,stores.store_zip
,items.*
FROM XMLTable('/orders/order_header'
passing xmltype( bfilename('STRIPE_UAT','wrigley_orders.xml'), nls_charset_id('AL32UTF8') )
columns order_id varchar2(2000) path 'id'
,account_number varchar2(50) path 'account_number'
,account_name varchar2(50) path 'account_name'
,retail_number varchar2(12) path 'store_externalid'
,store_name varchar2(50) path 'store'
,store_address varchar2(100) path 'store_address'
,store_state varchar2(5) path 'store_address_state'
,store_zip varchar2(10) path 'store_address_postcode'
,order_lines XMLTYPE path 'order_line'
) stores,
XMLTable('/order_line'
passing stores.order_lines
columns product_code varchar2(100) path 'product_code'
,product_name varchar2(100) path 'product_name'
,order_quantity varchar2(100) path 'order_quantity'
) items
[/box]
This methodology can easily be extended to query an XML structure n levels deep to return information in one result set.