I would like to import an XML file containing hierarchial data into a table in SQLserver. I guess that I am supposed to use the XML source editor and connect it to the xml file containing the following:
<?xml version="1.0"?>
<VariantFamilies>
<VariantFamily FamilyID="XXX">
<FamilyDescription Language="en-GB">Variant Family Description in English</FamilyDescription>
<FamilyDescription Language="sv-SE">Variantfamiljsbeskrivning p? svenska</FamilyDescription>
<Variants>
<Variant VariantID="1XX">
<VariantDescription Language="en-GB">Variant Description 1 in English</VariantDescription>
<VariantDescription Language="sv-SE">Variantbeskrivning 1 p? svenska</VariantDescription>
</Variant>
<Variant VariantID="2XX">
<VariantDescription Language="en-GB">Variant Description 2 in English</VariantDescription>
<VariantDescription Language="sv-SE">Variantbeskrivning 2 p? svenska</VariantDescription>
</Variant>
</Variants>
</VariantFamily>
<VariantFamily FamilyID="XYY">
<FamilyDescription Language="en-GB">Variant Family 2 Description in English</FamilyDescription>
<FamilyDescription Language="sv-SE">Variantfamiljsbeskrivning 2 p? svenska</FamilyDescription>
<Variants>
<Variant VariantID="1YY">
<VariantDescription Language="en-GB">Variant Description 1 in English</VariantDescription>
<VariantDescription Language="sv-SE">Variantbeskrivning 1 p? svenska</VariantDescription>
</Variant>
<Variant VariantID="2YY">
<VariantDescription Language="en-GB">Variant Description 2 in English</VariantDescription>
<VariantDescription Language="sv-SE">Variantbeskrivning 2 p? svenska</VariantDescription>
</Variant>
</Variants>
</VariantFamily>
</VariantFamilies>
Following columns exist in the table: WK, VariantFamily, VariantFamilyDescription, VariantID, VariantDescription.
wk is automatic,
Variant family should contain XYY,
VariantFamilyDescription should contain Variant Family 2 Description in English,
VariantID should contain 2YY,
VariantDescription should contain Variant Description 2 in English
in the outputname I can only choose one of the attributes at a time (the system has created two new items for each ID to identify the "parent". Now how do I go about to put the actual information into the columns? NOT the system created IDs. I would like to have one row per smalles grain (VariantID) hence having redundant data in Variant family column.
does it have anything to do with how the XSD is defined? or do I need to use the xml task and how do I do then?
Thankful for descriptions of how to go about.
Sincerely,Hanna,
XMLSrc adapter builds up its metadata based on the schema, a complexType in xsd will be read as a table (which is an output at xmlsrc) and elements in that complexType will be columns of that table. Hirarchial data will become multiple tables with _id columns in each related table to track their relationship.
From the xml above, I don't see you can import it to a single table using XMLSrc. Depends on your xsd, you may need further steps in XMLSrc downstreams, unionAll? merge?, to accomplish your scenario.
Thanks
Wenyang|||
I seem to be in a similar conundrum. However, what I want to do is move data elements from my XML file into multiple SQL Server database tables. I've created the XML data flow source and multiple data flow destinations, each pointing to their own table.
As Wenyang Hu pointed out, the XMLSrc adapter builds up its metadata based on the schema, a complexType in xsd will be read as a table. I've now got over 50 tables, but can only select one (from the XML Source Editor columns form). Subsequently, when I go to add a data flow path to my data flow destination, I can only select a single output, which doesn't provide me with all the data I need to load a single database table.
I hope this makes sense and look forward to a solution.
Thanks,
Johnny
|||
That's true one xmlsrc output can be hooked with only one downstream, hence if you want several xmlsrc outputs to be loaded to one single destination, then as said above you'll need other transformation(s) either in between XMLSrc and destination, or, in another dataflow to manipulate.
For example, you can use a dataflow like
XMLSrc (output1)->Sort->MergeJoin->Sort->MergeJoin>Destination
(output2)->Sort--^ ^
(output3)->Sort-|
To Sort on various key columns then join on those key columns at MergeJoin to eventually merge these 3 xmlsrc outputs into one destination. Since you have 50 outputs from xmlsrc, you can also save them into 50 destionation tables first, then use another dataflow with those tables as sources, then use other SSIS transforms to process based on your specific data hierarchy.
Thanks
Wenyang
|||Either way, that will be quite a mapping exercise. I'm wondering if the 'Union All' transformation task might work. The help on that task indicates it can have multiple inputs. If I map multiple inputs from my XMLSrc outputs, then maybe this will work. Then I can have about 10 union all task to map to my 10 SQL tables. Thoughts? I'm hoping to understand the union all as the help doesn't indicate that it does anything unusual.thanks.
|||
UnionAll unions all its input rows but a) the column metadata will be built up based only on the primary input(the first one hooked up with unionAll) b)it does not "join" on key columns which you may need to merge those hierarchical data from xml file into one destination without loosing your data logic. So unless your multiple xmlsrc outputs have identical column structure, unionall won't fit your need I think.
Thanks
Wenyang
|||I have a similar problem, except the XML source outputs would be fine if I just could reference the id of the parent node.
Example:
<Property>
<PropertyId>TheIdINeed</PropertyId>
....
....
<Buildings>
<Building id="123">
.....
......
<Rooms>
<Room id ="ABC">
.....
.....
<Beds>
<Bed id="cfg">
.....
.....
</Bed>
</Beds>
</Room>
</Rooms>
</Building>
</Buildings>
</Property>
My problem is that I can't get the PropertyId from the Xml Source output for Building, Room,Bed.
I have tried to make a reference i the XSD and a new External column, but that was just Null.
Best regards
Claus
No comments:
Post a Comment