XML FIle Parsing and validations in SSIS
I was looking for the best way of extracting an xml file in SSIS,
I have a xml source file with a specific data structure and data types which need to be transformed into a target schema of sqlserver database of a different structure,tthe source file is a complex structure which need to
be flattened and validated
With regards to data validations and business rules ,how should be the extraction based on ,should we create a XML Schema with necessary validations ,and reject the files if they don't confirm to the schema,
With regards to changing the structure of the data of the xml data ,what is the best way to implement it ,should we implement with XSLT with XML task to flatten the structure or custom coding with script task in .net framework,also
any specific things to avoid while extracting xml using ssis,i have been following previous experience of other people on this forum complaining about the size of the file.
Thanks
July 27th, 2012 8:20am
you can use XML Task to do VALIDATION, this is an example:
http://www.rad.pasfu.com/index.php?/archives/20-XML-Task-Validating-Data.html
you can use XML Task with XSLT operation to Changing STRUCTURE, like this:
http://www.rad.pasfu.com/index.php?/archives/21-XML-Task-Changing-Style-of-Data-XSLT.html
you also can use Data flow's XML Source if you want to fetch xml data and load it into destination table
http://www.rad.pasfu.com
Free Windows Admin Tool Kit Click here and download it now
July 27th, 2012 8:28am
if your xml is specific (not dynamic with "any" tag), you can generate the xsd for the files itself in ssis (under xml transfromation task) and give it a try. Also, error handling will be not be an issue in ssis. Data validation and business rules can be
implemented inside ssis.
http://blog.hoegaerden.be/2011/04/07/loading-xml-using-ssis/
regarding the flattening the file, you will get each element from the file as a separate column, which is itself flattened. XSLT will be easier to implement but script task has more options available(thigns you can do in asp.net).
xslt : http://blogs.msdn.com/b/mattm/archive/2007/12/15/xml-source-making-things-easier-with-xslt.aspx
regards
joon
July 27th, 2012 9:09am
Thank you Joon,
I was looking to what level we can validate the xml file with xml schema,in general people usually use the ssis generate schema to validate the xml file ,so what is the ideal thing to do ,is it good to create a custom schema with various validations
in it or just use the generate schema in ssis xml source task and do the rest of the validations down the flow.
If i need to validate Data types ,Column Uniqueness,and Header validations what should be the best approach to do ,should we do basic data type validations while loading the xml using xml schema and reject invalid files and with
schema changes, and do the rest of validations ,business rules in ssis.
My xml file generated 12 nodes with xml source componant which is a kind of second normal form of data which need to be flattened,i tried generating xml task with multiple schemas ,and used multiple merge joins to get it to the required
structure,but it is proving very cumbersome and i dont believe it is the right approach.
Thanks
Free Windows Admin Tool Kit Click here and download it now
July 27th, 2012 9:27am
Thank you Joon,
I was looking to what level we can validate the xml file with xml schema,
If i need to validate Data types ,Column Uniqueness,and Header validations what should be the best approach to do ,should we do basic data type validations while loading the xml using xml schema and reject invalid files and with
schema changes, and do the rest of validations ,business rules in ssis.
My xml file generated 12 nodes with xml source componant which is a kind of second normal form of data which need to be flattened,i tried generating xml task with multiple schemas ,and used multiple merge joins to get it to the required
structure,but it is proving very cumbersome and i dont believe it is the right approach.
Thanks
July 27th, 2012 9:32am