BizTalk Flat File Parsing Annotations. Flat File Parsing = LL(k) Parser The flat file parser is entirely grammar driven and is implemented as an LL(k)

BizTalk Flat File Parsing Annotations

Flat File Parsing = LL(k) Parser The flat file parser is entirely grammar driven and is implemented as an LL(k) Parser or Look-ahead LL Parser. Schema’s are translated into a grammar which is then translated into tables which are used during parsing. http://en.wikipedia.org/wiki/LL_parser The FFParser for BTS2004 is a streaming parser. Source of information = newsgroups, David Downing! Manual annotations = attributes of the /annotation/appinfo/schemaInfo element

suppress_empty_nodes="true|false" Removes empty nodes from the XML stream. This can be used to eliminate fields that are empty after being parsed, but the XSD type doesn't allow empty values. Default = false

suppress_empty_nodes="true|false" Sample false true Field1-Field2-Field3 Field1--Field3 Field1-Field2-

generate_empty_nodes="true|false“ Serialization Generate empty nodes for records that exist in the XML instance data. Adds missing fields to records in the XML stream. This allows positional records to line up correctly and place delimiters in delimited records. Default = true

generate_empty_nodes="true|false“ Sample positional (Record1/Field1,Field2,Field3 each 3pos) true AAA CCC false AAACCC

generate_empty_nodes="true|false“ Sample delimited (Record1/Field1,Field2,Field3 delimited infix ‘comma’) true AAA,,CCC false AAA,CCC

allow_early_termination="true|false" When it comes to positional records, because this is a grammar driven parser, delimiters encountered during a parse of the positional record are not treated as delimiters, rather they are treated as part of the current field of the positional record that is being parsed. Used to allow the right-most positional field to be treated as a delimited field (ie can be shorter or longer than specified by the pos_length setting). Only the right-most positional field is allowed to early terminate.Right-most = starting from a positional record the right most child of the record as it's being parsed from left to right. (This includes positional child records of the parent positional record. parser_optimization="complexity" and allow_early_termination="true“-> you effectively change the right- most positional field into a delimited field thus allowing this field to early terminate.)

allow_early_termination="true|false“ Sample 1. Ok: AAABBBCCC(0x0D 0x0A) 2. Nok: AAABBBCC(0x0D 0x0A) AAABBBCCC(0x0D 0x0A) allow_early_termination="true” 2. Nok now Ok

parser_optimization ="complexity|speed" Contols the grammar generated from the schema used to parse the flat file document. The complexity setting produces a more complex grammar and can be used to parse records that have complex nested optional children. Although the parser is much more flexible in the data that can be parsed using this setting, it does so at the expense of the speed in which the data can be parsed, and not all data layouts can be successfully parsed. The speed setting optimizes for speed, and is limited in the complexity of the data that can be parsed. When parser_optimization is set to complexity, you may have validation failures against a schema when there are many optional nodes in the same group or record. You may need to set lookahead_depth to zero (0) to avoid validation errors. Default = speed

parser_optimization="complexity|speed“ Sample ms-help://BTS_2004/SDK/htm/ebiz_prog_pipe_vhtb.htm Sample from BizTalk help (do not forget to put lookahead_depth="0") Root ("," prefix) Field1 opt Field2 opt Field3 opt Field4 opt Record1 ("," infix) Field5 Field6

parser_optimization=“complexity“ Sample instance:,1,2,3,4 Output (Record1 mandatory): Output (change Record1 optional – minoccurs=0):

parser_optimization=“speed“ Sample instance:,1,2,3,4 Output (Record1 optional – minoccurs=0): Output (change Record1 to mandatory): Parsing Error!

parser_optimization="complexity|speed“ Conclusion Complexity setting: parsing engine uses both top-down and bottom-up parsing Speed setting: parsing engine uses top- down only

lookahead_depth="nn" The lookahead_depth setting can be used to instruct how far the parser will attempt to lookahead when matching data. The lookahead_depth refers to how far ahead you look in the parsing token stream to make a parsing prediction. 0 means infinite lookahead. The higher the number the more expensive the processing will be to locate matches during the parse. Ideal: evaluate lookahead_depth from infinite (0) to the minimum value and then add 2. Because the parser is grammar driven and the grammar goes through several transformations from the schema before becoming a grammar, there is no way of correlating it back to the schema itself. Default = 3 The lookahead_depth applies to the speed mode as well, but because the generated grammar in speed mode is much less complex, it's more difficult to create a scenario where it actually does demonstrate this. Higher lookahead_depth = more memory consumption

lookahead_depth="nn“ Sample Instance(3 fields): Field1+Field1+Field1+ Result (depth=2): Missing data! Only 2 fields in XML Result (depth=3 or depth=0): Ok All 3 fields in XML

BizTalk Flat File Parsing Annotations. Flat File Parsing = LL(k) Parser The flat file parser is entirely grammar driven and is implemented as an LL(k)

Similar presentations

Presentation on theme: "BizTalk Flat File Parsing Annotations. Flat File Parsing = LL(k) Parser The flat file parser is entirely grammar driven and is implemented as an LL(k)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

BizTalk Flat File Parsing Annotations. Flat File Parsing = LL(k) Parser The flat file parser is entirely grammar driven and is implemented as an LL(k)

Similar presentations

Presentation on theme: "BizTalk Flat File Parsing Annotations. Flat File Parsing = LL(k) Parser The flat file parser is entirely grammar driven and is implemented as an LL(k)"— Presentation transcript:

Similar presentations

About project

Feedback