The %%$PARSE Function
Overview of %%$PARSELink copied to clipboard
The %%$PARSE function parses a specified string (that is, it splits the specified string into substrings) according to a specified template. A template consists of variables and "patterns" that determine the parsing process.
The format of the %%$PARSE function is
DO SET=%%$PARSE string template
In this format
-
string is the AutoEdit variable that contains the string to be parsed
-
template is the AutoEdit variable or constant that contains the template
DO SET=%%S=THIS IS A SAMPLE STRING
DO SET=%%T=A1 A2 A3 A4 A5
DO SET=%%$PARSE %%S %%T
The %%$PARSE function assigns substrings of the specified string to the specified variables according to the specified template.
The DO SET statements in the above example provide the same result as the following DO SET statements:
DO SET=%%A1=THIS
DO SET=%%A2=IS
DO SET=%%A3=A
DO SET=%%A4=SAMPLE
DO SET=%%A5=STRING
The parsing process involves the following stages:
-
The string is broken into substrings, from left to right, using the patterns in the template.
-
Each substring is parsed into words, from left to right, using the variable names in the template.
Template elements are
-
String Patterns
-
Position Patterns
-
Variables
-
Place holders (Dummy variables)
The rules of parsing are detailed in the following paragraphs.
Parsing WordsLink copied to clipboard
Scanning is performed from left to right and words in the string (leading and trailing blanks excluded) are matched one by one with the variables named in the template. The last variable named in the template will contain the remaining part of the string, including leading and trailing blanks.
Up to 30 variable names can be specified in a parsing template.
The following situations can be encountered:
-
The number of words in the string matches the number of variables in the template.
Each of those variables contains one word of the string. The last variable contains the last word in the string including leading and trailing blanks. -
The number of words in the string is smaller than the number of variables named in the template
The first variables each contain one word of the string and the extra variables receive a value of NULL (a string of 0 character length). -
The number of words in the string is greater than the number of variables in the template
All variables but the last one contain one word of the string and the last variable named in the template contains the remaining part of the string, including leading and trailing blanks.
The DO SET statements below (which include a %%$PARSE function)
DO SET=%%S = THIS IS A SAMPLE STRING
DO SET=%%T = A1 A2 A3
DO SET=%%$PARSE %%S %%T
have the same result as the following DO SET statements:
DO SET=%%A1 = THIS IS A SAMPLE STRING
DO SET=%%A2 = IS
DO SET=%%A3 = A SAMPLE STRING
Using Dummy Variables (Place Holders)Link copied to clipboard
A single period can be used as a dummy variable in the template. This is useful when the corresponding word in the string does not need to be stored in a named variable.
The following DO SET statements (which include a %%$PARSE function)
DO SET=%%S = THIS IS A SAMPLE STRING
DO SET=%%T = . . . A4.
DO SET=%%$PARSE %%S %%T
have the same result as the following DO SET statement:
DO SET=%%A4 = SAMPLE
Using Patterns in ParsingLink copied to clipboard
Patterns can be included in the template. Their purpose is to break down the string into substrings prior to the actual parsing into words process. Parsing will then be performed, as previously described, on the substrings and not on the original string.
Two types of patterns are available:
-
String Patterns - a character string delimited by quotes, to distinguish it from a variable name
-
Numeric (Positional) Patterns - a number, signed or unsigned
Using String PatternsLink copied to clipboard
The string is scanned from left to right for a substring that matches the string pattern.
The following situations may occur:
-
A match is found, that is, a substring within the string is identical to the given string pattern.
The original string is divided into two substrings. The first substring (up to, but not including, the string pattern) is parsed into words using the variables named before the string pattern on the template. Parsing continues from the character following the matched string.
CopyCopied to clipboardDO SET=%%S= THIS IS A SAMPLE STRING
DO SET=%%T= A1 A2 'SAMPLE' A3 A4 A5
DO SET=%%$PARSE %%S %%TA match is found since the string SAMPLE is part of the original string.
The %%$PARSRC System variThe original string is divided into two substrings while the matched part of the string is excluded. Parsing of the first substring will use the variables listed before the match on the template while parsing of the second substring will use the variables listed after the match:
-
First substring: THIS IS A
As a result of parsing
CopyCopied to clipboardA1=THIS
A2=IS A
-
Second substring: STRING
As a result of parsing:
CopyCopied to clipboardA3=STRING
A4=NULL
A5=NULL
-
-
A match is not found. There is no substring identical to the given string pattern within the string.
It is assumed that a match is found at the end of the string. The first substring consists of the entire string and it is parsed using only the variables named before the string pattern on the template. Parsing continues from the character following the matched string (the end of the string, in this case).
CopyCopied to clipboardDO SET=%%S = THIS IS A SAMPLE STRING
DO SET=%%T = A1 A2 A3 'EASY' A4 A5
DO SET=%%$PARSE %%S %%TA match was not found. The string 'EASY' does not exist within the original string.
-
First substring: THIS IS A SAMPLE STRING
As a result of parsing:
CopyCopied to clipboardA1=THIS
A2=IS
A3=A SAMPLE STRING-
Second substring: NULL
As a result of parsing:
CopyCopied to clipboardA4=NULL
A5=NULL
-
Using Numeric Patterns within the TemplateLink copied to clipboard
Numeric patterns are numbers that mark positions in the string. They are used to break the original string into substrings at the position indicated by the number.
The position specified can be absolute or relative:
-
An absolute position is specified by an unsigned number.
-
A relative position is specified by a signed number (positive or negative) and its purpose it to determine a new position within the string, relative to the last position.
-
The last position is one of the following:
-
the start of the string (position1), if the last position was not specified previously
-
the starting position of a string pattern if a match was found
-
the position of the end of the string, if the string pattern was not matched
-
the last position specified by a numeric pattern
-
-
If the specified position exceeds the length of the string, the numeric pattern is adjusted to the end of the string. Similarly, if the specified position precedes the beginning of the string (negative or zero numeric pattern), then the beginning of the string is used as last position.
A parsing template with an absolute numeric pattern:
CopyCopied to clipboardDO SET=%%S =THIS IS A SAMPLE STRING
DO SET=%%T = A1 A2 11 A3 A4 A5
DO SET=%%$PARSE %%S %%T -
First substring: THIS IS A (up to, but not including, position11)
As a result of parsing
CopyCopied to clipboardA1=THIS
A2=IS A -
Second substring: SAMPLE STRING (from position 11, up to the end of the string).
As a result of parsing
CopyCopied to clipboardA3=SAMPLE
A4=STRING
A5=NULL (0length string)A parsing template with a relative numeric pattern:
CopyCopied to clipboardDO SET=%%S =THIS IS A SAMPLE STRING
DO SET=%%T = A1 A2 +10 A3 A4 A5
DO SET=%%$PARSE %%S %%TLast position is the beginning of the string (position1).
Position marked within the string is 1 + 10 = 11.
-
First substring: THIS IS A (up to, but not including, position11)
As a result of parsing
CopyCopied to clipboardA1=THIS
A2=IS A -
Second substring: SAMPLE STRING (from position11, up to the end of the string).
As a result of parsing:
CopyCopied to clipboardA3=SAMPLE
A4=STRING
A5=NULL (0length string)
Using More Than One Pattern and Combining Pattern Types in the TemplateLink copied to clipboard
Both types of patterns (string and numeric) can be combined in the same template. Up to 30 patterns and up to 30 variable names can be specified.
Scanning of the string proceeds from beginning of the string until the first pattern (if any).
-
String pattern - A match was found
The substring that precedes the match to the pattern is parsed using the variables named in the template before the pattern, with the last variable receiving the end of the substring, including leading and trailing blanks.
-
String pattern - A match was not found
Since no match was found in the string, it is assumed that a match is found at the end of the string. The whole string is parsed using only the variables named in the template before the pattern.
-
Numeric pattern (absolute)
The absolute numeric pattern points to a position within the string when the beginning of the string is position1.
The string is divided into two substrings.
-
The first substring extends from the beginning of the string and up to, but not including, the position that corresponds to the numeric pattern and it is parsed using the variables named in the template before the pattern.
-
If the absolute numeric pattern specifies a position beyond the length of the string, it is readjusted to the first position beyond the length of the string and the entire string is parsed using the variables named in the template before the pattern.
-
-
Relative numeric pattern
The relative numeric pattern (a signed number) specifies a position within the string, relative to the last position.
-
Last position
It is the beginning of the string when the relative numeric pattern is the first pattern in the template.
-
If the relative numeric pattern is not the first pattern in the template and the previous pattern was numeric, the last position is that specified by the previous numeric pattern.
-
If the relative numeric pattern is not the first pattern in the template and the previous pattern was a string, the last position is that of the starting character of the match (if there was a match) or the position following the end of the string (if there was no match).
-
As a result of what was just explained:
-
If a pattern was not matched until the end of the string and the following pattern is a string pattern, this new string pattern is ignored since the starting point for the new scan is the end of the string.
-
If a pattern was not matched until the end of the string and the following pattern is a numeric pattern, then the scan and subsequent parsing will resume from the new position indicated by that numeric pattern.
Example 1
A parsing template with two absolute numeric patterns (with the second position preceding the first):
The following DO SET statements:
DO SET=%%S = THIS IS A SAMPLE STRING
DO SET=%%T = A1 A2 11 A3 6 A4
DO SET=%%$PARSE %%S %%T
have the same result as the following DO SET statements:
DO SET=%%A1 = THIS
DO SET=%%A2 = IS A
DO SET=%%A3 = SAMPLE STRING
DO SET=%%A4 = IS A SAMPLE STRING
-
First substring: THIS IS A (up to, not including, position 11)
As a result of parsing:
A1=THIS
A2=IS A
-
Second substring: SAMPLE STRING (from position 11 and up to the end of the string; since the next pattern, position 6, precedes the previous position, it cannot limit this second substring)
As a result of parsing:
A3=SAMPLE STRING
-
Third substring: IS A SAMPLE STRING (from position 6 and to the end of the string)
As a result of parsing:
A4=IS A SAMPLE STRING
Example 2
A parsing template with one absolute and one relative numeric pattern:
DO SET=%%S = THIS IS A SAMPLE STRING
DO SET=%%T = A1 6 A2 +3 A3
DO SET=%%$PARSE %%S %%T
-
First substring: THIS (beginning of the string up to, but not including, position6).
As a result of parsing:
A1=THIS
-
Second substring: IS (from position6 up to, but not including, position6+3=9)
As a result of parsing
A2=IS
-
Third substring: A SAMPLE STRING (from position9 to the end of the string)
As a result of parsing:
A3=A SAMPLE STRING
Example 3
A parsing template with two relative numeric patterns:
The following DO SET statements
DO SET=%%T = A1 A2 +40 A3 -13 A4 A5
DO SET=%%S = THIS IS A SAMPLE STRING
DO SET=%%$PARSE %%S %%T
have the same result as the following DO SET statements:
DO SET=%%A1 = THIS
DO SET=%%A2 = IS A SAMPLE STRING
DO SET=%%A3 = %%NULL
DO SET=%%A4 = SAMPLE
DO SET=%%A5 = STRING
The first numeric pattern specifies a position at column 40. This is beyond the end of the string so the position is reset to column 24 (end of the string + 1). As a result, the whole string is parsed to words using the A1 and A2 variables.
The second numeric pattern specifies a position at column 11 (end of the string + 1 minus 13) that precedes the position (40 readjusted to 24) previously specified; therefore the data from the last position (which is the end of the string) to the end of the string is parsed to words using the A3 variable (A3 is set to NULL).
The data (from column 12 to the end of the string) is parsed to words using the A4 and A5 variables.
Example 4
Combining a string pattern and numeric pattern
The following DO SET statements
DO SET=%%S = THIS IS A SAMPLE STRING
DO SET=%%T = A1 'A' A2 +3 A3
DO SET=%%$PARSE %%S %%T
have the same result as the following DO SET statements:
DO SET=%%A1 = THIS IS
DO SET=%%A2 = A S
DO SET=%%A3 = AMPLE STRING
The pattern specifies a string (A) that is matched at column 9. The data before column 9 is parsed to words using the A1 variable. The Numeric pattern (+3) specifies a position at column 12 by using relative position. The data from column 9 to column 12 is parsed to words using the A2 variable. The remaining data (from column 12 to the end of the string) is parsed to words using the A3 variable.