Altair® Monarch®

 

Using Address Blocks

Address blocks provide a way to parse address text into its component parts. An address block is a named object. Its definition includes its name, postal code format flags (the type of postal codes it recognizes), a set of one or more input fields, and a set of output fields. The output fields are the "pieces" that are extracted from the input text. Possible output fields are: up to six (6) generic address lines, city, region, postal code, country, and an error code.

The individual address fields (the output fields) of an address block behave as a special kind of calculated field. They are accessible from the "Calculated Fields" list, but they have no formula, and their data type is fixed as "character" (except the error code which is numeric).

Address blocks are defined via the Address Block wizard, which can be displayed by the following steps:

  1. Select Address Blocks from the Data group of the Table tab to display the Address Blocks menu.

  2. Select New.

Address block postal code formats

Postal Code Format

Description

5 digit

A pattern of "ddddd" is recognized as a 5-digit postal code (US ZIP code) if USA postal codes are enabled.

 

ZIP+4

 

A pattern of "ddddd-dddd" is recognized as a USA ZIP+4 postal code if USA postal codes are enabled.

 

4-digit

 

A pattern of "dddd" is recognized as a 4-digit postal code (Australia and New Zealand) if Australian postal codes are enabled.

 

Canada

 

A pattern of "ada dad" is recognized as a Canadian postal code if Canadian postal codes are enabled.

 

Europe

 

Any of several forms like "a-dddd", "aaddddd", etc., are recognized if Continental European postal codes are enabled.

The particular patterns that are recognized as European postal codes are quite varied. Generally, there is a 3-, 4- or 5-digit number, optionally preceded by a 1-, 2- or 3-letter country prefix. The country prefix may be separated from the digits by a space, a dash, or nothing.

Ambiguities can arise when European postal codes appear without their country prefix. For example, a pattern of "ddddd" might be either a 5-digit ZIP code or a 5-digit European postal code that lacks a country prefix. If both USA and European codes are enabled, this ambiguity is resolved by looking at where the 5-digit code appears in context. If it appears at logical EOL it's a ZIP code, if it appears at logical BOL it's a European postal code. If it appears at both BOL and EOL, it's assumed to be a ZIP code.

Similarly, there can be an ambiguity if the country prefix of a European postal code is separated from the digits by a space. If this type of pattern appears at logical EOL, it is NOT interpreted as a European postal code if the digits can be construed as a 3-, 4- or 5-digit code in their own right.

UK

 

A variety of patterns ("ad daa", "aad daa", "aadd daa" etc.) are recognized as UK postal codes if UK postal codes are enabled. Also, shortened patterns (those that match the first part but lack the "daa" pattern) are recognized as UK postal codes if they follow the word "London".

Ireland

Ireland does not use postal codes, except for addresses in Dublin. A 1 or 2 digit number or the special code "6W" code are recognized as Irish postal codes under the following conditions:

  • Irish postal codes are enabled.

  • The 1- or 2-digit number or "6W" is the next word after the word "Dublin".

Since many Irish addresses do not contain a postal code, if we find an address that lacks a postal code we look specifically for clues to see if we can recognize it as Irish address. The algorithm extracts the last three logical lines (pieces between BOL and EOL) of the address, then dissects the address using the following logic:

If the last line can be recognized as some form of name "Ireland" (such as "Ireland", "Republic of Ireland", "ROI", "Eire", or "Eireann") then this is taken as the country name. The previous line is then taken as the region (if it starts with "County" or "Co."), and the line before that is taken as the city.

Failing that, if the last line can be recognized as a county (for example, if it starts with the word "County" or "Co.", then this is taken as the region and the previous line is taken as the city.

Failing that, the last line is taken as the city, but the "No Postal Code" error is signaled.

Brazil

 

A pattern of "ddddd-ddd" is recognized as a Brazilian postal code under the following conditions:

  • Brazilian postal codes are enabled.

  • The pattern occurs at logical BOL.

India

 

A pattern of "ddd ddd" is recognized as an Indian postal code if Indian postal codes are enabled.

Note: If Indian postal codes are not enabled, the pattern "ddd ddd" is still recognized, but is treated as an ordinary word. This behavior prevents its separate "ddd" parts from being misinterpreted as either a generic 3-digit code or a 3-digit European code.

6-digit

A pattern of "dddddd" is recognized as a generic 6-digit postal code under the following conditions:

  • 6-digit postal codes are enabled.

  • The pattern occurs at logical EOL.

3-digit

A pattern of "ddd" is recognized as a generic 3-digit postal code under the following conditions:

  • 3-digit postal codes are enabled.

  • The pattern occurs at logical EOL.

Note: A pattern of "ddd" at logical BOL, but NOT at logical EOL is recognized as a European postal code (assuming European postal codes are enabled).

2-digit

 

A pattern of "dd" is recognized as a generic 2-digit postal code under the following conditions:

  • 2-digit postal codes are enabled.

  • The pattern occurs at logical EOL.

  • The pattern is NOT the next word after the word "cedex" (if European postal codes are enabled).

1-digit

 

A pattern of "d" is recognized as a generic 1-digit postal code under the following conditions:

  • 1-digit postal codes are enabled.

  • The pattern occurs at logical EOL.

  • The pattern is not the next word after the word "cedex" (if European postal codes are enabled).

 

Creating an address block

Steps:

  1. Open a report that contains an address.

  2. Create a template to trap the "raw" address text via a memo field.

  3. Go to Table view and select Data, Address Blocks (ALT, D, A) from the menu (or click the Address Blocks button). The Address Blocks dialog displays.

  4. Click the New button to display the Address Block wizard.

  5. On the Name and Postal Code Formats screen enter a name for the address block and select one or more expected postal code formats.

  6. On the Input Fields screen select the field or fields containing the address text.

  7. On the Output Fields screen select the desired output fields and enter suitable field names for them.

  8. Press OK to accept the address block and close the wizard, then press OK to close the Address Blocks dialog.

Address block error codes

The address block parser recognizes several error conditions. These are accessible through the "Error Code" output field. Values of this field are integers having the following meanings:

Error Code Value

Meaning

     0 

No error.

     1 

No postal code found. The parser couldn’t find anything in the input text that it recognized as a postal code. This could indicate bad input data or perhaps data containing postal codes of types that haven’t been enabled.

     2 

Unexpected text after postal code. Occurs only for right-aligned postal codes (such as US Zip codes) in the case where text (other than a comma) is found after the postal code but before the country. This generally indicates bad input data, or perhaps the parser misidentified the postal code.

     3 

The value that was parsed out into the Country field contains digits or commas. This generally means that the parser misidentified the country piece.

     4 

The value that was parsed out into the Region field contains digits or commas. This generally means that the parser misidentified the region piece.

     5 

The value that was parsed out into the City field contains embedded digits or commas. (Digits at the end of a city name are OK, e.g., "Bern 7"). This generally means that the parser misidentified the city piece.