Support DESK

Follow

S2.7.2 - SSIS Tutorial - Output Tables, Filtering and Quality Scoring

Previous Article matchIT SQL Index Next Article 

1) Go to an existing key generation task (below assumes your table has a telephone field in it)

2) Go to the outputs tab, click to enable outputs

The output table gets created at the same time as the key generation

3) Click advanced, go to the advanced tab

Where it says quality scoring – enable it, we suggest leaving the rest of it alone.

The other one people commonly change is the ‘consider casing', but that’s only if your data is always in CAPS, otherwise we might falsely assume something is an acronym when it isn’t.

set abbreviate state to true

4) Go back to the filtering tab, set to do on email only as ex, then delete

Here you can control what records are used for the key generation. So if you only wanted to generate keys on records that had a telephone field that isn’t null (please note though that this means a null value rather than just a blank field) Or you could have a deleteflag field such as in the overlap demo package which would only generate keys for records which are not populated with a 1 to indicate that they are duplicate records.

If you have a more complicated filter you need to apply, it may be better to select that subset into a separate table, or use a view instead. We would advise doing that over trying to do something more complicated with the sampling.

5)Save the task and run it.

Here are some things an output table could possibly produce for you: 

  • show a name parsed out
  • show a company that was all caps as a proper case
  • show a extracted address elements such as premises and thoroughfares
  • show a postcode that was extracted

The best way still to correct and parse an address is through our address validation module - addressIT,

Other Items:

  • show email parsed
  • show email that has varying quality scores

Here's a key of the Email Quality Scores:
0 = empty, nonsense
1 = invalid format
2 = invalid top-level domain (com, org, uk, fr etc.)
5 = generic username (sales, support, postmaster etc.)
6 = username doesn’t match the firstname & lastname from the Input fields
7 = webmail domain (eg. Hotmail.com, mail.com) if WebmailFiltering is enabled
9 = neither of the above apply

 

Previous Article matchIT SQL Index Next Article

 

Was this article helpful?
0 out of 0 found this helpful

0 Comments

Please sign in to leave a comment.