Click on the question below to reveal an answer;
How do I view matches in MS Access?
A lot of our users are accomplished in the use of MS Access Queries and Forms. If you are a confident Access user you may prefer to look at matches that matchIT identifies native in Access.
The simplest way to do this is to use the Single File Wizard in matchIT. At the end of processing one of the options for output is ‘Records in Matched Sets’ then choose ‘Matched Records (matching sets only)’. This provides a file with three columns; UniqueRef, MatchRef and SetDups. Import this file into your Access database being careful to match the data types of UniqueRef and Match_Ref to the data type of your URN.
You can now create an Access query which contains three tables; two instances of your data file and the newly imported Matched Records file. You need to create a relationship between the the 1st instance of your data file and the Master Records file where your URN = UniqueRef. Then create a 2nd relationship between the Master Records file and the 2nd instance of your data file where MatchRef = URN. You now have query which will display records from both instances of the data file in matching groups. You can filter the results using the SetDups field (indicates the number of duplicates in a group) in Master Records and change the ordering and grouping as required. Try ordering by SetDups Descending and URN Ascending as this will show you the larger groups of matches first.
How do I split view data into two partitions?
The option to View Data within your Main File is available as a hyperlink at the top of matchIT’s main window, or within the View menu. It allows you to browse and update the selected Main File in any order. If you want to split the view into two partitions, move the mouse to the vertical black bar in the extreme bottom left corner of the view and you will see the cursor change to a double-headed arrow.
Click and drag it to the right and you will open up a second partition, which is very useful for comparing fields at the beginning of the record with those nearer the end. The Table menu allows you to unlink the two partitions, otherwise as you move up or down, the record position in each partition is kept the same.
How can I customize matchIT reports?
The first thing to do is to create your own Bitmap or JPEG (BMP,JPG) file using a tool such as Paintbrush or Photoshop. More often than not if it is your company logo that you wish to use you can cut and paste an image from your web site by right clicking your mouse on the image and selecting ‘Save Picture As’. The image should be sized so that the width is approximately 1.5 times the height for best resolution. The helpIT logo is a Bitmap and is sized 656 * 442 pixels although a smaller image should work fine provided that the relative scale is similar. Once your logo has been created it must be placed in the root folder of matchIT. In my case this is c:program filesmatchitv52.
To place your own logo on matchIT reports simply go to the EDIT- OUTPUT LAYOUTS – REPORT BRANDING menu. Click on the control button which follows the Company Logo field and you will be able to navigate to your own Bitmap file. You should now find that matchIT reports incorporate your logo. You can also change the company name and contact details to your own company in the same menu option.
How do I speed up processing?
If you have finally gotten approval for a new machine you will probably want to know how to make matchIT perform at maximum levels. For intensive data processing environments we always recommend a dedicated machine for matchIT. Hardware is now relatively inexpensive compared to a person’s time. matchIT makes efficient use of RAM at anything up to 1GB and RAM is as cheap as chips these days. Specify the best processor you can afford; we have had good feedback from users in relation to the latest generation of Athlon 64 processors coupled with our own experience and these are cheaper than their Intel equivalents.
If your budget can stretch to a second hard drive you can improve performance by placing data such as Temporary Files on one drive whilst your Master file(s) is on another. This reduces the work that a single hard drive has to do which leads to inevitable performance gains. If you are working with very large data files it may be worth a little bit of preparation before processing. Minimize field widths as much as possible and exclude superfluous fields, this will reduce file sizes and indexes used for matching and hence speed processing.
How can I get matchIT to stop asking if I wish to quit the application?
Simply press Alt+F4 or Ctrl+Q. This also means that you don’t have to wait for it to close a very large table before it prompts you.
How can I match faster on larger data files?
The default third match key for matchIT is the Postcode. If you are matching to individual or family level, the purpose of this is to pick up matches where there is some discrepancy in the surname such that its phonetic key is not the same in the two records e.g. Wilson/Wislon, Walters/Waters, Morton/Horton, Smith-Robinson/Robinson, McDonald/Mc Donald. In these cases, the first name is usually the same, but sometimes names can be reversed e.g. Mary Smith/Smith Mary. You can allow for the first kind of problem much more quickly by adding the first letter of the first forename to the Postcode key i.e. use:
POSTCODE + LEFT(NAME2, 1), instead of just Postcode.
If you have first names consistently populated in matchIT, you can allow for the reversal of names by using a key of: POSTCODE + IIF(NAME1
What can the Quality Assurance Wizard help me with?
When you are checking the address output from a job, it is well worth using the Quality Assurance Wizard Output dialog to preview a one in N sample of the records that you will output – this is the second or third stage of the Quality Assurance Wizard, depending on whether you have Mailsorted the file. Even if you want to create an output file, you can choose Output to Label for the sample records and use matchIT’s predefined label format to preview formatted names and address on the screen – it is much easier in this view to see at a glance how the names and addresses look, compared with the normal Browse view.
By using this method, you might more easily pick up problems such as inconsistent casing, problems with formatting of names, differences in Mailsort codes from that which you are expecting.
What are some keyboard shortcuts for Verify Matches?
Page Up = Next Pair
Page Down = Previous Pair
Ctrl + Left Cursor = Delete Left (press again to undelete)
Ctrl + Right Cursor = Delete Right (press again to undelete)
Ctrl + Page Up = Next Score
Ctrl + Page Down = Previous Score
Ctrl + Del = False Match
How can I save and restore my setup?
You may have wondered how matchIT stores all of the settings you make during a job setup. matchIT uses a set of files that can in fact be recalled for later use making repetitive jobs simple to repeat or so that you can simply run the same data with slightly amended settings. The main configuration settings are stored in the files PARAMS.DBF, WEIGHTS.DBF and NAMEPARM.DBF. These are all stored in the root folder of matchIT.
There are also output layout files (*.OPL), output settings and match keys. The OPL files are ordinarily stored in the REPORT sub-folder of matchIT. The output settings are stored in the root folder of matchIT as OP_PARAMS.DBF. And the match keys are stored as IDX_PARMxx.DBF where xx is UK, US or OTH depending on the nationality setting.
You can save your own copies of PARAMS.DBF, WEIGHTS.DBF and NAMEPARM.DBF by using Save/Restore Setup and saving as a Custom Configuration. This allows you to set a file extension for the files you wish to save e.g. saving them with a file extension of JOB1 will create copies of the PARAMS.DBF, WEIGHTS.DBF and NAMEPARM.DBF and label them PARAMSUK.JOB1, WEIGHTS.JOB1 and NAMEPARM.JOB1. When required you can then use Restore Setup to a Custom Configuration to recall these files.
If you are going to start using Custom Configurations it is advisable to restore the default settings configuration for the nationality and match level that you want before restoring the custom configuration. Please don’t forget to restore the standard configuration for jobs for which your custom configuration is not required.
Can you explain the matchIT fields?
If you have ever looked at the raw table structure in matchIT after you have imported data you will have noticed that matchIT generates a host of fields which it in turn uses to process your data.
If you are using matchIT for more complex processing it can be advantageous to know the purpose of each of these fields particularly if you encounter unexpected behaviour. It is also useful for advanced quality assurance checks.
As an example the field ADD_KEY is a match key derived by matchIT to pick up matches where the postcode is not identified. By default, it is an 8 byte field, of which the first four characters represent the phonetic key of the town/city and the second four characters the street. There are over 50 derived fields in matchIT’s main table structure, each with their own specific purpose.
How does matchIT V5 handle important default settings that may affect results?
In matchIT v5, the Setup Wizard remembers the match level from the last job that you ran i.e. Individual, Business, Family or Household. If however your new file does not contain the right information, then the match level will default to something that is available e.g. if the last job you ran was deduped at Individual level and this file contains company names but no contact names, the match level will be set to Business. This means that if you next load another file with both contact and company names in, the match level will still be set to Business when you get to the end of the Setup Wizard. You should always check what the match level displayed at the end of the wizard is and change it if necessary.
One more point to bear in mind is that matchIT v5 now remembers the input format the last time you imported a file and defaults to that next time. It also preserves other settings that are likely to be standard for your installation e.g. maximum records to compare. This does not mean that it remembers all the changes to settings that you make in the Options screen, as the majority are reset to the standard ones if you change the matching level when you next use the Setup Wizard. This is because many settings are appropriate only to particular files or match levels. If you want to customise particular options for general use, not just for a particular file, then you can do this in the Save/Restore Setup option from the Jobs/Setup menu – you must save your changed options for each match level and nationality of data that you might use.
How can I match on a company name as well as a contact name?
Not everyone’s data or requirements are the same. One of the strengths of matchIT is the ability to add and change keys, which enables you to customize your deduplication settings to suit each specific job. Many customers have contacted us in the past stating that even if contact names and addresses are the same, they do not want matchIT to regard it as a match unless the company name is the same too. Below is a set of instructions, which you can use to change your matching keys to enable you to match on both Company Name AND Contact Name.
Use the normal weights for Individual level matching and add the company name to each match key. To minimise the chance of missing matches, we suggest using two forms of the company name so resulting in twice as many match keys:
First, add COY_KEY to each match key – it doesn’t matter whether it is at the start or end of the key, but for efficiency it is better at the front of the key if the data is sorted by company name and at the end if it is sorted by Postcode/Zip.
PUN_TRIM(COMPANY,20) to each key. N.B. Prior to v5.03 you will have to add UPPER(PADR(CHRTRAN(COMPANY,[“&*()+’;:#/.,]+SPACE(1),),20) instead of this.
This method may cause you to miss some acronym matches and ‘contained in’ matches such as SmithKline Beecham and Beecham, so if you want to include this type of match as well, please contact us for advice.
How do I use the Helpful Output Filters?
Filters can be used through the Output To File option in the Output Menu. Once there, you need to select the Use Filter option in Filtering and Ordering and click on the edit button. You will then be allowed to build some filters, which will allow you to further customise the output you want.
Some simple example of filters can be found below:
SEX = ‘F’ : In order to output all and only the records in the file recognised as being Female
In a multi file job to extract only the records from a specific file, you could use:
LISTC = ‘FILE1’ : LISTC is a generated field found in each master file created using the Multiple File Wizard.
It is used to determine which of the component files each Master file record came from (File1 is just an example, you would use the name of the relevant file)
To help create more customised filters, please contact our Support Team.
How do I set deletion priorities and dedupe a file with a hierarchical source code?
When you run Flag Matches on a file after running Find Matches, matchIT uses the deletion priorities table to decide which record(s) from a pair or set should be deleted. matchIT looks at the deletion priorities table and deletes the record(s) with the lowest deletion priorities.
The Deletion Priorities can be accessed via Job/Setup>Matching Setup>Deletion Priorities. When you select this option, you browse PRIORITY.DBF and can view or change the various deletion rules. The fields you see when browsing the Deletion Priorities table are Field_Name, Field_Val, Field_Pri and Comment. The records in this table are in descending value of priority – in other words, rules that most increase the likelihood of a record being kept are at the top, with rules that most reduce it being kept at the bottom.
You can find more information on this subject by visiting the following link: Deletion Priorities
Setting deletion priorities allows the user to choose which records to keep or delete depending on the content of each record and which fields the User deems more important.
One useful example is when a User is attempting to dedupe a single file with a hierarchical source code. By adding a Source Code field (in this example scode) in the deletion priorities, setting the field value and then assigning each value a different priority, this will ensure that duplicate records with a higer priority source code are kept over dupes with a lower source code. Please see the example below.
FIELD_NAME FIELD_VAL FIELD_PRI COMMENT
scode H 3000 H=High / scode=Source Code (can be user defined field name and value)
scode M 2000 M=Medium
scode L 1000 L=Low
In this case, records with a “H” in the “scode” field would be kept as they have a higher priority. The field names and values are all relative to each User’s situation.
Note that the more important the field in determining the deletion priority, the higher you should set the Field_Pri value. It is the sum of all fields which determines the priority thus if scode ‘H’ shown above has priority over everything else it’s FIELD_PRI value should be significantly higher than the summed value of all the other fields in the priority list.
How do I remove carriage returns from an Access file?
WARNING: Always work on a COPY of your source data and NOT on the original data when carrying out this type of modification.
Create a function in Access as follows;
Function MyReplace(Orig As String, Find As String, Repl As String) As String
Dim k As Long, L1 As Long, M As Long
MyReplace = “”
L1 = Len(Find) – 1
M = 1
k = InStr(Orig, Find)
Do While k > 0
MyReplace = MyReplace & Mid(Orig, M, k – 1) & Repl
M = M + k + L1
k = InStr(Mid(Orig, M), Find)
MyReplace = MyReplace & Mid(Orig, M)
Now you can call the function for any field which needs updating via an update query as follows;
SET [yourfield] = MyReplace([yourField], Chr(13) & Chr(10), “”)
What should I know when importing a file?
If a file is in the same input format and field layout as a file that you have previously loaded into matchIT, you don’t have to go through the Setup Wizard again. The field layout is stored in the Main File that you used previously and the Input Options specify the Input Format and what processing you want to do on Import.
To import a file into matchIT without using the Setup Wizard, first open the Main File that you used previously, or create a copy of this file’s layout by using the Copy Main File Layout option from the Tools menu. Next, select Import Records from the Import menu. matchIT then prompts you to Restore Standard Parameters or change the Input Options if necessary (via the Change Basic Parameters button) e.g. if the last file you Imported was Comma Delimited but this file is Fixed Width, or if you want to postcode or proper case this file and the last file wasn’t.
If you want to dedupe the file at a different matching level from the last file you deduped e.g. Individual last time but Family level this time, select Restore Standard Parameters and select the appropriate matching level – do this before you change options via the Change Basic Parameters button. This restores not only the matching level (in Matching Options) but also the matching weights (and the default match keys). Finally, select the input file (the source file) that you want to load into matchIT.
If there are records in the Main File from a previous run, matchIT wipes the file clean before it loads in the new data.
How do I tune matching on company name?
If you find that you are missing matches on similar company names getting scores that are too low, you can usually improve the matching by adding words and phrases to the Names table (via the Jobs/Setup menu, Names and Words option). For example, to enable Mytchett Newsagent to match Mytchett Newsagency as a sure match on company name, add “Newsagency” as an entry with type Business, Matching Equivalent of “News” – you will see that Newsagent and Newsagents already have a matching equivalent of News.
If you want to match Mytchett News and Mytchett Post Office, add “Post Office” as a Double Word entry with type Business, Matching Equivalent of “Post Office” – for this, you have to select the appropriate option from the Look at dropdown at the bottom of the screen. We don’t recommend that you set Post Office as equivalent to News or to PO, so after adding the entry for Post Office to the Names table, you will have to use Loose Business Matching to match Mytchett News and Mytchett Post Office. You can also tune company name matching via the Name Matching Matrix option in the Jobs/Setup menu.
How do I back-up job scripts?
Sometimes we get called to help clients recover customised scripts when a PC or hard disk has failed, or a system rebuilt. It is always best practise to back up all the files that make up these scripts, so that they can be easily restored in any of these events. In particular, you should back up:
Job. files in the matchIT Database folder i.e. all files beginning with “job” whatever the file extension
Any files with a file extension of PRG or FXP that are used in the scripts – these may be in the matchIT Progs folder or in a folder specific to your job.
The structures of the Main Files (DBF files) that you are using in the job – you can create an empty structure that contains no data (and so takes up minimal disk space) by using the Tools menu, Main File Layout utilities, Copy Main File layout. If you don’t do this, you should back up not just the Main File DBF but also the CDX files.
What is the minimum screen resolution for matchIT?
With the increasing need to display additional information on the screen during setup and processing matchIT v5.13 onwards supports a minimum screen resolution of 1024×768. Web statistics show that only 5% of people are now using 800*600 so we do not anticipate any difficulties in a commercial environment.
How do I avoid foreign characters in matching keys?
If you define your own match keys, you will have notice the Key Range within the match key setup screen. If you wish to avoid records with key fields starting with special characters being included in match keys, you must define a Start and End value. In most cases (by default) the Start key will either be 0 (zero) or A (capital A) and the End key will be zzzzzzzzzzzzzzzzzzzzzzzz. Most special characters can be avoided by using A as the start Key as opposed to 0 (zero).
What are match keys?
If you were to compare every element in every single record in a database the process would take a very long time especially when comparing many thousands or millions of records.
A Match Key is a common element shared by a large group of records. Match Keys are beneficial in the search for matches because they eliminate records which clearly aren’t duplicates. For example if we use the surname field as a match key then two records with different surnames will not be matched. Clearly one match key is not sufficient to identify all similar records.
In the case of matchIT, match keys are used to identify candidates for far more intensive matching. matchIT also uses phonetic match keys to identify records where elements simply sound the same such as ‘Dayton’ and ‘Deighton’. Once potential matches are identified using match keys matchIT can go to work on comparing every element within potentially matching records to derive a matching score. The higher the score the more likely the records are to match.
matchIT’s match keys are totally user definable however by default we use three match keys designed to pick up the widest range of possible matches whilst maintaining performance. We can of course advise customers on keys that may better suit specific data anomalies or in particular very large datasets. Please refer to the matchIT user manual for more detailed information.
How can I match quickly - even with large files?
For large files, matching is much faster if the data is physically sorted on the match key e.g. with matchIT’s default UK match keys, sort it by postcode, as two of the three keys start with all or part of the postcode. For US default match keys, sort it by last name if possible, as two of the three keys start with phonetic last name. This can make a tenfold difference or more in processing time on very large files.
To sort a database already imported into matchIT, set the Use index order? option on in Output to File, using an output format of DBF or VFP and choosing the appropriate index expression to sort the database by.
Use Operational Options to import a 1 in N sample of a large file, so you can check your setup and processing before you process the whole file.
Use Operational Options to enable you to browse a main file after loading data into it, to make sure that everything is as it should be before matchIT generates keys.
Use File Locations Options to specify standard sub-folder names for jobs.
Use the Q/A Wizard to delete (flag) all records that exhibit certain characteristics e.g. postcoding failures, records with no address lines.
Why did my PC crash on import?
Commonly, if your PC crashes while importing or finding matches, or if you have to end task on matchIT, this leaves Perform.dbf corrupt – which gives rise to an error when you next start matchIT.
The easiest fix is to unzip Perform.dbf/cdx from Initconfig.zip into the Database folder; however, that loses the log of previous runs, so it will not show the matching summary.
The best fix is to start Database Utilities from the matchIT program group, select Database menu, Fix Header, then the Perform.DBF table. Then select Database menu, Reindex and select the same DBF file.
What can I change to find more matches?
If you wish to perform a very precise match and identify the maximum number of duplicates then you may find more matches at lower scores. The default Minimum Score to Report is 80 however there are often very good phonetic matches at lower scores amongst likely false matches. Go to the Jobs/Setup menu — Options — Matching tab and lower the Minimum Score to Report to 75 for instance. Now when you run matching you will find the scores start at 75. You can disregard the additional matches using “Individual Score Breakdown” accessed via the Move Matches to Different Score Band dialog, which appears when you select Flag Matches.
How do I extract domains from website addresses?
To use this feature, you need to create a job script using the program extract domain.prg (new in matchITv528). Then simply map the program in the job script along with a main file containing a URL field on the same line of the script. Now edit the program file and specify the name of the field containing the URLs. Run the script, browse the main file and by default, a new “domain” field should have been generated. This will contain the domain part of any URL from the specified URL field. If a domain could not be recognised (e.g. if the URL contained “ww.helpit.com”) then the new domain field will be empty.
matchIT is telling me that I need to ensure that it has write access on the matchIT directory; how do I do this?
1. Right click on the directory in Windows Explorer and choose the Security tab from the Properties window.
2. Click Edit to change permissions. Add read/write access on the matchIT directory (e.g. C:\Program Files\matchITv52 by default) to the user account that will be running matchIT suite.
If you do not have access to change these settings, then you will need to consult your Systems Administrator.