A lightning fast, cross-platform in-memory matching solution that delivers deduplication of large contact databases and matching across files of any type and size. The engine that powers matchIT Data Quality Solutions, for batch and interactive use.
matchIT Hub is built around helpIT’s proven fuzzy matching engine, which has delivered a rapid return on investment for over 2,000 companies worldwide.
- Easy connectivity allows more companies to benefit quickly from effective matching across data sources – integration is extremely quick
- Superior performance delivers real-time matching on streams of data matching to large files, removing the reliance on nightly updates to achieve a Single Customer View
- Intelligent matching engine maximizes the proportion of matches that can safely be automated, unlike conventional match key solutions
- Enables the calling application to ensure that only good quality contact data is allowed to enter the database
matchIT Hub allows your existing systems and packages greatly improved control over your data quality.
matchIT Hub provides intelligent parsing and fuzzy matching of contact data, both business and personal, graded to allow automation for any type of requirement.
- Matches using any data held about the contact, including name, address, telephone, email, date of birth, web site and custom fields
- Match at multiple levels simultaneously (individual, family, household, company, custom) saving time and effort
- Implement custom matching rules with minimum effort at any stage of the process
- Name splitting, genderization and salutation generation to enable accurate personalization
- Screening for meaningless, incomplete or poor quality data to prevent if entering your database
matchIT Hub can match single files or across files, and enables streaming of individual records data for real-time matching.
Due to its in-memory architecture, matchIT Hub is many times faster than any other specialist contact data matching solution. It scales automatically across multiple processors – efficiently processing very high volume data. Performance depends principally on hardware and match rate (duplication or overlap rate), but examples include:
- Finds overlap of 100,000 records against 50 million preloaded records in 11 seconds (uses 13GB RAM, 20% match rate)*
- Matches 1 million records in 12 seconds (uses 500MB RAM, 11% match rate)*
- Matches 50 million records in 52 minutes (uses 15GB RAM, 12% match rate)*
*All stats are for a 10-core hyper threaded Windows PC with 64GB RAM.
Typically, users leverage the power of matchIT Hub via one of helpIT systems’ packaged Data Quality software solutions, but matchIT Hub has been designed to be easy to integrate into existing systems where this is required: the calling code simply applies a configuration, passes across all the data from the source(s), and reads the results. Code samples for all major programming languages are provided.
The supplied demo program allows you to immediately view matching results as part of the evaluation process, before starting integration work. To widen the scope, matchIT Hub is available integrated into Talend Open Studio, Magic XPI Integration Platform and Pentaho Data Integration for connection with a wide variety of data sources:
- Databases including Oracle, SQL Server, Teradata, MySQL, DB2, Ingres, PostgreSQL, Access
- NoSQL databases including Hadoop/HBase, Cassandra, MongoDB, CouchDB,
- Other file formats including CSV, fixed width text, Excel, XML and JSO
- Software packages and data platforms including SAP, SalesForce, Hortonworks, SAS, Greenplum
- matchIT Hub can be deployed to Windows, Linux and Unix Platforms
matchIT Hub can also be integrated into other Data Integration tools as required.
Why choose matchIT Hub?
Lightning speed due to processing entirely in-memory
Scales automatically across multiple processors
Maximize number of matches that can be automated for batch processing
What our customers say
“In our state of the art data compilation center we have integrated several of matchIT products to process hundreds of millions of records to create our Industry leading databases. It is important to be able to get to market our constantly changing data before our competitors and helpIT’s products meets the needs of our fast paced environment.”
VP of Client Services.