Activity on Existing Data

Data Normalization, Data Formatting , Data Standardize


Data de-duplication, as duplicate removal, identifying and eliminating duplicate records from a dataset


Data validation, Data Enrich, Data verify, integrate, consolidate and cleansing activity


Exact Match:

compares each record in the dataset against all other records and identifies exact duplicates based on a set of predefined matching criteria. Once identified, the duplicates can be removed from the dataset.


Fuzzy Matching:

Data may contain slight variations or inconsistencies. These algorithms compare records based on similarity measures, if two records are likely to represent the same entity. Fuzzy matching allows for variations in spelling, formatting, or other minor differences. One can be removed from the dataset.


Data Hashing:

Hashing techniques involve creating a unique identifier, or hash, for each record based on its content. Records with the same hash value are considered potential duplicates and can be further examined or removed from the dataset. Hashing is often used as a preliminary step in data de-duplication to reduce the computational complexity of the process.


New Data Acquisition and Profiling

New data acquisition obtaining and collecting fresh or previously uncollected data or New account discovery.

Clustering and Segmentation of accounts

