Category Archives: Data Management and Analytics

topics related to data management, Big Data processing and analytics

Data Preparation – Normalization Subsystem – Clustering using Tokens

Continuing on the subject related to clustering text to facilitate normalizing a data set in this blog post I will examine clustering using tokens. Token based clustering uses  tokens to evaluate similarity between two string and determine membership into a cluster. The … Continue reading

Posted in Data Management and Analytics | Leave a comment

Data preparation – Normalization subsystem – Clustering Text using Fingerprinting

In this blog I will examine the normalization sub-system which is one of the sub-systems I called in my earlier blog – Data Preparation Sub-Systems. A key objective of this step is to ensure the data consistency.  For example, when working … Continue reading

Posted in Data Management and Analytics | Tagged , | Leave a comment

Ensuring data consistency between cloud and on-premises

Enterprises today have greater flexibility in determining whether investing in applications, platforms and infrastructure should be a capital expenditure or operational expenditure or both.  As such enterprises are increasingly using a mix of public cloud, private cloud and on-premises strategy … Continue reading

Posted in Data Management and Analytics | Leave a comment

Data Preparation for Batch and Real-time data

In this blog post I will discuss the role of data preparation when working with Batch data set or Real-time data sets. Irrespective of whether the analysis of data is happening real-time or in batch some aspects of data preparation … Continue reading

Posted in Data Management and Analytics | Leave a comment

Data Preparation Platform for Big Data

Before discussing the data preparation platform  for Big Data lets look at the some of the requirements – Since there is no apriori knowledge of the data content, the data preparation process is highly interactive and a visual process. Getting a … Continue reading

Posted in Data Management and Analytics | Leave a comment

Data preparation – Cloud or not to cloud

Both from a consumer and a producer perspective the decision to go cloud or not is an important and sometimes  one. The following are some points to consider when making the decision. Data locality – Where is the majority of … Continue reading

Posted in Data Management and Analytics | Leave a comment

Data Preparation – Make or Buy Decision

Make or Buy decision is always in the minds of the executives in any product or services areas especially when working on cutting edge products.  The key question I always ask is what is the core-competency  of the Organization – … Continue reading

Posted in Data Management and Analytics | Leave a comment