Data preparation – Cloud or not to cloud

Both from a consumer and a producer perspective the decision to go cloud or not is an important and sometimes  one. The following are some points to consider when making the decision.

  • Data locality – Where is the majority of the data being collected that needs to be prepared ?  High speed networks,  convenient reliable and durable storage capabilities on the cloud, strategies to copy across the WAN are minimizing the effects of data locality.  I came across a product called WANDISCO which enables bi-replication for HDFS, HBase between two data centers. If any one reading this blog  is using this send me your experience and use case !
  • Real-time decisions – If decisions needs to be made in real – time the then preparation needs to happen close to where the data is gathered.
  • Complementary Services – Typically, Data Preparation is part of a value chain that sits between sources of data and processed data consumers such as BI systems, Discovery systems, Graph DBs, Analytic Applications etc.  The source data system could include applications (CRM, HCM, SCM etc),  applications logs etc.  Need to evaluate the optimal location for data preparation based on the locality upstream and downstream application in the value chain.
  • Security – Cloud services provide good support for security – Encryption, access control, data isolation   However, on-premises business which are primarily focused on security,  strict data governance will need run though their security checklist if moving data to the cloud for data preparation and other downstream services is the right approach.
  • Business Reasons – Strategic decisions demand moving to the cloud or staying on-premises.  Global businesses may find moving to the cloud a strategic investment in the long run.  Small and medium business may find cloud as an economical alternative and faster to market strategy.

Where would you do your data preparation and why ?


About atiru

Product Strategist and architect for harnessing value from data.
This entry was posted in Data Management and Analytics. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s