By Srinivasan Sundara Rajan | Article Rating: |
|
May 8, 2013 09:00 AM EDT | Reads: |
2,040 |

Master Data Management (MDM) is a very important data governance aspect in enterprises whereby MDM enables the development of a "Single Version of Truth." MDM establishes Single Version of Truth by providing common descriptions for enterprise-wide entities.
Need for MDM in Big Data Processing
Before Big Data, enterprises generally managed their transaction data in traditional relational databases. One of the biggest strengths of relational databases is their ability to enforce constraints like check constraints, primary key, foreign key, etc., which ensure that the data captured is of the highest quality.
In spite of such support for data integrity, enterprises had duplicates in their master data that resulted in inaccurate results in analytics on that data. For example, an enterprise may target an expensive advertisement campaign for a new product to its existing customers; however, due to the fact that a particular customer may exist with different IDs across multiple systems, the enterprise may be sending its campaign materials to the same person multiple times.
Similarly, a manufacturing enterprise may be analyzing the problem and complaint records from their customers, but a lack of uniformity between the product codes across the regions and a lack of uniformity across problem types may result in inaccurate quantification of the issues.
Enterprises traditionally attack the Master Data Management by implementing following measures.
- Enables development of a "single version of the truth" by establishing common descriptions for core business entities across multiple systems.
- Assess current master data maturity across the enterprise, identify target maturity and identify gaps
- Master Data Management Tool Selection
- Master data models and cleansed data
- MDM governance and stewardship
- MDM Strategy to tackle mergers and acquisitions
With the advent of Big Data processing, enterprises started analyzing massive amounts of unstructured data from unconventional sources, which means the inconsistencies across the data is increasing and the level of validations that are performed at the data capture is very limited when compared to the traditional relational data capture.
For example, if the enterprises wanted to target customers on social media with the potential for one customer represented in multiple social media forums in different names, the chances of the campaigns either overreaching a person or not reaching at all is very high. The same is true when microblogging sites are used to analyze the voice of the customer and categorize complaints across products. There is high possibility that customer misspell the product names or use some local naming conventions for the same products that will prevent an effective analysis.
Master Data Managemet in Big Data
The following are some of the approaches of integrating MDM data quality solutions in Big Data Processing so that the true insights on the massive quantities can be generated and these insights can really be accurate for the enterprises.
- Adopting Hybrid Big Data Solutions: As highlighted In my last article on Hybrid Big Data Solutions, integrating Big Data with the existing relational data which is likely to contain MDM source data bases is one of the easiest ways to ensure data quality on the big data sources.
- Matching More Than Keywords: The massive quantities of unstructured data bring together a greater level of ambiguity about classification and relevance of the documents, and hence a mere key word matching of entities to get the subject of interest is not enough. Most of the current examples on Big Data is more about utilizing standard regular expression functions, however the true potential of Big Data in conjunction with MDM can be achieved if Text Analytics is adopted on Big Data more than standard regular expressions.
- Adopting a Data Virtualization Layer: Data Virtualization platform provides a common hub for capturing data across traditional and big data and hence the business rules can be managed at this layer which will ensure the data quality across disparate data sources.
- Utilize the Power of Hadoop Database Extensions: Big Data frameworks like Hadoop provide the ability to keep the data in their own file system HDFS without transforming them, and the data can be accessed using SQL Like languages. For example Hive allows to read the data in Hadoop file system using SQL Interface. Similarly HBase is a columnar database implemented on top of HDFS file system. These implementations have support for imposing constraints on the underlying Big Data. For example Hive supports JOIN across tables which will go a long way in checking for integrity with respect to MDM.
Summary
While enterprises continue to adopt Big Data as part of their data management the biggest challenge will be the data quality. The RDBMS have done a great job on the data integrity and Big Data should support the same. Implementing the traditional Master Data Management on top of Big Data / Unified Data will go a long way in providing the correct insights from the Big Data processing.
Published May 8, 2013 Reads 2,040
Copyright © 2013 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Srinivasan Sundara Rajan
Srinivasan Sundara Rajan (Also Known As Sundar) Is A Enterprise Technology Enabler for realizing business capabilities. His primary focus is enabling Agile Enterprises by facilitating the adoption of Every Thing As A Service Model with particular concentration on BpaaS (Business Process As A Service). He also helps enterprises in getting meaningful insights from their structured and unstructured and real time data sources. All the views expressed are Srinivasan's independent analysis of industry and solutions and need not necessarily be of his current or past organizations. Srinivasan would like to thank every one who augmented his Architectural skills with Analytical ideas.
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo 2013 Silicon Valley Call for Papers Now Open
- Cloud Expo New York: Developing the World’s First IaaS Marketplace
- Commander of U.S. Cyber Command and National Security Agency Director, General Keith Alexander, To Keynote Day One of Black Hat USA 2013
- According to Nick Gholkar, Accounting Apps Make Conducting Business Easier
- Cloud Business Solutions, Social Media, and Platform Systems of Engagement Market Shares, Strategies, and Forecasts, Worldwide, 2013 to 2019
- Lunch Keynote at Cloud Expo | Strategies for App Delivery in the Cloud Era
- Lunch Keynote at Cloud Expo New York | CIOs Are Transforming the Cloud
- Which Web Browser Offers Best Malware Protection? NSS Labs Releases New 2013 Web Browser Group Test Results
- Cloud Expo New York: Bridging IaaS and PaaS
- 2013 - 2016 : solutions stabilisées, usages innovants généralisés
- Cloud Expo New York: Cloudy with a Chance of...Success
- Cloud People: A Who's Who of Cloud Computing
- Windows Azure IaaS Reaches General Availability
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- Streamline Health® Engages KPMG as Its New Independent Registered Public Accountants
- Cloud Expo New York: Deploying Hybrid Cloud for Performance and Uptime
- Cloud Expo 2013 Silicon Valley Call for Papers Now Open
- Predixion Software Announces General Availability of the Latest Version of its Predictive Analytics Platform
- Session Topics: 12th Cloud Expo / Cloud Expo New York
- Cloud Expo New York: Developing the World’s First IaaS Marketplace
- Commander of U.S. Cyber Command and National Security Agency Director, General Keith Alexander, To Keynote Day One of Black Hat USA 2013
- Cloud Expo New York: Aligning Your Cloud Security with the Business
- According to Nick Gholkar, Accounting Apps Make Conducting Business Easier
- Google Maps and ASP.NET
- Converting VB6 to VB.NET, Part I
- How to Write High-Performance C# Code
- Where Are RIA Technologies Headed in 2008?
- Crystal Reports XI & How It Has Changed
- Creating Controls for.NET Compact Framework in Visual Studio 2005
- Programmatically Posting Data to ASP .NET Web Applications
- Implementing Tab Navigation with ASP.NET 2.0
- AJAX World RIA Conference & Expo Kicks Off in New York City
- i-Technology Viewpoint: "SOA Sucks"
- .NET Archives: Getting Reacquainted with the Father of C#
- The Top 250 Players in the Cloud Computing Ecosystem
- ');
for(i = 0; i < google_ads.length; ++i)
{
document.write('
- ');
document.write('' + google_ads[i].line1 + '
'); document.write('' + google_ads[i].visible_url + '
'); document.write(google_ads[i].line2 + ' ' + google_ads[i].line3); document.write(' ');
}
document.write('