The Wayback Machine - https://web.archive.org/web/20170731193153/http://bigdata.sys-con.com:80/node/4129110

Welcome!

@BigDataExpo Authors: Liz McMillan, Elizabeth White, Harry Trott, Pat Romanski, Dr. Gopala Krishna Behara

Related Topics: Recurring Revenue, Artificial Intelligence, @BigDataExpo, @DevOpsSummit

Recurring Revenue: Article

A GDPR Compliance Journey | @DevOpsSummit #ML #Cloud #Agile #DevOps

The technology challenges, approaches, and lessons learned for the centralized testing environment

Lessons from the GDPR Compliance Journey of a Leading Financial Services Organization

In preparation for General Data Protection Regulation (GDPR) compliance, a global 100 financial services organization embarked on a journey to assess its core information processing environments with the objective of identifying opportunities to strengthen its data privacy protection programs. This article focuses on the technology challenges, approach, and lessons learned for the centralized testing environment.

Situation
Like many DevOps groups across the industry, this financial organization has adopted both continuous testing and quality testing regimes to deliver quality products leveraging agile methodology. The organization prefers to use production data to prepare the test data. While majority testing is primarily being done by an internal team, certain applications are tested by outsourced offshore teams. The test environment is fairly complex comprising Oracle, Hadoop (Parquet files), Hive, Cassandra, MS SQL, SAS, and Linux-based systems. Incremental data volume varies between 10 million to 15 million records on weekly basis. Certain major releases of Big Data-based applications require up to 5 GB data ( ~ 75 million records).

Challenges
In order to comply with the GDPR and prevent data privacy breach events, the testing team needed to detect and de-identify the PII element. If they use available de-identification methods of leveraging product-specific encryption technology like MS SQL encryption, etc., much of the data becomes unusable for testing for the following reasons:

  • Current methods scramble the data and make it unusable.
  • Current methods do not preserve any referential relationship between various data sources.

If they choose to mask the data, they are confronted with similar challenges. For example, if they want to test an application that calculates the end-of-month summary balance of a customer account using an Oracle data source and Hadoop data source - they would not able to use the data encrypted using available technology.

In addition, PII information often appears within comments and description fields - encryption or masking of the entire field would result in the loss of important information.

More important, data encryption using available methods are computationally time-consuming and require large hardware infrastructure.

Approach
The organization identified the following solution criteria to mitigate the challenges identified during the assessment:

  • Autonomous Detection: Leveraging a centralized library, a solution should examine all incoming data including embedded documents for the presence of PII elements. Solution should also be using machine learning techniques to classify sensitive documents present in a Big Data repository
  • Format Preserving Encryption: Based on the type of PII data and preference of the user, the solution should encrypt the data elements in following three modes:
    • Blind mode: It should encrypt data element if the data element matches a specific regular expression.
    • Column mode: It should encrypt the content of a specific column or a field.
    • Mixed Mode: It should encrypt the data elements within a specific column if the data element matches a specific regular expression
  • Cross Platform Referential Integrity: Solution must be able to retain referential integrity between records across platforms
  • Big Data Volume: Solution should be able to detect and encrypt sensitive data in 100 GB of data in less than one hour using commodity hardware.
  • Data Usage Monitoring: Solution should be able to record and retain information for all data privacy usage for audit and compliance. In addition, the solution should be able to identify abnormal data usage leveraging machine learning.

Lessons Learned

  1. Understand business and technology landscape: It is imperative to understand the current technology landscape, business practices and emerging trends. If your technology platform and domain is monolithic today - do you expect it to remain monolithic in the near future? What would be the impact should you move some of your testings to a cloud platform? What about Big Data applications?
  2. Evaluate risks: Assess data security risks through the lens of GDPR and beyond. In addition to the PII and PHI information, most organizations deal with sensitive data that may not be associated with an individual. How to you detect, encrypt and monitor other types of sensitive data such as B2B contract information in your testing environment?
  3. Beyond Retrofitting: Define the ideal solution characteristics prior to evaluating solutions. Retrofitting a solution to meet your business needs is often time-consuming and costly.

More Stories By Angsuman Dutta

Angsuman Dutta is the CEO and founder of Pricchaa, a Big Data security company. He is a Data Management and Analytics expert. He has helped numerous Fortune 500 enterprises with Big Data Adoption solutions primarily in Healthcare and Banking. Angsuman earned a degree in engineering from the IIT, and an MBA from the University of Chicago.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@BigDataExpo Stories
SYS-CON Events announced today that Massive Networks will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Massive Networks mission is simple. To help your business operate seamlessly with fast, reliable, and secure internet and network solutions. Improve your customer's experience with outstanding connections to your cloud.
SYS-CON Events announced today that Datera will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datera offers a radically new approach to data management, where innovative software makes data infrastructure invisible, elastic and able to perform at the highest level. It eliminates hardware lock-in and gives IT organizations the choice to source x86 server nodes, with business model option...
SYS-CON Events announced today that Akvelon will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Akvelon is a business and technology consulting firm that specializes in applying cutting-edge technology to problems in fields as diverse as mobile technology, sports technology, finance, and healthcare.
With 10 simultaneous tracks, keynotes, general sessions and targeted breakout classes, Cloud Expo and @ThingsExpo are two of the most important technology events of the year. Since its launch over eight years ago, Cloud Expo and @ThingsExpo have presented a rock star faculty as well as showcased hundreds of sponsors and exhibitors! In this blog post, I provide 7 tips on how, as part of our world-class faculty, you can deliver one of the most popular sessions at our events. But before reading the...
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
SYS-CON Events announced today that Datera, that offers a radically new data management architecture, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datera is transforming the traditional datacenter model through modern cloud simplicity. The technology industry is at another major inflection point. The rise of mobile, the Internet of Things, data storage and Big...
In his session at @ThingsExpo, Sudarshan Krishnamurthi, a Senior Manager, Business Strategy, at Cisco Systems, discussed how IT and operational technology (OT) work together, as opposed to being in separate siloes as once was traditional. Attendees learned how to fully leverage the power of IoT in their organization by bringing the two sides together and bridging the communication gap. He also looked at what good leadership must entail in order to accomplish this, and how IT managers can be the ...
Dasher Technologies is committed to being the best technology solution company in the United States by operating with the highest integrity and building lasting relationships with its customers and partners. Since 1982, Dasher Technologies helped public, private and nonprofit organizations implement technology solutions that speed and simplify their operations. As one of the fastest growing system integrators in the country, Dasher have gained a reputation for effortless implementations with rel...
SYS-CON Events announced today that DXWorldExpo has been named “Global Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Digital Transformation is the key issue driving the global enterprise IT business. Digital Transformation is most prominent among Global 2000 enterprises and government institutions.
SYS-CON Events announced today that Calligo has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Calligo is an innovative cloud service provider offering mid-sized companies the highest levels of data privacy. Calligo offers unparalleled application performance guarantees, commercial flexibility and a personalized support service from its globally located cloud platform...
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
"The Striim platform is a full end-to-end streaming integration and analytics platform that is middleware that covers a lot of different use cases," explained Steve Wilkes, Founder and CTO at Striim, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
As businesses adopt functionalities in cloud computing, it’s imperative that IT operations consistently ensure cloud systems work correctly – all of the time, and to their best capabilities. In his session at @BigDataExpo, Bernd Harzog, CEO and founder of OpsDataStore, presented an industry answer to the common question, “Are you running IT operations as efficiently and as cost effectively as you need to?” He then expounded on the industry issues he frequently came up against as an analyst, and ...
Blockchain is a shared, secure record of exchange that establishes trust, accountability and transparency across supply chain networks. Supported by the Linux Foundation's open source, open-standards based Hyperledger Project, Blockchain has the potential to improve regulatory compliance, reduce cost and time for product recall as well as advance trade. Are you curious about Blockchain and how it can provide you with new opportunities for innovation and growth? In her session at 20th Cloud Exp...
With tough new regulations coming to Europe on data privacy in May 2018, Calligo will explain why in reality the effect is global and transforms how you consider critical data. EU GDPR fundamentally rewrites the rules for cloud, Big Data and IoT. In his session at 21st Cloud Expo, Adam Ryan, Vice President and General Manager EMEA at Calligo, will examine the regulations and provide insight on how it affects technology, challenges the established rules and will usher in new levels of diligence a...
DX World EXPO, LLC., a Lighthouse Point, Florida-based startup trade show producer and the creator of "DXWorldEXPO® - Digital Transformation Conference & Expo" has announced its executive management team. The team is headed by Levent Selamoglu, who has been named CEO. "Now is the time for a truly global DX event, to bring together the leading minds from the technology world in a conversation about Digital Transformation," he said in making the announcement.
SYS-CON Events announced today that Calligo, an innovative cloud service provider offering mid-sized companies the highest levels of data privacy and security, has been named "Bronze Sponsor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Calligo offers unparalleled application performance guarantees, commercial flexibility and a personalised support service from its globally located cloud plat...
"Outscale was founded in 2010, is based in France, is a strategic partner to Dassault Systémes and has done quite a bit of work with divisions of Dassault," explained Jackie Funk, Digital Marketing exec at Outscale, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
In the enterprise today, connected IoT devices are everywhere – both inside and outside corporate environments. The need to identify, manage, control and secure a quickly growing web of connections and outside devices is making the already challenging task of security even more important, and onerous. In his session at @ThingsExpo, Rich Boyer, CISO and Chief Architect for Security at NTT i3, discussed new ways of thinking and the approaches needed to address the emerging challenges of security i...
IoT is at the core or many Digital Transformation initiatives with the goal of re-inventing a company's business model. We all agree that collecting relevant IoT data will result in massive amounts of data needing to be stored. However, with the rapid development of IoT devices and ongoing business model transformation, we are not able to predict the volume and growth of IoT data. And with the lack of IoT history, traditional methods of IT and infrastructure planning based on the past do not app...