The Wayback Machine - https://web.archive.org/web/20170619114902/http://devops.sys-con.com:80/node/4105053

Welcome!

@DevOpsSummit Authors: Liz McMillan, Stackify Blog, Yeshim Deniz, Derek Weeks, Elizabeth White

Related Topics: Government Cloud, Containers Expo Blog, @BigDataExpo, @DevOpsSummit

Government Cloud: Blog Feed Post

Compression: Making the Big Smaller and Faster (Part 1) | @DevOpsSummit #DevOps #WebPerf

The sharing of information in a fast and efficient manner has been an area of constant study and research

Compression: Making the Big Smaller and Faster (Part 1)
By Nilabh Mishra

How important is data compression? The sharing of information in a fast and efficient manner has been an area of constant study and research. Companies like Google and Facebook have spent a lot of time and effort trying to develop faster and better compression algorithms. Compression algorithms have existed since the ’70s and the ongoing research to have better algorithms proves just how important compression is for the Internet and for all of us.

The Need for Data Compression
The World Wide Web (WWW) has undergone a lot of changes since it was made available to the public in 1991. Believe it or not, the copy of the world’s first website can still be browsed here. Back then, webpages were very simple. Today, they are increasingly more complex and there is an evident need to have compression algorithms that are lossless, fast, and efficient.

There are several best practices that help optimize page load times. Here is a blog from that discusses webpage optimization. In this article, we will spend some time understanding the basics of compression and how it works. We will also cover a new type of compression method called “Brotli” in the second part of this blog.

Encoding and Data Compression
Let’s start by understanding what data encoding and compression are:

The word “compression” comes from the Latin word compressare, which means to press together. “Encoding” is the process of placing a sequence of characters in a specialized format that allows efficient data storage as well as transmission. Per Wikipedia: “Data compression involves encoding information using fewer bits than the original representation.

Compression plays a key role when it comes to saving bandwidth and speeding up your site. Modern day websites involve a lot of HTTP requests and responses between the client (the browser) and the server to serve a webpage. With an overall increase in the number of HTTP requests and responses, it becomes important to ensure that these transfers are taking place at a fast and efficient rate.

HTTP works on a request-response model, as demonstrated below:

In this case, we are not using any compression method to compress the response being sent by the server.

  • The browser sends an HTTP request asking for the Index.html page
  • The server looks for the requested file and responds with the requested resource and a 200 OK HTTP status message
  • The browser receives the server’s response and renders the page

As we can see, in this case there is no compression involved. The server responded with a 300 KB file (index.html page). If the file size was bigger, it would have taken more time for the response to be sent on the wire and this would have increased the overall page load time. Please note that we are currently looking only at a single HTTP response. Modern websites receive hundreds of such HTTP responses from the server to render a webpage.

The image below shows the same HTTP request – response between the browser and the server, but in this case, we use compression to reduce the size of the response being sent by the server to the browser.

Today, complex and dynamic websites generate hundreds of HTTP requests/responses. This made it important to have a system which would ensure fast and efficient data transfer between the server and the browser. This is when compression algorithms like Deflate and Gzip came into existence.

Introduction to Gzip
Gzip is a compression method that is used to make files smaller for storage and faster transmission over the network. Gzip is one of the most popular, powerful, and effective ways of compressing data and it can reduce the file size by up to 70%.

Gzip is based on the DEFLATE algorithm, which in turn is a combination of LZ77 and Huffman coding. Understanding how LZ77 works is essential to understand how compression methods like DEFLATE and Gzip work.

LZ77
Developed in the late ’70s by Abraham Lempel and Jacob Ziv, the LZ77 method of compression looks for sequences of characters that recur in a text. It performs compression by replacing the recurring occurrences of strings using pointers that backreference identical strings, previously encountered in the text, that needs to be compressed.

The pointer or backreference is of the form <relative jump, length>, where relative jump signifies how many bytes are there between the current occurrence of the string and its last occurrence and length is the total number of identical bytes found.

Now let us understand this better with the help of an example. Assume, there is a text file with the following text:

As idle as a painted ship, upon a painted ocean.

In this file, we see the following strings: “as” and “painted” occurring multiple times. What LZ77 method does is, it replaces multiple occurrences of strings with the notation: <relative jump, length>.

So using LZ77, the text will get encoded in the following way:

As idle <8,2> a painted ship, upon a <21,7> ocean.

To encode the text, we took the following steps:

  1. Looked at the string and tried to find occurrences of the same “string” or “substrings”.
  2. Replaced multiple occurrences of a string with the notation: <relative jump, length>; The two strings: “as” and “painted” were replaced the multiple occurrences of the strings with <relative jump, length>.
  3. The string “painted” which would have earlier occupied 7 bytes (i.e. the number of characters in the word: “painted”) X 1 byte = 7 bytes was compressed to occupy only 2 bytes. 2 bytes or 16 bits is the size of the pointer or backreference.

HUFFMAN Coding
Huffman Coding is another lossless data compression algorithm. The frequency of occurrence of a string in a text file or pixels in images form the basis of Huffman coding. To get a deeper understanding of this algorithm, read this detailed tutorial that clearly explains how Huffman Coding works.

All modern browsers support Gzip compression for HTTP Requests. With Gzip, one of the most important question is what to compress. It works best with text-based resources like static HTML, CSS files and JavaScript resources but is not very efficient for already compressed resources such as Images. To support Gzip, the server must be configured to allow gzip compression.

The image above shows the impact Gzip compression can have on a text-based resource like a JavaScript file. In this case, we ran 2 instant tests using Catchpoint to the URL: https://code.jquery.com/jquery-3.2.1.js.

For the first test run, we did not specify any encoding to be used by passing the custom header: Accept-Encoding: identity along with the request. The first image shows no Content-Encoding being passed for the request.

In the second image, the browser is sending Accept-Encoding:zip, for which the server is sending zipped file as the response.

We can clearly see how Gzip can drastically compress the files to improve data transmission rate over the wire.

Catchpoint’s Scheduled tests also highlight the difference between compressed and not-compressed content loading on webpages.

In the screenshot above, we see the difference in downloaded bytes for static content (CSS, JavaScript) when using G-zip vs. when not using any encoding.

Brotli Compression
A new compression method called Brotli was introduced not too long ago. The Brotli compression algorithm is optimized for the web and specifically for small text documents. We will discuss more about this compression method and what is has to offer to the World Wide Web community in the second part of the article.

The post Compression: Making the Big Smaller and Faster (Part 1) appeared first on Catchpoint's Blog - Web Performance Monitoring.

Read the original blog entry...

More Stories By Mehdi Daoudi

Catchpoint radically transforms the way businesses manage, monitor, and test the performance of online applications. Truly understand and improve user experience with clear visibility into complex, distributed online systems.

Founded in 2008 by four DoubleClick / Google executives with a passion for speed, reliability and overall better online experiences, Catchpoint has now become the most innovative provider of web performance testing and monitoring solutions. We are a team with expertise in designing, building, operating, scaling and monitoring highly transactional Internet services used by thousands of companies and impacting the experience of millions of users. Catchpoint is funded by top-tier venture capital firm, Battery Ventures, which has invested in category leaders such as Akamai, Omniture (Adobe Systems), Optimizely, Tealium, BazaarVoice, Marketo and many more.

@DevOpsSummit Stories
Wooed by the promise of faster innovation, lower TCO, and greater agility, businesses of every shape and size have embraced the cloud at every layer of the IT stack – from apps to file sharing to infrastructure. The typical organization currently uses more than a dozen sanctioned cloud apps and will shift more than half of all workloads to the cloud by 2018. Such cloud investments have delivered measurable benefits. But they’ve also resulted in some unintended side-effects: complexity and risk. End users now struggle to navigate multiple environments with varying degrees of performance. Companies are unclear on the security of their data and network access. And IT squads are overwhelmed trying to monitor and manage it all.
SYS-CON Events announced today that MobiDev, a client-oriented software development company, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex software systems for startups and enterprises. Since 2009 it has grown from a small group of passionate engineers and business managers to a full-scale mobile software company with over 200 developers, designers, quality assurance engineers, project managers in house, specializing in the world-class mobile and web development.
SYS-CON Events announced today that GrapeUp, the leading provider of rapid product development at the speed of business, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company, specialized in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market across the USA and Europe, we work with a variety of customers from emerging startups to Fortune 1000 companies.
SYS-CON Events announced today that Ayehu will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on October 31 - November 2, 2017 at the Santa Clara Convention Center in Santa Clara California. Ayehu provides IT Process Automation & Orchestration solutions for IT and Security professionals to identify and resolve critical incidents and enable rapid containment, eradication, and recovery from cyber security breaches. Ayehu provides customers greater control over IT infrastructure through automation. Ayehu solutions have been deployed by major enterprises worldwide, and currently, support thousands of IT processes across the globe. The company has offices in New York, California, and Israel.
"We are a monitoring company. We work with Salesforce, BBC, and quite a few other big logos. We basically provide monitoring for them, structure for their cloud services and we fit into the DevOps world" explained David Gildeh, Co-founder and CEO of Outlyer, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
For organizations that have amassed large sums of software complexity, taking a microservices approach is the first step toward DevOps and continuous improvement / development. Integrating system-level analysis with microservices makes it easier to change and add functionality to applications at any time without the increase of risk. Before you start big transformation projects or a cloud migration, make sure these changes won’t take down your entire organization.
SYS-CON Events announced today that Enzu will exhibit at SYS-CON's 21st Int\ernational Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Enzu’s mission is to be the leading provider of enterprise cloud solutions worldwide. Enzu enables online businesses to use its IT infrastructure to their competitive advantage. By offering a suite of proven hosting and management services, Enzu wants companies to focus on the core of their online business and let Enzu manage their IT hosting infrastructure.
SYS-CON Events announced today that CA Technologies has been named "Platinum Sponsor" of SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business - from apparel to energy - is being rewritten by software. From planning to development to management to security, CA creates software that fuels transformation for companies in the application economy. With CA software at the center of their IT strategy, organizations can leverage the technology that changes the way we live - from the data center to the mobile device. CA's software and solutions help customers thrive in the new application economy by delivering the means to deploy, monitor and secure their applications and infrastructure.
Both SaaS vendors and SaaS buyers are going “all-in” to hyperscale IaaS platforms such as AWS, which is disrupting the SaaS value proposition. Why should the enterprise SaaS consumer pay for the SaaS service if their data is resident in adjacent AWS S3 buckets? If both SaaS sellers and buyers are using the same cloud tools, automation and pay-per-transaction model offered by IaaS platforms, then why not host the “shrink-wrapped” software in the customers’ cloud? Further, serverless computing, cloud marketplaces and DevOps are changing the economics of hosting and delivering software.
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
It is ironic, but perhaps not unexpected, that many organizations who want the benefits of using an Agile approach to deliver software use a waterfall approach to adopting Agile practices: they form plans, they set milestones, and they measure progress by how many teams they have engaged. Old habits die hard, but like most waterfall software projects, most waterfall-style Agile adoption efforts fail to produce the results desired. The problem is that to get the results they want, they have to change their culture and cultures are very hard to change. To paraphrase Peter Drucker, "culture eats Agile for breakfast." Successful approaches are opportunistic and leverage the power of self-organization to achieve lasting change.
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to the new world.
When shopping for a new data processing platform for IoT solutions, many development teams want to be able to test-drive options before making a choice. Yet when evaluating an IoT solution, it’s simply not feasible to do so at scale with physical devices. Building a sensor simulator is the next best choice; however, generating a realistic simulation at very high TPS with ease of configurability is a formidable challenge. When dealing with multiple application or transport protocols, you would be looking at some significant engineering investment. On-demand, serverless computing enables developers to try out a fleet of devices on IoT gateways with ease. With a sensor simulator built on top of AWS Lambda, it’s possible to elastically generate device sensors that report their state to the cloud.
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 additional third-party data centers across Europe. Its full-service Unified ICT platform serves international enterprises and many of the world’s leading service providers, as well as governments and universities.
SYS-CON Events announced today that Clouber will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Clouber offers Migration as a Service (MaaS) across Private and Public Cloud (AWS, Azure, GCP) including bare metal migration to cloud. Clouber’s innovative technology allows for migration projects to be completed in minutes instead of weeks. For more updates follow #clouberio
SYS-CON Events announced today that Striim will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Striim is pronounced "stream", with two i's for integration and intelligence. The company was founded in 2012 as WebAction, with a mission to help companies make data useful the instant it's born. The leaders behind the Striim platform thrive on building technology companies that raise expectations for how the world does business. The team include core executives from GoldenGate Software (acquired by Oracle in 2009), Informatica, Oracle, SnapLogic, Embarcadero Technologies, PubNub and WebLogic. It is led by Ali Kutay who was an angel investor, president and CEO of WebLogic, as well as Chairman and CEO of GoldenGate Software.
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists will examine how DevOps helps to meet the demands of Digital Transformation – including accelerating application delivery, closing feedback loops, enabling multi-channel delivery, empowering collaborative decisions, improving user experience, and ultimately meeting (and exceeding) business goals.
SYS-CON Events announced today that Outscale will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Outscale's technology makes an automated and adaptable Cloud available to businesses, supporting them in the most complex IT projects while controlling their operational aspects. You boost your IT infrastructure's reactivity, with request responses that only take a few seconds.
SYS-CON Events announced today that CAST Highlight has been named "Bronze Sponsor" of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. CAST Highlight is an ultra-rapid code-scanning SaaS offering that identifies potential IT risks and cost savings opportunities across distributed application portfolios. By delivering data and insights on the health of portfolios, CAST Highlight provides IT leaders with objectivity and clarity to make more informed business decisions, prevent risk, and reduce complexity and cost.
Regardless of what business you’re in, it’s increasingly a software-driven business. Consumers’ rising expectations for connected digital and physical experiences are driving what some are calling the "Customer Experience Challenge.” In his session at @DevOpsSummit at 20th Cloud Expo, Marco Morales, Director of Global Solutions at CollabNet, will discuss how organizations are increasingly adopting a discipline of Value Stream Mapping to ensure that the software they are producing is poised to offer continuous improvements to customers’ experience of products and services.
In his opening keynote at 20th Cloud Expo, Michael Maximilien, Research Scientist, Architect, and Engineer at IBM, will motivate why realizing the full potential of the cloud and social data requires artificial intelligence. By mixing Cloud Foundry and the rich set of Watson services, IBM's Bluemix is the best cloud operating system for enterprises today, providing rapid development and deployment of applications that can take advantage of the rich catalog of Watson services to help drive insights from the vast trove of private and public data available to enterprises.
This talk centers around how to automate best practices in a multi-/hybrid-cloud world based on our work with customers like GE, Discovery Communications and Fannie Mae. Today’s enterprises are reaping the benefits of cloud computing, but also discovering many risks and challenges. In the age of DevOps and the decentralization of IT, it’s easy to over-provision resources, forget that instances are running, or unintentionally expose vulnerabilities.
Cloud promises the agility required by today’s digital businesses. As organizations adopt cloud based infrastructures and services, their IT resources become increasingly dynamic and hybrid in nature. Managing these require modern IT operations and tools. In his session at 20th Cloud Expo, Raj Sundaram, Senior Principal Product Manager at CA Technologies, will discuss how to modernize your IT operations in order to proactively manage your hybrid cloud and IT environments. He will be sharing best practices around collaboration, monitoring, configuration and analytics that will help you boost experience and optimize utilization of your modern IT Infrastructures.
SYS-CON Events announced today that Progress, a global leader in application development, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Enterprises today are rapidly adopting the cloud, while continuing to retain business-critical/sensitive data inside the firewall. This is creating two separate data silos – one inside the firewall and the other outside the firewall. Cloud ISVs often get requests to connect these silos using technologies such as VPN; however, these tend to be difficult to manage and are not engineered for accessing business data from the cloud.
SYS-CON Events announced today that Progress, a global leader in application development, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Enterprises today are rapidly adopting the cloud, while continuing to retain business-critical/sensitive data inside the firewall. This is creating two separate data silos – one inside the firewall and the other outside the firewall. Cloud ISVs often get requests to connect these silos using technologies such as VPN; however, these tend to be difficult to manage and are not engineered for accessing business data from the cloud.