The Wayback Machine - https://web.archive.org/web/20161207145224/http://eclipse.sys-con.com/node/2644586

Welcome!

Eclipse Authors: Yeshim Deniz, Liz McMillan, Elizabeth White, XebiaLabs Blog, Ken Fogel

Blog Feed Post

Under the Hood: How ExtraHop Delivers 20Gbps of Real-Time Transaction Analysis

This post is authored by ExtraHop CEO Jesse Rothstein.

When we talk to IT teams who are considering ExtraHop, there’s often a discussion about scalability. People are skeptical, and rightfully so. Many monitoring vendors sell the dream of real-time, off-the-wire transaction analysis. In reality, they only do so for a subset of traffic and for a relatively small number of concurrent flows, or they write the bulk of the data to huge disk arrays for post-hoc analysis.

We love to talk to people about scalability and performance because it matters. For real-time analysis, if you can’t keep up, you fall behind, and if you fall behind, you might never catch up again. Additionally, greater scalability of real-time monitoring offers IT teams visibility into very large environments in which they previously were flying blind, and it offers a more cost-effective approach with fewer appliances.

20gbps throughput

The EH8000: An All-in-One Operational Intelligence Platform

Our new EH8000 appliance performs real-time, L2-L7 transaction analysis for up to a sustained 20Gbps. Throughput is only part of the picture. A single EH8000 can analyze more than 400,000 transactions per second, extracting application-level health and performance metrics such as URIs associated with HTTP 500 errors, slow stored procedures in a database, or the location of corrupt files in network-attached storage. This level of performance is far beyond what other passive monitoring vendors even advertise let alone what they actually do. For example, our EH8000 performs over an order of magnitude faster than the recently announced TruView appliance from Visual Network Systems, which, according to their own materials, only analyzes one million transactions per minute, or less than 17,000 per second. The ExtraHop platform’s analysis of more than 400,000 transactions per second is a true market leader.

Even with our current lead, I believe that ExtraHop will continue to widen the scalability gap compared to other products on the market. This is a bold claim, so please allow me to explain why.

Reason #1 – ExtraHop was built from the ground up for multi-core processing.

The first reason for ExtraHop’s substantial performance lead—and the reason why I believe ExtraHop will continue to widen the gap—is that our platform was built from the ground up for multi-core processing. Network processing is embarrassingly parallel and can be easily split across multiple cores. Systems that are more parallelized see greater speedup with more cores, according to Amdahl’s Law. The chart below illustrates the effect of Amdahl’s Law, where a program that is 95% parallelized sees a maximum speedup that is five times the maximum speedup of a program that is only 75% parallelized.[1] While other analysis products will see some benefit from multi-core processing, the ExtraHop platform, which is unburdened by legacy architectures and built from the ground up for multi-core processing, will continue to see tremendous benefit.

Source: Wikipedia

Vendors who are working to convert their existing code to run faster on newer multi-core processors face an uphill battle. As a recent Dr. Dobbs report, the State of Parallel Programming 2012, states, “Refactoring existing code is particularly challenging, so the researchers recommend that parallelism be part of the design from the start.” The report goes on to detail the types of concurrency bugs that developers often struggle with when converting existing serial code to parallel code.

Even at ExtraHop, where our software is designed for multi-core processing, we still deal with issues such as lock contention, concurrent access, NUMA (non-uniform memory access) effects, and cache ping-ponging. These are sophisticated problems that can have disastrous consequences if handled poorly, especially in this type of high-performance appliance, and there are relatively few development tools that can help.

Reason #2 – ExtraHop’s Engineering team is committed to performance. 

Writing high-performance code is a rarely practiced art. The majority of software developers work on front-end applications that have relatively forgiving timing constraints. ExtraHop does not have this luxury with real-time packet processing, so we are laser-focused on writing performance-sensitive code. We are constantly profiling our systems to seek out bottlenecks, especially in the packet path. If new code adds a few as 1,000 CPU cycles, we will notice. We also pay close attention to caching effects, both for dedicated per-core and shared on-die caches. This is not to say that other vendors’ engineering teams are not committed to performance, but simply that our focus on performance is one of the reasons why the ExtraHop platform performs real-time transaction analysis at a sustained 20Gbps.

As an aside, if you are a software engineer looking to solve kernel-level, systems-engineering problems and enjoy working with an outstanding team of developers, we’re hiring.

Reason #3 – ExtraHop uses OS bypass for the data plane.

ExtraHop uses a custom Linux distribution for activities on the control plane, such as running the administration UI and configuring the system. For the data plane, ExtraHop uses a proprietary networking microkernel that runs on the metal for the fastest possible performance. Optimizing packet scheduling, performing memory management, and talking directly to I/O devices all help to speed up our packet processing considerably.

In addition to packet processing, another challenge is recording the stream of health and performance metrics to persistent storage. When we were designing the ExtraHop platform, we considered many commercial and open-source databases. We ended up rejecting these options because they would have required continuous management and administrative tuning. Most importantly, these RDBMSes couldn’t handle the level of sustained reads and writes that the ExtraHop platform requires. We also tried pure file-based systems that didn’t scale and investigated less-structured datastores such as Berkeley DB and Tokyo Cabinet. We could have solved this problem by throwing money at it, such as by requiring our users to purchase an expensive SQL cluster, but we wanted to build an all-in-one appliance with a small footprint that required little care and feeding.

To keep our deployment simple and make real-time analysis available to users immediately, we built a proprietary, high-speed, real-time streaming datastore that is optimized for telemetry, or time-sequenced data. This datastore bypasses the operating system to directly read from and write to block devices and uses fast in-memory indexing so that data can be read as soon as it is written, similar to how Google uses Big Table for web indexing.

ExtraHop Platform Architecture

You Are Right to Care About Scalability and Performance

ExtraHop cares as much about performance as you do. It will affect how much value you get from the product, and it also impacts data fidelity. If a load balancer, switch, firewall, or other in-line device is overloaded and drops packets, the sender will simply retransmit them (assuming a reliable transport protocol such as TCP). That doesn’t happen for an out-of-line device that uses a SPAN or network tap. If the device is overloaded, packets will drop, and analysis will suffer.

When choosing a real-time transaction-analysis solution, be sure to question the vendor on scalability. Ask them when their solution was first developed and if it has been redesigned for multi-core chip architectures. If they claim a certain level of throughput, ask them if they can handle high packet rates as well—many monitoring products that do not scale in real-world environments only talk about one end of the performance curve. And, finally, be sure to contact us so we can show you the ExtraHop difference!

 


[1] It’s worthwhile to consider the necessity of parallelization. Since 2005, increases in clock speed have plateaued while transistor counts have continued to grow according to Moore’s Law (see the graph below). During the same period, CPUs have gone from one to two to four to six to eight to sixteen CPU cores, starting with the dual-core Itanium 2 in 2006. To see maximum benefits from new processors, software developers must understand how to parallelize their systems. As experts have noted, this limitation means that the free lunch is over for software developers in regard to benefiting from hardware improvements. As a recent Intel whitepaper put it, “The future of computing is parallel computing, and the future of programming is parallel programming.”

Source: The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software

 

Read the original blog entry...

More Stories By ExtraHop Networks

ExtraHop Networks is a leading provider of network-based application performance management (APM) solutions. The ExtraHop Application Delivery Assurance system performs the fastest and deepest analysis in the industry, achieving real-time transaction monitoring at speeds up to a sustained 10Gbps in a single appliance and application-level visibility with no agents, configuration, or overhead. The ExtraHop system quickly auto-discovers and auto-classifies applications and devices, delivering immediate value out of the box. ExtraHop Networks provides award-winning solutions to companies across a wide range of industries, including ecommerce, communications, and financial services. The privately held company was founded in 2007 by Jesse Rothstein and Raja Mukerji, engineering veterans from F5 Networks and architects of the BIG-IP v9 product. Follow us on Twitter @ExtraHop. For more information, visit www.extrahop.com.

@ThingsExpo Stories
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life sett...
20th Cloud Expo, taking place June 6-8, 2017, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy.
Whether your IoT service is connecting cars, homes, appliances, wearable, cameras or other devices, one question hangs in the balance – how do you actually make money from this service? The ability to turn your IoT service into profit requires the ability to create a monetization strategy that is flexible, scalable and working for you in real-time. It must be a transparent, smoothly implemented strategy that all stakeholders – from customers to the board – will be able to understand and comprehe...
An IoT product’s log files speak volumes about what’s happening with your products in the field, pinpointing current and potential issues, and enabling you to predict failures and save millions of dollars in inventory. But until recently, no one knew how to listen. In his session at @ThingsExpo, Dan Gettens, Chief Research Officer at OnProcess, discussed recent research by Massachusetts Institute of Technology and OnProcess Technology, where MIT created a new, breakthrough analytics model for ...
DevOps is being widely accepted (if not fully adopted) as essential in enterprise IT. But as Enterprise DevOps gains maturity, expands scope, and increases velocity, the need for data-driven decisions across teams becomes more acute. DevOps teams in any modern business must wrangle the ‘digital exhaust’ from the delivery toolchain, "pervasive" and "cognitive" computing, APIs and services, mobile devices and applications, the Internet of Things, and now even blockchain. In this power panel at @...
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smar...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
"We build IoT infrastructure products - when you have to integrate different devices, different systems and cloud you have to build an application to do that but we eliminate the need to build an application. Our products can integrate any device, any system, any cloud regardless of protocol," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Internet of @ThingsExpo has announced today that Chris Matthieu has been named tech chair of Internet of @ThingsExpo 2017 New York The 7th Internet of @ThingsExpo will take place on June 6-8, 2017, at the Javits Center in New York City, New York. Chris Matthieu is the co-founder and CTO of Octoblu, a revolutionary real-time IoT platform recently acquired by Citrix. Octoblu connects things, systems, people and clouds to a global mesh network allowing users to automate and control design flo...
In addition to all the benefits, IoT is also bringing new kind of customer experience challenges - cars that unlock themselves, thermostats turning houses into saunas and baby video monitors broadcasting over the internet. This list can only increase because while IoT services should be intuitive and simple to use, the delivery ecosystem is a myriad of potential problems as IoT explodes complexity. So finding a performance issue is like finding the proverbial needle in the haystack.
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at 20th Cloud Expo, Ed Featherston, director/senior enterprise architect at Collaborative Consulting, will discuss the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
According to Forrester Research, every business will become either a digital predator or digital prey by 2020. To avoid demise, organizations must rapidly create new sources of value in their end-to-end customer experiences. True digital predators also must break down information and process silos and extend digital transformation initiatives to empower employees with the digital resources needed to win, serve, and retain customers.
The WebRTC Summit New York, to be held June 6-8, 2017, at the Javits Center in New York City, NY, announces that its Call for Papers is now open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 20th International Cloud Expo and @ThingsExpo. WebRTC is the future of browser-to-browser communications, and continues to make inroads into the traditional, difficult, plug-in web co...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
The Internet of Things (IoT) promises to simplify and streamline our lives by automating routine tasks that distract us from our goals. This promise is based on the ubiquitous deployment of smart, connected devices that link everything from industrial control systems to automobiles to refrigerators. Unfortunately, comparatively few of the devices currently deployed have been developed with an eye toward security, and as the DDoS attacks of late October 2016 have demonstrated, this oversight can ...
What happens when the different parts of a vehicle become smarter than the vehicle itself? As we move toward the era of smart everything, hundreds of entities in a vehicle that communicate with each other, the vehicle and external systems create a need for identity orchestration so that all entities work as a conglomerate. Much like an orchestra without a conductor, without the ability to secure, control, and connect the link between a vehicle’s head unit, devices, and systems and to manage the ...
"Once customers get a year into their IoT deployments, they start to realize that they may have been shortsighted in the ways they built out their deployment and the key thing I see a lot of people looking at is - how can I take equipment data, pull it back in an IoT solution and show it in a dashboard," stated Dave McCarthy, Director of Products at Bsquare Corporation, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Unsecured IoT devices were used to launch crippling DDOS attacks in October 2016, targeting services such as Twitter, Spotify, and GitHub. Subsequent testimony to Congress about potential attacks on office buildings, schools, and hospitals raised the possibility for the IoT to harm and even kill people. What should be done? Does the government need to intervene? This panel at @ThingExpo New York brings together leading IoT and security experts to discuss this very serious topic.
We are always online. We access our data, our finances, work, and various services on the Internet. But we live in a congested world of information in which the roads were built two decades ago. The quest for better, faster Internet routing has been around for a decade, but nobody solved this problem. We’ve seen band-aid approaches like CDNs that attack a niche's slice of static content part of the Internet, but that’s it. It does not address the dynamic services-based Internet of today. It does...
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, discussed the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.