|
PHP Authors: Liz McMillan, Carmen Gonzalez, Hovhannes Avoyan, Lori MacVittie, Trevor Parsons Related Topics: Ruby-On-Rails, Java IoT, Industrial IoT, Microservices Expo, Open Source Cloud, IoT User Interface, PHP Ruby-On-Rails: Article
Monitoring Background Jobs in Ruby’s Resque
How to get visibility into an important component of any complex system: the messaging queue
|
By TR Jordan |
Article Rating: |
|
May 15, 2013 01:45 PM EDT |
Reads: |
8,109 |
Here at AppNeta, we get to see a lot about how people build their web applications. From simple PHP scripts to heavily service-oriented Java clouds to monolithic Django apps, everybody’s product is architected a little differently. We’re still out to trace everything, and today I want to talk how to get visibility into an important component of any complex system: the messaging queue. Specifically, let’s look at how to trace a job from Rails using Resque.
Messaging Queues If you haven’t used a messaging queue in your app, the idea is simple. Instead of forcing all the work to happen during the request, while the user is waiting, you can delay some of the more time-consuming tasks. You can do anything in these tasks, ranging from a simple insert to kicking off a series of user analytics that touch all parts of your infrastructure. The advantage is that you can return a speedy response to the user, or, if they are actually waiting on the task results, give them a better loading interface than a white screen and browser loading bar.
A Quick Resque Tutorial In Ruby, Resque is a task runner, which by default stores the task descriptions in Redis (though other options are available). Resque jobs are just Ruby classes, with a single mandatory method perform . Resque will call perform with the arguments given in the task description. Let’s look at a minimal task, that takes a single argument and prints it. (Useless, I know.)

The @queue variable defines a name that a worker can bind to, in case you want to spread different types of jobs across different machines. To create a task that this worker could run, we just call it from our request:

And that’s our job! Maybe not the most interesting job, and probably not prone to performance issues, but we don’t know that yet. So let’s measure it!
Tracing a Resque Task Now that we’ve added this to our system, we should have monitoring around it. The easiest way to do this would be to just measure the time each task takes, and log that information:

Unfortunately, the data presentation here leaves a bit to be desired, so I’m going to use TraceView to log this information instead. This also has the benefit of logging any SQL queries, cache accesses, or service calls that we might do in a more complex task, as well as reporting errors. To start a trace fresh, we can wrap this call in the start_trace block:
That’s a start! We’ve now got some visibility into our Resque jobs, and we can rest easy knowing that this is running smoothly in production.
Tracing a Resque Job (with multiple tasks!) For cron-style jobs, the approach of tracing each task individually works fine. For reference, let’s look at the events we’re generating with that code:

Pretty straightforward. Now let’s consider a more complicated set of tasks: a document-processing pipeline. That code might look like this:

In this case, our first task takes a document, and the second one archives it. If we have multiple tasks, each one gets logged separately, and we can figure out same statistics for each — average, std. dev., percentiles, and the like. But what if you have a job that spans multiple tasks? We can further aggregate the stats, but we might be starting to miss things, like large inputs that cause the entire pipeline to slow down.
What we’d really like is to correlate the related tasks, so instead of timing the each task, we’re timing the entire job. Under the hood, TraceView generates a token for each request. If we pass this ID (generally stored in xtrace, after the X-Trace header it’s passed around in) to each task, we can correlate those timings before storing them, and retrieve them all together. To do this, we can modify each task to take this token, and trace using that ID. ProcessDoc then becomes:

Now we need to start the trace somewhere, but we’re not doing it in the job. We could start it in the first task, or we could link this one step further up the chain and tie it back to the web request that started it in the first place. In a default rails stack, that request generates the following events:

To add in the task queue call to the logged request, we can call the following function:

We have to force a fork in the execution path to indicate that we’re running an asynchronous task, possibly in parallel, with the rest of the web request, which is done with the call to fromString . Aside from that, this is the same underlying call as is done by start_trace above — log that we’re entering a named block of code, and start timing it.
When we put it all together, we get a secondary execution path attached to the web request, and the logged events look like this:

Now we’ve got everything: the original request, all tasks, individual timing information, and a global view of how the process performed. Not that we now have an additional timing measurement here: the delay before starting the task at all. In this case, we waited a full 500ms between queuing the job actually executing it! Once we were in the pipeline, the tasks happened much faster (only 25ms between processing and archiving).
Caveats Lest you think that everything was easy, there’s a couple things to keep in mind when you use this in your own application.
- Because we’re starting the timing in the web request and ending it in a task queue, we’re relying on those two processes to have an identical clock. If they’re on the same machine, it won’t be a problem, but on different machines, any clock skew will effect the timing.
- I’ve quietly assumed everything in this system is reliable, which is almost certainly wrong. Whatever your error handling is, make sure you always log the exit event for ‘job’, or you may never know that you have errors!
As long as I haven’t totally dissuaded you from trying this out, all the code is available in one place in this gist, and you can try in out in your application today with our free version of TraceView!
Related Articles
Ruby 2.0 Released: Let The Tracing Begin!
AppNeta Rubygems Verified
Relieve Event Binding Aches in Backbone.js
A veteran of MIT’s Lincoln Labs, TR is a reformed physicist and full-stack hacker – for some limited definition of full stack. After a few years as Software Development Lead with Thermopylae Science and Techology, he left to join Tracelytics as its first engineer. Following Tracelytics merger with AppNeta, TR was tapped to run all of its developer and market evangelism efforts. TR still harbors a not-so-secret love for Matlab-esque graphs and half-baked statistics, as well as elegant and highly-performant code. Read more of his articles at www.appneta.com/blog or visit www.appneta.com.
@ThingsExpo Stories By Pat Romanski  IoT solutions exploit operational data generated by Internet-connected smart “things” for the purpose of gaining operational insight and producing “better outcomes” (for example, create new business models, eliminate unscheduled maintenance, etc.). The explosive proliferation of IoT solutions will result in an exponential growth in the volume of IoT data, precipitating significant Information Governance issues: who owns the IoT data, what are the rights/duties of IoT solutions adopters towards t... Dec. 7, 2016 05:15 PM EST Reads: 245 | By Elizabeth White  Businesses and business units of all sizes can benefit from cloud computing, but many don't want the cost, performance and security concerns of public cloud nor the complexity of building their own private clouds. Today, some cloud vendors are using artificial intelligence (AI) to simplify cloud deployment and management. In his session at 20th Cloud Expo, Ajay Gulati, Co-founder and CEO of ZeroStack, will discuss how AI can simplify cloud operations. He will cover the following topics: why clou... Dec. 7, 2016 05:15 PM EST Reads: 925 | By Elizabeth White  As data explodes in quantity, importance and from new sources, the need for managing and protecting data residing across physical, virtual, and cloud environments grow with it. Managing data includes protecting it, indexing and classifying it for true, long-term management, compliance and E-Discovery. Commvault can ensure this with a single pane of glass solution – whether in a private cloud, a Service Provider delivered public cloud or a hybrid cloud environment – across the heterogeneous enter... Dec. 7, 2016 05:15 PM EST Reads: 1,741 | By Pat Romanski  Everyone knows that truly innovative companies learn as they go along, pushing boundaries in response to market changes and demands. What's more of a mystery is how to balance innovation on a fresh platform built from scratch with the legacy tech stack, product suite and customers that continue to serve as the business' foundation.
In his General Session at 19th Cloud Expo, Michael Chambliss, Head of Engineering at ReadyTalk, discussed why and how ReadyTalk diverted from healthy revenue and mor... Dec. 7, 2016 04:30 PM EST Reads: 1,674 | By Elizabeth White  The many IoT deployments around the world are busy integrating smart devices and sensors into their enterprise IT infrastructures. Yet all of this technology – and there are an amazing number of choices – is of no use without the software to gather, communicate, and analyze the new data flows. Without software, there is no IT.
In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, Dave McCarthy, Director of Products at Bsquare Corporation; Alan Williamson, Principal... Dec. 7, 2016 04:15 PM EST Reads: 343 | By Carmen Gonzalez  The 20th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held June 6-8, 2017, at the Javits Center in New York City, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location.
With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal ... Dec. 7, 2016 03:45 PM EST Reads: 2,253 | By Pat Romanski  You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time.
In his session at 19th Cloud Expo, Mark Allen, General Manager of... Dec. 7, 2016 03:30 PM EST Reads: 1,774 | By Pat Romanski  "IoT is going to be a huge industry with a lot of value for end users, for industries, for consumers, for manufacturers. How can we use cloud to effectively manage IoT applications," stated Ian Khan, Innovation & Marketing Manager at Solgeniakhela, in this SYS-CON.tv interview at @ThingsExpo, held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Dec. 7, 2016 02:30 PM EST Reads: 4,313 | By Elizabeth White  Successful digital transformation requires new organizational competencies and capabilities. Research tells us that the biggest impediment to successful transformation is human; consequently, the biggest enabler is a properly skilled and empowered workforce. In the digital age, new individual and collective competencies are required.
In his session at 19th Cloud Expo, Bob Newhouse, CEO and founder of Agilitiv, drew together recent research and lessons learned from emerging and established compa... Dec. 7, 2016 02:30 PM EST Reads: 941 | By Elizabeth White  Bert Loomis was a visionary. This general session will highlight how Bert Loomis and people like him inspire us to build great things with small inventions. In their general session at 19th Cloud Expo, Harold Hannon, Architect at IBM Bluemix, and Michael O'Neill, Strategic Business Development at Nvidia, discussed the accelerating pace of AI development and how IBM Cloud and NVIDIA are partnering to bring AI capabilities to "every day," on-demand. They also reviewed two "free infrastructure" pr... Dec. 7, 2016 02:15 PM EST Reads: 1,117 | By Carmen Gonzalez  Financial Technology has become a topic of intense interest throughout the cloud developer and enterprise IT communities.
Accordingly, attendees at the upcoming 20th Cloud Expo at the Javits Center in New York, June 6-8, 2017, will find fresh new content in a new track called FinTech. Dec. 7, 2016 02:15 PM EST Reads: 2,199 | By Liz McMillan  Information technology is an industry that has always experienced change, and the dramatic change sweeping across the industry today could not be truthfully described as the first time we've seen such widespread change impacting customer investments. However, the rate of the change, and the potential outcomes from today's digital transformation has the distinct potential to separate the industry into two camps: Organizations that see the change coming, embrace it, and successful leverage it; and... Dec. 7, 2016 02:15 PM EST Reads: 3,387 | By Liz McMillan  "Dice has been around for the last 20 years. We have been helping tech professionals find new jobs and career opportunities," explained Manish Dixit, VP of Product and Engineering at Dice, in this SYS-CON.tv interview at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Dec. 7, 2016 01:45 PM EST Reads: 1,116 | By Pat Romanski  Extracting business value from Internet of Things (IoT) data doesn’t happen overnight. There are several requirements that must be satisfied, including IoT device enablement, data analysis, real-time detection of complex events and automated orchestration of actions. Unfortunately, too many companies fall short in achieving their business goals by implementing incomplete solutions or not focusing on tangible use cases.
In his general session at @ThingsExpo, Dave McCarthy, Director of Products... Dec. 7, 2016 12:30 PM EST Reads: 847 | By Liz McMillan  "At ROHA we develop an app called Catcha. It was developed after we spent a year meeting with, talking to, interacting with senior citizens watching them use their smartphones and talking to them about how they use their smartphones so we could get to know their smartphone behavior," explained Dave Woods, Chief Innovation Officer at ROHA, in this SYS-CON.tv interview at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Dec. 7, 2016 12:30 PM EST Reads: 639 | By Liz McMillan  SYS-CON Events has announced today that Roger Strukhoff has been named conference chair of Cloud Expo and @ThingsExpo 2017 New York.
The 20th Cloud Expo and 7th @ThingsExpo will take place on June 6-8, 2017, at the Javits Center in New York City, NY.
"The Internet of Things brings trillions of dollars of opportunity to developers and enterprise IT, no matter how you measure it," stated Roger Strukhoff. "More importantly, it leverages the power of devices and the Internet to enable us all to im... Dec. 7, 2016 12:30 PM EST Reads: 736 | By Liz McMillan  We are always online. We access our data, our finances, work, and various services on the Internet. But we live in a congested world of information in which the roads were built two decades ago. The quest for better, faster Internet routing has been around for a decade, but nobody solved this problem. We’ve seen band-aid approaches like CDNs that attack a niche's slice of static content part of the Internet, but that’s it. It does not address the dynamic services-based Internet of today. It does... Dec. 7, 2016 12:00 PM EST Reads: 1,124 | By Liz McMillan  "ReadyTalk is an audio and web video conferencing provider. We've really come to embrace WebRTC as the platform for our future of technology," explained Dan Cunningham, CTO of ReadyTalk, in this SYS-CON.tv interview at WebRTC Summit at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Dec. 7, 2016 11:30 AM EST Reads: 690 | By Carmen Gonzalez  20th Cloud Expo, taking place June 6-8, 2017, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world.
Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Dec. 7, 2016 11:15 AM EST Reads: 2,298 | By Carmen Gonzalez  WebRTC is the future of browser-to-browser communications, and continues to make inroads into the traditional, difficult, plug-in web communications world. The 6th WebRTC Summit continues our tradition of delivering the latest and greatest presentations within the world of WebRTC. Topics include voice calling, video chat, P2P file sharing, and use cases that have already leveraged the power and convenience of WebRTC. Dec. 7, 2016 10:30 AM EST Reads: 1,713 |
|
|
|
|
|
|