Cron tasks as Interactors
How OOP benefits the scripts in your application
Interactors
I’m going to start with an overview of a very simple yet deceptively powerful object oriented design pattern: the interactor. An interactor is simply a class with one public callaable method, neatly and obviously encapsulating its single responsibility.
def MyInteractor
# The class method is a convenience: MyInteractor.call instead of MyInteractor.new.call
def self.call
new.call
end
# The One True Public Method
def call
# ... do the thing, call private methods, etc ...
end
end
A nice feature of interactors is they are a dead-simple pattern that makes it hard to violate single responsibility principle. What can in other contexts be a confusing mess of code blocks and clashing method names become private methods in an interactor. Applied well, this can be very freeing.
Now, let’s skip the bikeshedding about how Interactors are either The Best Thing To Happen To Code or The Stupidest Idea Because Of X Y And Z. There’s a use case for well-designed interactors that I find incredibly powerful in an increasingly large monolithic application: scheduled tasks / scripts. For example, running a backup, cleanup, or data resync task once a day, once an hour, or whatever. These might be cron tasks in a more monolithic or legacy app or some tiem-triggered cloud functionn.
Time Distribution
Let’s talk about cron in a monolith. In particular, I’ll use syntax consistent with the whenever gem, which lets you write crontab files with a nicer Ruby DSL:
every 15.minute do
rake 'gates:open_the_gates'
end
every 1.hour do
rake 'gates:close_the_gates'
end
I’ve already written about instrumenting rake tasks, but I just needed to take it further. Specifically, my goals included:
- Creating a pattern where cron tasks ccould very easily be wired over to a Resque worker instead of run locally on the machine running cron
- Making it easy to write tests for our cron-invoked rake tasks
- Adding additional instrumentation, such as count, failure, and duration reporting, without continuing my ugly hacks into the Rakefile as I did in the previous post.
- Support tasks that could only have a single instance running (across any number of processes and machiens)
- Make it easy to mark a particular task as “safe to run outuside of prod” – Because of, well, legacy, we had some unsafe tasks that we guarded by simply not installing the cron schedule in staging. This, of course, increased staging-production divergence, which was causing a lot of pain.
In particular, a big reason I got buy-in for the project because of the first bullet: we had a single machine running a increasingly large and memory-intesnive set of cron tasks that started to crash said machine. There wasn’t and there wasn’t an obvious path to having these run in a more distributed way without tons of refactoring and rewriting.
OOP to the rescue
Step 1 was to create a base Interactor class, which I called ApplicationInteractor
in keeping with ApplicationRecord
, ApplicationController
, and such. I was going to migrate code; largely cut-and-paste, from rake tasks into subclasses of ApplicationInteractor
class ApplicationInteractor
def self.call(*args)
new(*args).call
end
# Simply a placeholder for future development of shared init behavior
def initialize(*); end
def call
raise NoMethodError, "#call is not defined for #{self.class}; please override it."
end
end
With this in place, the following rake task
task :do_the_thing => :environment do
big_long_code_vommit
end
because
class DoTheThing < ApplicationInteractor
def call
big_long_code_vommit
end
end
This allows you to (1) very easily write tests for DoTheThing#call
and (2) start refacoring big_long_code_vommit
into private methods of the interactor. Already a win for code maintainability!
But we ain’t even started yet.
OOP to the Resque
The next task was aimed at our first requirement bullet point: pointing cron tasks at our background job processing system. We use Resque, but what follows could just as easily be done for Sidekiq, ActiveJob, or many others. If you know these systems and have a keen eye, you might have already noticed something – they already usue interactors, usually with a method called perform
. For Resque, this is a class method, but the concept is the same.
So for Resque, all we’d really have to is alias call
as perform
and it will work:
class ApplicationInteractor
def self.call(*args)
new(*args).call
end
alias perform call
...
end
However, we had already abstracted Resque worker classes (for instrumentation, db-backed tracking, and a few other more idiosyncratic reasons) to have an instance method called process
:
class ApplicationResqueWorker
def self.perform(...)
job = persist_job(...)
report_start(job)
burn_the_phoenix if !deploy_version_correct? # A story for another time
new.process(*args)
report_success(job)
rescue StandardError => err
report_error(job, err)
end
# Instance method API! Results in better classes.
def process(*args)
raise NoMethodError, "`process` is not defined for #{self.class}; please override it."
end
end
Thus we have to implement something more like we woulud for Sidekiq with its instance method perform
API. This is just as well, because I prefer that the interactor classes that are being defined to be different from the Resque classes themselves; this keeps their concernt of performing the task vs managing job / queue state separate.
So for each interactor, e.g. DoTheThing
, I’d automatically define DoTheThing::ResqueWorker
using Ruby’s inherited
hook:
class << self
def inherited(klass)
super
build_resque_class(klass)
end
private
def build_resque_class(klass)
# Create a new class that works as a Resque worker
resque_class = Class.new(ApplicationInteractor::BaseResqueClass)
# Save a pointer to the interactor class as data in the resque class
resque_class.interactor_class = klass
# Save the resque class as self::ResqueWorker
klass.const_set(:ResqueWorker, resque_class)
end
end
Our ApplicationInteractor::BaseResqueClass
is the basic adapter around our existing Resque abstraction. It defines the requisite process
method that calls our original interactor.
class ApplicationInteractor::BaseResqueClass
# Base class that is used to construct the desired resque worker
include ApplicationResqueWorker
class << self
# Store `interactor_class` as class data
attr_accessor :interactor_class
end
def process(*args)
self.class.interactor_class.call(*args)
end
end
Thus DoTheThing::ResqueWorker
is a very simple class: it’s a Resque worker (inheriting from ApplicationResqueWorker
, with all of our bells and whistles) which calls our ineractor with the same args.
Get in line
The remaining piece for our MVP is to hook it up in our schedule.rb
file. To do this with whenever
, we’ll define a few new custom job types:
# For tasks that cannot withstand any queueing whatsoever
job_type :call_interactor, "cd :path && bin/rails runner ':task.call' :output"
# For tasks that can accept their default queue
job_type :enqueue_interactor, "cd :path && bin/rails runner ':task::ResqueWorker.enqueue' :output"
# For tasks where cron wants to override the queue
job_type :enqueue_interactor_to, "cd :path && bin/rails runner ':task::ResqueWorker.enqueue_to(\":queue\")' :output"
Note that enqueue
and enqueue_to
come from our ApplicationResqueWorker
class, details omitted. Also note: I found that rails runner
took a non-negligibly higher amoubnt of memory than bundle exec rake
. For our 450 cron tasks this was important, so I rewrote these as their own rake tasks, but we needn’t go into detail.
Now, we can hook up our cron schedule as follows
every 15.minute do
enqueue_interactor 'Gates::OpenTheGates'
end
every 1.hour do
enqueue_interactor 'Gates::CloseTheGates'
end
Music to my fingers
So far, we’ve created a pattern for interactors in our application that allows us to write tests and better encapsulate the logic. We’ve made it easy to call or enqueue these interactors. We added a hook in our cron schedule generator to make it easy to migrate cron tasks to Resque. Those are already some pretty big wins!
And now that we have this pattern, we have the full benefit of OOP at our disposal – let’s keep going! Let’s add a feature to our base interactor that emits metrics: run count, success, failure, and run time.
Ok, I’m actually going to make this a module for opt-in, but you could do this either way. As a modulue, we’ll use ActiveSupport::Concern
to easily define class methods suuch as an override to call
. (I’ve simplified this slightly to omit return value tracking and more.)
module ApplicationInteractor::Instrumentation
METRIC_BASE = "application.tasks"
extend ActiveSupport::Concern
module ClassMethods
def call(*args)
emit_count("count")
DevOps.emit_timer_metrics(METRIC_BASE, tags: metrics_tags) do
super
end
emit_count("success")
rescue StandardError
emit_count("failure")
raise
end
private
def metrics_tags
{task_name: self.name}
end
def emit_count(name)
DevOps.emit_metric("#{METRIC_BASE}.#{name}", 1, type: "count", tags: metrics_tags)
end
end
end
Without going into detail about the implementation of DevOps
, we now get count
, success
, failure
, and run_time
metrics in every ApplicationInteractor
that includes this module, without the authoring developer needing to think about it more than know they can include ApplicationInteractor::Instrumenation
. Sweet!
Produuction-ready
One of the other requirements was to mark specific tasks as safe or unsafe to run in production. We have almost 500 unique cron tasks (yikes), so deciding on a case-by-case basis was not in the cards at this time. Instead, the path forward is to mark all interactors that came from cron tasks as unsafe, and one-by-one turn them on as needed or able. With our OOP structure, this is stupid easy.
Every cron task based interactor gets one more modulue: ApplicationInteractor::LegacyCron
class DoTheCronThing < ApplicationInteractor
include ApplicationInteractotr::Instrumentation
include ApplicationInteractor::LegacyCron
...
end
It’s just an ENV-overridable guard clause:
module ApplicationInteractor::LegacyCron
def call(*args)
return if !Rails.env.production? && !ENV["FORCE_LEGACY_CRON"]
super
end
end
We could do something more useful for the conditional. But doing nothing is existing behavior, and I’m not going to spend too much more time thinking about adapting legacy code for non-production environments. So that’s it!
Locked down
The last requirement was the ability to run tasks as singletons – no other instance of the task is allowed to run at the same time as another. This might be a destructive order polling loop where racce conditions are best avoided. In single-instance cron world, we used lockrun to do this. We’ll accomplish this by, you guessed it, another module. We’ll use redis to store our locks, backed by the redlock gem.
This is a bit simplified; in production I wrote a root-namespace LockRun
class that wraps redlock
without any notion of interactors, and use that class here. For simplicity here, I’m smashing it all together and omitting some nicities like more detailed return values.
module ApplicationInteractor::LockRun
extend ActiveSupport::Concern
module ClassMethods
def lock_expiry_in_seconds
1.hour.to_i
end
# Use the class name for the lock key
def lock_key_name
name
end
def call(*args)
lock_manager Redlock::Client.new([REDIS_CONFIG])
lock_manager.lock(key, (lock_expiry_in_seconds * 1000).round, retry_count: 0) do |locked|
if locked
yeild
elsif ancestors.include?(Instrumentation)
emit_count("locked")
end
end
end
end
end
Some really nice feattures of this system:
- Since we’re using redis, all of our application instances and workers respect the same lock.
- Since we’re using the class name as a key, no two instances of the same interactor will run simultaneously.
- Since we’re using expiring redis keys, we get a timeout on our lock in case a machinne goes MIA. Note that any uncaught errors are handled by
redlock
to release the lock, it’s only if a machine goes poof that we might leave a dangling lock. And machines do go poof. - We now get metrics for each lockout. This is a nice bonus feature that we didn’t have before, and allows us to easily track when a LockRun process is too slow for its own schedule.
That’s all the time we have
To sum up:
- We created a base class
ApplicationInteractor
to standardize the use of task-oriented interactors in our application. - We moved our rake tasks into interactor classes. This allows them to be testable, refactorable, and break our more private methods for bettter maintainability.
- We created an easy way to use these interactors from cron, either directly or kicked off to a backgrounud job processor such as Resque. This allows our cron instance to lighten its load, putting the heavy lifting on our more scalable job processing machines.
- We added an easy module to instrument our interactors. We chose opt-in over opt-out or required, but any of these are just as easy.
- We added a module for our existing cron tasks to avoid running them in prod until we’ve had time to deem them safe and/or necessary.
- We added a module to robustly lock invocations of a given interactor class across all of our machines so that two instances couuld not run at the same time.
Not bad for a little OOP.