Readonly Replicas

Since Rails 6, ActiveRecord has provided a great way to connect to a readonly replica database instance. In brief, with a simple database configuration, ApplicationRecord configuration, and block of code, you can execute all database connections within the block connected to a readonly replica, allowing you to spread your query load among database hosts.

database.yml:

production:
  primary:
    <<: *default_config
  replica:
    <<: *default_config
    hostname: replica.db_host.foo
    replica: true

ApplicationRecord:

class ApplicationRecord < ActiveRecord::Base
  ...
  connects_to database: { writing: :primary, reading: :replica }
  ...
end

Code:

ApplicationRecord.connected_to(role: "reading") do
  ... code here connects to your configured replica ...
end

However, there is a major limitation to this approach, especially when adapting legacy code. In particular, legacy code where it may be difficult to separate the read and write concerns, often because it’s a core engine of a product that has accumulated lots of little add-ons and features over the years, AKA a prime candidate to move over to a replica as a quick win to distributing your load.

In a simplified example (yes, really, it’s simplified), we had a market page where users could view and claim jobs. This page had lots of little features, such as filters, forecasts, summaries, and all sorts of bells and whistles the product team dreamed up over a decade of operation. As one of the most visited pages of the app with the heaviest set of queries, shifting the core query for what jobs were available became an easy target to move to a read replica.

# cartoonized example
def index
  # Get the jobs (READ)
  all_jobs = get_all_available_jobs
  # Apply filter conditions and save filter settings (READ, WRITE)
  filtered_jobs = apply_filters(jobs, filters)
  # Adjust filters based on conditions (WRITE)
  auto_adjust_filters(filters, all_jobs)
  # Store analytics (WRITE)
  log_filter_effect(all_jobs, filtered_jobs, filters)
  # Create job recommendations (READ)
  recommended = recommend_jobs(filtered_jobs)
  # Store recommendations for future reuse (WRITE)
  save_recommendations(recommended)
  # Create predictions for future jobs (READ, WRITE)
  generate_job_forecast(jobs)
  # Create a summary and index the view (READ, WRITE)
  save_and_display_job_summary(filtered_jobs)
  # Paginate the jobs (READ)
  @jobs = paginate(filtered_jobs)
end

Unfortunately, with all those bells and whistles and quick MVPs, a few writes (saving profile settings, filters, logging, and instrumentation) were intermingled with the construction of the core queries. Refactoring was going to be a slog.

Target the replica only for a single query

The problem with moving the above example to a replica is that there was no single contiguous block of code that could be executed in a replica context that isn’t littered with writes. And you can’t just dart in and out of a connected_to block, as using the jobs query outside and inside of the replica blocks now just causes ActiveRecord to do double duty, executing the same basic query on both the primary and replica database connections. That’s the opposite of what we want! Plus, when a connected_to block returns a constructed query, that query is immediately executed, which is problematic if we don’t intend to execute the query until a later pagination, for example.

But why does the replica need to be so all-or-nothing? Only because that’s the mechanism ActiveRecord gives us by default.

What if, instead of a big block where all connections are executed against a replica, we just used a single stand-in model for our jobs, one that could query the jobs table just like the real thing, but itself be configured only to query the replica? As long as that model was used in a read-only context, any other model writes could happen (user profiles, instrumentation events, forecasts and summaries, saved filters) with no problem.

Really, this idea is simple enough. Let’s say our jobs model is represented by a model Job. It’s easy enough to create a subclass of Job that reconfigures its database connection, simply by pointing both the writing and reading roles at the replica. This way, the default connection (which will be writing) for this subclass will connect to the replica

class ReadOnlyJob < Job
  connects_to database: { writing: :replica, reading: :replica }
end

So … we’re done, right?

Controlling your STIs

There’s a significant hitch. The battled Rails developer will have felt the red flag go up as soon as we created a subclass of an ActiveRecord model. Single Table Inheritance (STI) in Rails works by using subclasses of a base ActiveRecord model, so any other usage of subclasses can play poorly with STI models. In our case I’m using Job to illustrate because it was already an STI model with subclasses such as EditJob and AuditJob.

There are two issues we will have to wrestle, all dealing with how any given ReadOnly subclass is supposed to act identically to its superclass as far as query generation.

  1. Our ReadOnly subclass is not another STI type, and the ReadOnly name should not be used in the query
  2. What class should should be used to instantiate a record of a ReadOnly class?

The first problem with #1 is immediately evident in any generated query. Here, the SQL engine looks for a record with type EditJob::ReadOnly, but no such records exist.

EditJob::ReadOnly.all.to_sql
# =>  "SELECT `jobs`.* FROM `jobs` WHERE `jobs`.`type` = 'EditJob::ReadOnly'"
EditJob::ReadOnly.last
# => nil

So, that’s no good. But that’s a simple fix. We just need use the same STI name as our superclass. So, on EditJob::ReadOnly’s singleton class:

  delegate :sti_name, to: :superclass

giving us

EditJob::ReadOnly.all.to_sql
# =>  "SELECT `jobs`.* FROM `jobs` WHERE `jobs`.`type` = 'EditJob'"

Yay!

EditJob::ReadOnly.last
# !!! ActiveRecord::SubclassNotFound (Invalid single-table inheritance type: EditJob is not a subclass of EditJob::ReadOnly)

Uh oh.

What’s happening here? This is issue #2. The query is correct, and returns an EditJob record. However, ActiveRecord’s STI implementation has a check that when a record (EditJob) in initialized it is in fact an instance of the STI class being used (EditJob::ReadOnly). This allows you to initialize a subclass of EditJob, such as ReallySpecificEditJob < EditJob, when querying for an EditJob (because, after all, a ReallySpecificEditJob is_a EditJob based on the rules of OOP inheritance), but won’t let you initialize a TotallyUnrelatedJob as an EditJob.

Since EditJob is not a subclass of EditJob::ReadOnly (in fact, the reverse is true), this triggers, and we get the error above.

Not to worry! We can again use the superclass’ method so that the same class names that pass for the EditJob class will pass for our subclass. This one is a private method, so we have to be a little naughty and use send:

  def find_sti_class(type_name)
    superclass.send(:find_sti_class, type_name)
  end

Now,

EditJob::ReadOnly.last
# => #<EditJob id: ..., ...>

Yay!

But we’re not actually done with issue #1. We’ve solved it for every subclass of our base class, but not the base class itself. To illustrate, consider that a query for the base class, Job, does not insert any query conditions:

Job.all.to_sql
# => "SELECT `jobs`.* FROM `jobs`"

This is more efficient than adding in a query for all possible Job subclasses. ActiveRecord accomplishes this by excluding type conditions if this class is the top-level base class.

But Job::ReadOnly is not the top level base class.

Job::ReadOnly.all.to_sql
# => "SELECT `jobs`.* FROM `jobs` WHERE `jobs`.`type` = 'Job'"

As we illustrated above, our fix tells ActiveRecord to use Job and not Job::ReadOnly in the query constraint. However, in this case, we want no constraints, just like im our base class. The way ActiveRecord decides this is via the descends_from_active_record? method, which returns true for the base class and false for every other class.

So, once again, we will use the superclass method.

  delegate :descends_from_active_record?, to: :superclass

Now,

Job::ReadOnly.all.to_sql
# => "SELECT `jobs`.* FROM `jobs`"

There’s one more case to consider. We’ve solved for the base class and concrete subclasses. But abstract subclasses that themselves have subclasses are still unhappy:

ManualJob.all.to_sql
# => "SELECT `jobs`.* FROM `jobs` WHERE `jobs`.`type` IN ('ManualJob', 'ManualJob', 'EditJob', 'EditJob', 'AuditJob' 'AuditJob')"
ManualJob::ReadOnly.all.to_sql
=> "SELECT `jobs`.* FROM `jobs` WHERE `jobs`.`type` = 'ManualJob'"

This occurs because once ActiveRecord has decided to add a type condition, it does so by adding the types of all known descendants of the queried class as we can see in the first query. While ManualJob has two subclasses, EditJob and AuditJob, the ManualJob::ReadOnly class has no descendants. So the classes don’t make it into the query.

As a related problem, you can see the first query duplicates EditJob and AuditJob in the query. This is because EditJob::ReadOnly is a descendant of ManualJob, and its sti_class as we’ve overridden above is just EditJob. The first EditJob in the query corresponds to the true EditJob class, and the second corresponds to the EditJob::ReadOnly subclass.

Both issues are solved with the same solution: override the type_condition method. The actual list of type names to use is buried inside this method, so we have to copy over the whole thing and override the parts we need. This is definitely the dirtiest part of this work, but it’ll do:

  def type_condition(table = arel_table)
    # Same as original
    sti_column = table[inheritance_column]
    # Grab descendants from superclass instead of self, remove dups
    sti_names = [superclass] + superclass.descendants.map(&:sti_name).uniq
    # Same as original
    predicate_builder.build(sti_column, sti_names)
  end

Finally,

ManualJob::ReadOnly.all.to_sql
=> "SELECT `jobs`.* FROM `jobs` WHERE `jobs`.`type` IN ('ManualJob', 'EditJob', 'AuditJob')"

The abstraction

Putting it all together, we have four class methods and a model database configuration to override. To do this in the abstract, we’ll override the methods in a module, and extend that module as well as define the database configuration in an inherited hook of our root ApplicationModel class. This way, every model class we define gains its very own ReadOnly nested class that connects to the replica.

Note that in order to use connects_to, we need a named class. Thus we will use class_eval to create the nested class rather than const_set with Class.new.

class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true

  connects_to database: { writing: :primary, reading: :replica }

  # HiddenFromSTI: Allow a subclass to behave identically to its superclass from
  # the perspective of ActiveRecord STI. This module is useful for defining a
  # ReadOnly subclass of STI models
  module HiddenFromSTI
    # Use same STI name as superclass (`type` field value in the table)
    # The allows `Foo::Readonly` to load records with sty type `"Foo"`
    delegate :sti_name, to: :superclass

    # ActiveRecord will not add type conditions to the base class. It does this via
    # `descends_from_active_record`; use superclass' definition to accomplish the same.
    # I.e. Job::ReadOnly should not add and `type` constraints, as `Job` does not.
    delegate :descends_from_active_record?, to: :superclass

    # Use the superclass method of class resolution to avoid an error where a record with
    # type == "Foo" is not a subclass of Foo::ReadOnly.
    # This also instantiates a `Foo::ReadOnly` record as a regular `Foo`
    def find_sti_class(type_name)
      superclass.send(:find_sti_class, type_name)
    end

    # Override default type conditions to:
    # 1. Include subclasses of the real class in the query
    #    -> For this, we grab the descendants of the superclass instead of self
    # 2. Exclude ReadOnly classes from the query
    #    -> For this, we de-dup the names list since we've already overridden `sti_name`
    # Original source: https://github.com/rails/rails/blob/7c70791470fc517deb7c640bead9f1b47efb5539/activerecord/lib/active_record/inheritance.rb#L306
    def type_condition(table = arel_table)
      # Same as original
      sti_column = table[inheritance_column]
      # Grab descendants from superclass instead of self, remove dups
      sti_names = ([superclass] + superclass.descendants).map(&:sti_name).uniq
      # Same as original
      predicate_builder.build(sti_column, sti_names)
    end
  end

  class << self
    def inherited(model)
      super
      # Stop infinite recursion: don't trigger this inherited hook on the
      # definition of our ReadOnly class
      return if model.name&.demodulize == "ReadOnly"

      # NOTE: we must define a named class since `connects_to` does not support anonymous classes
      # Therefore, we cannot use `Class.new(...)`, and must instead use the `class` keyword
      model.class_eval <<-BLOCK, __FILE__, __LINE__ + 1
        class self::ReadOnly < self
          extend ApplicationRecord::HiddenFromSTI
          connects_to database: { writing: :replica, reading: :replica }
        end
      BLOCK
    end
  end
end

Now, our availble jobs index can construct a replica-bound jobs query while writing other models

jobs = Job::ReadOnly.available
user.profile.update!(last_seen_job_count: jobs.count)
jobs.where(name: filter).count

but we can confirm we’re connected to the replica configuration by attempting an update action:

jobs.update_all(name: "bar")
# !!! ActiveRecord::ReadOnlyError (Write query attempted while in readonly mode: UPDATE `jobs` SET `jobs`.`name` = 'bar' WHERE `jobs`.`type` IN ('Job', 'ManualJob', 'EditJob', 'AudiJob') AND `jobs`.`state` = 'available')

Copy that.