Fetching ActiveStorage attributes without N+1 queries
In this post I want to describe how to fetch attributes from files uploaded via ActiveStorage, without causing N+1 queries. I hope this is relevant to anyone running into the same issues while building a Rails application.
Setup
Users can upload multiple PDF documents. Each PDF document is stored in S3 and uploaded via Rails ActiveStorage. The user can see a list of attributes of the PDF in a typical index view. The list contains data take about each PDF after it was analysed. We store that data in an outputs table.
# app/models/upload.rb
class Upload < ApplicationRecord
has_many_attached :pdfs
has_many :outputs
# app/models/output.rb
class Output < ApplicationRecord
belongs_to :upload
Note, that there are no models for the active_storage tables that Rails creates. By default they look as follows in the schema:
# db/schema.rb
create_table "active_storage_attachments", force: :cascade do |t|
t.string "name", null: false
t.string "record_type", null: false
t.bigint "record_id", null: false
t.bigint "blob_id", null: false
t.datetime "created_at", null: false
t.index ["blob_id"], name: "index_active_storage_attachments_on_blob_id"
t.index ["record_type", "record_id", "name", "blob_id"], name: "index_active_storage_attachments_uniqueness", unique: true
end
create_table "active_storage_blobs", force: :cascade do |t|
t.string "key", null: false
t.string "filename", null: false
t.string "content_type"
t.text "metadata"
t.string "service_name", null: false
t.bigint "byte_size", null: false
t.string "checksum"
t.datetime "created_at", null: false
t.index ["key"], name: "index_active_storage_blobs_on_key", unique: true
end
The outputs index view lists attributes on the outputs table, but should also show the filename of the PDF and link to it.
The filename lives on active_storage_blobs
by default.
Furthermore, the index view should link to the uploaded PDF document, so that the user can download it. That requires the blob record for the uploaded PDF, so it can be used in the rails_blob_path
helper.
rails_blob_path(output.blob, disposition: 'attachment')
Challenge
The challenge is how to get from the output to the active_storage_blob record to generate the link to the PDF and display the filename.
As a first step I stored the blob_id
on each output record.
Outputs were created after the upload of PDFs had finished, through an after_commit
callback. THis allowed assigning the blob_id
of the pdf_attachment
to the output record.
# app/models/upload.rb
def create_outputs
pdfs.each do |pdf|
Output.create(upload: self, blob_id: pdf.blob_id)
end
end
# db/schema.rb
create_table "outputs", force: :cascade do |t|
t.bigint "blob_id"
t.bigint "upload_id"
t.string "title"
t.integer "page_count"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
t.index ["upload_id"], name: "index_outputs_on_upload_id"
end
The link between output
and active_storage_blob
records now exists. It maade it possivle to get the filename of the PDF for each output with a filename
method on output
# app/models/output.rb
def filename
ActiveStorageService.find_filename_from_blob_id(blob_id)
end
The ActiveStorage service had a method which wrapped a sequel query to fetch the filename
# app/services/active_storage_service.rb
def self.find_filename_from_blob_id(blob_id)
find_filename_from_blob_id_sql = "
SELECT filename FROM active_storage_blobs
WHERE id = '#{blob_id}';"
ActiveRecord::Base.connection.execute(find_filename_from_blob_id_sql).values.flatten.first
end
This worked. However, the way the filename was fetched caused another query for each output, painfully slowing down the index page. A classical N+1 scenario. The multiple queries can be observed in the Rails server logs. The bullet gem can further help debug the issue.
Solution
ActiveStorage tables are generated with a Rails command. But, they come without models, flying under the radar. In the ActiveStorage documentation, attributes such as filename
and content_type
are accessed through the model of the record_type, the upload
model in our case.
The solution was to build a relationship between Output
and ActiveStorage::Blob
. That way the filename could be accessed through the output.blob
relationship.
# app/models/output.rb
belongs_to :blob, class_name: 'ActiveStorage::Blob'
# ...
def filename
blob.filename.to_s
end
We just need to make sure to eager load blob
when fetching outputs
. This is achieved by using includes
in the ActiveRecord query.
# app/controllers/outputs_controller.rb
class OutputsController < ApplicationController
def index
@outputs = Output.not_exported.includes(:blob).find_each(batch_size: 100, order: :desc)
end