Using EagerPreload in Pop
Introduction
We use Pop as our ORM
(object-relational mapping) tool
for querying the database. Pop provides the ability to eagerly fetch associations via its
Eager method.
However, the Eager
method is subject to the "n+1" problem
where each association is loaded via a separate query. For Pop queries that
return records in which each has an eagerly-loaded tree of associated data, the amount of
SQL queries executed as a result can be substantial.
Starting with version 5.1, Pop has the ability to minimize the "n+1" problem via
a new EagerPreload method.
Using EagerPreload
, Pop fetches the requested associations across
all parent records rather than doing one at a time. This reduces the number of connections made
to the database at the expense of doing more computation on the Go side. In many
situations, this can be a reasonable tradeoff that provides better overall performance.
When to Use
If you use Eager
in a query, you should also try EagerPreload
and note the difference in
the number of generated queries (which should show in the log by default in development mode).
Compare performance of both Eager
and EagerPreload
with representative data.
In most cases, EagerPreload
should outperform Eager
. Although there is
an option to turn
EagerPreload
on by default, there are some issues in Pop's implementation at the moment that could
lead to subtle bugs in MilMove
(example1, example2).
In some cases, associations that loaded with Eager
are not loading with EagerPreload
-- this may not cause a
failure, but rather result in missing data returned from an endpoint, for instance. For now, we should
consider using and testing EagerPreload
on a case-by-case basis until we feel more confident in Pop's
implementation.
How to Use
Generally speaking, you should be able to take an Eager
call like so:
err := db.Q().Eager(
"PaymentRequests.PaymentServiceItems.PaymentServiceItemParams.ServiceItemParamKey",
"MTOServiceItems.ReService",
"MTOShipments.DestinationAddress",
"Orders.NewDutyStation.Address",
).All(&moveTaskOrders)
and just replace it with an EagerPreload
call instead:
err := db.Q().EagerPreload(
"PaymentRequests.PaymentServiceItems.PaymentServiceItemParams.ServiceItemParamKey",
"MTOServiceItems.ReService",
"MTOShipments.DestinationAddress",
"Orders.NewDutyStation.Address",
).All(&moveTaskOrders)
In theory, the resulting slice in moveTaskOrders
should be identical in both cases. In practice,
you should always verify that is indeed the case due to bugs we've run into in Pop as noted above.
EagerPreload Bugs and Workarounds
Below are some of the issues we have found so far in using EagerPreload
:
Foreign keys as pointers
We often represent a nullable foreign key (a belongs_to relationship in Pop terms) as a pointer. For example:
type MTOShipment struct {
ID uuid.UUID `db:"id"`
// ...
PickupAddress *Address `belongs_to:"addresses"`
PickupAddressID *uuid.UUID `db:"pickup_address_id"`
// ...
}
Doing an EagerPreload
on PickupAddress
will not load the association currently,
but it will with Eager
.
We think we have identified the issue in Pop and have submitted a
PR to fix it. Until that is merged and released
(update: it got released in Pop 5.3.2), however,
a workaround is to use gobuffalo's nulls package (specifically,
a nulls.UUID
in this case) instead of a pointer for the PickupAddressID
field (note that PickupAddress
can remain a *Address
, however). You can see
examples of how this is used in Pop's documentation.
In general, if you use the nulls
variants rather than pointers for optional fields, the EagerPreload
correctly loads the assocation.
Note that moving to nulls
instead of pointers will usually require changing most usages of the field in
question since it is a different type (although it's a pretty straightforward change in most cases).
Foreign keys named differently from the related table
Consider this example model:
type Order struct {
ID uuid.UUID `json:"id" db:"id"`
// ...
NewDutyStationID uuid.UUID `json:"new_duty_station_id" db:"new_duty_station_id"`
NewDutyStation DutyStation `belongs_to:"duty_stations"`
// ...
}
Note that the foreign key is named NewDutyStationID
but it references a DutyStation
model. Depending on
what other associations are on the model, this can cause Pop to fail to load the NewDutyStation
association
in cases where Eager
worked fine. We have developed a test case and
filed an issue
with Pop about this -- refer to the issue for an example that fails.
As a workaround, try putting a fk_id
struct tag on the association field like so:
type Order struct {
ID uuid.UUID `json:"id" db:"id"`
// ...
NewDutyStationID uuid.UUID `json:"new_duty_station_id" db:"new_duty_station_id"`
NewDutyStation DutyStation `belongs_to:"duty_stations" fk_id:"new_duty_station_id"`
// ...
}
This appears to cause the association to load because it forces Pop to use the explicit foreign key column name
rather than deducing it from the struct's field name (Pop didn't seem to need this fk_id
with Eager
).
Associations with 3+ path elements where the first 2 path elements match
Suppose you try to do an EagerPreload
that looks like this:
err := db.Q().EagerPreload(
"Orders.OriginDutyStation.Address",
"Orders.OriginDutyStation.TransportationOffice",
).All(&moveTaskOrders)
What seems to be happening in this case is that the last association only -- TransportationOffice
in this case --
will be populated. The Address
will be left as either a zero-valued struct or a nil depending on whether the
field is a pointer or not. It appears this only occurs when you have a long association path where the first two
path elements match. We have developed a test case and
filed an issue
with Pop about this. Please refer to the issue for more details.
One workaround is to remove one of the associations from the EagerPreload
and include it separately by
iterating over the results and doing a Load
on the missing association each time through the loop.
That's not as efficient as EagerPreload
of course, but it will at least populate the models correctly.
Eager
vs EagerPreload
Inconsistency
The data loaded using EagerPreload
and Eager
differs when there is no joined record to load. EagerPreload
will simply return nil
for the associated record, while Eager
returns an empty/zero value struct as the associated record. It is recommended to always use EagerPreload
over Eager
for optimization reasons (Eager
suffers from "n+1" problem). However, the nil
record behavior of EagerPreload
is somewhat prohibitive. It results in the inability to distinguish between associated records that (correctly) did not join and never having attempted to EagerPreload
associated records. There is not much that can be done to workaround this issue other than trying to avoid situations in the code where we don't know if records have been loaded with EagerPreload
by minimizing codepaths where EagerPreload
is expected to happen "up-stream" or out of the context of where the associated records are used. For more discussion and examples of this issue, see this slack thread.