Skip to main content

Using EagerPreload in Pop

Introduction

We use Pop as our ORM (object-relational mapping) tool for querying the database. Pop provides the ability to eagerly fetch associations via its Eager method. However, the Eager method is subject to the "n+1" problem where each association is loaded via a separate query. For Pop queries that return records in which each has an eagerly-loaded tree of associated data, the amount of SQL queries executed as a result can be substantial.

Starting with version 5.1, Pop has the ability to minimize the "n+1" problem via a new EagerPreload method. Using EagerPreload, Pop fetches the requested associations across all parent records rather than doing one at a time. This reduces the number of connections made to the database at the expense of doing more computation on the Go side. In many situations, this can be a reasonable tradeoff that provides better overall performance.

When to Use

If you use Eager in a query, you should also try EagerPreload and note the difference in the number of generated queries (which should show in the log by default in development mode). Compare performance of both Eager and EagerPreload with representative data.

In most cases, EagerPreload should outperform Eager. Although there is an option to turn EagerPreload on by default, there are some issues in Pop's implementation at the moment that could lead to subtle bugs in MilMove (example1, example2). In some cases, associations that loaded with Eager are not loading with EagerPreload -- this may not cause a failure, but rather result in missing data returned from an endpoint, for instance. For now, we should consider using and testing EagerPreload on a case-by-case basis until we feel more confident in Pop's implementation.

How to Use

Generally speaking, you should be able to take an Eager call like so:

err := db.Q().Eager(
"PaymentRequests.PaymentServiceItems.PaymentServiceItemParams.ServiceItemParamKey",
"MTOServiceItems.ReService",
"MTOShipments.DestinationAddress",
"Orders.NewDutyStation.Address",
).All(&moveTaskOrders)

and just replace it with an EagerPreload call instead:

err := db.Q().EagerPreload(
"PaymentRequests.PaymentServiceItems.PaymentServiceItemParams.ServiceItemParamKey",
"MTOServiceItems.ReService",
"MTOShipments.DestinationAddress",
"Orders.NewDutyStation.Address",
).All(&moveTaskOrders)

In theory, the resulting slice in moveTaskOrders should be identical in both cases. In practice, you should always verify that is indeed the case due to bugs we've run into in Pop as noted above.

EagerPreload Bugs and Workarounds

Below are some of the issues we have found so far in using EagerPreload:

Foreign keys as pointers

We often represent a nullable foreign key (a belongs_to relationship in Pop terms) as a pointer. For example:

type MTOShipment struct {
ID uuid.UUID `db:"id"`
// ...
PickupAddress *Address `belongs_to:"addresses"`
PickupAddressID *uuid.UUID `db:"pickup_address_id"`
// ...
}

Doing an EagerPreload on PickupAddress will not load the association currently, but it will with Eager. We think we have identified the issue in Pop and have submitted a PR to fix it. Until that is merged and released (update: it got released in Pop 5.3.2), however, a workaround is to use gobuffalo's nulls package (specifically, a nulls.UUID in this case) instead of a pointer for the PickupAddressID field (note that PickupAddress can remain a *Address, however). You can see examples of how this is used in Pop's documentation. In general, if you use the nulls variants rather than pointers for optional fields, the EagerPreload correctly loads the assocation. Note that moving to nulls instead of pointers will usually require changing most usages of the field in question since it is a different type (although it's a pretty straightforward change in most cases).

Consider this example model:

type Order struct {
ID uuid.UUID `json:"id" db:"id"`
// ...
NewDutyStationID uuid.UUID `json:"new_duty_station_id" db:"new_duty_station_id"`
NewDutyStation DutyStation `belongs_to:"duty_stations"`
// ...
}

Note that the foreign key is named NewDutyStationID but it references a DutyStation model. Depending on what other associations are on the model, this can cause Pop to fail to load the NewDutyStation association in cases where Eager worked fine. We have developed a test case and filed an issue with Pop about this -- refer to the issue for an example that fails.

As a workaround, try putting a fk_id struct tag on the association field like so:

type Order struct {
ID uuid.UUID `json:"id" db:"id"`
// ...
NewDutyStationID uuid.UUID `json:"new_duty_station_id" db:"new_duty_station_id"`
NewDutyStation DutyStation `belongs_to:"duty_stations" fk_id:"new_duty_station_id"`
// ...
}

This appears to cause the association to load because it forces Pop to use the explicit foreign key column name rather than deducing it from the struct's field name (Pop didn't seem to need this fk_id with Eager).

Associations with 3+ path elements where the first 2 path elements match

Suppose you try to do an EagerPreload that looks like this:

err := db.Q().EagerPreload(
"Orders.OriginDutyStation.Address",
"Orders.OriginDutyStation.TransportationOffice",
).All(&moveTaskOrders)

What seems to be happening in this case is that the last association only -- TransportationOffice in this case -- will be populated. The Address will be left as either a zero-valued struct or a nil depending on whether the field is a pointer or not. It appears this only occurs when you have a long association path where the first two path elements match. We have developed a test case and filed an issue with Pop about this. Please refer to the issue for more details.

One workaround is to remove one of the associations from the EagerPreload and include it separately by iterating over the results and doing a Load on the missing association each time through the loop. That's not as efficient as EagerPreload of course, but it will at least populate the models correctly.

Eager vs EagerPreload Inconsistency

The data loaded using EagerPreload and Eager differs when there is no joined record to load. EagerPreload will simply return nil for the associated record, while Eager returns an empty/zero value struct as the associated record. It is recommended to always use EagerPreload over Eager for optimization reasons (Eager suffers from "n+1" problem). However, the nil record behavior of EagerPreload is somewhat prohibitive. It results in the inability to distinguish between associated records that (correctly) did not join and never having attempted to EagerPreload associated records. There is not much that can be done to workaround this issue other than trying to avoid situations in the code where we don't know if records have been loaded with EagerPreload by minimizing codepaths where EagerPreload is expected to happen "up-stream" or out of the context of where the associated records are used. For more discussion and examples of this issue, see this slack thread.