Rails and inner query choices

Consider that we have two models; Well and SurveryPoints . A Well represent an oil/gas/water well, each with multiple SurveyPoints. A SurveyPoint represent the well’s coordinate at a certain depth.

class Well < ActiveRecord::Base	has_many :survey_pointsendclass SurveyPoint < ActiveRecord::Base	belongs_to :wellend

Say, we’d like to implement a filtration mechanism that allows us to query the wells – and for those set of wells we’d like to conditionally present the set of survey_points for those set of filtered wells.

A method find_for_wells would returns the set of wells we’re interested in ( “ )

Using the power of AREL chaining, we could then conditionally extract the SurveyPoints using the following:

def SurveyPoint.find_for_wells(wells)	scope.where(:well_id => wells.select(:id))end

This roughly translates into the query:

SELECT * FROM SURVEY_POINTS WHERE WELL_ID IN (SELECT ID FROM WELLS WHERE ….)

The challenge here is that if both results is large, depending on the cardinality – the database engine might not be able to make use of defined indexes and the outer query might not be optimizable.

Instead – if we use pluck instead of select:

def self.find_for_wells(wells)	scope.where(:well_id => wells.pluck(:id))end

Rails would fire two queries instead of an inner one. Pluck would first extract the list of ids (val_1, val_2, …) in an array before passing it to the main query. This roughly translates into:

SELECT * FROM SURVEY_POINTS WHERE WELL_ID IN (val_1, val_2, val_3, ….)

No inner query is used in that case and the main one is easy to optimize.

Archives

You may also like...

Leave a Reply

You may also like...

Rails – multi selects for associated models

JRuby / PaperClip

[10 seconds ad] AJAX requests synchronization

Leave a Reply