Introduction
Knowing when to use #count, #length, or #size can be confusing and using the wrong method can lead to avoidable increased database usage.
Namely, these issues arises in the form of undesired database queries.
The differences between these methods are subtle and can be easily overlooked.
Let’s explore the differences between these methods and when to use each one in the context of Rails.
ActiveRecord vs Array and Hash
First of all its important to note that the behavior of these methods only differs when dealing with an ActiveRecord::Relation.
When dealing with an Array or a Hash, all three methods will effectively be the same.
For this post we will be using an example of an ActiveRecord::Relation to illustrate the differences between these methods:
class Order < ActiveRecord::Base; end
orders = Order.all
order.class
=> Order::ActiveRecord_Relation
#count, #length, and #size
Let’s explore the differences between these methods in detail:
#countwill always execute a database query to count the number of records in the relation using SQL.#lengthwill first execute the SQL query to load all matching rows into memory and then returnsArray#size.#sizeif the records have already been loaded into memory run#lengthotherwise it will execute#count.
| Method | Records not loaded | Records loaded |
|---|---|---|
#count | Executes SQL COUNT query | Executes SQL COUNT query |
#length | Loads all records into memory, then returns Array#size | Returns Array#size |
#size | Executes SQL COUNT query | Returns Array#size |
Examples
Let’s explore the differences between these methods in practice.
orders = Order.all
orders.count # SELECT COUNT(*) FROM "orders"
orders = Order.all
orders.length # SELECT COUNT(*) FROM "orders"
orders = Order.all
orders.size # SELECT COUNT(*) FROM "orders"
Note that an SQL query is executed in all cases since the relation is not loaded into memory. Now if we load the relation into memory first, the behavior of these methods will differ.
orders = Order.all.load
orders.count # SELECT COUNT(*) FROM "orders"
orders = Order.all
orders.length # no query
orders = Order.all
orders.size # no query
Note that an SQL query is not executed with #length or #size since the relation is already loaded into memory.
# BAD
orders = Order.all
orders.size # SELECT COUNT(*) FROM "orders"
orders.each { } # SELECT "orders".* FROM "orders"
# BAD
orders = Order.all
orders.each { } # SELECT "orders".* FROM "orders"
orders.count # SELECT COUNT(*) FROM "orders"
# GOOD
orders = Order.all
orders.each { } # SELECT "orders".* FROM "orders"
orders.size # no query
Note that the order of methods used matters. If you are using #each or #map to enumerate the records, this will load the records into memory first.
After then records have been loaded into memory then you can use #size or #count to avoid a second unnecessary query with #count, like in the second example above.
Notes on performance
If database usage is of concern and if records were already going to be loaded into memory eventually,
it is best to count the number of records in a relation after they have already been loaded into memory.
This effectively reduces the number of queries that will be executed.
For example, #each first then #size and not the other way around.
orders = Order.all
orders.each { } # SELECT "orders".* FROM "orders"
orders.size # no query
However, if there are many records, for example, 10'000 or more.
They should probably not loaded into memory all at once and could potentially cause other issues.
We might opt for #find_each instead, which queries the database in batches of 1'000 records at a time by default.
This will avoid the huge memory allocation of loading all records into memory at once but wil also change our strategy for counting the number of records.
We could use #count, #length or #size and all will perform an SQL query.
In this case the order of method we choose to use does not matter as the result is the same.
orders = Order.all # 10_000 records
orders.find_each { } # 10 queries
# SELECT "orders".* FROM "orders" WHERE "orders"."id" > 1000 ORDER BY "orders"."id" ASC LIMIT 1000
orders.count # 1 query
I will crate a full post on Find vs Find Each vs Find In Batches in Rails coming soon!
Summary
#count will always execute a database query.
#length will first execute the SQL query to load all matching rows into memory if they haven’t already.
#size will only execute a database query if the records have not been already been loaded into memory.
If there are less than 1'000 records, #load them into memory first and then use #size.
If there are more than 10'000 records, use #find_each and then use #count.
