R14 - Memory Quota Exceeded in Ruby (MRI)

Last updated December 18, 2024

Why memory errors matter
Detecting a problem
How Ruby memory works
Ruby 2.0 upgrade
Memory leaks
Too many workers
Forking behavior of Puma worker processes
Too much memory used at runtime
GC tuning
Excess memory use due to malloc in a multi-threaded environment
Dyno size and performance
Additional resources

When your Ruby application uses more memory than is available on the Dyno, a R14 - Memory quota exceeded error message will be emitted to your application’s logs. This article is intended to help you understand your application’s memory use and give you the tools to run your application without memory errors.

Why memory errors matter

If you’re getting R14 - Memory quota exceeded errors, it means your application is using swap memory. Swap uses the disk to store memory instead of RAM. Disk speed is significantly slower than RAM, so page access time is greatly increased. This leads to a significant degradation in application performance. An application that is swapping will be much slower than one that is not. No one wants a slow application, so getting rid of R14 Memory quota exceeded errors on your application is very important.

Detecting a problem

Since you’re reading this article it’s likely you already spotted a problem. If not, you can view your last 24 hours of memory use by using Application Metrics on your app’s dashboard. Alternatively, you can check your logs where you will occasionally see the error emitted:

2011-05-03T17:40:11+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)

How Ruby memory works

It can help to understand how Ruby consumes memory to be able to decrease your memory usage. For more information, see How Ruby Uses Memory and Why does my App’s Memory Use Grow Over Time?.

Ruby 2.0 upgrade

Upgrading from Ruby 2.0 to 2.1+ introduced generational garbage collection. This means that Ruby 2.1+ applications should run faster, but also use more memory. We always recommend you run the latest released Ruby version, it will have the latest security, bugfix, and performance patches.

If you see a slight increase in memory, you can use the techniques below to decrease your usage to an acceptable level.

Memory leaks

A memory leak is defined as memory increasing indefinitely over time. Most applications that have memory problems are defined as having a “memory leak” however if you let those applications run for a long enough period of time, the memory use will level out.

If you believe your application has a memory leak you can test this out. First make sure you can run dynamic benchmarks with derailed. Then you can benchmark RAM use over time to determine if your app is experiencing a memory leak.

Too many workers

Modern Ruby webservers such as Puma allow you to serve requests to your users via concurrent processes. In Puma these are referred to as “worker” processes. In general increasing your workers will increase your throughput, but it will also increase your RAM use. You want to maximize the number of Puma workers that you are using without going over your RAM limit and causing your application to swap.

Too many workers at boot

If your application immediately starts to throw R14 errors as soon as it boots it may be due to setting too many workers. You can potentially fix this by setting your WEB_CONCURRENCY config var to a lower value.

For example if you have this in your config/puma.rb file:

# config/puma.rb
workers Integer(ENV['WEB_CONCURRENCY'] || 2)

You can lower your worker count

$ heroku config:set WEB_CONCURRENCY=1

For some applications two Puma workers will cause you to use more RAM than a standard-1x dyno can provide. You can still achieve increased throughput with threads when this happens. Alternatively you can upgrade dyno size to run more workers.

Too many workers over time

Your application’s memory use will increase over time. If it starts out fine, but gradually increases to be above your RAM limit, there are a few things you can try. If you quickly hit the limit, you likely want to decrease your total number of Puma workers. If it takes hours before you hit the limit, there is a bandaid you can try called Puma Worker Killer.

Puma worker killer allows you to set up a rolling worker restart of your Puma workers. The idea is that you want to figure out at what interval your application begins using too much memory. You will then schedule your application to restart your workers at that interval. When you restart a process, the memory use goes back to it’s original lower levels. Even if the memory is still growing it won’t cause problems for another few hours, where we would have another restart scheduled.

To use this gem add it to your Gemfile:

gem "puma_worker_killer"

Then $ bundle install and add this to an initializer such as config/initializers/puma_worker_killer.rb

PumaWorkerKiller.enable_rolling_restart

It’s important to note that this won’t actually fix any memory problems, but instead will cover them up. When your workers restart they cannot serve requests for a few seconds, so when the rolling restarts are triggered your end users may experience a slow down as your overall application’s throughput is decreased. Once restarts are done, throughput should go back to normal.

It is highly recommended that you only use Puma Worker Killer as a stop gap measure until you can identify and fix the memory problem. Several suggestions are covered below.

Forking behavior of Puma worker processes

Puma implements its worker processes via forking. When you fork a program, you copy a running program into a new process and then make changes instead of starting with a blank process. Most modern operating systems allow for memory to be shared between processes in a concept called “copy on write”. When Puma spins up a new worker, it requires very little memory, only when Puma needs to modify or “write” to memory does it copy a memory location from one process to another. Modern Ruby versions are optimized to be copy on write “friendly”, that is they do not write to memory unnecessarily. This means when Puma spins up a new worker it is likely smaller than the one before it. You can observe this behavior locally on Activity Monitor on a Mac or via ps on Linux. There will be a large process consuming a lot of memory, and then smaller processes. So if Puma with one worker was consuming 300 MB of RAM then using two workers would likely consume less than 600 MB of RAM total.

Too much memory on boot

A common cause of memory use is due to libraries being required in a Gemfile but not used. You can see how much memory your gems use at boot time through the derailed benchmark gem.

First add the gem to your Gemfile:

gem 'derailed', group: :development

Now $ bundle install and you’re ready to investigate memory use. You can run:

$ bundle exec derailed bundle:mem

This will output the memory use of each of your gems as they are required into memory:

$ derailed bundle:mem
TOP: 54.1836 MiB
  mail: 18.9688 MiB
    mime/types: 17.4453 MiB
    mail/field: 0.4023 MiB
    mail/message: 0.3906 MiB
  action_view/view_paths: 0.4453 MiB
    action_view/base: 0.4336 MiB

Remove any libraries you aren’t using. If you see a library using an unusually large amount of memory, try to upgrade to the latest version to see if any issues have been fixed. If the problem persists, open an issue with the library maintainer to see if there is something that can be done to decrease require time memory. To help with this process you can use $ bundle exec derailed bundle:objects. See Objects created at Require time in derailed benchmarks for more information.

Too much memory used at runtime

If you’ve cleaned out your unused gems, and you’re still seeing too much memory use, there may be code generating excessive amounts of Ruby objects. It is possible to use a runtime tool such as the Heroku Add-on Scout. Scout published a guide on debugging runtime memory use.

If you don’t use use a tool that can track object allocations at runtime, you can try to reproduce this memory increasing behavior locally with derailed benchmarks by reproducing the allocations locally.

GC tuning

Every application behaves differently so there is no one correct set of GC (garbage collector) values that we can recommend.

When it comes to memory utilization you can control how fast Ruby allocates memory by setting RUBY_GC_HEAP_GROWTH_FACTOR. This value is different for different versions of Ruby. To understand how it works it is helpful to first understand how Ruby uses memory.

When Ruby runs out of memory and cannot free up any slots via the garbage collector, it has to tell the operating system it needs more memory. Asking the operating system for memory is an expensive (slow) process, so Ruby wants to always ask for a little more than it needs. You can control how much memory it asks for by setting this RUBY_GC_HEAP_GROWTH_FACTOR config var. For example, if you wanted your application to grow by 3% every time memory was allocated you could set:

$ heroku config:set RUBY_GC_HEAP_GROWTH_FACTOR=1.03

So if your application is 100 MB in size and it needs extra memory to function, with this setting it would ask the OS for 3 MB of RAM extra. This would bring the total amount of memory that Ruby can use to 103 MB. If your memory is growing too quickly try setting this value to smaller numbers. Keep in mind that setting too low of a value can cause Ruby to spend a large amount of time asking the OS for memory.

Generally, tuning RUBY_GC_HEAP_GROWTH_FACTOR will only help with R14 errors if you are barely over your dyno memory limits. Or if you’re seeing extremely large “stair-step” memory allocations after your application has been running for several hours. Individual apps are responsible for setting and maintaining their own GC tuning configuration variables.

Excess memory use due to malloc in a multi-threaded environment

Applications created after September of 2019 will have the environment variable MALLOC_ARENA_MAX=2 set.

The behavior of malloc in a multi-threaded environment can drastically increase memory use.

To work around this malloc behavior, an alternative memory allocator such as jemalloc can be used to replace malloc. To replace malloc on Heroku with jemalloc you can use a third party jemalloc buildpack.

Alternatively, if you don’t want to use a third-party buildpack, it is possible to tune the behavior of glibc memory behavior so that it will consume less memory. However, this approach may impact the performance of the application.

Dyno size and performance

Dynos come in two types, standard dynos which run on a shared infrastructure, and performance dynos which consume an entire runtime instance. When you increase your dyno size you increase the amount of memory you can consume. If your application cannot serve two or more requests concurrently it is subject to request queueing. Ideally, your application should be running in a dyno that allows it to run at least two Puma worker processes. As stated before, additional Puma worker processes consume less RAM than the first process. You may be able to keep application spend the same by upgrading to a larger dyno size, doubling your worker count but using only half the number of total dynos.

While this article is primarily about memory, its focus is speed. On that topic, it is important to highlight that since performance dynos are isolated from “noisy neighbors” they will see much more consistent performance. Most applications perform significantly better when run on performance dynos.

Additional resources

Why does my App’s Memory Use Grow Over Time? - Start here, a good high level overview of what causes a system’s memory to grow that will help you develop an understanding of how Ruby allocates and uses memory at the application level.
Complete Guide to Rails Performance (Book) - This book is by Nate Berkopec is highly regarded.
How Ruby uses memory - This is a lower-level look at precisely what “retained” and “allocated” memory means. It uses small scripts to demonstrate Ruby memory behavior. It also explains why our system’s “total max” memory rarely goes down.
How Ruby uses memory (Video) - If you’re new to the concepts of object allocation, this might be an excellent place to start (you can skip the first story in the video, the rest are about memory). Memory explanations start at 13 minutes
Debugging a memory leak on Heroku - It’s probably not a leak. Still worth reading to see how you can come to the same conclusions yourself. Content is valid for environments other than Heroku. Lots of examples of using the tool derailed_benchmarks.
The Life-Changing Magic of Tidying Active Record Allocations (Blog + Video) - This post shows how tools were used to track down and eliminate memory allocations in real life. All of the examples are from patches submitted to Rails, but the process works the same for finding allocations caused by your application logic.
N+1 Queries or Memory Problems: Why not Solve Both? - Goes through a tricky real-world scenario from 2017 where an N+1 query was killing performance, but using “eager loading” used too much memory. The app had to re-work the flow of the code to extract the data that was needed while avoiding extra memory allocations.
Jumping off the Ruby Memory Cliff - Sometimes, you might see a ‘cliff’ in your memory metrics or a saw-tooth pattern. This article explores why that behavior exists and what it means.

Keep reading

Troubleshooting & Support

Categories