123456789_123456789_123456789_123456789_123456789_

Welcome to Puma 5: Spoony Bard.

Spoony Bard

Note: Puma 5 now automatically uses WEB_CONCURRENCY env var if set see this post for an explanation. If your memory use goes up after upgrading to Puma 5 it indicates you're now running with multiple workers (processes). You can decrease memory use by tuning this number to be lower.

Puma 5 brings new experimental performance features, a few quality-of-life features and loads of bugfixes. Here's what you should do:

  1. Review the Upgrade section below to see if any of 5.0's breaking changes will affect you.
  2. Upgrade to version 5.0 in your Gemfile and deploy.
  3. Try the new performance experiments outlined below and report your results back to the Puma issue tracker.

Puma 5 was named Spoony Bard by our newest supercontributor, @wjordan. Will brought you one of our new perf features for this release, as well as many other fixes and refactors. If you'd like to name a Puma release in the future, take a look at CONTRIBUTING.md and get started helping us out :)

Puma 5 also welcomes @MSP-Greg as our newest committer. Greg has been instrumental in improving our CI setup and SSL features. Greg also named our 4.3.0 release: Mysterious Traveller.

What's New

Puma 5 contains three new "experimental" performance features for cluster-mode Pumas running on MRI.

If you try any of these features, please report your results to our report issue.

Part of the reason we're calling them experimental is because we're not sure if they'll actually have any benefit. People's workloads in the real world are often not what we anticipate, and synthetic benchmarks are usually not of any help in figuring out if a change will be beneficial or not.

We do not believe any of the new features will have a negative effect or impact the stability of your application. This is either a "it works" or "it does nothing" experiment.

If any of the features turn out to be particularly beneficial, we may make them defaults in future versions of Puma.

Lower latency, better throughput

From our friends at GitLab, the new experimental wait_for_less_busy_worker config option may reduce latency and improve throughput for high-load Puma apps on MRI. See the pull request for more discussion.

Users of this option should see reduced request queue latency and possibly less overall latency.

Add the following to your puma.rb to try it:

wait_for_less_busy_worker
# or
wait_for_less_busy_worker 0.001

Production testing at GitLab suggests values between 0.001 and 0.010 are best.

Better memory usage

5.0 brings two new options to your config which may improve memory usage.

nakayoshi_fork

nakayoshi_fork calls GC a handful of times and compacts the heap on Ruby 2.7+ before forking. This may reduce memory usage of Puma on MRI with preload enabled. It's inspired by Koichi Sasada's work.

To use it, you can add this to your puma.rb:

nakayoshi_fork

fork_worker

Puma 5 introduces an experimental new cluster-mode configuration option, fork_worker (--fork-worker from the CLI). This mode causes Puma to fork additional workers from worker 0, instead of directly from the master process:

10000   \_ puma 4.3 (tcp://0.0:9292) [puma]
10001       \_ puma: cluster worker 0: 10000 [puma]
10002           \_ puma: cluster worker 1: 10000 [puma]
10003           \_ puma: cluster worker 2: 10000 [puma]
10004           \_ puma: cluster worker 3: 10000 [puma]

It is compatible with phased restarts. It also may improve memory usage because the worker process loads additional code after processing requests.

To learn more about using refork and fork_worker, see 'Fork Worker'.

What else is new?

Upgrade

Then, update your Gemfile:

gem 'puma', '< 6'