File: Development — RuboCop master

This guide covers developing new cops for RuboCop. It walks through:

Scaffolding a cop from a template
Understanding the AST that RuboCop operates on
Implementing detection with node patterns and callbacks
Autocorrect to fix offenses automatically
Constraining cops by Ruby or gem version
Common patterns and conventions
Testing your cop
Documenting the cop for users

Create a new cop

Note	Clone the repository and run <code>bundle install</code> if not done yet. The following rake task can only be run inside the rubocop project directory itself.

Use the bundled rake task new_cop to generate a cop template:

$ bundle exec rake 'new_cop[Department/Name]'
[create] lib/rubocop/cop/department/name.rb
[create] spec/rubocop/cop/department/name_spec.rb
[modify] lib/rubocop.rb - <code>require_relative 'rubocop/cop/department/name'</code> was injected.
[modify] A configuration for the cop is added into config/default.yml.
Do 4 steps:
  1. Modify the description of Department/Name in config/default.yml
  2. Implement your new cop in the generated file!
  3. Commit your new cop with a message such as
     e.g. "Add new `Department/Name` cop"
  4. Run `bundle exec rake changelog:new` to generate a changelog entry
     for your new cop.

Understanding the AST

RuboCop uses the parser library to create the Abstract Syntax Tree (AST) representation of the code.

You can install parser gem and use <code>ruby-parse</code> command line utility to check what the AST looks like in the output.

$ gem install parser

And then try to parse a simple integer representation with ruby-parse:

$ ruby-parse -e '1'
(int 1)

Each expression surrounded by parentheses represents a node in the AST. The first element is the node type and the tail contains the children with all information needed to represent the code.

Here’s another example - a local variable name being assigned the string value "John":

$ ruby-parse -e 'name = "John"'
(lvasgn :name
  (str "John"))

Inspecting the AST

Let’s imagine we want to simplify statements from <code>!array.empty?</code> to array.any?:

First, check what the bad code returns in the Abstract Syntax Tree representation.

$ ruby-parse -e '!array.empty?'
(send
  (send
    (send nil :array) :empty?) :!)

Now, it’s time to debug our expression using the REPL from RuboCop:

$ bin/console

First we need to declare the code that we want to match, and use the ProcessedSource that is a simple wrap to make the parser interpret the code and build the AST:

code = '!something.empty?'
source = {RuboCop::ProcessedSource.new}(code, {RUBY_VERSION.to_f})
node = source.ast
# => s(:send, s(:send, s(:send, nil, :something), :empty?), :!)

The node has a few attributes that can be useful in the journey:

node.type # => :send
node.children # => [s(:send, s(:send, nil, :something), :empty?), :!]
node.source # => "!something.empty?"

Implementing the cop

Writing node pattern rules

Tip	You can write cops without using `NodePattern` (and many older cops don’t use it), but it generally simplifies the code a lot, as manual node matching and destructuring can be quite verbose.

Now that you’re familiar with AST, you can learn a bit about the node pattern and use patterns to match with specific nodes that you want to match.

You can learn more about Node Pattern here.

Alias NodePattern to RuboCop::AST::NodePattern to make it easier to use:

NodePattern = {RuboCop::AST::NodePattern}

Node pattern matches something very similar to the current output from AST representation, then let’s start with something very generic:

{NodePattern.new}('send').match(node) # => true

It matches because the root is a send type. Now let’s match it deeply using parentheses to define details for sub-nodes. If you don’t care about what an internal node is, you can use <code>...</code> to skip it and just consider "a node".

{NodePattern.new}('(send ...)').match(node) # => true
{NodePattern.new}('(send (send ...) :!)').match(node) # => true
{NodePattern.new}('(send (send (send ...) :empty?) :!)').match(node) # => true

Sometimes it’s hard to comprehend complex expressions you’re building with the pattern, then, if you got lost with the node pattern parens surrounding deeply, try to use the $ to capture the internal expression and check exactly each piece of the expression:

{NodePattern.new}('(send (send (send $...) :empty?) :!)').match(node) # => [nil, :something]

It’s not needed to strictly receive a send in the internal node because maybe it can also be a literal array like:

![].empty?

The code above has the following representation:

=> s(:send, s(:send, s(:array), :empty?), :!)

It’s possible to skip the internal node with <code>...</code> to make sure that it’s just another internal node:

{NodePattern.new}('(send (send (...) :empty?) :!)').match(node) # => true

In other words, it says: "Match code calling !<expression>.empty?".

Great! Now, let’s implement our cop to simplify such statements:

$ rake 'new_cop[Style/SimplifyNotEmptyWithAny]'

After the cop scaffold is generated, change the node matcher to match with the expression achieved previously:

def_node_matcher :not_empty_call?, <<~PATTERN
  (send (send $(...) :empty?) :!)
PATTERN

Note that we added a $ sign to capture the "expression" in <code>!<expression>.empty?</code>, it will become useful later.

Get yourself familiar with the AST node hooks that parser and rubocop-ast provide.

As it starts with a send type, it’s needed to implement the on_send method, as the cop scaffold already suggested:

def on_send(node)
  return unless not_empty_call?(node)

  add_offense(node)
end

The on_send callback is the most used and can be optimized by restricting the acceptable method names with a constant RESTRICT_ON_SEND.

The final cop code will look something like this:

module RuboCop
  module Cop
    module Style
      # `array.any?` is a simplified way to say `!array.empty?`
      #
      # @example
      #   # bad
      #   !array.empty?
      #
      #   # good
      #   array.any?
      #
      class SimplifyNotEmptyWithAny < Base
        MSG = 'Use `.any?` and remove the negation part.'
        RESTRICT_ON_SEND = [:!].freeze # optimization: don't call `on_send` unless
                                       # the method name is in this list

        def_node_matcher :not_empty_call?, <<~PATTERN
          (send (send $(...) :empty?) :!)
        PATTERN

        def on_send(node)
          return unless not_empty_call?(node)

          add_offense(node)
        end
      end
    end
  end
end

Callback ordering: on_send is called on a node before the <code>on_<type></code> callbacks for its children. There is also an after_send callback that runs after children are processed. Every node type has a corresponding <code>after_<type></code> callback (except types that never have children).

Update the spec to cover the expected syntax:

describe {RuboCop::Cop::Style::SimplifyNotEmptyWithAny}, :config do
  it 'registers an offense when using `!a.empty?`' do
    expect_offense(<<~RUBY)
      !array.empty?
      ^^^^^^^^^^^^^ Use `.any?` and remove the negation part.
    RUBY
  end

  it 'does not register an offense when using `.any?` or `.empty?`' do
    expect_no_offenses(<<~RUBY)
      array.any?
      array.empty?
    RUBY
  end
end

If your code has variables of different lengths, you can use the following markers to format your template by passing the variables as keyword arguments:

%{foo}: Interpolates foo
^{foo}: Inserts <code>'^' * foo.size</code> for dynamic offense range length
_{foo}: Inserts <code>' ' * foo.size</code> for dynamic offense range indentation

You can also abbreviate offense messages with <code>[…]</code>.

%w[raise fail].each do |keyword|
  expect_offense(<<~RUBY, keyword: keyword)
    %{keyword}(RuntimeError, msg)
    ^{keyword}^^^^^^^^^^^^^^^^^^^ Redundant `RuntimeError` argument [...]
  RUBY

%w[has_one has_many].each do |type|
  expect_offense(<<~RUBY, type: type)
    class Book
      %{type} :chapter, foreign_key: 'book_id'
      _{type}           ^^^^^^^^^^^^^^^^^^^^^^ Specifying the default [...]
    end
  RUBY
end

Autocorrect

Autocorrect lets cops automatically fix the offenses they detect. It’s necessary to <code>extend AutoCorrector</code>. The method add_offense yields a corrector object that is a thin wrapper on parser’s TreeRewriter to which you can give instructions about what to do with the offensive node.

Let’s start with a simple spec to cover it:

it 'corrects `!a.empty?`' do
  expect_offense(<<~RUBY)
    !array.empty?
    ^^^^^^^^^^^^^ Use `.any?` and remove the negation part.
  RUBY

  expect_correction(<<~RUBY)
    array.any?
  RUBY
end

And then add the autocorrecting block on the cop side:

extend AutoCorrector

def on_send(node)
  expression = not_empty_call?(node)
  return unless expression

  add_offense(node) do |corrector|
    corrector.replace(node, "#{expression.source}.any?")
  end
end

The corrector allows you to insert_after, insert_before, wrap or replace a specific node or in any specific range of the code.

Range can be determined on <code>node.location</code> where it brings specific ranges for expression or other internal information that the node holds.

Preventing clobbering

The corrector detects and prevents correcting overlapping nodes, to prevent one correction from clobbering another. Supporting nested corrections is done by taking multiple passes, and skipping corrections for nested nodes. This can be implemented using the IgnoredNode module:

 extend AutoCorrector
+include IgnoredNode

 def on_send(node)
   return unless some_condition?(node)

   add_offense(node) do |corrector|
+    next if part_of_ignored_node?(node)
+
     corrector.replace(node, "...")
   end
+
+  ignore_node(node)
 end

This works because file correction is implemented by repeating investigation and correction until the file no longer requires correction, meaning all nested nodes will eventually be processed.

Note that expect_correction in Cop specs asserts the result after all passes.

Limit by Ruby or gem versions

Some cops apply changes that only apply in particular contexts, such as if the user has a minimum Ruby version. There are helpers that let you constrain your cops automatically, to only run where applicable.

Requiring a minimum Ruby version

If your cop uses new Ruby syntax or standard library APIs, it should only register offenses if the user has the proper target Ruby version, which you can require with TargetRubyVersion#minimum_target_ruby_version.

For example, the <code>Performance/SelectMap</code> cop requires Ruby 2.7, which introduced Enumerable#filter_map:

class {RuboCop::Cop::Performance::SelectMap} < Base
  extend TargetRubyVersion

  minimum_target_ruby_version 2.7

  # ...
end

This cop won’t register offenses on Ruby 2.6 or older.

Requiring a maximum Ruby version

Mirroring minimum_target_ruby_version, you can also specify a maximum Ruby version your cop should analyze.

For example, the <code>Lint/CircularArgumentReference</code> cop only runs when analyzing code for Ruby before 2.7. The code it looks for can never be written in more recent Rubies — it would be a syntax error:

class {RuboCop::Cop::Lint::CircularArgumentReference} < Base
  extend TargetRubyVersion

  maximum_target_ruby_version 2.6

  # ...
end

Requiring a gem

If your cop depends on the presence of a gem, you can declare that with RuboCop::Cop::Base.requires_gem.

For example, to declare that MyCop should only apply if the bundle is using <code>my-gem</code> with a version between <code>1.2.3</code> and 4.5.6:

class MyCop < Base
  requires_gem "my-gem", ">= 1.2.3", "< 4.5.6"

  # ...
end

You can specify any gem requirement using the same syntax as your Gemfile.

You can also handle multiple versions of a gem with target_gem_version. It behaves similarly to target_ruby_version, allowing you to inspect a gem version at runtime:

class MyCop < Base
  requires_gem "my-gem"

  def on_send(node)
    if target_gem_version("my-gem") < "2.0"
      # ...
    else
      # ...
    end
  end
end

When writing tests, you can specify the gem version to run your example against through the gem_versions RSpec helper:

describe {RuboCop::Cop::Style::MyCop}, :config do
  context 'when `my-gem` is at version `1.X`' do
    let(:gem_versions) { { 'my-gem' => '1.0.0' } }

    it 'registers no offense' do
      expect_no_offenses(<<~RUBY)
        MyGem.foo
      RUBY
    end
  end

  context 'when `my-gem` is at version `2.X`' do
    let(:gem_versions) { { 'my-gem' => '2.0.0' } }

    it 'registers an offense' do
      expect_offense(<<~RUBY)
        MyGem.foo
        ^^^^^^^^^ Instead of `foo`, use the newer `bar` method.
      RUBY
    end
  end
end

Special case: Rails

Historically, many cops in <code>rubocop-rails</code> aren’t actually specific to Rails itself, but some of its components (e.g., Active Support). These dependencies are declared with TargetRailsVersion.minimum_target_rails_version.

For example, the <code>Rails/Pluck</code> cop requires Active Support 6.0, which introduces Enumerable#pluck:

class {RuboCop::Cop::Rails::Pluck} < Base
  extend TargetRailsVersion

  minimum_target_rails_version 6.0

  #...
end

Configuration

Each cop can hold a configuration and you can refer to cop_config in the instance and it will bring a hash with options declared in the <code>.rubocop.yml</code> file.

For example, let’s imagine we want to make the replacement method configurable, so it works with a method other than .any?:

Style/SimplifyNotEmptyWithAny:
  Enabled: true
  ReplaceAnyWith: "size > 0"

And then in the autocorrect method, you just need to use cop_config:

def on_send(node)
  expression = not_empty_call?(node)
  return unless expression

  add_offense(node) do |corrector|
    replacement = cop_config['ReplaceAnyWith'] || 'any?'
    corrector.replace(node, "#{expression.source}.#{replacement}")
  end
end

Common patterns

This section covers conventions and patterns you should follow when writing cops.

Handling safe navigation (`&.`)

If your cop defines on_send, you should almost always also handle the safe navigation operator by aliasing on_csend:

def on_send(node)
  # ...
end
alias on_csend on_send

Without this, your cop will silently ignore code like <code>foo&.bar</code>. Only skip this if the cop explicitly does not apply to safe navigation.

Handling numbered-parameter and `it`-parameter blocks

Similarly, if your cop defines on_block, alias the numbered-parameter and it-parameter variants:

def on_block(node)
  # ...
end
alias on_numblock on_block
alias on_itblock on_block

Using `RESTRICT_ON_SEND`

When your cop uses on_send, define a RESTRICT_ON_SEND constant listing the method names the cop cares about. This is a performance optimization — RuboCop will skip calling on_send entirely for method names not in the list:

RESTRICT_ON_SEND = %i[bad_method other_bad_method].freeze

Documenting node matchers with <code>@!method</code>

Every def_node_matcher and def_node_search should have a <code>@!method</code> YARD tag above it so that documentation tools can find the generated method:

# @!method not_empty_call?(node)
def_node_matcher :not_empty_call?, <<~PATTERN
  (send (send $(...) :empty?) :!)
PATTERN

Declaring autocorrect support

If your cop provides a corrector block inside add_offense, you must declare <code>extend AutoCorrector</code> at the top of the class. Without it, RuboCop won’t know the cop supports autocorrection.

Running tests

RuboCop supports two parser engines: the Parser gem and Prism. By default, tests are executed with the Parser:

$ bundle exec rake spec

To run all tests with Prism, use <code>bundle exec rake prism_spec</code>. To run a single spec file with Prism, set the PARSER_ENGINE environment variable:

$ PARSER_ENGINE=parser_prism bundle exec rspec spec/rubocop/cop/style/hash_syntax_spec.rb

<code>bundle exec rake</code> runs the full CI suite: specs for both parser engines, self-linting, and documentation checks. Always run this before submitting a PR.

Documentation

Every cop needs YARD documentation with examples directly in the source file. The CI documentation_syntax_check task parses every <code>@example</code> block, so all examples must contain valid Ruby syntax.

Key rules:

The first line of the YARD comment must be a complete sentence starting with a verb and ending with a period (e.g., "Checks for …", "Enforces …").
Every cop must have at least one <code># bad</code> / <code># good</code> example pair.
For each SupportedStyle or unique configuration key, add a separate @example block.
Mark the default style with <code>(default)</code>.
List config keys in alphabetical order.

module RuboCop
  module Cop
    module Style
      # Simplifies `!array.empty?` to `array.any?`.
      #
      # @example EnforcedStyle: any? (default)
      #   # bad
      #   !array.empty?
      #
      #   # good
      #   array.any?
      #
      # @example EnforcedStyle: size
      #   # bad
      #   !array.empty?
      #
      #   # good
      #   array.size > 0
      #
      class SimplifyNotEmptyWithAny < Base
        # ...

Add additional <code>@example</code> blocks following the same pattern for each style or config value.

Testing your cop in a real codebase

It’s generally good practice to check if your cop is working properly over a significant codebase (e.g. Rails or some big project you’re working on) to guarantee it’s working in a range of different syntaxes.

There are several ways to do this. Two common approaches:

From within your local rubocop repo, run <code>exe/rubocop ~/your/other/codebase</code>.
From within the other codebase’s Gemfile, set a path to your local repo like this: <code>gem 'rubocop', path: '/full/path/to/rubocop'</code>. Then run rubocop within your codebase.

With approach #2, you can use local versions of RuboCop extension repos such as <code>rubocop-rspec</code> as well.

Tip	Use <code>--only</code> to run just your cop and avoid noise from other cops:

$ rubocop --only Style/SimplifyNotEmptyWithAny

Custom formatters

Beyond cops, RuboCop can also be extended with custom output formatters.

Creating a custom formatter

To implement a custom formatter, you need to subclass ::RuboCop::Formatter::BaseFormatter and override some methods, or implement all formatter API methods by duck typing.

Please see the documents below for more formatter API details.

Using a custom formatter from the command line

You can tell RuboCop to use your custom formatter with a combination of <code>--format</code> and <code>--require</code> option. For example, when you have defined MyCustomFormatter in <code>./path/to/my_custom_formatter.rb</code>, you would type this command:

$ rubocop --require ./path/to/my_custom_formatter --format MyCustomFormatter

Template support

RuboCop can also analyze Ruby embedded in templates (ERB, Haml, Slim, etc.) through a Ruby extractor API.

A template file contains multiple embedded Ruby snippets, unlike a regular Ruby file. RuboCop solves this with RuboCop::Runner.ruby_extractors — a list of callable extractors that plugins can prepend to.

A Ruby extractor takes a RuboCop::ProcessedSource and returns either an Array of Hash-es containing extracted Ruby source and offsets, or nil if the file is not relevant.

ruby_extractor.call(processed_source)

An example returned value from a Ruby extractor would be as follows:

[
  {
    offset: 2,
    processed_source: #<RuboCop::ProcessedSource>
  },
  {
    offset: 10,
    processed_source: #<RuboCop::ProcessedSource>
  }
]

On the extension side, the code would be something like this:

{RuboCop::Runner.ruby_extractors}.unshift(ruby_extractor)

RuboCop::Runner.ruby_extractors is processed from the beginning and ends when one of them returns a non-nil value. By default, there is a Ruby extractor that returns the given Ruby source code with offset 0, so you can unshift any Ruby extractor before it.

Warning

This is still an experimental feature and may change in the future.