Class: Prism::Translation::Parser

Relationships & Source Files
Namespace Children
Classes: `Builder`, `Compiler`, `Lexer`, `PrismDiagnostic`
Extension / Inclusion / Inheritance Descendants
Subclasses: Parser33, Parser34, Parser35
Super Chains via Extension / Inclusion / Inheritance
Class Chain: self, Parser::Base
Instance Chain: self, Parser::Base
Inherits:	Parser::Base Object Parser::Base Prism::Translation::Parser
Defined in:	lib/prism/translation/parser.rb, lib/prism/translation/parser/builder.rb, lib/prism/translation/parser/compiler.rb, lib/prism/translation/parser/lexer.rb

Overview

This class is the entry-point for converting a prism syntax tree into the whitequark/parser gem’s syntax tree. It inherits from the base parser for the parser gem, and overrides the parse* methods to parse with prism and then translate.

Constant Summary

Diagnostic = private Internal use only
# File 'lib/prism/translation/parser.rb', line 23
```
::Parser::Diagnostic
```
Racc_debug_parser = Internal use only
# File 'lib/prism/translation/parser.rb', line 40
```
false
```

Class Method Summary

.new(builder = Prism::Translation::Parser::Builder.new, parser: Prism) ⇒ Parser constructor

The builder argument is used to create the parser using our custom builder class by default.

Instance Method Summary

#default_encoding

The default encoding for Ruby files is UTF-8.
#parse(source_buffer)

Parses a source buffer and returns the AST.
#parse_with_comments(source_buffer)

Parses a source buffer and returns the AST and the source code comments.
#tokenize(source_buffer, recover = false)

Parses a source buffer and returns the AST, the source code comments, and the tokens emitted by the lexer.
#try_declare_numparam(node)

Since prism resolves num params for us, we don’t need to support this kind of logic here.
#build_ast(program, offset_cache) private

Build the parser gem AST from the prism AST.
#build_comments(comments, offset_cache) private

Build the parser gem comments from the prism comments.
#build_offset_cache(source) private

::Prism deals with offsets in bytes, while the parser gem deals with offsets in characters.
#build_range(location, offset_cache) private

Build a range from a prism location.
#build_tokens(tokens, offset_cache) private

Build the parser gem tokens from the prism tokens.
#convert_for_prism(version) private

Converts the version format handled by Parser to the format handled by ::Prism.
#error_diagnostic(error, offset_cache) private

Build a diagnostic from the given prism parse error.
#prism_options private

Options for how prism should parse/lex the source.
#unwrap(result, offset_cache) private

If there was a error generated during the parse, then raise an appropriate syntax error.
#valid_error?(error) ⇒ Boolean private

This is a hook to allow consumers to disable some errors if they don’t want them to block creating the syntax tree.
#valid_warning?(warning) ⇒ Boolean private

This is a hook to allow consumers to disable some warnings if they don’t want them to block creating the syntax tree.
#warning_diagnostic(warning, offset_cache) private

Build a diagnostic from the given prism parse warning.
#version Internal use only
#yyerror Internal use only

Constructor Details

.new(builder = Prism::Translation::Parser::Builder.new, parser: Prism) ⇒ `Parser`

The builder argument is used to create the parser using our custom builder class by default.

By using the :parser keyword argument, you can translate in a way that is compatible with the Parser gem using any parser.

For example, in RuboCop for Ruby LSP, the following approach can be used to improve performance by reusing a pre-parsed ::Prism::ParseLexResult:

class PrismPreparsed
  def initialize(prism_result)
    @prism_result = prism_result
  end

  def parse_lex(source, **options)
    @prism_result
  end
end

prism_preparsed = PrismPreparsed.new(prism_result)

Prism::Translation::Ruby34.new(builder, parser: prism_preparsed)

In an object passed to the :parser keyword argument, the #parse and Prism.parse_lex methods should be implemented as needed.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 67


def initialize(builder = Prism::Translation::Parser::Builder.new, parser: Prism)
  if !builder.is_a?(Prism::Translation::Parser::Builder)
    warn(<<~MSG, uplevel: 1, category: :deprecated)
      [deprecation]: The builder passed to `Prism::Translation::Parser.new` is not a \
      `Prism::Translation::Parser::Builder` subclass. This will raise in the next major version.
    MSG
  end
  @parser = parser

  super(builder)
end

Instance Method Details

#build_ast(program, offset_cache) (private)

Build the parser gem AST from the prism AST.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 306


def build_ast(program, offset_cache)
  program.accept(Compiler.new(self, offset_cache))
end

#build_comments(comments, offset_cache) (private)

Build the parser gem comments from the prism comments.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 311


def build_comments(comments, offset_cache)
  comments.map do |comment|
    ::Parser::Source::Comment.new(build_range(comment.location, offset_cache))
  end
end

#build_offset_cache(source) (private)

::Prism deals with offsets in bytes, while the parser gem deals with offsets in characters. We need to handle this conversion in order to build the parser gem AST.

If the bytesize of the source is the same as the length, then we can just use the offset directly. Otherwise, we build an array where the index is the byte offset and the value is the character offset.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 289


def build_offset_cache(source)
  if source.bytesize == source.length
    #=> (offset) { offset }
  else
    offset_cache = []
    offset = 0

    source.each_char do |char|
      char.bytesize.times { offset_cache << offset }
      offset += 1
    end

    offset_cache << offset
  end
end

#build_range(location, offset_cache) (private)

Build a range from a prism location.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 323


def build_range(location, offset_cache)
  ::Parser::Source::Range.new(
    source_buffer,
    offset_cache[location.start_offset],
    offset_cache[location.end_offset]
  )
end

#build_tokens(tokens, offset_cache) (private)

Build the parser gem tokens from the prism tokens.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 318


def build_tokens(tokens, offset_cache)
  Lexer.new(source_buffer, tokens, offset_cache).to_a
end

#convert_for_prism(version) (private)

Converts the version format handled by Parser to the format handled by ::Prism.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 346


def convert_for_prism(version)
  case version
  when 33
    "3.3.1"
  when 34
    "3.4.0"
  when 35
    "3.5.0"
  else
    "latest"
  end
end

#default_encoding

The default encoding for Ruby files is UTF-8.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 84


def default_encoding
  Encoding::UTF_8
end

#error_diagnostic(error, offset_cache) (private)

Build a diagnostic from the given prism parse error.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 167


def error_diagnostic(error, offset_cache)
  location = error.location
  diagnostic_location = build_range(location, offset_cache)

  case error.type
  when :argument_block_multi
    Diagnostic.new(:error, :block_and_blockarg, {}, diagnostic_location, [])
  when :argument_formal_constant
    Diagnostic.new(:error, :argument_const, {}, diagnostic_location, [])
  when :argument_formal_class
    Diagnostic.new(:error, :argument_cvar, {}, diagnostic_location, [])
  when :argument_formal_global
    Diagnostic.new(:error, :argument_gvar, {}, diagnostic_location, [])
  when :argument_formal_ivar
    Diagnostic.new(:error, :argument_ivar, {}, diagnostic_location, [])
  when :argument_no_forwarding_amp
    Diagnostic.new(:error, :no_anonymous_blockarg, {}, diagnostic_location, [])
  when :argument_no_forwarding_star
    Diagnostic.new(:error, :no_anonymous_restarg, {}, diagnostic_location, [])
  when :argument_no_forwarding_star_star
    Diagnostic.new(:error, :no_anonymous_kwrestarg, {}, diagnostic_location, [])
  when :begin_lonely_else
    location = location.copy(length: 4)
    diagnostic_location = build_range(location, offset_cache)
    Diagnostic.new(:error, :useless_else, {}, diagnostic_location, [])
  when :class_name, :module_name
    Diagnostic.new(:error, :module_name_const, {}, diagnostic_location, [])
  when :class_in_method
    Diagnostic.new(:error, :class_in_def, {}, diagnostic_location, [])
  when :def_endless_setter
    Diagnostic.new(:error, :endless_setter, {}, diagnostic_location, [])
  when :embdoc_term
    Diagnostic.new(:error, :embedded_document, {}, diagnostic_location, [])
  when :incomplete_variable_class, :incomplete_variable_class_3_3
    location = location.copy(length: location.length + 1)
    diagnostic_location = build_range(location, offset_cache)

    Diagnostic.new(:error, :cvar_name, { name: location.slice }, diagnostic_location, [])
  when :incomplete_variable_instance, :incomplete_variable_instance_3_3
    location = location.copy(length: location.length + 1)
    diagnostic_location = build_range(location, offset_cache)

    Diagnostic.new(:error, :ivar_name, { name: location.slice }, diagnostic_location, [])
  when :invalid_variable_global, :invalid_variable_global_3_3
    Diagnostic.new(:error, :gvar_name, { name: location.slice }, diagnostic_location, [])
  when :module_in_method
    Diagnostic.new(:error, :module_in_def, {}, diagnostic_location, [])
  when :numbered_parameter_ordinary
    Diagnostic.new(:error, :ordinary_param_defined, {}, diagnostic_location, [])
  when :numbered_parameter_outer_scope
    Diagnostic.new(:error, :numparam_used_in_outer_scope, {}, diagnostic_location, [])
  when :parameter_circular
    Diagnostic.new(:error, :circular_argument_reference, { var_name: location.slice }, diagnostic_location, [])
  when :parameter_name_repeat
    Diagnostic.new(:error, :duplicate_argument, {}, diagnostic_location, [])
  when :parameter_numbered_reserved
    Diagnostic.new(:error, :reserved_for_numparam, { name: location.slice }, diagnostic_location, [])
  when :regexp_unknown_options
    Diagnostic.new(:error, :regexp_options, { options: location.slice[1..] }, diagnostic_location, [])
  when :singleton_for_literals
    Diagnostic.new(:error, :singleton_literal, {}, diagnostic_location, [])
  when :string_literal_eof
    Diagnostic.new(:error, :string_eof, {}, diagnostic_location, [])
  when :unexpected_token_ignore
    Diagnostic.new(:error, :unexpected_token, { token: location.slice }, diagnostic_location, [])
  when :write_target_in_method
    Diagnostic.new(:error, :dynamic_const, {}, diagnostic_location, [])
  else
    PrismDiagnostic.new(error.message, :error, error.type, diagnostic_location)
  end
end

#parse(source_buffer)

Parses a source buffer and returns the AST.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 92


def parse(source_buffer)
  @source_buffer = source_buffer
  source = source_buffer.source

  offset_cache = build_offset_cache(source)
  result = unwrap(@parser.parse(source, **prism_options), offset_cache)

  build_ast(result.value, offset_cache)
ensure
  @source_buffer = nil
end

#parse_with_comments(source_buffer)

Parses a source buffer and returns the AST and the source code comments.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 105


def parse_with_comments(source_buffer)
  @source_buffer = source_buffer
  source = source_buffer.source

  offset_cache = build_offset_cache(source)
  result = unwrap(@parser.parse(source, **prism_options), offset_cache)

  [
    build_ast(result.value, offset_cache),
    build_comments(result.comments, offset_cache)
  ]
ensure
  @source_buffer = nil
end

#prism_options (private)

Options for how prism should parse/lex the source.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 332


def prism_options
  options = {
    filepath: @source_buffer.name,
    version: convert_for_prism(version),
    partial_script: true,
  }
  # The parser gem always encodes to UTF-8, unless it is binary.
  # https://github.com/whitequark/parser/blob/v3.3.6.0/lib/parser/source/buffer.rb#L80-L107
  options[:encoding] = false if @source_buffer.source.encoding != Encoding::BINARY

  options
end

#tokenize(source_buffer, recover = false)

Parses a source buffer and returns the AST, the source code comments, and the tokens emitted by the lexer.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 122


def tokenize(source_buffer, recover = false)
  @source_buffer = source_buffer
  source = source_buffer.source

  offset_cache = build_offset_cache(source)
  result =
    begin
      unwrap(@parser.parse_lex(source, **prism_options), offset_cache)
    rescue ::Parser::SyntaxError
      raise if !recover
    end

  program, tokens = result.value
  ast = build_ast(program, offset_cache) if result.success?

  [
    ast,
    build_comments(result.comments, offset_cache),
    build_tokens(tokens, offset_cache)
  ]
ensure
  @source_buffer = nil
end

#try_declare_numparam(node)

Since prism resolves num params for us, we don’t need to support this kind of logic here.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 148


def try_declare_numparam(node)
  node.children[0].match?(/\A_[1-9]\z/)
end

#unwrap(result, offset_cache) (private)

If there was a error generated during the parse, then raise an appropriate syntax error. Otherwise return the result.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 267


def unwrap(result, offset_cache)
  result.errors.each do |error|
    next unless valid_error?(error)
    diagnostics.process(error_diagnostic(error, offset_cache))
  end

  result.warnings.each do |warning|
    next unless valid_warning?(warning)
    diagnostic = warning_diagnostic(warning, offset_cache)
    diagnostics.process(diagnostic) if diagnostic
  end

  result
end

#valid_error?(error) ⇒ `Boolean` (private)

This is a hook to allow consumers to disable some errors if they don’t want them to block creating the syntax tree.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 156


def valid_error?(error)
  true
end

#valid_warning?(warning) ⇒ `Boolean` (private)

This is a hook to allow consumers to disable some warnings if they don’t want them to block creating the syntax tree.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 162


def valid_warning?(warning)
  true
end

#version

This method is for internal use only.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 79


def version # :nodoc:
  34
end

#warning_diagnostic(warning, offset_cache) (private)

Build a diagnostic from the given prism parse warning.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 240


def warning_diagnostic(warning, offset_cache)
  diagnostic_location = build_range(warning.location, offset_cache)

  case warning.type
  when :ambiguous_first_argument_plus
    Diagnostic.new(:warning, :ambiguous_prefix, { prefix: "+" }, diagnostic_location, [])
  when :ambiguous_first_argument_minus
    Diagnostic.new(:warning, :ambiguous_prefix, { prefix: "-" }, diagnostic_location, [])
  when :ambiguous_prefix_ampersand
    Diagnostic.new(:warning, :ambiguous_prefix, { prefix: "&" }, diagnostic_location, [])
  when :ambiguous_prefix_star
    Diagnostic.new(:warning, :ambiguous_prefix, { prefix: "*" }, diagnostic_location, [])
  when :ambiguous_prefix_star_star
    Diagnostic.new(:warning, :ambiguous_prefix, { prefix: "**" }, diagnostic_location, [])
  when :ambiguous_slash
    Diagnostic.new(:warning, :ambiguous_regexp, {}, diagnostic_location, [])
  when :dot_dot_dot_eol
    Diagnostic.new(:warning, :triple_dot_at_eol, {}, diagnostic_location, [])
  when :duplicated_hash_key
    # skip, parser does this on its own
  else
    PrismDiagnostic.new(warning.message, :warning, warning.type, diagnostic_location)
  end
end

#yyerror

This method is for internal use only.

[ GitHub ]

# File 'lib/prism/translation/parser.rb', line 88


def yyerror # :nodoc:
end

Class: Prism::Translation::Parser

Overview

Constant Summary

Class Method Summary

Instance Method Summary

Constructor Details

.new(builder = Prism::Translation::Parser::Builder.new, parser: Prism) ⇒ Parser

Instance Method Details

#build_ast(program, offset_cache) (private)

#build_comments(comments, offset_cache) (private)

#build_offset_cache(source) (private)

#build_range(location, offset_cache) (private)

#build_tokens(tokens, offset_cache) (private)

#convert_for_prism(version) (private)

#default_encoding

#error_diagnostic(error, offset_cache) (private)

#parse(source_buffer)

#parse_with_comments(source_buffer)

#prism_options (private)

#tokenize(source_buffer, recover = false)

#try_declare_numparam(node)

#unwrap(result, offset_cache) (private)

#valid_error?(error) ⇒ Boolean (private)

#valid_warning?(warning) ⇒ Boolean (private)

#version

#warning_diagnostic(warning, offset_cache) (private)

#yyerror

.new(builder = Prism::Translation::Parser::Builder.new, parser: Prism) ⇒ `Parser`

#valid_error?(error) ⇒ `Boolean` (private)

#valid_warning?(warning) ⇒ `Boolean` (private)