123456789_123456789_123456789_123456789_123456789_

Class: YARD::Parser::SourceParser

Relationships & Source Files
Inherits: Object
Defined in: lib/yard/parser/source_parser.rb

Overview

Responsible for parsing a source file into the namespace. Parsing also invokes handlers to process the parsed statements and generate any code objects that may be recognized.

=== Custom Parsers SourceParser allows custom parsers to be registered and called when a certain filetype is recognized. To register a parser and hook it up to a set of file extensions, call .register_parser_type

Constant Summary

Parser Callbacks

Class Attribute Summary

Class Method Summary

Constructor Details

.new(parser_type = SourceParser.parser_type, globals = nil) ⇒ SourceParser

Creates a new parser object for code parsing with a specific parser type.

Parameters:

  • parser_type (Symbol) (defaults to: SourceParser.parser_type)

    the parser type to use

  • globals (OpenStruct) (defaults to: nil)

    global state to be re-used across separate source files

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 406

def initialize(parser_type = SourceParser.parser_type, globals1 = nil, globals2 = nil)
  globals = [true, false].include?(globals1) ? globals2 : globals1
  @file = '(stdin)'
  @globals = globals || OpenStruct.new
  self.parser_type = parser_type
end

Class Attribute Details

.parser_typeSymbol (rw)

Returns:

  • (Symbol)

    the default parser type (defaults to :ruby)

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 85

attr_reader :parser_type

.parser_type=(value) (rw)

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 87

def parser_type=(value)
  @parser_type = validated_parser_type(value)
end

.parser_type_extensionsHash (rw)

This method is for internal use only.

Returns:

  • (Hash)

    a list of registered parser type extensions

Since:

  • 0.5.6

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 163

def parser_type_extensions; @@parser_type_extensions ||= {} end

.parser_type_extensions=(value) (rw)

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 164

def parser_type_extensions=(value) @@parser_type_extensions = value end

.parser_typesHash{Symbol=>Object} (rw)

This method is for internal use only.

Returns:

  • (Hash{Symbol=>Object})

    a list of registered parser types

Since:

  • 0.5.6

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 157

def parser_types; @@parser_types ||= {} end

.parser_types=(value) (rw)

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 158

def parser_types=(value) @@parser_types = value end

Class Method Details

.after_parse_file {|parser| ... } ⇒ Proc

Registers a callback to be called after an individual file is parsed. The block passed to this method will be called on subsequent parse calls.

To register a callback that is called after the entire list of files is processed, see .after_parse_list.

Examples:

Printing the length of each file after it is parsed

SourceParser.after_parse_file do |parser|
  puts "#{parser.file} is #{parser.contents.size} characters"
end
YARD.parse('lib/**/*.rb')
# prints:
"lib/foo.rb is 1240 characters"
"lib/foo_bar.rb is 248 characters"

Yields:

  • (parser)

    the yielded block is called once after each file that is parsed. This might happen many times for a single codebase.

Yield Parameters:

  • parser (SourceParser)

    the parser object that parsed the file.

Yield Returns:

  • (void)

    the return value for the block is ignored.

Returns:

  • (Proc)

    the yielded block

See Also:

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 324

def after_parse_file(&block)
  after_parse_file_callbacks << block
end

.after_parse_file_callbacksArray<Proc>

Returns:

  • (Array<Proc>)

    the list of callbacks to be called after parsing a file. Should only be used for testing.

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 352

def after_parse_file_callbacks
  @after_parse_file_callbacks ||= []
end

.after_parse_list {|files, globals| ... } ⇒ Proc

Registers a callback to be called after a list of files is parsed via .parse. The block passed to this method will be called on subsequent parse calls.

Examples:

Printing results after parsing occurs

SourceParser.after_parse_list do
  puts "Finished parsing!"
end
YARD.parse
# Prints "Finished parsing!" after parsing files

Yields:

  • (files, globals)

    the yielded block is called once before parsing all files

Yield Parameters:

Yield Returns:

  • (void)

    the return value for the block is ignored.

Returns:

  • (Proc)

    the yielded block

See Also:

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 258

def after_parse_list(&block)
  after_parse_list_callbacks << block
end

.after_parse_list_callbacksArray<Proc>

Returns:

  • (Array<Proc>)

    the list of callbacks to be called after parsing a list of files. Should only be used for testing.

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 338

def after_parse_list_callbacks
  @after_parse_list_callbacks ||= []
end

.before_parse_file {|parser| ... } ⇒ Proc

Registers a callback to be called before an individual file is parsed. The block passed to this method will be called on subsequent parse calls.

To register a callback that is called before the entire list of files is processed, see .before_parse_list.

Examples:

Installing a simple callback

SourceParser.before_parse_file do |parser|
  puts "I'm parsing #{parser.file}"
end
YARD.parse('lib/**/*.rb')
# prints:
"I'm parsing lib/foo.rb"
"I'm parsing lib/foo_bar.rb"
"I'm parsing lib/last_file.rb"

Cancel parsing of any test_*.rb files

SourceParser.before_parse_file do |parser|
  return false if parser.file =~ /^test_.+\.rb$/
end

Yields:

  • (parser)

    the yielded block is called once before each file that is parsed. This might happen many times for a single codebase.

Yield Parameters:

  • parser (SourceParser)

    the parser object that will #parse the file.

Yield Returns:

  • (Boolean)

    if the block returns false, parsing for the file is cancelled.

Returns:

  • (Proc)

    the yielded block

See Also:

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 295

def before_parse_file(&block)
  before_parse_file_callbacks << block
end

.before_parse_file_callbacksArray<Proc>

Returns:

  • (Array<Proc>)

    the list of callbacks to be called before parsing a file. Should only be used for testing.

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 345

def before_parse_file_callbacks
  @before_parse_file_callbacks ||= []
end

.before_parse_list {|files, globals| ... } ⇒ Proc

Registers a callback to be called before a list of files is parsed via .parse. The block passed to this method will be called on subsequent parse calls.

Examples:

Installing a simple callback

SourceParser.before_parse_list do |files, globals|
  puts "Starting to parse..."
end
YARD.parse('lib/**/*.rb')
# prints "Starting to parse..."

Setting global state

SourceParser.before_parse_list do |files, globals|
  globals.method_count = 0
end
SourceParser.after_parse_list do |files, globals|
  puts "Found #{globals.method_count} methods"
end
class MyCountHandler < Handlers::Ruby::Base
  handles :def, :defs
  process { globals.method_count += 1 }
end
YARD.parse
# Prints: "Found 37 methods"

Using a global callback to cancel parsing

SourceParser.before_parse_list do |files, globals|
  return false if files.include?('foo.rb')
end

YARD.parse(['foo.rb', 'bar.rb']) # callback cancels this method
YARD.parse('bar.rb') # parses normally

Yields:

  • (files, globals)

    the yielded block is called once before parsing all files

Yield Parameters:

Yield Returns:

  • (Boolean)

    if the block returns false, parsing is cancelled.

Returns:

  • (Proc)

    the yielded block

See Also:

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 234

def before_parse_list(&block)
  before_parse_list_callbacks << block
end

.before_parse_list_callbacksArray<Proc>

Returns:

  • (Array<Proc>)

    the list of callbacks to be called before parsing a list of files. Should only be used for testing.

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 331

def before_parse_list_callbacks
  @before_parse_list_callbacks ||= []
end

.parse(paths = DEFAULT_PATH_GLOB, excluded = [], level = log.level) ⇒ void

This method returns an undefined value.

Parses a path or set of paths

Parameters:

  • paths (String, Array<String>) (defaults to: DEFAULT_PATH_GLOB)

    a path, glob, or list of paths to parse

  • excluded (Array<String, Regexp>) (defaults to: [])

    a list of excluded path matchers

  • level (Fixnum) (defaults to: log.level)

    the logger level to use during parsing. See ::YARD::Logger

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 99

def parse(paths = DEFAULT_PATH_GLOB, excluded = [], level = log.level)
  log.debug("Parsing #{paths.inspect} with `#{parser_type}` parser")
  excluded = excluded.map do |path|
    case path
    when Regexp; path
    else Regexp.new(path.to_s, Regexp::IGNORECASE)
    end
  end
  files = [paths].flatten.
    map {|p| File.directory?(p) ? "#{p}/**/*.{rb,c,cc,cxx,cpp}" : p }.
    map {|p| p.include?("*") ? Dir[p].sort_by {|d| [d.length, d] } : p }.flatten.
    reject {|p| !File.file?(p) || excluded.any? {|re| p =~ re } }.
    map {|p| p.encoding == Encoding.default_external ? p : p.dup.force_encoding(Encoding.default_external) }

  log.enter_level(level) do
    parse_in_order(*files.uniq)
  end
end

.parse_in_order(*files) ⇒ void (private)

This method returns an undefined value.

Parses a list of files in a queue.

Parameters:

  • files (Array<String>)

    a list of files to queue for parsing

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 364

def parse_in_order(*files)
  global_state = OpenStruct.new

  return if before_parse_list_callbacks.any? do |cb|
    cb.call(files, global_state) == false
  end

  OrderedParser.new(global_state, files).parse

  after_parse_list_callbacks.each do |cb|
    cb.call(files, global_state)
  end
end

.parse_string(content, ptype = parser_type) ⇒ Object

Parses a string content

Parameters:

  • content (String)

    the block of code to parse

  • ptype (Symbol) (defaults to: parser_type)

    the parser type to use. See .parser_type.

Returns:

  • the parser object that was used to parse content

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 123

def parse_string(content, ptype = parser_type)
  new(ptype).parse(StringIO.new(content))
end

.parser_type_for_extension(extension) ⇒ Symbol

Finds a parser type that is registered for the extension. If no type is found, the default Ruby type is returned.

Returns:

  • (Symbol)

    the parser type to be used for the extension

Since:

  • 0.5.6

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 171

def parser_type_for_extension(extension)
  type = parser_type_extensions.find do |_t, exts|
    [exts].flatten.any? {|ext| ext === extension }
  end
  validated_parser_type(type ? type.first : :ruby)
end

.register_parser_type(type, parser_klass, extensions = nil) ⇒ void

This method returns an undefined value.

Registers a new parser type.

Examples:

Registering a parser for "java" files

SourceParser.register_parser_type :java, JavaParser, 'java'

Parameters:

  • type (Symbol)

    a symbolic name for the parser type

  • parser_klass (Base)

    a class that implements parsing and tokenization

  • extensions (Array<String>, String, Regexp) (defaults to: nil)

    a list of extensions or a regex to match against the file extension

See Also:

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 146

def register_parser_type(type, parser_klass, extensions = nil)
  unless Base > parser_klass
    raise ArgumentError, "expecting parser_klass to be a subclass of YARD::Parser::Base"
  end
  parser_type_extensions[type.to_sym] = extensions if extensions
  parser_types[type.to_sym] = parser_klass
end

.tokenize(content, ptype = parser_type) ⇒ Array

Tokenizes but does not parse the block of code

Parameters:

  • content (String)

    the block of code to tokenize

  • ptype (Symbol) (defaults to: parser_type)

    the parser type to use. See .parser_type.

Returns:

  • (Array)

    a list of tokens

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 132

def tokenize(content, ptype = parser_type)
  new(ptype).tokenize(content)
end

.validated_parser_type(type) ⇒ Symbol

This method is for internal use only.

Returns the validated parser type. Basically, enforces that :ruby type is never set if the Ripper library is not available

Parameters:

  • type (Symbol)

    the parser type to set

Returns:

  • (Symbol)

    the validated parser type

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 184

def validated_parser_type(type)
  !defined?(::Ripper) && type == :ruby ? :ruby18 : type
end

Instance Attribute Details

#contentsString (readonly)

Returns:

  • (String)

    the contents of the file to be parsed

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 399

attr_reader :contents

#fileString (rw)

Returns:

  • (String)

    the filename being parsed by the parser.

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 386

attr_accessor :file

#globalsOpenStruct (readonly)

Returns:

  • (OpenStruct)

    an open struct containing arbitrary global state shared between files and handlers.

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 395

attr_reader :globals

#parser_typeSymbol (rw)

Returns:

  • (Symbol)

    the parser type associated with the parser instance. This should be set by the constructor.

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 390

attr_reader :parser_type

#parser_type=(value) (rw, private)

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 500

def parser_type=(value)
  @parser_type = self.class.validated_parser_type(value)
end

Instance Method Details

#convert_encoding(content) (private)

Searches for encoding line and forces encoding

Since:

  • 0.5.3

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 471

def convert_encoding(content)
  return content unless content.respond_to?(:force_encoding)
  if content =~ ENCODING_LINE
    content.force_encoding($1)
  else
    content.force_encoding('binary')
    ENCODING_BYTE_ORDER_MARKS.each do |encoding, bom|
      bom.force_encoding('binary')
      if content.start_with?(bom)
        return content.sub(bom, '').force_encoding(encoding)
      end
    end
    content.force_encoding('utf-8') # UTF-8 is default encoding
    content
  end
end

#parse(content = __FILE__) ⇒ Object?

The main parser method. This should not be called directly. Instead, use the class methods parse and .parse_string.

Parameters:

  • content (String, #read, Object) (defaults to: __FILE__)

    the source file to parse

Returns:

  • (Object, nil)

    the parser object used to parse the source

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 418

def parse(content = __FILE__)
  case content
  when String
    @file = File.cleanpath(content)
    content = convert_encoding(String.new(File.read_binary(file)))
    checksum = Registry.checksum_for(content)
    return if Registry.checksums[file] == checksum

    if Registry.checksums.key?(file)
      log.info "File '#{file}' was modified, re-processing..."
    end
    Registry.checksums[@file] = checksum
    self.parser_type = parser_type_for_filename(file)
  else
    content = content.read if content.respond_to? :read
  end

  @contents = content
  @parser = parser_class.new(content, file)

  self.class.before_parse_file_callbacks.each do |cb|
    return @parser if cb.call(self) == false
  end

  @parser.parse
  post_process

  self.class.after_parse_file_callbacks.each do |cb|
    cb.call(self)
  end

  @parser
rescue ArgumentError, NotImplementedError => e
  log.warn("Cannot parse `#{file}': #{e.message}")
  log.backtrace(e, :warn)
rescue ParserSyntaxError => e
  log.warn(e.message.capitalize)
  log.backtrace(e, :warn)
end

#parser_class (private)

Since:

  • 0.5.6

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 515

def parser_class
  klass = self.class.parser_types[parser_type]
  unless klass
    raise ArgumentError, "invalid parser type '#{parser_type}' or unrecognized file", caller[1..-1]
  end

  klass
end

#parser_type_for_filename(filename) ⇒ Symbol (private)

Guesses the parser type to use depending on the file extension.

Parameters:

  • filename (String)

    the filename to use to guess the parser type

Returns:

  • (Symbol)

    a parser type that matches the filename

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 508

def parser_type_for_filename(filename)
  ext = (File.extname(filename)[1..-1] || "").downcase
  type = self.class.parser_type_for_extension(ext)
  parser_type == :ruby18 && type == :ruby ? :ruby18 : type
end

#post_processvoid (private)

This method returns an undefined value.

Runs a ::YARD::Handlers::Processor object to post process the parsed statements.

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 490

def post_process
  return unless @parser.respond_to?(:enumerator)

  enumerator = @parser.enumerator
  if enumerator
    post = Handlers::Processor.new(self)
    post.process(enumerator)
  end
end

#tokenize(content) ⇒ Array

Tokenizes but does not parse the block of code using the current #parser_type

Parameters:

  • content (String)

    the block of code to tokenize

Returns:

  • (Array)

    a list of tokens

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 462

def tokenize(content)
  @parser = parser_class.new(content, file)
  @parser.tokenize
end