123456789_123456789_123456789_123456789_123456789_

Class: YARD::Parser::SourceParser

Relationships & Source Files
Inherits: Object
Defined in: lib/yard/parser/source_parser.rb

Overview

Responsible for parsing a source file into the namespace. Parsing also invokes handlers to process the parsed statements and generate any code objects that may be recognized.

Custom Parsers

SourceParser allows custom parsers to be registered and called when a certain filetype is recognized. To register a parser and hook it up to a set of file extensions, call .register_parser_type

Constant Summary

Parser Callbacks

Class Attribute Summary

Class Method Summary

Constructor Details

.new(parser_type = SourceParser.parser_type, globals = nil) ⇒ SourceParser

Creates a new parser object for code parsing with a specific parser type.

Parameters:

  • parser_type (Symbol) (defaults to: SourceParser.parser_type)

    the parser type to use

  • globals (OpenStruct) (defaults to: nil)

    global state to be re-used across separate source files

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 406

def initialize(parser_type = SourceParser.parser_type, globals1 = nil, globals2 = nil)
  globals = [true, false].include?(globals1) ? globals2 : globals1
  @file = '(stdin)'
  @globals = globals || OpenStruct.new
  self.parser_type = parser_type
end

Class Attribute Details

.parser_typeSymbol (rw)

Returns:

  • (Symbol)

    the default parser type (defaults to :ruby)

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 86

attr_reader :parser_type

.parser_type=(value) (rw)

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 88

def parser_type=(value)
  @parser_type = validated_parser_type(value)
end

.parser_type_extensionsHash (rw)

This method is for internal use only.

Returns:

  • (Hash)

    a list of registered parser type extensions

Since:

  • 0.5.6

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 163

def parser_type_extensions; @@parser_type_extensions ||= {} end

.parser_type_extensions=(value) (rw)

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 164

def parser_type_extensions=(value) @@parser_type_extensions = value end

.parser_typesHash{Symbol=>Object} (rw)

This method is for internal use only.

Returns:

  • (Hash{Symbol=>Object})

    a list of registered parser types

Since:

  • 0.5.6

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 157

def parser_types; @@parser_types ||= {} end

.parser_types=(value) (rw)

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 158

def parser_types=(value) @@parser_types = value end

Class Method Details

.after_parse_file {|parser| ... } ⇒ Proc

Registers a callback to be called after an individual file is parsed. The block passed to this method will be called on subsequent parse calls.

To register a callback that is called after the entire list of files is processed, see .after_parse_list.

Examples:

Printing the length of each file after it is parsed

SourceParser.after_parse_file do |parser|
  puts "#{parser.file} is #{parser.contents.size} characters"
end
YARD.parse('lib/**/*.rb')
# prints:
"lib/foo.rb is 1240 characters"
"lib/foo_bar.rb is 248 characters"

Yields:

  • (parser)

    the yielded block is called once after each file that is parsed. This might happen many times for a single codebase.

Yield Parameters:

  • parser (SourceParser)

    the parser object that parsed the file.

Yield Returns:

  • (void)

    the return value for the block is ignored.

Returns:

  • (Proc)

    the yielded block

See Also:

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 324

def after_parse_file(&block)
  after_parse_file_callbacks << block
end

.after_parse_file_callbacksArray<Proc>

Returns:

  • (Array<Proc>)

    the list of callbacks to be called after parsing a file. Should only be used for testing.

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 352

def after_parse_file_callbacks
  @after_parse_file_callbacks ||= []
end

.after_parse_list {|files, globals| ... } ⇒ Proc

Registers a callback to be called after a list of files is parsed via .parse. The block passed to this method will be called on subsequent parse calls.

Examples:

Printing results after parsing occurs

SourceParser.after_parse_list do
  puts "Finished parsing!"
end
YARD.parse
# Prints "Finished parsing!" after parsing files

Yields:

  • (files, globals)

    the yielded block is called once before parsing all files

Yield Parameters:

Yield Returns:

  • (void)

    the return value for the block is ignored.

Returns:

  • (Proc)

    the yielded block

See Also:

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 258

def after_parse_list(&block)
  after_parse_list_callbacks << block
end

.after_parse_list_callbacksArray<Proc>

Returns:

  • (Array<Proc>)

    the list of callbacks to be called after parsing a list of files. Should only be used for testing.

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 338

def after_parse_list_callbacks
  @after_parse_list_callbacks ||= []
end

.before_parse_file {|parser| ... } ⇒ Proc

Registers a callback to be called before an individual file is parsed. The block passed to this method will be called on subsequent parse calls.

To register a callback that is called before the entire list of files is processed, see .before_parse_list.

Examples:

Installing a simple callback

SourceParser.before_parse_file do |parser|
  puts "I'm parsing #{parser.file}"
end
YARD.parse('lib/**/*.rb')
# prints:
"I'm parsing lib/foo.rb"
"I'm parsing lib/foo_bar.rb"
"I'm parsing lib/last_file.rb"

Cancel parsing of any test_*.rb files

SourceParser.before_parse_file do |parser|
  return false if parser.file =~ /^test_.+\.rb$/
end

Yields:

  • (parser)

    the yielded block is called once before each file that is parsed. This might happen many times for a single codebase.

Yield Parameters:

  • parser (SourceParser)

    the parser object that will #parse the file.

Yield Returns:

  • (Boolean)

    if the block returns false, parsing for the file is cancelled.

Returns:

  • (Proc)

    the yielded block

See Also:

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 295

def before_parse_file(&block)
  before_parse_file_callbacks << block
end

.before_parse_file_callbacksArray<Proc>

Returns:

  • (Array<Proc>)

    the list of callbacks to be called before parsing a file. Should only be used for testing.

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 345

def before_parse_file_callbacks
  @before_parse_file_callbacks ||= []
end

.before_parse_list {|files, globals| ... } ⇒ Proc

Registers a callback to be called before a list of files is parsed via .parse. The block passed to this method will be called on subsequent parse calls.

Examples:

Installing a simple callback

SourceParser.before_parse_list do |files, globals|
  puts "Starting to parse..."
end
YARD.parse('lib/**/*.rb')
# prints "Starting to parse..."

Setting global state

SourceParser.before_parse_list do |files, globals|
  globals.method_count = 0
end
SourceParser.after_parse_list do |files, globals|
  puts "Found #{globals.method_count} methods"
end
class MyCountHandler < Handlers::Ruby::Base
  handles :def, :defs
  process { globals.method_count += 1 }
end
YARD.parse
# Prints: "Found 37 methods"

Using a global callback to cancel parsing

SourceParser.before_parse_list do |files, globals|
  return false if files.include?('foo.rb')
end

YARD.parse(['foo.rb', 'bar.rb']) # callback cancels this method
YARD.parse('bar.rb') # parses normally

Yields:

  • (files, globals)

    the yielded block is called once before parsing all files

Yield Parameters:

Yield Returns:

  • (Boolean)

    if the block returns false, parsing is cancelled.

Returns:

  • (Proc)

    the yielded block

See Also:

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 234

def before_parse_list(&block)
  before_parse_list_callbacks << block
end

.before_parse_list_callbacksArray<Proc>

Returns:

  • (Array<Proc>)

    the list of callbacks to be called before parsing a list of files. Should only be used for testing.

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 331

def before_parse_list_callbacks
  @before_parse_list_callbacks ||= []
end

.parse(paths = DEFAULT_PATH_GLOB, excluded = [], level = log.level) ⇒ void

This method returns an undefined value.

Parses a path or set of paths

Parameters:

  • paths (String, Array<String>) (defaults to: DEFAULT_PATH_GLOB)

    a path, glob, or list of paths to parse

  • excluded (Array<String, Regexp>) (defaults to: [])

    a list of excluded path matchers

  • level (Fixnum) (defaults to: log.level)

    the logger level to use during parsing. See ::YARD::Logger

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 100

def parse(paths = DEFAULT_PATH_GLOB, excluded = [], level = log.level)
  log.debug("Parsing #{paths.inspect} with `#{parser_type}` parser")
  excluded = excluded.map do |path|
    case path
    when Regexp; path
    else Regexp.new(path.to_s, Regexp::IGNORECASE)
    end
  end
  files = [paths].flatten.
    map {|p| File.directory?(p) ? "#{p}/**/*.{rb,c,cc,cxx,cpp}" : p }.
    map {|p| p.include?("*") ? Dir[p].sort_by {|d| [d.length, d] } : p }.flatten.
    reject {|p| !File.file?(p) || excluded.any? {|re| p =~ re } }

  log.enter_level(level) do
    parse_in_order(*files.uniq)
  end
end

.parse_in_order(*files) ⇒ void (private)

This method returns an undefined value.

Parses a list of files in a queue.

Parameters:

  • files (Array<String>)

    a list of files to queue for parsing

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 364

def parse_in_order(*files)
  global_state = OpenStruct.new

  return if before_parse_list_callbacks.any? do |cb|
    cb.call(files, global_state) == false
  end

  OrderedParser.new(global_state, files).parse

  after_parse_list_callbacks.each do |cb|
    cb.call(files, global_state)
  end
end

.parse_string(content, ptype = parser_type) ⇒ Object

Parses a string content

Parameters:

  • content (String)

    the block of code to parse

  • ptype (Symbol) (defaults to: parser_type)

    the parser type to use. See .parser_type.

Returns:

  • the parser object that was used to parse content

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 123

def parse_string(content, ptype = parser_type)
  new(ptype).parse(StringIO.new(content))
end

.parser_type_for_extension(extension) ⇒ Symbol

Finds a parser type that is registered for the extension. If no type is found, the default Ruby type is returned.

Returns:

  • (Symbol)

    the parser type to be used for the extension

Since:

  • 0.5.6

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 171

def parser_type_for_extension(extension)
  type = parser_type_extensions.find do |_t, exts|
    [exts].flatten.any? {|ext| ext === extension }
  end
  validated_parser_type(type ? type.first : :ruby)
end

.register_parser_type(type, parser_klass, extensions = nil) ⇒ void

This method returns an undefined value.

Registers a new parser type.

Examples:

Registering a parser for "java" files

SourceParser.register_parser_type :java, JavaParser, 'java'

Parameters:

  • type (Symbol)

    a symbolic name for the parser type

  • parser_klass (Base)

    a class that implements parsing and tokenization

  • extensions (Array<String>, String, Regexp) (defaults to: nil)

    a list of extensions or a regex to match against the file extension

See Also:

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 146

def register_parser_type(type, parser_klass, extensions = nil)
  unless Base > parser_klass
    raise ArgumentError, "expecting parser_klass to be a subclass of YARD::Parser::Base"
  end
  parser_type_extensions[type.to_sym] = extensions if extensions
  parser_types[type.to_sym] = parser_klass
end

.tokenize(content, ptype = parser_type) ⇒ Array

Tokenizes but does not parse the block of code

Parameters:

  • content (String)

    the block of code to tokenize

  • ptype (Symbol) (defaults to: parser_type)

    the parser type to use. See .parser_type.

Returns:

  • (Array)

    a list of tokens

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 132

def tokenize(content, ptype = parser_type)
  new(ptype).tokenize(content)
end

.validated_parser_type(type) ⇒ Symbol

This method is for internal use only.

Returns the validated parser type. Basically, enforces that :ruby type is never set if the Ripper library is not available

Parameters:

  • type (Symbol)

    the parser type to set

Returns:

  • (Symbol)

    the validated parser type

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 184

def validated_parser_type(type)
  !defined?(::Ripper) && type == :ruby ? :ruby18 : type
end

Instance Attribute Details

#contentsString (readonly)

Returns:

  • (String)

    the contents of the file to be parsed

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 399

attr_reader :contents

#fileString (rw)

Returns:

  • (String)

    the filename being parsed by the parser.

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 386

attr_accessor :file

#globalsOpenStruct (readonly)

Returns:

  • (OpenStruct)

    an open struct containing arbitrary global state shared between files and handlers.

Since:

  • 0.7.0

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 395

attr_reader :globals

#parser_typeSymbol (rw)

Returns:

  • (Symbol)

    the parser type associated with the parser instance. This should be set by the constructor.

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 390

attr_reader :parser_type

#parser_type=(value) (rw, private)

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 500

def parser_type=(value)
  @parser_type = self.class.validated_parser_type(value)
end

Instance Method Details

#convert_encoding(content) (private)

Searches for encoding line and forces encoding

Since:

  • 0.5.3

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 471

def convert_encoding(content)
  return content unless content.respond_to?(:force_encoding)
  if content =~ ENCODING_LINE
    content.force_encoding($1)
  else
    content.force_encoding('binary')
    ENCODING_BYTE_ORDER_MARKS.each do |encoding, bom|
      bom.force_encoding('binary')
      if content.start_with?(bom)
        return content.sub(bom, '').force_encoding(encoding)
      end
    end
    content.force_encoding('utf-8') # UTF-8 is default encoding
    content
  end
end

#parse(content = __FILE__) ⇒ Object?

The main parser method. This should not be called directly. Instead, use the class methods parse and .parse_string.

Parameters:

  • content (String, #read, Object) (defaults to: __FILE__)

    the source file to parse

Returns:

  • (Object, nil)

    the parser object used to parse the source

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 418

def parse(content = __FILE__)
  case content
  when String
    @file = File.cleanpath(content)
    content = convert_encoding(String.new(File.read_binary(file)))
    checksum = Registry.checksum_for(content)
    return if Registry.checksums[file] == checksum

    if Registry.checksums.key?(file)
      log.info "File '#{file}' was modified, re-processing..."
    end
    Registry.checksums[@file] = checksum
    self.parser_type = parser_type_for_filename(file)
  else
    content = content.read if content.respond_to? :read
  end

  @contents = content
  @parser = parser_class.new(content, file)

  self.class.before_parse_file_callbacks.each do |cb|
    return @parser if cb.call(self) == false
  end

  @parser.parse
  post_process

  self.class.after_parse_file_callbacks.each do |cb|
    cb.call(self)
  end

  @parser
rescue ArgumentError, NotImplementedError => e
  log.warn("Cannot parse `#{file}': #{e.message}")
  log.backtrace(e, :warn)
rescue ParserSyntaxError => e
  log.warn(e.message.capitalize)
  log.backtrace(e, :warn)
end

#parser_class (private)

Since:

  • 0.5.6

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 515

def parser_class
  klass = self.class.parser_types[parser_type]
  unless klass
    raise ArgumentError, "invalid parser type '#{parser_type}' or unrecognized file", caller[1..-1]
  end

  klass
end

#parser_type_for_filename(filename) ⇒ Symbol (private)

Guesses the parser type to use depending on the file extension.

Parameters:

  • filename (String)

    the filename to use to guess the parser type

Returns:

  • (Symbol)

    a parser type that matches the filename

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 508

def parser_type_for_filename(filename)
  ext = (File.extname(filename)[1..-1] || "").downcase
  type = self.class.parser_type_for_extension(ext)
  parser_type == :ruby18 && type == :ruby ? :ruby18 : type
end

#post_processvoid (private)

This method returns an undefined value.

Runs a ::YARD::Handlers::Processor object to post process the parsed statements.

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 490

def post_process
  return unless @parser.respond_to?(:enumerator)

  enumerator = @parser.enumerator
  if enumerator
    post = Handlers::Processor.new(self)
    post.process(enumerator)
  end
end

#tokenize(content) ⇒ Array

Tokenizes but does not parse the block of code using the current #parser_type

Parameters:

  • content (String)

    the block of code to tokenize

Returns:

  • (Array)

    a list of tokens

[ GitHub ]

  
# File 'lib/yard/parser/source_parser.rb', line 462

def tokenize(content)
  @parser = parser_class.new(content, file)
  @parser.tokenize
end