123456789_123456789_123456789_123456789_123456789_

Class: REXML::Source

Relationships & Source Files
Extension / Inclusion / Inheritance Descendants
Subclasses:
Super Chains via Extension / Inclusion / Inheritance
Instance Chain:
self, Encoding
Inherits: Object
Defined in: lib/rexml/source.rb

Overview

A Source can be searched for patterns, and wraps buffers and other objects and provides consumption of text

Class Method Summary

Instance Attribute Summary

Encoding - Included

#encoding

ID —> Encoding name.

#encoding=

Instance Method Summary

Constructor Details

.new(arg, encoding = nil) ⇒ Source

Constructor value, overriding all encoding detection

Parameters:

  • arg

    must be a String, and should be a valid XML document

  • encoding (defaults to: nil)

    if non-null, sets the encoding of the source to this

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 42

def initialize(arg, encoding=nil)
  @orig = @buffer = arg
  if encoding
    self.encoding = encoding
  else
    detect_encoding
  end
  @line = 0
end

Instance Attribute Details

#buffer (readonly)

The current buffer (what we're going to read next)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 33

attr_reader :buffer

#empty?Boolean (readonly)

Returns:

  • (Boolean)

    true if the Source is exhausted

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 107

def empty?
  @buffer == ""
end

#encoding (rw)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 36

attr_reader :encoding

#encoding=(enc) (rw)

Inherited from Encoding Overridden to support optimized en/decoding

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 55

def encoding=(enc)
  return unless super
  encoding_updated
end

#line (readonly)

The line number of the last consumed text

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 35

attr_reader :line

Instance Method Details

#consume(pattern)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 86

def consume( pattern )
  @buffer = $' if pattern.match( @buffer )
end

#current_lineObject

Returns:

  • the current line in the source

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 116

def current_line
  lines = @orig.split
  res = lines.grep @buffer[0..30]
  res = res[-1] if res.kind_of? Array
  lines.index( res ) if res
end

#detect_encoding (private)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 124

def detect_encoding
  buffer_encoding = @buffer.encoding
  detected_encoding = "UTF-8"
  begin
    @buffer.force_encoding("ASCII-8BIT")
    if @buffer[0, 2] == "\xfe\xff"
      @buffer[0, 2] = ""
      detected_encoding = "UTF-16BE"
    elsif @buffer[0, 2] == "\xff\xfe"
      @buffer[0, 2] = ""
      detected_encoding = "UTF-16LE"
    elsif @buffer[0, 3] == "\xef\xbb\xbf"
      @buffer[0, 3] = ""
      detected_encoding = "UTF-8"
    end
  ensure
    @buffer.force_encoding(buffer_encoding)
  end
  self.encoding = detected_encoding
end

#encoding_updated (private)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 145

def encoding_updated
  if @encoding != 'UTF-8'
    @buffer = decode(@buffer)
    @to_utf = true
  else
    @to_utf = false
    @buffer.force_encoding ::Encoding::UTF_8
  end
end

#match(pattern, cons = false)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 100

def match(pattern, cons=false)
  md = pattern.match(@buffer)
  @buffer = $' if cons and md
  return md
end

#match_to(char, pattern)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 90

def match_to( char, pattern )
  return pattern.match(@buffer)
end

#match_to_consume(char, pattern)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 94

def match_to_consume( char, pattern )
  md = pattern.match(@buffer)
  @buffer = $'
  return md
end

#position

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 111

def position
  @orig.index( @buffer )
end

#read

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 83

def read
end

#scan(pattern, cons = false) ⇒ Object

Scans the source for a given pattern. Note, that this is not your usual scan() method. For one thing, the pattern argument has some requirements; for another, the source can be consumed. You can easily confuse this method. Originally, the patterns were easier to construct and this method more robust, because this method generated search regexps on the fly; however, this was computationally expensive and slowed down the entire ::REXML package considerably, since this is by far the most commonly called method. /^s*(#pattern, with no groups)(.*)/. The first group will be returned; the second group is used if the consume flag is set. everything after it in the Source. pattern is not found.

Parameters:

  • pattern

    must be a Regexp, and must be in the form of

  • consume

    if true, the pattern returned will be consumed, leaving

Returns:

  • the pattern, if found, or nil if the Source is empty or the

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 76

def scan(pattern, cons=false)
  return nil if @buffer.nil?
  rv = @buffer.scan(pattern)
  @buffer = $' if cons and rv.size>0
  rv
end