123456789_123456789_123456789_123456789_123456789_

Class: REXML::Source

Relationships & Source Files
Extension / Inclusion / Inheritance Descendants
Subclasses:
Super Chains via Extension / Inclusion / Inheritance
Instance Chain:
self, Encoding
Inherits: Object
Defined in: lib/rexml/source.rb

Overview

A Source can be searched for patterns, and wraps buffers and other objects and provides consumption of text

Class Method Summary

Instance Attribute Summary

Encoding - Included

#encoding

ID —> Encoding name.

#encoding=

Instance Method Summary

Constructor Details

.new(arg, encoding = nil) ⇒ Source

Constructor value, overriding all encoding detection

Parameters:

  • arg

    must be a String, and should be a valid XML document

  • encoding (defaults to: nil)

    if non-null, sets the encoding of the source to this

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 43

def initialize(arg, encoding=nil)
  @orig = @buffer = arg
  if encoding
    self.encoding = encoding
  else
    detect_encoding
  end
  @line = 0
end

Instance Attribute Details

#buffer (readonly)

The current buffer (what we’re going to read next)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 34

attr_reader :buffer

#empty?Boolean (readonly)

Returns:

  • (Boolean)

    true if the Source is exhausted

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 108

def empty?
  @buffer == ""
end

#encoding (rw)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 37

attr_reader :encoding

#encoding=(enc) (rw)

Inherited from Encoding Overridden to support optimized en/decoding

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 56

def encoding=(enc)
  return unless super
  encoding_updated
end

#line (readonly)

The line number of the last consumed text

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 36

attr_reader :line

Instance Method Details

#consume(pattern)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 87

def consume( pattern )
  @buffer = $' if pattern.match( @buffer )
end

#current_lineObject

Returns:

  • the current line in the source

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 117

def current_line
  lines = @orig.split
  res = lines.grep @buffer[0..30]
  res = res[-1] if res.kind_of? Array
  lines.index( res ) if res
end

#detect_encoding (private)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 125

def detect_encoding
  buffer_encoding = @buffer.encoding
  detected_encoding = "UTF-8"
  begin
    @buffer.force_encoding("ASCII-8BIT")
    if @buffer[0, 2] == "\xfe\xff"
      @buffer[0, 2] = ""
      detected_encoding = "UTF-16BE"
    elsif @buffer[0, 2] == "\xff\xfe"
      @buffer[0, 2] = ""
      detected_encoding = "UTF-16LE"
    elsif @buffer[0, 3] == "\xef\xbb\xbf"
      @buffer[0, 3] = ""
      detected_encoding = "UTF-8"
    end
  ensure
    @buffer.force_encoding(buffer_encoding)
  end
  self.encoding = detected_encoding
end

#encoding_updated (private)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 146

def encoding_updated
  if @encoding != 'UTF-8'
    @buffer = decode(@buffer)
    @to_utf = true
  else
    @to_utf = false
    @buffer.force_encoding ::Encoding::UTF_8
  end
end

#match(pattern, cons = false)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 101

def match(pattern, cons=false)
  md = pattern.match(@buffer)
  @buffer = $' if cons and md
  return md
end

#match_to(char, pattern)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 91

def match_to( char, pattern )
  return pattern.match(@buffer)
end

#match_to_consume(char, pattern)

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 95

def match_to_consume( char, pattern )
  md = pattern.match(@buffer)
  @buffer = $'
  return md
end

#position

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 112

def position
  @orig.index( @buffer )
end

#read

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 84

def read
end

#scan(pattern, cons = false) ⇒ Object

Scans the source for a given pattern. Note, that this is not your usual scan() method. For one thing, the pattern argument has some requirements; for another, the source can be consumed. You can easily confuse this method. Originally, the patterns were easier to construct and this method more robust, because this method generated search regexps on the fly; however, this was computationally expensive and slowed down the entire ::REXML package considerably, since this is by far the most commonly called method. /^s*(#pattern, with no groups)(.*)/. The first group will be returned; the second group is used if the consume flag is set. everything after it in the Source. pattern is not found.

Parameters:

  • pattern

    must be a Regexp, and must be in the form of

  • consume

    if true, the pattern returned will be consumed, leaving

Returns:

  • the pattern, if found, or nil if the Source is empty or the

[ GitHub ]

  
# File 'lib/rexml/source.rb', line 77

def scan(pattern, cons=false)
  return nil if @buffer.nil?
  rv = @buffer.scan(pattern)
  @buffer = $' if cons and rv.size>0
  rv
end