Class: REXML::Document
| Relationships & Source Files | |
| Super Chains via Extension / Inclusion / Inheritance | |
|
Class Chain:
|
|
|
Instance Chain:
|
|
| Inherits: |
REXML::Element
|
| Defined in: | lib/rexml/document.rb |
Overview
Represents an XML document.
A document may have:
-
A single child that may be accessed via method #root.
-
An XML declaration.
-
A document type.
-
Processing instructions.
In a Hurry?
If you’re somewhat familiar with XML and have a particular task in mind, you may want to see the tasks pages, and in particular, the tasks page for documents.
Constant Summary
-
DECLARATION =
# File 'lib/rexml/document.rb', line 40
XMLDecl.default
XMLTokens - Included
NAME, NAMECHAR, NAME_CHAR, NAME_START_CHAR, NAME_STR, NCNAME_STR, NMTOKEN, NMTOKENS, REFERENCE
Namespace - Included
NAMESPLIT, NAME_WITHOUT_NAMESPACE
Element - Inherited
Class Attribute Summary
-
.entity_expansion_limit
rw
Get the entity expansion limit.
-
.entity_expansion_limit=(val)
rw
Set the entity expansion limit.
-
.entity_expansion_text_limit
rw
Get the entity expansion limit.
-
.entity_expansion_text_limit=(val)
rw
Set the entity expansion limit.
Class Method Summary
-
.new(string = nil, context = {}) ⇒ Document
constructor
Returns a new REXML::Document object.
- .parse_stream(source, listener)
Element - Inherited
| .new | Returns a new REXML::Element object. |
Parent - Inherited
| .new | Constructor. |
Child - Inherited
| .new | Constructor. |
Instance Attribute Summary
- #entity_expansion_count readonly
- #entity_expansion_limit=(value) writeonly
- #entity_expansion_text_limit rw
-
#stand_alone? ⇒ Boolean
readonly
Returns the
XMLDeclstandalone value of the document as a string, if it has been set, otherwise the default standalone value: - #namespaces_cache rw private
Element - Inherited
| #attributes | Mechanisms for accessing attributes and child elements of this element. |
| #context | The context holds information about the processing environment, such as whitespace handling. |
| #elements | Mechanisms for accessing attributes and child elements of this element. |
| #has_attributes? | Returns |
| #has_elements? | Returns |
| #has_text? | Returns |
Namespace - Included
| #expanded_name | The name of the object, valid if set. |
| #local_name | Alias for Namespace#name. |
| #name | The name of the object, valid if set. |
| #name= | Sets the name and the expanded name. |
| #prefix | The expanded name of the object, valid if name is set. |
Parent - Inherited
Child - Inherited
| #next_sibling | Alias for Node#next_sibling_node. |
| #next_sibling= | Sets the next sibling of this child. |
| #parent | The Parent of this object. |
| #parent= | Sets the parent of this child to the supplied argument. |
| #previous_sibling | Alias for Node#previous_sibling_node. |
| #previous_sibling= | Sets the previous sibling of this child. |
Node - Included
Instance Method Summary
-
#<<(child)
Alias for #add.
-
#add(xml_decl) ⇒ self
(also: #<<)
Adds an object to the document; returns
self. -
#add_element(name_or_element = nil, attributes = nil) ⇒ new_element
Adds an element to the document by calling
REXML::Element.add_element: -
#clone ⇒ Document
Returns the new document resulting from executing
Document.new(self). -
#doctype ⇒ doc_type?
Returns the
DocTypeobject for the document, if it exists, otherwisenil: - #document
-
#encoding ⇒ encoding_string
Returns the
XMLDeclencoding of the document, if it has been set, otherwise the default encoding: -
#expanded_name ⇒ empty_string
(also: #name)
Returns an empty string.
-
#name
Alias for #expanded_name.
-
#node_type ⇒ Document
Returns the symbol
:document. - #record_entity_expansion
-
#root ⇒ root_element?
Returns the root element of the document, if it exists, otherwise
nil: -
#version ⇒ version_string
Returns the
XMLDeclversion of this document as a string, if it has been set, otherwise the default version: -
#write(output = $stdout, indent = -1, transitive = false, ie_hack = false, encoding = nil)
Write the XML tree out, optionally with indent.
- #xml_decl ⇒ xml_decl
- #build(source) private
-
#enable_cache
private
New document level cache is created and available in this block.
Element - Inherited
| #[] | With integer argument |
| #add_attribute | Adds an attribute to this element, overwriting any existing attribute by the same name. |
| #add_attributes | Adds zero or more attributes to the element; returns the argument. |
| #add_element | Adds a child element, optionally setting attributes on the added element; returns the added element. |
| #add_namespace | Adds a namespace to the element; returns |
| #add_text | Adds text to the element. |
| #attribute | Returns the string value for the given attribute name. |
| #cdatas | Returns a frozen array of the |
| #clone | Returns a shallow copy of the element, containing the name and attributes, but not the parent or children: |
| #comments | Returns a frozen array of the |
| #delete_attribute | Removes a named attribute if it exists; returns the removed attribute if found, otherwise |
| #delete_element | Deletes a child element. |
| #delete_namespace | Removes a namespace from the element. |
| #document | If the element is part of a document, returns that document: |
| #each_element | Calls the given block with each child element: |
| #each_element_with_attribute | Calls the given block with each child element that meets given criteria. |
| #each_element_with_text | Calls the given block with each child element that meets given criteria. |
| #get_elements | Returns an array of the elements that match the given |
| #get_text | Returns the first text node child in a specified element, if it exists, |
| #ignore_whitespace_nodes | Returns |
| #inspect | Returns a string representation of the element. |
| #instructions | Returns a frozen array of the |
| #namespace | Returns the string namespace URI for the element, possibly deriving from one of its ancestors. |
| #namespaces | Returns a hash of all defined namespaces in the element and its ancestors: |
| #next_element | Returns the next sibling that is an element if it exists, |
| #node_type | Returns symbol |
| #prefixes | Returns an array of the string prefixes (names) of all defined namespaces in the element and its ancestors: |
| #previous_element | Returns the previous sibling that is an element if it exists, |
| #raw | Returns |
| #root | Returns the most distant element (not document) ancestor of the element: |
| #root_node | Returns the most distant ancestor of |
| #text | Returns the text string from the first text node child in a specified element, if it exists, |
| #text= | Adds, replaces, or removes the first text node child in the element. |
| #texts | Returns a frozen array of the |
| #whitespace | Returns |
| #write | DEPRECATED See |
| #xpath | Returns the string xpath to the element relative to the most distant parent: |
| #__to_xpath_helper, #calculate_namespaces, | |
| #each_with_something | A private helper method. |
Namespace - Included
| #fully_expanded_name | Fully expand the name, even if the prefix wasn’t specified in the source file. |
| #has_name? | Compares names optionally WITH namespaces. |
Parent - Inherited
| #<< | Alias for Parent#push. |
| #[] | Fetches a child at a given index. |
| #[]= | Set an index entry. |
| #add, | |
| #children | Alias for Parent#to_a. |
| #deep_clone | Deeply clones this object. |
| #delete, #delete_at, #delete_if, #each, | |
| #each_child | Alias for Parent#each. |
| #each_index, | |
| #index | Fetches the index of a given child of this parent. |
| #insert_after | Inserts an child after another child child2 will be inserted after child1 in the child list of the parent. |
| #insert_before | Inserts an child before another child child2 will be inserted before child1 in the child list of the parent. |
| #length | Alias for Parent#size. |
| #push | Alias for Parent#add. |
| #replace_child | Replaces one child with another, making sure the nodelist is correct |
| #size, #to_a, #unshift | |
Child - Inherited
| #bytes | This doesn’t yet handle encodings. |
| #document |
|
| #remove | Removes this child from the parent. |
| #replace_with | Replaces this object with another object. |
Node - Included
| #each_recursive | Visit all subnodes of |
| #find_first_recursive | Find (and return) first subnode (recursively) for which the block evaluates to true. |
| #indent, | |
| #index_in_parent | Returns the position that |
| #next_sibling_node, #previous_sibling_node, | |
| #to_s |
|
Constructor Details
.new(string = nil, context = {}) ⇒ Document
.new(io_stream = nil, context = {}) ⇒ Document
.new(document = nil, context = {}) ⇒ Document
Document
.new(io_stream = nil, context = {}) ⇒ Document
.new(document = nil, context = {}) ⇒ Document
Returns a new REXML::Document object.
When no arguments are given, returns an empty document:
d = REXML::Document.new
d.to_s # => ""
When argument string is given, it must be a string containing a valid XML document:
xml_string = '<root><foo>Foo</foo><bar>Bar</bar></root>'
d = REXML::Document.new(xml_string)
d.to_s # => "<root><foo>Foo</foo><bar>Bar</bar></root>"
When argument io_stream is given, it must be an IO object that is opened for reading, and when read must return a valid XML document:
File.write('t.xml', xml_string)
d = File.open('t.xml', 'r') do |io|
REXML::Document.new(io)
end
d.to_s # => "<root><foo>Foo</foo><bar>Bar</bar></root>"
When argument #document is given, it must be an existing document object, whose context and attributes (but not children) are cloned into the new document:
d = REXML::Document.new(xml_string)
d.children # => [<root> ... </>]
d.context = {raw: :all, compress_whitespace: :all}
d.add_attributes({'bar' => 0, 'baz' => 1})
d1 = REXML::Document.new(d)
d1.children # => []
d1.context # => {:raw=>:all, :compress_whitespace=>:all}
d1.attributes # => {"bar"=>bar='0', "baz"=>baz='1'}
When argument context is given, it must be a hash containing context entries for the document; see Element Context:
context = {raw: :all, compress_whitespace: :all}
d = REXML::Document.new(xml_string, context)
d.context # => {:raw=>:all, :compress_whitespace=>:all}
# File 'lib/rexml/document.rb', line 92
def initialize( source = nil, context = {} ) @entity_expansion_count = 0 @entity_expansion_limit = Security.entity_expansion_limit @entity_expansion_text_limit = Security.entity_expansion_text_limit super() @context = context # `source = ""` is an invalid usage because no root element XML is an invalid XML. # But we accept `""` for backward compatibility. return if source.nil? or source == "" if source.kind_of? Document @context = source.context super source else build( source ) end end
Class Attribute Details
.entity_expansion_limit (rw)
Get the entity expansion limit. By default the limit is set to 10000.
Deprecated. Use Security.entity_expansion_limit= instead.
# File 'lib/rexml/document.rb', line 419
def Document::entity_expansion_limit Security.entity_expansion_limit end
.entity_expansion_limit=(val) (rw)
Set the entity expansion limit. By default the limit is set to 10000.
Deprecated. Use Security.entity_expansion_limit= instead.
# File 'lib/rexml/document.rb', line 412
def Document::entity_expansion_limit=( val ) Security.entity_expansion_limit = val end
.entity_expansion_text_limit (rw)
Get the entity expansion limit. By default the limit is set to 10240.
Deprecated. Use Security.entity_expansion_text_limit instead.
# File 'lib/rexml/document.rb', line 433
def Document::entity_expansion_text_limit Security.entity_expansion_text_limit end
.entity_expansion_text_limit=(val) (rw)
Set the entity expansion limit. By default the limit is set to 10240.
Deprecated. Use Security.entity_expansion_text_limit= instead.
# File 'lib/rexml/document.rb', line 426
def Document::entity_expansion_text_limit=( val ) Security.entity_expansion_text_limit = val end
Class Method Details
.parse_stream(source, listener)
[ GitHub ]# File 'lib/rexml/document.rb', line 405
def Document::parse_stream( source, listener ) Parsers::StreamParser.new( source, listener ).parse end
Instance Attribute Details
#entity_expansion_count (readonly)
[ GitHub ]# File 'lib/rexml/document.rb', line 437
attr_reader :entity_expansion_count
#entity_expansion_limit=(value) (writeonly)
[ GitHub ]# File 'lib/rexml/document.rb', line 438
attr_writer :entity_expansion_limit
#entity_expansion_text_limit (rw)
[ GitHub ]# File 'lib/rexml/document.rb', line 439
attr_accessor :entity_expansion_text_limit
#namespaces_cache (rw, private)
[ GitHub ]# File 'lib/rexml/document.rb', line 454
attr_accessor :namespaces_cache
#stand_alone? ⇒ Boolean (readonly)
# File 'lib/rexml/document.rb', line 309
def stand_alone? xml_decl().stand_alone? end
Instance Method Details
#<<(child)
Alias for #add.
# File 'lib/rexml/document.rb', line 205
alias :<< :add
#add(xml_decl) ⇒ self
#add(doc_type) ⇒ self
#add(object) ⇒ self
Also known as: #<<
self
#add(doc_type) ⇒ self
#add(object) ⇒ self
Adds an object to the document; returns self.
When argument #xml_decl is given, it must be an XMLDecl object, which becomes the XML declaration for the document, replacing the previous XML declaration if any:
d = REXML::Document.new
d.xml_decl.to_s # => ""
d.add(REXML::XMLDecl.new('2.0'))
d.xml_decl.to_s # => "<?xml version='2.0'?>"
When argument doc_type is given, it must be an DocType object, which becomes the document type for the document, replacing the previous document type, if any:
d = REXML::Document.new
d.doctype.to_s # => ""
d.add(REXML::DocType.new('foo'))
d.doctype.to_s # => "<!DOCTYPE foo>"
When argument object (not an XMLDecl or DocType object) is given it is added as the last child:
d = REXML::Document.new
d.add(REXML::Element.new('foo'))
d.to_s # => "<foo/>"
# File 'lib/rexml/document.rb', line 174
def add( child ) if child.kind_of? XMLDecl if @children[0].kind_of? XMLDecl @children[0] = child else @children.unshift child end child.parent = self elsif child.kind_of? DocType # Find first Element or DocType node and insert the decl right # before it. If there is no such node, just insert the child at the # end. If there is a child and it is an DocType, then replace it. insert_before_index = @children.find_index { |x| x.kind_of?(Element) || x.kind_of?(DocType) } if insert_before_index # Not null = not end of list if @children[ insert_before_index ].kind_of? DocType @children[ insert_before_index ] = child else @children[ insert_before_index-1, 0 ] = child end else # Insert at end of list @children << child end child.parent = self else rv = super raise "attempted adding second root element to document" if @elements.size > 1 rv end end
#add_element(name_or_element = nil, attributes = nil) ⇒ new_element
Adds an element to the document by calling REXML::Element.add_element:
REXML::Element.add_element(name_or_element, attributes)
# File 'lib/rexml/document.rb', line 213
def add_element(arg=nil, arg2=nil) rv = super raise "attempted adding second root element to document" if @elements.size > 1 rv end
#build(source) (private)
[ GitHub ]# File 'lib/rexml/document.rb', line 467
def build( source ) Parsers::TreeParser.new( source, self ).parse end
#clone ⇒ Document
Returns the new document resulting from executing Document.new(self). See .new.
# File 'lib/rexml/document.rb', line 124
def clone Document.new self end
#doctype ⇒ doc_type?
# File 'lib/rexml/document.rb', line 245
def doctype @children.find { |item| item.kind_of? DocType } end
#document
[ GitHub ]# File 'lib/rexml/document.rb', line 448
def document self end
#enable_cache (private)
New document level cache is created and available in this block. This API is thread unsafe. Users can’t change this document in this block.
# File 'lib/rexml/document.rb', line 458
def enable_cache @namespaces_cache = {} begin yield ensure @namespaces_cache = nil end end
#encoding ⇒ encoding_string
# File 'lib/rexml/document.rb', line 294
def encoding xml_decl().encoding end
#expanded_name ⇒ empty_string Also known as: #name
Returns an empty string.
# File 'lib/rexml/document.rb', line 133
def '' #d = doc_type #d ? d.name : "UNDEFINED" end
#name
Alias for #expanded_name.
# File 'lib/rexml/document.rb', line 138
alias :name :
#node_type ⇒ Document
Returns the symbol :document.
# File 'lib/rexml/document.rb', line 114
def node_type :document end
#record_entity_expansion
[ GitHub ]# File 'lib/rexml/document.rb', line 441
def record_entity_expansion @entity_expansion_count += 1 if @entity_expansion_count > @entity_expansion_limit raise "number of entity expansions exceeded, processing aborted." end end
#root ⇒ root_element?
# File 'lib/rexml/document.rb', line 229
def root elements[1] #self #@children.find { |item| item.kind_of? Element } end
#version ⇒ version_string
# File 'lib/rexml/document.rb', line 279
def version xml_decl().version end
#write(output = $stdout, indent = -1, transitive = false, ie_hack = false, encoding = nil)
#write(options={:output) ⇒ $stdout, :indent
stdout, :indent
Write the XML tree out, optionally with indent. This writes out the entire XML document, including XML declarations, doctype declarations, and processing instructions (if any are given).
A controversial point is whether Document should always write the XML declaration (<?xml version=‘1.0’?>) whether or not one is given by the user (or source document). ::REXML does not write one if one was not specified, because it adds unnecessary bandwidth to applications such as XML-RPC.
Accept Nth argument style and options Hash style as argument. The recommended style is options Hash style for one or more arguments case.
Examples
Document.new("<a><b/></a>").write
output = ""
Document.new("<a><b/></a>").write(output)
output = ""
Document.new("<a><b/></a>").write(:output => output, :indent => 2)
See also the classes in the rexml/formatters package for the proper way to change the default formatting of XML output.
Examples
output = ""
tr = Transitive.new
tr.write(Document.new("<a><b/></a>"), output)
- output
-
output an object which supports ‘<< string’; this is where the document will be written.
- indent
-
An integer. If -1, no indenting will be used; otherwise, the indentation will be twice this number of spaces, and children will be indented an additional amount. For a value of 3, every item will be indented 3 more levels, or 6 more spaces (2 * 3). Defaults to -1
- transitive
-
If transitive is true and indent is >= 0, then the output will be pretty-printed in such a way that the added whitespace does not affect the absolute value of the document – that is, it leaves the value and number of Text nodes in the document unchanged.
- ie_hack
-
This hack inserts a space before the /> on empty tags to address a limitation of Internet Explorer. Defaults to false
- encoding
-
Encoding name as String. Change output encoding to specified encoding instead of encoding in XML declaration. Defaults to nil. It means encoding in XML declaration is used.
# File 'lib/rexml/document.rb', line 369
def write(*arguments) if arguments.size == 1 and arguments[0].class == Hash = arguments[0] output = [:output] indent = [:indent] transitive = [:transitive] ie_hack = [:ie_hack] encoding = [:encoding] else output, indent, transitive, ie_hack, encoding, = *arguments end output ||= $stdout indent ||= -1 transitive = false if transitive.nil? ie_hack = false if ie_hack.nil? encoding ||= xml_decl.encoding if encoding != 'UTF-8' && !output.kind_of?(Output) output = Output.new( output, encoding ) end formatter = if indent > -1 if transitive require_relative "formatters/transitive" REXML::Formatters::Transitive.new( indent, ie_hack ) else REXML::Formatters::Pretty.new( indent, ie_hack ) end else REXML::Formatters::Default.new( ie_hack ) end formatter.write( self, output ) end
#xml_decl ⇒ xml_decl
Returns the XMLDecl object for the document, if it exists, otherwise the default XMLDecl object:
d = REXML::Document.new('<?xml version="1.0" encoding="UTF-8"?>')
d.xml_decl.class # => REXML::XMLDecl
d.xml_decl.to_s # => "<?xml version='1.0' encoding='UTF-8'?>"
d = REXML::Document.new('')
d.xml_decl.class # => REXML::XMLDecl
d.xml_decl.to_s # => ""