Class: Nokogiri::XML::SAX::Parser
Relationships & Source Files | |
Namespace Children | |
Classes:
| |
Extension / Inclusion / Inheritance Descendants | |
Subclasses:
|
|
Super Chains via Extension / Inclusion / Inheritance | |
Instance Chain:
|
|
Inherits: | Object |
Defined in: | lib/nokogiri/xml/sax/parser.rb, ext/nokogiri/xml_sax_parser.c |
Overview
This parser is a ::Nokogiri::XML::SAX
style parser that reads its input as it deems necessary. The parser takes a Document
, an optional encoding, then given an ::Nokogiri::XML
input, sends messages to the Document
.
Here is an example of using this parser:
# Create a subclass of Nokogiri::XML::SAX::Document and implement
# the events we care about:
class MyHandler < Nokogiri::XML::SAX::Document
def start_element name, attrs = []
puts "starting: #{name}"
end
def end_element name
puts "ending: #{name}"
end
end
parser = Nokogiri::XML::SAX::Parser.new(MyHandler.new)
# Hand an IO object to the parser, which will read the XML from the IO.
File.open(path_to_xml) do |f|
parser.parse(f)
end
For more information about SAX parsers, see ::Nokogiri::XML::SAX
.
Also see Document
for the available events.
For HTML documents, use the subclass ::Nokogiri::HTML4::SAX::Parser
.
Constant Summary
-
ENCODINGS =
Internal use only
# File 'lib/nokogiri/xml/sax/parser.rb', line 46{ # :nodoc: "NONE" => 0, # No char encoding detected "UTF-8" => 1, # UTF-8 "UTF16LE" => 2, # UTF-16 little endian "UTF16BE" => 3, # UTF-16 big endian "UCS4LE" => 4, # UCS-4 little endian "UCS4BE" => 5, # UCS-4 big endian "EBCDIC" => 6, # EBCDIC uh! "UCS4-2143" => 7, # UCS-4 unusual ordering "UCS4-3412" => 8, # UCS-4 unusual ordering "UCS2" => 9, # UCS-2 "ISO-8859-1" => 10, # ISO-8859-1 ISO Latin 1 "ISO-8859-2" => 11, # ISO-8859-2 ISO Latin 2 "ISO-8859-3" => 12, # ISO-8859-3 "ISO-8859-4" => 13, # ISO-8859-4 "ISO-8859-5" => 14, # ISO-8859-5 "ISO-8859-6" => 15, # ISO-8859-6 "ISO-8859-7" => 16, # ISO-8859-7 "ISO-8859-8" => 17, # ISO-8859-8 "ISO-8859-9" => 18, # ISO-8859-9 "ISO-2022-JP" => 19, # ISO-2022-JP "SHIFT-JIS" => 20, # Shift_JIS "EUC-JP" => 21, # EUC-JP "ASCII" => 22, # pure ASCII }
-
REVERSE_ENCODINGS =
Internal use only
pure ASCII
ENCODINGS.invert
::Nokogiri::ClassResolver
- Included
Class Method Summary
-
.new(⇒ SAX::Parser) ⇒ Parser
constructor
Create a new
Parser
.
Instance Attribute Summary
Instance Method Summary
- #parse(input) {|parser_context| ... }
-
#parse_file(filename) {|parser_context| ... }
Parse a file.
-
#parse_io(io) {|parser_context| ... }
Parse an input stream.
-
#parse_memory(input) {|parser_context| ... }
Parse an input string.
- #initialize_native private
::Nokogiri::ClassResolver
- Included
#related_class | Find a class constant within the. |
Constructor Details
.new(⇒ SAX::Parser) ⇒ Parser
.new(handler) ⇒ SAX::Parser) ⇒ Parser
.new(handler, encoding) ⇒ SAX::Parser) ⇒ Parser
Parser
.new(handler) ⇒ SAX::Parser) ⇒ Parser
.new(handler, encoding) ⇒ SAX::Parser) ⇒ Parser
Create a new Parser
.
- Parameters
-
handler
(optionalDocument
) The document that will receive events. Will create a new Nokogiri::XML::SAX::Document if not given, which is accessible through the #document attribute. -
#encoding (optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input. (default
nil
for auto-detection)
Instance Attribute Details
#document (rw)
The Document
where events will be sent.
# File 'lib/nokogiri/xml/sax/parser.rb', line 75
attr_accessor :document
#encoding (rw)
The encoding beings used for this document.
# File 'lib/nokogiri/xml/sax/parser.rb', line 78
attr_accessor :encoding
Instance Method Details
#initialize_native (private)
[ GitHub ]# File 'ext/nokogiri/xml_sax_parser.c', line 328
static VALUE noko_xml_sax_parser__initialize_native(VALUE self) { xmlSAXHandlerPtr handler = noko_xml_sax_parser_unwrap(self); handler->startDocument = noko_xml_sax_parser_start_document_callback; handler->endDocument = noko_xml_sax_parser_end_document_callback; handler->startElement = noko_xml_sax_parser_start_element_callback; handler->endElement = noko_xml_sax_parser_end_element_callback; handler->startElementNs = noko_xml_sax_parser_start_element_ns_callback; handler->endElementNs = noko_xml_sax_parser_end_element_ns_callback; handler->characters = noko_xml_sax_parser_characters_callback; handler->comment = noko_xml_sax_parser_comment_callback; handler->warning = noko_xml_sax_parser_warning_callback; handler->error = noko_xml_sax_parser_error_callback; handler->cdataBlock = noko_xml_sax_parser_cdata_block_callback; handler->processingInstruction = noko_xml_sax_parser_processing_instruction_callback; handler->reference = noko_xml_sax_parser_reference_callback; /* use some of libxml2's default callbacks to managed DTDs and entities */ handler->getEntity = xmlSAX2GetEntity; handler->internalSubset = xmlSAX2InternalSubset; handler->externalSubset = xmlSAX2ExternalSubset; handler->isStandalone = xmlSAX2IsStandalone; handler->hasInternalSubset = xmlSAX2HasInternalSubset; handler->hasExternalSubset = xmlSAX2HasExternalSubset; handler->resolveEntity = xmlSAX2ResolveEntity; handler->getParameterEntity = xmlSAX2GetParameterEntity; handler->entityDecl = xmlSAX2EntityDecl; handler->unparsedEntityDecl = xmlSAX2UnparsedEntityDecl; handler->initialized = XML_SAX2_MAGIC; return self; }
#parse(input) {|parser_context| ... }
Parse the input, sending events to the Document
at #document.
- Parameters
-
input
(String, IO) The input to parse.
If input
quacks like a readable IO object, this method forwards to #parse_io, otherwise it forwards to #parse_memory.
- Yields
-
If a block is given, the underlying
ParserContext
object will be yielded. This can be used to set options on the parser context before parsing begins.
# File 'lib/nokogiri/xml/sax/parser.rb', line 119
def parse(input, &block) if input.respond_to?(:read) && input.respond_to?(:close) parse_io(input, &block) else parse_memory(input, &block) end end
#parse_file(filename) {|parser_context| ... }
#parse_file(filename, encoding) {|parser_context| ... }
Parse a file.
- Parameters
-
filename
(String) The path to the file to be parsed. -
#encoding (optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input, or
nil
for auto-detection. (default #encoding)
- Yields
-
If a block is given, the underlying
ParserContext
object will be yielded. This can be used to set options on the parser context before parsing begins.
# File 'lib/nokogiri/xml/sax/parser.rb', line 187
def parse_file(filename, encoding = @encoding) raise ArgumentError, "no filename provided" unless filename raise Errno::ENOENT unless File.exist?(filename) raise Errno::EISDIR if File.directory?(filename) ctx = ("ParserContext").file(filename, encoding) yield ctx if block_given? ctx.parse_with(self) end
#parse_io(io) {|parser_context| ... }
#parse_io(io, encoding) {|parser_context| ... }
Parse an input stream.
- Parameters
-
io
(IO) The readable IO object from which to read input -
#encoding (optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input, or
nil
for auto-detection. (default #encoding)
- Yields
-
If a block is given, the underlying
ParserContext
object will be yielded. This can be used to set options on the parser context before parsing begins.
#parse_memory(input) {|parser_context| ... }
#parse_memory(input, encoding) {|parser_context| ... }
Parse an input string.
- Parameters
-
input
(String) The input string to be parsed. -
#encoding (optional Encoding, String, nil) An Encoding or encoding name to use when parsing the input, or
nil
for auto-detection. (default #encoding)
- Yields
-
If a block is given, the underlying
ParserContext
object will be yielded. This can be used to set options on the parser context before parsing begins.