Module: Nokogiri::XML::SAX
Relationships & Source Files | |
Namespace Children | |
Classes:
| |
Defined in: | lib/nokogiri/xml/sax/document.rb, ext/nokogiri/nokogiri.c, lib/nokogiri/xml/sax/parser.rb, lib/nokogiri/xml/sax/parser_context.rb, lib/nokogiri/xml/sax/push_parser.rb |
Overview
SAX
Parsers are event driven parsers. ::Nokogiri
provides two different event based parsers when dealing with ::Nokogiri::XML
. If you want to do SAX
style parsing using ::Nokogiri::HTML
, check out ::Nokogiri::HTML4::SAX
.
The basic way a SAX
style parser works is by creating a parser, telling the parser about the events we’re interested in, then giving the parser some ::Nokogiri::XML
to process. The parser will notify you when it encounters events you said you would like to know about.
To register for events, you simply subclass Document
, and implement the methods for which you would like notification.
For example, if I want to be notified when a document ends, and when an element starts, I would write a class like this:
class MyDocument < Nokogiri::XML::SAX::Document
def end_document
puts "the document has ended"
end
def start_element name, attributes = []
puts "#{name} started"
end
end
Then I would instantiate a SAX
parser with this document, and feed the parser some ::Nokogiri::XML
# Create a new parser
parser = Nokogiri::XML::SAX::Parser.new(MyDocument.new)
# Feed the parser some XML
parser.parse(File.open(ARGV[0]))
Now my document handler will be called when each node starts, and when then document ends. To see what kinds of events are available, take a look at Document
.
Two SAX parsers for ::Nokogiri::XML
are available, a parser that reads from a string or IO object as it feels necessary, and a parser that lets you spoon feed it ::Nokogiri::XML
. If you want to let ::Nokogiri
deal with reading your ::Nokogiri::XML
, use the Parser
. If you want to have fine grain control over the ::Nokogiri::XML
input, use the PushParser
.