123456789_123456789_123456789_123456789_123456789_

Class: Nokogiri::HTML4::SAX::Parser

Relationships & Source Files
Super Chains via Extension / Inclusion / Inheritance
Class Chain:
Instance Chain:
Inherits: Nokogiri::XML::SAX::Parser
Defined in: lib/nokogiri/html4/sax/parser.rb

Overview

This class lets you perform ::Nokogiri::HTML4::SAX style parsing on ::Nokogiri::HTML with ::Nokogiri::HTML error correction.

Here is a basic usage example:

class MyDoc < Nokogiri::XML::SAX::Document
  def start_element name, attributes = []
    puts "found a #{name}"
  end
end

parser = Nokogiri::HTML4::SAX::Parser.new(MyDoc.new)
parser.parse(File.read(ARGV[0], mode: 'rb'))

For more information on ::Nokogiri::HTML4::SAX parsers, see ::Nokogiri::XML::SAX

Constant Summary

::Nokogiri::XML::SAX::Parser - Inherited

ENCODINGS

Class Method Summary

::Nokogiri::XML::SAX::Parser - Inherited

.new

Create a new Parser with doc and encoding

Instance Attribute Summary

::Nokogiri::XML::SAX::Parser - Inherited

#document

The ::Nokogiri::XML::SAX::Document where events will be sent.

#encoding

The encoding beings used for this document.

Instance Method Summary

::Nokogiri::XML::SAX::Parser - Inherited

#parse

Parse given thing which may be a string containing xml, or an IO object.

#parse_file

Parse a file with filename

#parse_io

Parse given io

#parse_memory, #check_encoding

Constructor Details

This class inherits a constructor from Nokogiri::XML::SAX::Parser

Instance Method Details

#parse_file(filename, encoding = "UTF-8") {|ctx| ... }

Parse a file with filename

Yields:

  • (ctx)

Raises:

  • (ArgumentError)
[ GitHub ]

  
# File 'lib/nokogiri/html4/sax/parser.rb', line 51

def parse_file(filename, encoding = "UTF-8")
  raise ArgumentError unless filename
  raise Errno::ENOENT unless File.exist?(filename)
  raise Errno::EISDIR if File.directory?(filename)

  ctx = ParserContext.file(filename, encoding)
  yield ctx if block_given?
  ctx.parse_with(self)
end

#parse_io(io, encoding = "UTF-8") {|ctx| ... }

Parse given io

Yields:

  • (ctx)
[ GitHub ]

  
# File 'lib/nokogiri/html4/sax/parser.rb', line 41

def parse_io(io, encoding = "UTF-8")
  check_encoding(encoding)
  @encoding = encoding
  ctx = ParserContext.io(io, ENCODINGS[encoding])
  yield ctx if block_given?
  ctx.parse_with(self)
end

#parse_memory(data, encoding = "UTF-8") {|ctx| ... }

Parse html stored in data using encoding

Yields:

  • (ctx)

Raises:

  • (TypeError)
[ GitHub ]

  
# File 'lib/nokogiri/html4/sax/parser.rb', line 30

def parse_memory(data, encoding = "UTF-8")
  raise TypeError unless String === data
  return if data.empty?

  ctx = ParserContext.memory(data, encoding)
  yield ctx if block_given?
  ctx.parse_with(self)
end