Module: LibXML::XML::Encoding
Relationships & Source Files | |
Defined in: | ext/libxml/ruby_xml_encoding.c |
Overview
The encoding class exposes the encodings that libxml supports via constants.
::LibXML
converts all data sources to UTF8 internally before processing them. By default, ::LibXML
determines a data source’s encoding using the algorithm described on its website.
However, you may override a data source’s encoding by using the encoding constants defined in this module.
Example 1:
io = File.open('some_file', 'rb')
parser = XML::Parser.io(io, :encoding => XML::Encoding::ISO_8859_1)
doc = parser.parse
Example 2:
parser = XML::HTMLParser.file("some_file", :encoding => XML::Encoding::ISO_8859_1)
doc = parser.parse
Example 3:
document = XML::Document.new
document.encoding = XML::Encoding::ISO_8859_1
doc << XML::Node.new
Constant Summary
-
ASCII =
pure
ASCII
.22
-
EBCDIC =
EBCDIC
uh!6
-
ERROR =
No char encoding detected.
-1
-
EUC_JP =
EUC-JP.
21
-
ISO_2022_JP =
ISO-2022-JP.
19
-
ISO_8859_1 =
ISO-8859-1 ISO Latin 1.
10
-
ISO_8859_2 =
ISO-8859-2 ISO Latin 2.
11
-
ISO_8859_3 =
ISO-8859-3.
12
-
ISO_8859_4 =
ISO-8859-4.
13
-
ISO_8859_5 =
ISO-8859-5.
14
-
ISO_8859_6 =
ISO-8859-6.
15
-
ISO_8859_7 =
ISO-8859-7.
16
-
ISO_8859_8 =
ISO-8859-8.
17
-
ISO_8859_9 =
ISO-8859-9.
18
-
NONE =
No char encoding detected.
0
-
SHIFT_JIS =
Shift_JIS.
20
-
UCS_2 =
UCS-2.
9
-
UCS_4BE =
UCS-4 big endian.
5
-
UCS_4LE =
UCS-4 little endian.
4
-
UCS_4_2143 =
UCS-4 unusual ordering.
7
-
UCS_4_3412 =
UCS-4 unusual ordering.
8
-
UTF_16BE =
UTF-16 big endian.
3
-
UTF_16LE =
UTF-16 little endian.
2
-
UTF_8 =
UTF-8
1
Class Method Summary
-
.from_s("UTF_8") ⇒ Encoding
mod_func
Converts an encoding string to an encoding constant defined on the
Encoding
class. -
Input.encoding_to_rb_encoding(Input::ENCODING) ⇒ Encoding
mod_func
Converts an encoding constant defined on the
Encoding
class to a Ruby encoding object (available on Ruby 1.9.* and higher). -
.to_s(XML::Encoding::UTF_8) ⇒ "UTF-8"
mod_func
Converts an encoding constant defined on the
Encoding
class to its text representation.
Class Method Details
.from_s("UTF_8") ⇒ Encoding
(mod_func)
Converts an encoding string to an encoding constant defined on the Encoding
class.
# File 'ext/libxml/ruby_xml_encoding.c', line 49
static VALUE rxml_encoding_from_s(VALUE klass, VALUE encoding) { xmlCharEncoding xencoding; if (encoding == Qnil) return Qnil; xencoding = xmlParseCharEncoding(StringValuePtr(encoding)); return INT2NUM(xencoding); }
Input.encoding_to_rb_encoding(Input::ENCODING) ⇒ Encoding
(mod_func)
Converts an encoding constant defined on the Encoding
class to a Ruby encoding object (available on Ruby 1.9.* and higher).
# File 'ext/libxml/ruby_xml_encoding.c', line 160
VALUE rxml_encoding_to_rb_encoding(VALUE klass, VALUE encoding) { xmlCharEncoding xmlEncoding = (xmlCharEncoding)NUM2INT(encoding); rb_encoding* rbencoding = rxml_xml_encoding_to_rb_encoding(klass, xmlEncoding); return rb_enc_from_encoding(rbencoding); }
.to_s(XML::Encoding::UTF_8) ⇒ "UTF
-8
" (mod_func)
Converts an encoding constant defined on the Encoding
class to its text representation.
# File 'ext/libxml/ruby_xml_encoding.c', line 67
static VALUE rxml_encoding_to_s(VALUE klass, VALUE encoding) { const xmlChar* xencoding = (const xmlChar*)xmlGetCharEncodingName(NUM2INT(encoding)); if (!xencoding) return Qnil; else return rxml_new_cstr(xencoding, xencoding); }