Class: RSpec::Support::EncodedString Private
Do not use. This class is for internal use only.
Relationships & Source Files | |
Inherits: | Object |
Defined in: | rspec-support/lib/rspec/support/encoded_string.rb |
Constant Summary
-
REPLACE =
Ruby’s default replacement string is:
U+FFFD ("\xEF\xBF\xBD"), for Unicode encoding forms, else ? ("\x3F")
"?"
-
US_ASCII =
# File 'rspec-support/lib/rspec/support/encoded_string.rb', line 9"US-ASCII"
-
UTF_8 =
Reduce allocations by storing constants.
"UTF-8"
Class Method Summary
- .new(string, encoding = nil) ⇒ EncodedString constructor Internal use only
-
.pick_encoding(_source_a, _source_b)
See additional method definition at line 143.
Instance Attribute Summary
- #source_encoding readonly Internal use only
Instance Method Summary
- #<<(string) Internal use only
-
#split(regex_or_string)
See additional method definition at line 33.
- #to_s (also: #to_str) Internal use only
-
#to_str
Alias for #to_s.
-
#detect_source_encoding(_string)
private
See additional method definition at line 139.
-
#matching_encoding(string)
private
Internal use only
Encoding Exceptions:
- #remove_invalid_bytes(string) private Internal use only
Class Method Details
.pick_encoding(_source_a, _source_b)
See additional method definition at line 143.
# File 'rspec-support/lib/rspec/support/encoded_string.rb', line 148
def self.pick_encoding(source_a, source_b) Encoding.compatible?(source_a, source_b) || Encoding.default_external end
Instance Attribute Details
#source_encoding (readonly)
[ GitHub ]# File 'rspec-support/lib/rspec/support/encoded_string.rb', line 21
attr_reader :source_encoding
Instance Method Details
#<<(string)
[ GitHub ]# File 'rspec-support/lib/rspec/support/encoded_string.rb', line 28
def <<(string) @string << matching_encoding(string) end
#detect_source_encoding(_string) (private)
See additional method definition at line 139.
# File 'rspec-support/lib/rspec/support/encoded_string.rb', line 157
def detect_source_encoding(string) string.encoding end
#matching_encoding(string) (private)
Encoding Exceptions:
Raised by Encoding and String methods:
Encoding::UndefinedConversionError:
when a transcoding operation fails
if the String contains characters invalid for the target encoding
e.g. "\x80".encode('UTF-8','ASCII-8BIT')
vs "\x80".encode('UTF-8','ASCII-8BIT', undef: :replace, replace: '<undef>')
# => '<undef>'
Encoding::CompatibilityError
when Encoding.compatible?(str1, str2) is nil
e.g. utf_16le_emoji_string.split("\n")
e.g. valid_unicode_string.encode(utf8_encoding) << ascii_string
Encoding::InvalidByteSequenceError:
when the string being transcoded contains a byte invalid for
either the source or target encoding
e.g. "\x80".encode('UTF-8','US-ASCII')
vs "\x80".encode('UTF-8','US-ASCII', invalid: :replace, replace: '<byte>')
# => '<byte>'
ArgumentError
when on a string with invalid bytes
e.g."\x80".split("\n")
TypeError
when a symbol is passed as an encoding
Encoding.find(:"UTF-8")
when calling force_encoding on an object
that doesn't respond to #to_str
Raised by transcoding methods:
Encoding::ConverterNotFoundError:
when a named encoding does not correspond with a known converter
e.g. 'abc'.force_encoding('UTF-8').encode('foo')
or a converter path cannot be found
e.g. "\x80".force_encoding('ASCII-8BIT').encode('Emacs-Mule')
Raised by byte <-> char conversions
RangeError: out of char range
e.g. the UTF-16LE emoji: 128169.chr
See additional method definition at line 93.
# File 'rspec-support/lib/rspec/support/encoded_string.rb', line 153
def matching_encoding(string) string = remove_invalid_bytes(string) string.encode(@encoding) rescue Encoding::UndefinedConversionError, Encoding::InvalidByteSequenceError # Originally defined as a constant to avoid unneeded allocations, this hash must # be defined inline (without {}) to avoid warnings on Ruby 2.7 # # In MRI 2.1 'invalid: :replace' changed to also replace an invalid byte sequence # see https://github.com/ruby/ruby/blob/v2_1_0/NEWS#L176 # https://www.ruby-forum.com/topic/6861247 # https://twitter.com/nalsh/status/553413844685438976 # # For example, given: # "\x80".force_encoding("Emacs-Mule").encode(:invalid => :replace).bytes.to_a # # On MRI 2.1 or above: 63 # '?' # else : 128 # "\x80" # string.encode(@encoding, :invalid => :replace, :undef => :replace, :replace => REPLACE) rescue Encoding::ConverterNotFoundError # Originally defined as a constant to avoid unneeded allocations, this hash must # be defined inline (without {}) to avoid warnings on Ruby 2.7 string.dup.force_encoding(@encoding).encode(:invalid => :replace, :replace => REPLACE) end
#remove_invalid_bytes(string) (private)
# File 'rspec-support/lib/rspec/support/encoded_string.rb', line 132
def remove_invalid_bytes(string) string.scrub(REPLACE) end
#split(regex_or_string)
See additional method definition at line 33.
# File 'rspec-support/lib/rspec/support/encoded_string.rb', line 41
def split(regex_or_string) @string.split(matching_encoding(regex_or_string)) rescue ArgumentError # JRuby raises an ArgumentError when splitting a source string that # contains invalid bytes. remove_invalid_bytes(@string).split regex_or_string end
#to_s Also known as: #to_str
[ GitHub ]# File 'rspec-support/lib/rspec/support/encoded_string.rb', line 46
def to_s @string end
#to_str
Alias for #to_s.
# File 'rspec-support/lib/rspec/support/encoded_string.rb', line 49
alias :to_str :to_s