Class: ActiveSupport::Multibyte::Chars
Relationships & Source Files | |
Super Chains via Extension / Inclusion / Inheritance | |
Instance Chain:
self,
Comparable
|
|
Inherits: | Object |
Defined in: | activesupport/lib/active_support/multibyte/chars.rb |
Overview
Chars
enables you to work transparently with UTF-8 encoding in the Ruby ::String class without having extensive knowledge about the encoding. A Chars
object accepts a string upon initialization and proxies ::String methods in an encoding safe manner. All the normal ::String methods are also implemented on the proxy.
::String methods are proxied through the Chars
object, and can be accessed through the mb_chars
method. Methods which would normally return a ::String object now return a Chars
object so methods can be chained.
'The Perfect String '.mb_chars.downcase.strip.normalize # => "the perfect string"
Chars
objects are perfectly interchangeable with ::String objects as long as no explicit class checks are made. If certain methods do explicitly check the class, call #to_s before you pass chars objects to them.
bad.explicit_checking_method 'T'.mb_chars.downcase.to_s
The default Chars
implementation assumes that the encoding of the string is UTF-8, if you want to handle different encodings you can write your own multibyte string handler and configure it through proxy_class.
class CharsForUTF32
def size
@wrapped_string.size / 4
end
def self.accepts?(string)
string.length % 4 == 0
end
end
ActiveSupport::Multibyte.proxy_class = CharsForUTF32
Class Method Summary
-
.consumes?(string) ⇒ Boolean
Returns
true
when the proxy class can handle the string. -
.new(string) ⇒ Chars
constructor
Creates a new
Chars
instance by wrapping string.
Instance Attribute Summary
-
#to_s
readonly
Alias for #wrapped_string.
-
#to_str
readonly
Alias for #wrapped_string.
- #wrapped_string (also: #to_s, #to_str) readonly
Instance Method Summary
- #<=>
- #=~
- #acts_like_string? ⇒ Boolean
-
#capitalize
Converts the first character to uppercase and the remainder to lowercase.
-
#compose
Performs composition on all the characters.
-
#decompose
Performs canonical decomposition on all the characters.
-
#downcase
Converts characters in the string to lowercase.
-
#grapheme_length
Returns the number of grapheme clusters in the string.
-
#limit(limit)
Limits the byte size of the string to a number of bytes without breaking characters.
-
#method_missing(method, *args, &block)
Forward all undefined methods to the wrapped string.
-
#normalize(form = nil)
Returns the KC normalization of the string by default.
-
#respond_to_missing?(method, include_private) ⇒ Boolean
Returns
true
if obj responds to the given method. -
#reverse
Reverses all characters in the string.
-
#slice!(*args)
Works like
String#slice!
, but returns an instance ofChars
, or nil if the string was not modified. -
#split(*args)
Works just like
String#split
, with the exception that the items in the resulting list areChars
instances instead of ::String. -
#swapcase
Converts characters in the string to the opposite case.
-
#tidy_bytes(force = false)
Replaces all ISO-8859-1 or CP1252 characters by their UTF-8 equivalent resulting in a valid UTF-8 string.
-
#titlecase
Alias for #titleize.
-
#titleize
(also: #titlecase)
Capitalizes the first letter of every word, when possible.
-
#upcase
Converts characters in the string to uppercase.
Constructor Details
.new(string) ⇒ Chars
Creates a new Chars
instance by wrapping string.
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 52
def initialize(string) @wrapped_string = string @wrapped_string.force_encoding(Encoding::UTF_8) unless @wrapped_string.frozen? end
Dynamic Method Handling
This class handles dynamic methods through the method_missing method
#method_missing(method, *args, &block)
Forward all undefined methods to the wrapped string.
Class Method Details
.consumes?(string) ⇒ Boolean
Returns true
when the proxy class can handle the string. Returns false
otherwise.
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 76
def self.consumes?(string) string.encoding == Encoding::UTF_8 end
Instance Attribute Details
#to_s (readonly)
Alias for #wrapped_string.
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 46
alias to_s wrapped_string
#to_str (readonly)
Alias for #wrapped_string.
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 47
alias to_str wrapped_string
#wrapped_string (readonly) Also known as: #to_s, #to_str
[ GitHub ]# File 'activesupport/lib/active_support/multibyte/chars.rb', line 45
attr_reader :wrapped_string
Instance Method Details
#<=>
[ GitHub ]# File 'activesupport/lib/active_support/multibyte/chars.rb', line 49
delegate :<=>, :=~, :acts_like_string?, :to => :wrapped_string
#=~
[ GitHub ]# File 'activesupport/lib/active_support/multibyte/chars.rb', line 49
delegate :<=>, :=~, :acts_like_string?, :to => :wrapped_string
#acts_like_string? ⇒ Boolean
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 49
delegate :<=>, :=~, :acts_like_string?, :to => :wrapped_string
#capitalize
Converts the first character to uppercase and the remainder to lowercase.
'über'.mb_chars.capitalize.to_s # => "Über"
#compose
Performs composition on all the characters.
'é'.length # => 3
'é'.mb_chars.compose.to_s.length # => 2
#decompose
Performs canonical decomposition on all the characters.
'é'.length # => 2
'é'.mb_chars.decompose.to_s.length # => 3
#downcase
Converts characters in the string to lowercase.
'VĚDA A VÝZKUM'.mb_chars.downcase.to_s # => "věda a výzkum"
#grapheme_length
Returns the number of grapheme clusters in the string.
'क्षि'.mb_chars.length # => 4
'क्षि'.mb_chars.grapheme_length # => 3
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 179
def grapheme_length Unicode.unpack_graphemes(@wrapped_string).length end
#limit(limit)
Limits the byte size of the string to a number of bytes without breaking characters. Usable when the storage for a string is limited for some reason.
'こんにちは'.mb_chars.limit(7).to_s # => "こん"
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 107
def limit(limit) slice(0...translate_offset(limit)) end
#normalize(form = nil)
Returns the KC normalization of the string by default. NFKC is considered the best normalization form for passing strings to databases and validations.
-
form
- The form you want to normalize in. Should be one of the following::c
,:kc
,:d
, or:kd
. Default is ActiveSupport::Multibyte::Unicode.default_normalization_form
#respond_to_missing?(method, include_private) ⇒ Boolean
Returns true
if obj responds to the given method. Private methods are included in the search only if the optional second parameter evaluates to true
.
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 70
def respond_to_missing?(method, include_private) @wrapped_string.respond_to?(method, include_private) end
#reverse
Reverses all characters in the string.
'Café'.mb_chars.reverse.to_s # => 'éfaC'
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 98
def reverse chars(Unicode.unpack_graphemes(@wrapped_string).reverse.flatten.pack('U*')) end
#slice!(*args)
Works like String#slice!
, but returns an instance of Chars
, or nil if the string was not modified.
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 91
def slice!(*args) chars(@wrapped_string.slice!(*args)) end
#split(*args)
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 85
def split(*args) @wrapped_string.split(*args).map { |i| self.class.new(i) } end
#swapcase
Converts characters in the string to the opposite case.
'El Cañón".mb_chars.swapcase.to_s # => "eL cAÑÓN"
#tidy_bytes(force = false)
Replaces all ISO-8859-1 or CP1252 characters by their UTF-8 equivalent resulting in a valid UTF-8 string.
Passing true
will forcibly tidy all bytes, assuming that the string's encoding is entirely CP1252 or ISO-8859-1.
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 188
def tidy_bytes(force = false) chars(Unicode.tidy_bytes(@wrapped_string, force)) end
#titlecase
Alias for #titleize.
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 146
alias_method :titlecase, :titleize
#titleize Also known as: #titlecase
Capitalizes the first letter of every word, when possible.
"ÉL QUE SE ENTERÓ".mb_chars.titleize # => "Él Que Se Enteró"
"日本語".mb_chars.titleize # => "日本語"
#upcase
Converts characters in the string to uppercase.
'Laurent, où sont les tests ?'.mb_chars.upcase.to_s # => "LAURENT, OÙ SONT LES TESTS ?"