123456789_123456789_123456789_123456789_123456789_

Class: ActiveSupport::Multibyte::Chars

Relationships & Source Files
Super Chains via Extension / Inclusion / Inheritance
Instance Chain:
self, Comparable
Inherits: Object
Defined in: activesupport/lib/active_support/multibyte/chars.rb

Overview

Chars enables you to work transparently with UTF-8 encoding in the Ruby ::String class without having extensive knowledge about the encoding. A Chars object accepts a string upon initialization and proxies ::String methods in an encoding safe manner. All the normal ::String methods are also implemented on the proxy.

::String methods are proxied through the Chars object, and can be accessed through the mb_chars method. Methods which would normally return a ::String object now return a Chars object so methods can be chained.

'The Perfect String  '.mb_chars.downcase.strip
# => #<ActiveSupport::Multibyte::Chars:0x007fdc434ccc10 @wrapped_string="the perfect string">

Chars objects are perfectly interchangeable with ::String objects as long as no explicit class checks are made. If certain methods do explicitly check the class, call #to_s before you pass chars objects to them.

bad.explicit_checking_method 'T'.mb_chars.downcase.to_s

The default Chars implementation assumes that the encoding of the string is UTF-8, if you want to handle different encodings you can write your own multibyte string handler and configure it through proxy_class.

class CharsForUTF32
  def size
    @wrapped_string.size / 4
  end

  def self.accepts?(string)
    string.length % 4 == 0
  end
end

ActiveSupport::Multibyte.proxy_class = CharsForUTF32

Class Method Summary

Instance Attribute Summary

Instance Method Summary

Constructor Details

.new(string, deprecation: true) ⇒ Chars

Creates a new Chars instance by wrapping string.

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 56

def initialize(string, deprecation: true)
  if deprecation
    ActiveSupport.deprecator.warn(
      "ActiveSupport::Multibyte::Chars is deprecated and will be removed in Rails 8.2. " \
      "Use normal string methods instead."
    )
  end

  @wrapped_string = string
  if string.encoding != Encoding::UTF_8
    @wrapped_string = @wrapped_string.dup
    @wrapped_string.force_encoding(Encoding::UTF_8)
  end
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(method)

Forward all undefined methods to the wrapped string.

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 72

def method_missing(method, ...)
  result = @wrapped_string.__send__(method, ...)
  if method.end_with?("!")
    self if result
  else
    result.kind_of?(String) ? chars(result) : result
  end
end

Instance Attribute Details

#to_s (readonly)

Alias for #wrapped_string.

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 50

alias to_s wrapped_string

#to_str (readonly)

Alias for #wrapped_string.

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 51

alias to_str wrapped_string

#wrapped_string (readonly) Also known as: #to_s, #to_str

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 49

attr_reader :wrapped_string

Instance Method Details

#<=>

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 53

delegate :<=>, :=~, :match?, :acts_like_string?, to: :wrapped_string

#=~

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 53

delegate :<=>, :=~, :match?, :acts_like_string?, to: :wrapped_string

#acts_like_string?Boolean

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 53

delegate :<=>, :=~, :match?, :acts_like_string?, to: :wrapped_string

#as_json(options = nil)

This method is for internal use only.
[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 171

def as_json(options = nil) # :nodoc:
  to_s.as_json(options)
end

#chars(string) (private)

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 183

def chars(string)
  self.class.new(string)
end

#compose

Performs composition on all the characters.

'é'.length                       # => 1
'é'.mb_chars.compose.to_s.length # => 1
[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 150

def compose
  chars(Unicode.compose(@wrapped_string.codepoints.to_a).pack("U*"))
end

#decompose

Performs canonical decomposition on all the characters.

'é'.length                         # => 1
'é'.mb_chars.decompose.to_s.length # => 2
[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 142

def decompose
  chars(Unicode.decompose(:canonical, @wrapped_string.codepoints.to_a).pack("U*"))
end

#grapheme_length

Returns the number of grapheme clusters in the string.

'क्षि'.mb_chars.length   # => 4
'क्षि'.mb_chars.grapheme_length # => 2
[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 158

def grapheme_length
  @wrapped_string.grapheme_clusters.length
end

#limit(limit)

Limits the byte size of the string to a number of bytes without breaking characters. Usable when the storage for a string is limited for some reason.

'こんにちは'.mb_chars.limit(7).to_s # => "こん"
[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 125

def limit(limit)
  chars(@wrapped_string.truncate_bytes(limit, omission: nil))
end

#match?Boolean

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 53

delegate :<=>, :=~, :match?, :acts_like_string?, to: :wrapped_string

#respond_to_missing?(method, include_private) ⇒ Boolean

Returns true if obj responds to the given method. Private methods are included in the search only if the optional second parameter evaluates to true.

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 84

def respond_to_missing?(method, include_private)
  @wrapped_string.respond_to?(method, include_private)
end

#reverse

Reverses all characters in the string.

'Café'.mb_chars.reverse.to_s # => 'éfaC'
[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 116

def reverse
  chars(@wrapped_string.grapheme_clusters.reverse.join)
end

#slice!(*args)

Works like String#slice!, but returns an instance of Chars, or nil if the string was not modified. The string will not be modified if the range given is out of bounds

string = 'Welcome'
string.mb_chars.slice!(3)    # => #<ActiveSupport::Multibyte::Chars:0x000000038109b8 @wrapped_string="c">
string # => 'Welome'
string.mb_chars.slice!(0..3) # => #<ActiveSupport::Multibyte::Chars:0x00000002eb80a0 @wrapped_string="Welo">
string # => 'me'
[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 106

def slice!(*args)
  string_sliced = @wrapped_string.slice!(*args)
  if string_sliced
    chars(string_sliced)
  end
end

#split(*args)

Works just like String#split, with the exception that the items in the resulting list are Chars instances instead of ::String. This makes chaining methods easier.

'Café périferôl'.mb_chars.split(/é/).map { |part| part.upcase.to_s } # => ["CAF", " P", "RIFERÔL"]
[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 93

def split(*args)
  @wrapped_string.split(*args).map { |i| self.class.new(i) }
end

#tidy_bytes(force = false)

Replaces all ISO-8859-1 or CP1252 characters by their UTF-8 equivalent resulting in a valid UTF-8 string.

Passing true will forcibly tidy all bytes, assuming that the string’s encoding is entirely CP1252 or ISO-8859-1.

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 167

def tidy_bytes(force = false)
  chars(Unicode.tidy_bytes(@wrapped_string, force))
end

#titlecase

Alias for #titleize.

[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 136

alias_method :titlecase, :titleize

#titleize Also known as: #titlecase

Capitalizes the first letter of every word, when possible.

"ÉL QUE SE ENTERÓ".mb_chars.titleize.to_s    # => "Él Que Se Enteró"
"日本語".mb_chars.titleize.to_s               # => "日本語"
[ GitHub ]

  
# File 'activesupport/lib/active_support/multibyte/chars.rb', line 133

def titleize
  chars(downcase.to_s.gsub(/\b('?\S)/u) { $1.upcase })
end