123456789_123456789_123456789_123456789_123456789_

Module: URI

Relationships & Source Files
Namespace Children
Modules:
Classes:
Exceptions:
Extension / Inclusion / Inheritance Descendants
Included In:
Super Chains via Extension / Inclusion / Inheritance
Instance Chain:
Defined in: lib/uri.rb,
lib/uri/common.rb,
lib/uri/file.rb,
lib/uri/ftp.rb,
lib/uri/generic.rb,
lib/uri/http.rb,
lib/uri/https.rb,
lib/uri/ldap.rb,
lib/uri/ldaps.rb,
lib/uri/mailto.rb,
lib/uri/rfc2396_parser.rb,
lib/uri/rfc3986_parser.rb,
lib/uri/version.rb,
lib/uri/ws.rb,
lib/uri/wss.rb

Overview

URI is a module providing classes to handle Uniform Resource Identifiers (RFC2396).

Features

  • Uniform way of handling URIs.

  • Flexibility to introduce custom URI schemes.

  • Flexibility to have an alternate Parser (or just different patterns and regexp’s).

Basic example

require 'uri'

uri = URI("http://foo.com/posts?id=30&limit=5#time=1305298413")
#=> #<URI::HTTP http://foo.com/posts?id=30&limit=5#time=1305298413>

uri.scheme    #=> "http"
uri.host      #=> "foo.com"
uri.path      #=> "/posts"
uri.query     #=> "id=30&limit=5"
uri.fragment  #=> "time=1305298413"

uri.to_s      #=> "http://foo.com/posts?id=30&limit=5#time=1305298413"

Adding custom URIs

module URI
  class RSYNC < Generic
    DEFAULT_PORT = 873
  end
  register_scheme 'RSYNC', RSYNC
end
#=> URI::RSYNC

URI.scheme_list
#=> {"FILE"=>URI::File, "FTP"=>URI::FTP, "HTTP"=>URI::HTTP,
#    "HTTPS"=>URI::HTTPS, "LDAP"=>URI::LDAP, "LDAPS"=>URI::LDAPS,
#    "MAILTO"=>URI::MailTo, "RSYNC"=>URI::RSYNC}

uri = URI("rsync://rsync.foo.com")
#=> #<URI::RSYNC rsync://rsync.foo.com>

RFC References

A good place to view an RFC spec is www.ietf.org/rfc.html.

Here is a list of all related RFC’s:

Class tree

  • Generic (in uri/generic.rb)

    • URI::File - (in uri/file.rb)

    • URI::FTP - (in uri/ftp.rb)

    • URI::HTTP - (in uri/http.rb)

      • URI::HTTPS - (in uri/https.rb)

    • URI::LDAP - (in uri/ldap.rb)

      • URI::LDAPS - (in uri/ldaps.rb)

    • URI::MailTo - (in uri/mailto.rb)

  • Parser - (in uri/common.rb)

  • REGEXP - (in uri/common.rb)

    • URI::REGEXP::PATTERN - (in uri/common.rb)

  • Util - (in uri/common.rb)

  • Error - (in uri/common.rb)

    • URI::InvalidURIError - (in uri/common.rb)

    • URI::InvalidComponentError - (in uri/common.rb)

    • URI::BadURIError - (in uri/common.rb)

Copyright Info

Author

Akira Yamada <akira@ruby-lang.org>

Documentation

Akira Yamada <akira@ruby-lang.org> Dmitry V. Sabanin <sdmitry@lrn.ru> Vincent Batts <vbatts@hashbangbash.com>

License

Copyright © 2001 akira yamada <akira@ruby-lang.org> You can redistribute it and/or modify it under the same term as Ruby.

Constant Summary

  • DEFAULT_PARSER =

    URI::Parser.new

    # File 'lib/uri/common.rb', line 24
    Parser.new
  • INITIAL_SCHEMES = private
    # File 'lib/uri/common.rb', line 87
    scheme_list
  • Parser =
    # File 'lib/uri/common.rb', line 19
    RFC2396_Parser
  • REGEXP =
    # File 'lib/uri/common.rb', line 18
    RFC2396_REGEXP
  • RFC3986_PARSER =
    # File 'lib/uri/common.rb', line 20
    RFC3986_Parser.new
  • TBLDECWWWCOMP_ = Internal use only
    # File 'lib/uri/common.rb', line 306
    {}
  • TBLENCURICOMP_ =
    # File 'lib/uri/common.rb', line 303
    TBLENCWWWCOMP_.dup.freeze
  • TBLENCWWWCOMP_ = Internal use only
    # File 'lib/uri/common.rb', line 299
    {}
  • VERSION = Internal use only
    # File 'lib/uri/version.rb', line 4
    VERSION_CODE.scan(/../).collect{|n| n.to_i}.join('.').freeze
  • VERSION_CODE = Internal use only
    # File 'lib/uri/version.rb', line 3
    '001202'.freeze
  • WEB_ENCODINGS_ = Internal use only

    curl encoding.spec.whatwg.org/encodings.json|

    ruby -rjson -e 'H={}
    h={
      "shift_jis"=>"Windows-31J",
      "euc-jp"=>"cp51932",
      "iso-2022-jp"=>"cp50221",
      "x-mac-cyrillic"=>"macCyrillic",
    }
    JSON($<.read).map{|x|x["encodings"]}.flatten.each{|x|
      Encoding.find(n=h.fetch(n=x["name"].downcase,n))rescue next
      x["labels"].each{|y|H[y]=n}
    }
    puts "{"
    H.each{|k,v|puts %[  #{k.dump}=>#{v.dump},]}
    puts "}"

    # File 'lib/uri/common.rb', line 496
    {
      "unicode-1-1-utf-8"=>"utf-8",
      "utf-8"=>"utf-8",
      "utf8"=>"utf-8",
      "866"=>"ibm866",
      "cp866"=>"ibm866",
      "csibm866"=>"ibm866",
      "ibm866"=>"ibm866",
      "csisolatin2"=>"iso-8859-2",
      "iso-8859-2"=>"iso-8859-2",
      "iso-ir-101"=>"iso-8859-2",
      "iso8859-2"=>"iso-8859-2",
      "iso88592"=>"iso-8859-2",
      "iso_8859-2"=>"iso-8859-2",
      "iso_8859-2:1987"=>"iso-8859-2",
      "l2"=>"iso-8859-2",
      "latin2"=>"iso-8859-2",
      "csisolatin3"=>"iso-8859-3",
      "iso-8859-3"=>"iso-8859-3",
      "iso-ir-109"=>"iso-8859-3",
      "iso8859-3"=>"iso-8859-3",
      "iso88593"=>"iso-8859-3",
      "iso_8859-3"=>"iso-8859-3",
      "iso_8859-3:1988"=>"iso-8859-3",
      "l3"=>"iso-8859-3",
      "latin3"=>"iso-8859-3",
      "csisolatin4"=>"iso-8859-4",
      "iso-8859-4"=>"iso-8859-4",
      "iso-ir-110"=>"iso-8859-4",
      "iso8859-4"=>"iso-8859-4",
      "iso88594"=>"iso-8859-4",
      "iso_8859-4"=>"iso-8859-4",
      "iso_8859-4:1988"=>"iso-8859-4",
      "l4"=>"iso-8859-4",
      "latin4"=>"iso-8859-4",
      "csisolatincyrillic"=>"iso-8859-5",
      "cyrillic"=>"iso-8859-5",
      "iso-8859-5"=>"iso-8859-5",
      "iso-ir-144"=>"iso-8859-5",
      "iso8859-5"=>"iso-8859-5",
      "iso88595"=>"iso-8859-5",
      "iso_8859-5"=>"iso-8859-5",
      "iso_8859-5:1988"=>"iso-8859-5",
      "arabic"=>"iso-8859-6",
      "asmo-708"=>"iso-8859-6",
      "csiso88596e"=>"iso-8859-6",
      "csiso88596i"=>"iso-8859-6",
      "csisolatinarabic"=>"iso-8859-6",
      "ecma-114"=>"iso-8859-6",
      "iso-8859-6"=>"iso-8859-6",
      "iso-8859-6-e"=>"iso-8859-6",
      "iso-8859-6-i"=>"iso-8859-6",
      "iso-ir-127"=>"iso-8859-6",
      "iso8859-6"=>"iso-8859-6",
      "iso88596"=>"iso-8859-6",
      "iso_8859-6"=>"iso-8859-6",
      "iso_8859-6:1987"=>"iso-8859-6",
      "csisolatingreek"=>"iso-8859-7",
      "ecma-118"=>"iso-8859-7",
      "elot_928"=>"iso-8859-7",
      "greek"=>"iso-8859-7",
      "greek8"=>"iso-8859-7",
      "iso-8859-7"=>"iso-8859-7",
      "iso-ir-126"=>"iso-8859-7",
      "iso8859-7"=>"iso-8859-7",
      "iso88597"=>"iso-8859-7",
      "iso_8859-7"=>"iso-8859-7",
      "iso_8859-7:1987"=>"iso-8859-7",
      "sun_eu_greek"=>"iso-8859-7",
      "csiso88598e"=>"iso-8859-8",
      "csisolatinhebrew"=>"iso-8859-8",
      "hebrew"=>"iso-8859-8",
      "iso-8859-8"=>"iso-8859-8",
      "iso-8859-8-e"=>"iso-8859-8",
      "iso-ir-138"=>"iso-8859-8",
      "iso8859-8"=>"iso-8859-8",
      "iso88598"=>"iso-8859-8",
      "iso_8859-8"=>"iso-8859-8",
      "iso_8859-8:1988"=>"iso-8859-8",
      "visual"=>"iso-8859-8",
      "csisolatin6"=>"iso-8859-10",
      "iso-8859-10"=>"iso-8859-10",
      "iso-ir-157"=>"iso-8859-10",
      "iso8859-10"=>"iso-8859-10",
      "iso885910"=>"iso-8859-10",
      "l6"=>"iso-8859-10",
      "latin6"=>"iso-8859-10",
      "iso-8859-13"=>"iso-8859-13",
      "iso8859-13"=>"iso-8859-13",
      "iso885913"=>"iso-8859-13",
      "iso-8859-14"=>"iso-8859-14",
      "iso8859-14"=>"iso-8859-14",
      "iso885914"=>"iso-8859-14",
      "csisolatin9"=>"iso-8859-15",
      "iso-8859-15"=>"iso-8859-15",
      "iso8859-15"=>"iso-8859-15",
      "iso885915"=>"iso-8859-15",
      "iso_8859-15"=>"iso-8859-15",
      "l9"=>"iso-8859-15",
      "iso-8859-16"=>"iso-8859-16",
      "cskoi8r"=>"koi8-r",
      "koi"=>"koi8-r",
      "koi8"=>"koi8-r",
      "koi8-r"=>"koi8-r",
      "koi8_r"=>"koi8-r",
      "koi8-ru"=>"koi8-u",
      "koi8-u"=>"koi8-u",
      "dos-874"=>"windows-874",
      "iso-8859-11"=>"windows-874",
      "iso8859-11"=>"windows-874",
      "iso885911"=>"windows-874",
      "tis-620"=>"windows-874",
      "windows-874"=>"windows-874",
      "cp1250"=>"windows-1250",
      "windows-1250"=>"windows-1250",
      "x-cp1250"=>"windows-1250",
      "cp1251"=>"windows-1251",
      "windows-1251"=>"windows-1251",
      "x-cp1251"=>"windows-1251",
      "ansi_x3.4-1968"=>"windows-1252",
      "ascii"=>"windows-1252",
      "cp1252"=>"windows-1252",
      "cp819"=>"windows-1252",
      "csisolatin1"=>"windows-1252",
      "ibm819"=>"windows-1252",
      "iso-8859-1"=>"windows-1252",
      "iso-ir-100"=>"windows-1252",
      "iso8859-1"=>"windows-1252",
      "iso88591"=>"windows-1252",
      "iso_8859-1"=>"windows-1252",
      "iso_8859-1:1987"=>"windows-1252",
      "l1"=>"windows-1252",
      "latin1"=>"windows-1252",
      "us-ascii"=>"windows-1252",
      "windows-1252"=>"windows-1252",
      "x-cp1252"=>"windows-1252",
      "cp1253"=>"windows-1253",
      "windows-1253"=>"windows-1253",
      "x-cp1253"=>"windows-1253",
      "cp1254"=>"windows-1254",
      "csisolatin5"=>"windows-1254",
      "iso-8859-9"=>"windows-1254",
      "iso-ir-148"=>"windows-1254",
      "iso8859-9"=>"windows-1254",
      "iso88599"=>"windows-1254",
      "iso_8859-9"=>"windows-1254",
      "iso_8859-9:1989"=>"windows-1254",
      "l5"=>"windows-1254",
      "latin5"=>"windows-1254",
      "windows-1254"=>"windows-1254",
      "x-cp1254"=>"windows-1254",
      "cp1255"=>"windows-1255",
      "windows-1255"=>"windows-1255",
      "x-cp1255"=>"windows-1255",
      "cp1256"=>"windows-1256",
      "windows-1256"=>"windows-1256",
      "x-cp1256"=>"windows-1256",
      "cp1257"=>"windows-1257",
      "windows-1257"=>"windows-1257",
      "x-cp1257"=>"windows-1257",
      "cp1258"=>"windows-1258",
      "windows-1258"=>"windows-1258",
      "x-cp1258"=>"windows-1258",
      "x-mac-cyrillic"=>"macCyrillic",
      "x-mac-ukrainian"=>"macCyrillic",
      "chinese"=>"gbk",
      "csgb2312"=>"gbk",
      "csiso58gb231280"=>"gbk",
      "gb2312"=>"gbk",
      "gb_2312"=>"gbk",
      "gb_2312-80"=>"gbk",
      "gbk"=>"gbk",
      "iso-ir-58"=>"gbk",
      "x-gbk"=>"gbk",
      "gb18030"=>"gb18030",
      "big5"=>"big5",
      "big5-hkscs"=>"big5",
      "cn-big5"=>"big5",
      "csbig5"=>"big5",
      "x-x-big5"=>"big5",
      "cseucpkdfmtjapanese"=>"cp51932",
      "euc-jp"=>"cp51932",
      "x-euc-jp"=>"cp51932",
      "csiso2022jp"=>"cp50221",
      "iso-2022-jp"=>"cp50221",
      "csshiftjis"=>"Windows-31J",
      "ms932"=>"Windows-31J",
      "ms_kanji"=>"Windows-31J",
      "shift-jis"=>"Windows-31J",
      "shift_jis"=>"Windows-31J",
      "sjis"=>"Windows-31J",
      "windows-31j"=>"Windows-31J",
      "x-sjis"=>"Windows-31J",
      "cseuckr"=>"euc-kr",
      "csksc56011987"=>"euc-kr",
      "euc-kr"=>"euc-kr",
      "iso-ir-149"=>"euc-kr",
      "korean"=>"euc-kr",
      "ks_c_5601-1987"=>"euc-kr",
      "ks_c_5601-1989"=>"euc-kr",
      "ksc5601"=>"euc-kr",
      "ksc_5601"=>"euc-kr",
      "windows-949"=>"euc-kr",
      "utf-16be"=>"utf-16be",
      "utf-16"=>"utf-16le",
      "utf-16le"=>"utf-16le",
    }

Class Method Summary

Class Method Details

._decode_uri_component(regexp, str, enc) (private)

Raises:

  • (ArgumentError)
[ GitHub ]

  
# File 'lib/uri/common.rb', line 369

def self._decode_uri_component(regexp, str, enc)
  raise ArgumentError, "invalid %-encoding (#{str})" if /%(?!\h\h)/.match?(str)
  str.b.gsub(regexp, TBLDECWWWCOMP_).force_encoding(enc)
end

._encode_uri_component(regexp, table, str, enc) (private)

[ GitHub ]

  
# File 'lib/uri/common.rb', line 355

def self._encode_uri_component(regexp, table, str, enc)
  str = str.to_s.dup
  if str.encoding != Encoding::ASCII_8BIT
    if enc && enc != Encoding::ASCII_8BIT
      str.encode!(Encoding::UTF_8, invalid: :replace, undef: :replace)
      str.encode!(enc, fallback: ->(x){"&##{x.ord};"})
    end
    str.force_encoding(Encoding::ASCII_8BIT)
  end
  str.gsub!(regexp, table)
  str.force_encoding(Encoding::US_ASCII)
end

.decode_uri_component(str, enc = Encoding::UTF_8)

Decodes given str of URL-encoded data.

This does not decode + to SP.

[ GitHub ]

  
# File 'lib/uri/common.rb', line 351

def self.decode_uri_component(str, enc=Encoding::UTF_8)
  _decode_uri_component(/%\h\h/, str, enc)
end

.decode_www_form(str, enc = Encoding::UTF_8, separator: '&', use__charset_: false, isindex: false)

Decodes URL-encoded form data from given str.

This decodes application/x-www-form-urlencoded data and returns an array of key-value arrays.

This refers url.spec.whatwg.org/#concept-urlencoded-parser, so this supports only &-separator, and doesn’t support ;-separator.

ary = URI.decode_www_form("a=1&a=2&b=3")
ary                   #=> [['a', '1'], ['a', '2'], ['b', '3']]
ary.assoc('a').last   #=> '1'
ary.assoc('b').last   #=> '3'
ary.rassoc('a').last  #=> '2'
Hash[ary]             #=> {"a"=>"2", "b"=>"3"}

See .decode_www_form_component, .encode_www_form.

Raises:

  • (ArgumentError)
[ GitHub ]

  
# File 'lib/uri/common.rb', line 438

def self.decode_www_form(str, enc=Encoding::UTF_8, separator: '&', use__charset_: false, isindex: false)
  raise ArgumentError, "the input of #{self.name}.#{__method__} must be ASCII only string" unless str.ascii_only?
  ary = []
  return ary if str.empty?
  enc = Encoding.find(enc)
  str.b.each_line(separator) do |string|
    string.chomp!(separator)
    key, sep, val = string.partition('=')
    if isindex
      if sep.empty?
        val = key
        key = +''
      end
      isindex = false
    end

    if use__charset_ and key == '_charset_' and e = get_encoding(val)
      enc = e
      use__charset_ = false
    end

    key.gsub!(/\+|%\h\h/, TBLDECWWWCOMP_)
    if val
      val.gsub!(/\+|%\h\h/, TBLDECWWWCOMP_)
    else
      val = +''
    end

    ary << [key, val]
  end
  ary.each do |k, v|
    k.force_encoding(enc)
    k.scrub!
    v.force_encoding(enc)
    v.scrub!
  end
  ary
end

.decode_www_form_component(str, enc = Encoding::UTF_8)

Decodes given str of URL-encoded form data.

This decodes + to SP.

See .encode_www_form_component, .decode_www_form.

[ GitHub ]

  
# File 'lib/uri/common.rb', line 337

def self.decode_www_form_component(str, enc=Encoding::UTF_8)
  _decode_uri_component(/\+|%\h\h/, str, enc)
end

.encode_uri_component(str, enc = nil)

Encodes str using URL encoding

This encodes SP to %20 instead of +.

[ GitHub ]

  
# File 'lib/uri/common.rb', line 344

def self.encode_uri_component(str, enc=nil)
  _encode_uri_component(/[^*\-.0-9A-Z_a-z]/, TBLENCURICOMP_, str, enc)
end

.encode_www_form(enum, enc = nil)

Generates URL-encoded form data from given enum.

This generates application/x-www-form-urlencoded data defined in HTML5 from given an Enumerable object.

This internally uses .encode_www_form_component(str).

This method doesn’t convert the encoding of given items, so convert them before calling this method if you want to send data as other than original encoding or mixed encoding data. (Strings which are encoded in an HTML5 ASCII incompatible encoding are converted to UTF-8.)

This method doesn’t handle files. When you send a file, use multipart/form-data.

This refers url.spec.whatwg.org/#concept-urlencoded-serializer

URI.encode_www_form([["q", "ruby"], ["lang", "en"]])
#=> "q=ruby&lang=en"
URI.encode_www_form("q" => "ruby", "lang" => "en")
#=> "q=ruby&lang=en"
URI.encode_www_form("q" => ["ruby", "perl"], "lang" => "en")
#=> "q=ruby&q=perl&lang=en"
URI.encode_www_form([["q", "ruby"], ["q", "perl"], ["lang", "en"]])
#=> "q=ruby&q=perl&lang=en"

See .encode_www_form_component, .decode_www_form.

[ GitHub ]

  
# File 'lib/uri/common.rb', line 402

def self.encode_www_form(enum, enc=nil)
  enum.map do |k,v|
    if v.nil?
      encode_www_form_component(k, enc)
    elsif v.respond_to?(:to_ary)
      v.to_ary.map do |w|
        str = encode_www_form_component(k, enc)
        unless w.nil?
          str << '='
          str << encode_www_form_component(w, enc)
        end
      end.join('&')
    else
      str = encode_www_form_component(k, enc)
      str << '='
      str << encode_www_form_component(v, enc)
    end
  end.join('&')
end

.encode_www_form_component(str, enc = nil)

Encodes given str to URL-encoded form data.

This method doesn’t convert *, -, ., 0-9, A-Z, _, a-z, but does convert SP (ASCII space) to + and converts others to %XX.

If enc is given, convert str to the encoding before percent encoding.

This is an implementation of www.w3.org/TR/2013/CR-html5-20130806/forms.html#url-encoded-form-data.

See .decode_www_form_component, .encode_www_form.

[ GitHub ]

  
# File 'lib/uri/common.rb', line 328

def self.encode_www_form_component(str, enc=nil)
  _encode_uri_component(/[^*\-.0-9A-Z_a-z]/, TBLENCWWWCOMP_, str, enc)
end

.extract(str, schemes = nil, &block)

Synopsis

URI::extract(str[, schemes][,&blk])

Args

str

String to extract URIs from.

schemes

Limit URI matching to specific schemes.

Description

Extracts URIs from a string. If block given, iterates through all matched URIs. Returns nil if block given or array with matches.

Usage

require "uri"

URI.extract("text here http://foo.example.org/bla and here mailto:test@example.com and here also.")
# => ["http://foo.example.com/bla", "mailto:test@example.com"]
[ GitHub ]

  
# File 'lib/uri/common.rb', line 257

def self.extract(str, schemes = nil, &block)
  warn "URI.extract is obsolete", uplevel: 1 if $VERBOSE
  DEFAULT_PARSER.extract(str, schemes, &block)
end

.for(scheme, *arguments, default: Generic)

Construct a URI instance, using the scheme to detect the appropriate class from .scheme_list.

[ GitHub ]

  
# File 'lib/uri/common.rb', line 95

def self.for(scheme, *arguments, default: Generic)
  const_name = scheme.to_s.upcase

  uri_class = INITIAL_SCHEMES[const_name]
  uri_class ||= if /\A[A-Z]\w*\z/.match?(const_name) && Schemes.const_defined?(const_name, false)
    Schemes.const_get(const_name, false)
  end
  uri_class ||= default

  return uri_class.new(scheme, *arguments)
end

.get_encoding(label) (private)

This method is for internal use only.
[ GitHub ]

  
# File 'lib/uri/common.rb', line 708

def self.get_encoding(label)
  Encoding.find(WEB_ENCODINGS_[label.to_str.strip.downcase]) rescue nil
end

.join(*str)

Synopsis

URI::join(str[, str, ...])

Args

str

String(s) to work with, will be converted to RFC3986 URIs before merging.

Description

Joins URIs.

Usage

require 'uri'

URI.join("http://example.com/","main.rbx")
# => #<URI::HTTP http://example.com/main.rbx>

URI.join('http://example.com', 'foo')
# => #<URI::HTTP http://example.com/foo>

URI.join('http://example.com', '/foo', '/bar')
# => #<URI::HTTP http://example.com/bar>

URI.join('http://example.com', '/foo', 'bar')
# => #<URI::HTTP http://example.com/bar>

URI.join('http://example.com', '/foo/', 'bar')
# => #<URI::HTTP http://example.com/foo/bar>
[ GitHub ]

  
# File 'lib/uri/common.rb', line 229

def self.join(*str)
  RFC3986_PARSER.join(*str)
end

.parse(uri)

Synopsis

URI::parse(uri_str)

Args

uri_str

String with URI.

Description

Creates one of the URI’s subclasses instance from the string.

Raises

URI::InvalidURIError

Raised if URI given is not a correct one.

Usage

require 'uri'

uri = URI.parse("http://www.ruby-lang.org/")
# => #<URI::HTTP http://www.ruby-lang.org/>
uri.scheme
# => "http"
uri.host
# => "www.ruby-lang.org"

It’s recommended to first .escape the provided uri_str if there are any invalid URI characters.

[ GitHub ]

  
# File 'lib/uri/common.rb', line 192

def self.parse(uri)
  RFC3986_PARSER.parse(uri)
end

.regexp(schemes = nil)

Synopsis

URI::regexp([match_schemes])

Args

match_schemes

Array of schemes. If given, resulting regexp matches to URIs whose scheme is one of the match_schemes.

Description

Returns a Regexp object which matches to URI-like strings. The Regexp object returned by this method includes arbitrary number of capture group (parentheses). Never rely on its number.

Usage

require 'uri'

# extract first URI from html_string
html_string.slice(URI.regexp)

# remove ftp URIs
html_string.sub(URI.regexp(['ftp']), '')

# You should not rely on the number of parentheses
html_string.scan(URI.regexp) do |*matches|
  p $&
end
[ GitHub ]

  
# File 'lib/uri/common.rb', line 294

def self.regexp(schemes = nil)
  warn "URI.regexp is obsolete", uplevel: 1 if $VERBOSE
  DEFAULT_PARSER.make_regexp(schemes)
end

.register_scheme(scheme, klass)

Register the given klass to be instantiated when parsing URLs with the given scheme. Note that currently only schemes which after .upcase are valid constant names can be registered (no -/+/. allowed).

[ GitHub ]

  
# File 'lib/uri/common.rb', line 76

def self.register_scheme(scheme, klass)
  Schemes.const_set(scheme.to_s.upcase, klass)
end

.scheme_list

Returns a Hash of the defined schemes.

[ GitHub ]

  
# File 'lib/uri/common.rb', line 81

def self.scheme_list
  Schemes.constants.map { |name|
    [name.to_s.upcase, Schemes.const_get(name)]
  }.to_h
end

.split(uri)

Synopsis

URI::split(uri)

Args

uri

String with URI.

Description

Splits the string on following parts and returns array with result:

  • Scheme

  • Userinfo

  • Host

  • Port

  • Registry

  • Path

  • Opaque

  • Query

  • Fragment

Usage

require 'uri'

URI.split("http://www.ruby-lang.org/")
# => ["http", nil, "www.ruby-lang.org", nil, nil, "/", nil, nil, nil]
[ GitHub ]

  
# File 'lib/uri/common.rb', line 155

def self.split(uri)
  RFC3986_PARSER.split(uri)
end