123456789_123456789_123456789_123456789_123456789_

Class: StringScanner

Relationships & Source Files
Namespace Children
Exceptions:
Inherits: Object
Defined in: ext/strscan/strscan.c,
ext/strscan/strscan.c,
ext/strscan/lib/strscan/strscan.rb

Overview

\Class StringScanner supports processing a stored string as a stream; this code creates a new StringScanner object with string 'foobarbaz':

require 'strscan'
scanner = StringScanner.new('foobarbaz')

About the Examples

All examples here assume that StringScanner has been required:

require 'strscan'

Some examples here assume that these constants are defined:

MULTILINE_TEXT = <<~EOT
Go placidly amid the noise and haste,
and remember what peace there may be in silence.
EOT

HIRAGANA_TEXT = 'こんにちは'

ENGLISH_TEXT = 'Hello'

Some examples here assume that certain helper methods are defined:

  • put_situation(scanner): Displays the values of the scanner's methods #pos, #charpos, #rest, and #rest_size.
  • put_match_values(scanner): Displays the scanner's match values.
  • match_values_cleared?(scanner): Returns whether the scanner's match values are cleared.

See examples at helper methods.

The StringScanner \Object

This code creates a StringScanner object (we'll call it simply a scanner), and shows some of its basic properties:

scanner = StringScanner.new('foobarbaz')
scanner.string # => "foobarbaz"
put_situation(scanner)
# Situation:
#   pos:       0
#   charpos:   0
#   rest:      "foobarbaz"
#   rest_size: 9

The scanner has:

  • A stored string, which is:

    • Initially set by StringScanner.new(string) to the given string ('foobarbaz' in the example above).
    • Modifiable by methods #string=(new_string) and #concat(more_string).
    • Returned by method #string.

    More at Stored String below.

  • A position; a zero-based index into the bytes of the stored string (not into its characters):

    • Initially set by StringScanner.new to 0.
    • Returned by method #pos.
    • Modifiable explicitly by methods #reset, #terminate, and #pos=(new_pos).
    • Modifiable implicitly (various traversing methods, among others).

    More at Byte Position below.

  • A target substring, which is a trailing substring of the stored string; it extends from the current position to the end of the stored string:

    • Initially set by StringScanner.new(string) to the given string ('foobarbaz' in the example above).
    • Returned by method #rest.
    • Modified by any modification to either the stored string or the position.

    Most importantly: the searching and traversing methods operate on the target substring, which may be (and often is) less than the entire stored string.

    More at Target Substring below.

Stored \String

The stored string is the string stored in the StringScanner object.

Each of these methods sets, modifies, or returns the stored string:

Method Effect
.new(string) Creates a new scanner for the given string.
#string=(new_string) Replaces the existing stored string.
#concat(more_string) Appends a string to the existing stored string.
#string Returns the stored string.

Positions

A StringScanner object maintains a zero-based byte position and a zero-based character position.

Each of these methods explicitly sets positions:

Method Effect
#reset Sets both positions to zero (beginning of stored string).
#terminate Sets both positions to the end of the stored string.
#pos=(new_byte_position) Sets byte position; adjusts character position.

Byte Position (Position)

The byte position (or simply position) is a zero-based index into the bytes in the scanner's stored string; for a new StringScanner object, the byte position is zero.

When the byte position is:

  • Zero (at the beginning), the target substring is the entire stored string.
  • Equal to the size of the stored string (at the end), the target substring is the empty string ''.

To get or set the byte position:

  • #pos: returns the byte position.
  • #pos=(new_pos): sets the byte position.

Many methods use the byte position as the basis for finding matches; many others set, increment, or decrement the byte position:

scanner = StringScanner.new('foobar')
scanner.pos # => 0
scanner.scan(/foo/) # => "foo" # Match found.
scanner.pos         # => 3     # Byte position incremented.
scanner.scan(/foo/) # => nil   # Match not found.
scanner.pos # => 3             # Byte position not changed.

Some methods implicitly modify the byte position; see:

The values of these methods are derived directly from the values of #pos and #string:

Character Position

The character position is a zero-based index into the characters in the stored string; for a new StringScanner object, the character position is zero.

\Method #charpos returns the character position; its value may not be reset explicitly.

Some methods change (increment or reset) the character position; see:

Example (string includes multi-byte characters):

scanner = StringScanner.new(ENGLISH_TEXT) # Five 1-byte characters.
scanner.concat(HIRAGANA_TEXT)             # Five 3-byte characters
scanner.string # => "Helloこんにちは"       # Twenty bytes in all.
put_situation(scanner)
# Situation:
#   pos:       0
#   charpos:   0
#   rest:      "Helloこんにちは"
#   rest_size: 20
scanner.scan(/Hello/) # => "Hello" # Five 1-byte characters.
put_situation(scanner)
# Situation:
#   pos:       5
#   charpos:   5
#   rest:      "こんにちは"
#   rest_size: 15
scanner.getch         # => "こ"    # One 3-byte character.
put_situation(scanner)
# Situation:
#   pos:       8
#   charpos:   6
#   rest:      "んにちは"
#   rest_size: 12

Target Substring

The target substring is the part of the stored string that extends from the current byte position to the end of the stored string; it is always either:

  • The entire stored string (byte position is zero).
  • A trailing substring of the stored string (byte position positive).

The target substring is returned by method #rest, and its size is returned by method #rest_size.

Examples:

scanner = StringScanner.new('foobarbaz')
put_situation(scanner)
# Situation:
#   pos:       0
#   charpos:   0
#   rest:      "foobarbaz"
#   rest_size: 9
scanner.pos = 3
put_situation(scanner)
# Situation:
#   pos:       3
#   charpos:   3
#   rest:      "barbaz"
#   rest_size: 6
scanner.pos = 9
put_situation(scanner)
# Situation:
#   pos:       9
#   charpos:   9
#   rest:      ""
#   rest_size: 0

Setting the Target Substring

The target substring is set whenever:

  • The stored string is set (position reset to zero; target substring set to stored string).
  • The byte position is set (target substring adjusted accordingly).

Querying the Target Substring

This table summarizes (details and examples at the links):

Method Returns
#rest Target substring.
#rest_size Size (bytes) of target substring.

Searching the Target Substring

A search method examines the target substring, but does not advance the positions or (by implication) shorten the target substring.

This table summarizes (details and examples at the links):

Method Returns Sets Match Values?
#check(pattern) Matched leading substring or nil. Yes.
#check_until(pattern) Matched substring (anywhere) or nil. Yes.
#exist?(pattern) Matched substring (anywhere) end index. Yes.
#match?(pattern) Size of matched leading substring or nil. Yes.
#peek(size) Leading substring of given length (bytes). No.
#peek_byte Integer leading byte or nil. No.
#rest Target substring (from byte position to end). No.

Traversing the Target Substring

A traversal method examines the target substring, and, if successful:

  • Advances the positions.
  • Shortens the target substring.

This table summarizes (details and examples at links):

Method Returns Sets Match Values?
#get_byte Leading byte or nil. No.
#getch Leading character or nil. No.
#scan(pattern) Matched leading substring or nil. Yes.
#scan_byte Integer leading byte or nil. No.
#scan_until(pattern) Matched substring (anywhere) or nil. Yes.
#skip(pattern) Matched leading substring size or nil. Yes.
#skip_until(pattern) Position delta to end-of-matched-substring or nil. Yes.
#unscan self. No.

Querying the Scanner

Each of these methods queries the scanner object without modifying it (details and examples at links)

Method Returns
#beginning_of_line? true or false.
#charpos Character position.
#eos? true or false.
#fixed_anchor? true or false.
#inspect String representation of self.
#pos Byte position.
#rest Target substring.
#rest_size Size of target substring.
#string Stored string.

Matching

StringScanner implements pattern matching via Ruby class Regexp, and its matching behaviors are the same as Ruby's except for the fixed-anchor property.

Matcher Methods

Each matcher method takes a single argument pattern, and attempts to find a matching substring in the target substring.

Method Pattern Type Matches Target Substring Success Return May Update Positions?
#check Regexp or String. At beginning. Matched substring. No.
#check_until Regexp or String. Anywhere. Substring. No.
#match? Regexp or String. At beginning. Match size. No.
#exist? Regexp or String. Anywhere. Substring size. No.
#scan Regexp or String. At beginning. Matched substring. Yes.
#scan_until Regexp or String. Anywhere. Substring. Yes.
#skip Regexp or String. At beginning. Match size. Yes.
#skip_until Regexp or String. Anywhere. Substring size. Yes.


Which matcher you choose will depend on:

  • Where you want to find a match:

    • Only at the beginning of the target substring: #check, #match?, #scan, #skip.
    • Anywhere in the target substring: #check_until, #exist?, #scan_until, #skip_until.
  • Whether you want to:

    • Traverse, by advancing the positions: #scan, #scan_until, #skip, #skip_until.
    • Keep the positions unchanged: #check, #check_until, #match?, #exist?.
  • What you want for the return value:

    • The matched substring: #check, #scan.
    • The substring: #check_until, #scan_until.
    • The match size: #match?, #skip.
    • The substring size: #exist?, #skip_until.

Match Values

The match values in a StringScanner object generally contain the results of the most recent attempted match.

Each match value may be thought of as:

  • Clear: Initially, or after an unsuccessful match attempt: usually, false, nil, or {}.
  • Set: After a successful match attempt: true, string, array, or hash.

Each of these methods clears match values:

Each of these methods attempts a match based on a pattern, and either sets match values (if successful) or clears them (if not);

Basic Match Values

Basic match values are those not related to captures.

Each of these methods returns a basic match value:

Method Return After Match Return After No Match
#matched? true. false.
#matched_size Size of matched substring. nil.
#matched Matched substring. nil.
#pre_match Substring preceding matched substring. nil.
#post_match Substring following matched substring. nil.


See examples below.

Captured Match Values

Captured match values are those related to captures.

Each of these methods returns a captured match value:

Method Return After Match Return After No Match
#size Count of captured substrings. nil.
#[](n) nth captured substring. nil.
#captures Array of all captured substrings. nil.
#values_at(*n) Array of specified captured substrings. nil.
#named_captures Hash of named captures. {}.


See examples below.

Match Values Examples

Successful basic match attempt (no captures):

scanner = StringScanner.new('foobarbaz')
scanner.exist?(/bar/)
put_match_values(scanner)
# Basic match values:
#   matched?:       true
#   matched_size:   3
#   pre_match:      "foo"
#   matched  :      "bar"
#   post_match:     "baz"
# Captured match values:
#   size:           1
#   captures:       []
#   named_captures: {}
#   values_at:      ["bar", nil]
#   []:
#     [0]:          "bar"
#     [1]:          nil

Failed basic match attempt (no captures);

scanner = StringScanner.new('foobarbaz')
scanner.exist?(/nope/)
match_values_cleared?(scanner) # => true

Successful unnamed capture match attempt:

scanner = StringScanner.new('foobarbazbatbam')
scanner.exist?(/(foo)bar(baz)bat(bam)/)
put_match_values(scanner)
# Basic match values:
#   matched?:       true
#   matched_size:   15
#   pre_match:      ""
#   matched  :      "foobarbazbatbam"
#   post_match:     ""
# Captured match values:
#   size:           4
#   captures:       ["foo", "baz", "bam"]
#   named_captures: {}
#   values_at:      ["foobarbazbatbam", "foo", "baz", "bam", nil]
#   []:
#     [0]:          "foobarbazbatbam"
#     [1]:          "foo"
#     [2]:          "baz"
#     [3]:          "bam"
#     [4]:          nil

Successful named capture match attempt; same as unnamed above, except for #named_captures:

scanner = StringScanner.new('foobarbazbatbam')
scanner.exist?(/(?<x>foo)bar(?<y>baz)bat(?<z>bam)/)
scanner.named_captures # => {"x"=>"foo", "y"=>"baz", "z"=>"bam"}

Failed unnamed capture match attempt:

scanner = StringScanner.new('somestring')
scanner.exist?(/(foo)bar(baz)bat(bam)/)
match_values_cleared?(scanner) # => true

Failed named capture match attempt; same as unnamed above, except for #named_captures:

scanner = StringScanner.new('somestring')
scanner.exist?(/(?<x>foo)bar(?<y>baz)bat(?<z>bam)/)
match_values_cleared?(scanner) # => false
scanner.named_captures # => {"x"=>nil, "y"=>nil, "z"=>nil}

Fixed-Anchor Property

Pattern matching in StringScanner is the same as in Ruby's, except for its fixed-anchor property, which determines the meaning of '\A':

  • false (the default): matches the current byte position.

    scanner = StringScanner.new('foobar')
    scanner.scan(/\A./) # => "f"
    scanner.scan(/\A./) # => "o"
    scanner.scan(/\A./) # => "o"
    scanner.scan(/\A./) # => "b"
  • true: matches the beginning of the target substring; never matches unless the byte position is zero:

    scanner = StringScanner.new('foobar', fixed_anchor: true)
    scanner.scan(/\A./) # => "f"
    scanner.scan(/\A./) # => nil
    scanner.reset
    scanner.scan(/\A./) # => "f"

The fixed-anchor property is set when the StringScanner object is created, and may not be modified (see .new); method #fixed_anchor? returns the setting.

Class Method Summary

Instance Attribute Summary

Instance Method Summary

Constructor Details

.new(string, fixed_anchor: false) ⇒ string_scanner (private)

Returns a new StringScanner object whose stored string is the given #string; sets the fixed-anchor property:

scanner = StringScanner.new('foobarbaz')
scanner.string        # => "foobarbaz"
scanner.fixed_anchor? # => false
put_situation(scanner)
#### Situation:
####   pos:       0
####   charpos:   0
####   rest:      "foobarbaz"
####   rest_size: 9
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 254

static VALUE
strscan_initialize(int argc, VALUE *argv, VALUE self)
{
    struct strscanner *p;
    VALUE str, options;

    p = check_strscan(self);
    rb_scan_args(argc, argv, "11", &str, &options);
    options = rb_check_hash_type(options);
    if (!NIL_P(options)) {
        VALUE fixed_anchor;
        ID keyword_ids[1];
        keyword_ids[0] = rb_intern("fixed_anchor");
        rb_get_kwargs(options, keyword_ids, 0, 1, &fixed_anchor);
        if (fixed_anchor == Qundef) {
            p->fixed_anchor_p = false;
        }
        else {
            p->fixed_anchor_p = RTEST(fixed_anchor);
        }
    }
    else {
        p->fixed_anchor_p = false;
    }
    StringValue(str);
    RB_OBJ_WRITE(self, &p->str, str);

    return self;
}

Class Method Details

.must_C_version

This method is for internal use only.
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 332

static VALUE
strscan_s_mustc(VALUE self)
{
    return self;
}

Instance Attribute Details

#beginning_of_line?Boolean (readonly)

Returns whether the position is at the beginning of a line; that is, at the beginning of the stored string or immediately after a newline:

scanner = StringScanner.new(MULTILINE_TEXT)
scanner.string
# => "Go placidly amid the noise and haste,\nand remember what peace there may be in silence.\n"
scanner.pos                # => 0
scanner.beginning_of_line? # => true

scanner.scan_until(/,/)    # => "Go placidly amid the noise and haste,"
scanner.beginning_of_line? # => false

scanner.scan(/\n/)         # => "\n"
scanner.beginning_of_line? # => true

scanner.terminate
scanner.beginning_of_line? # => true

scanner.concat('x')
scanner.terminate
scanner.beginning_of_line? # => false

StringScanner#bol? is an alias for beginning_of_line?.

[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1445

static VALUE
strscan_bol_p(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    if (CURPTR(p) > S_PEND(p)) return Qnil;
    if (p->curr == 0) return Qtrue;
    return (*(CURPTR(p) - 1) == '\n') ? Qtrue : Qfalse;
}

#eos?Boolean (readonly)

Returns whether the position is at the end of the stored string:

scanner = StringScanner.new('foobarbaz')
scanner.eos? # => false
pos = 3
scanner.eos? # => false
scanner.terminate
scanner.eos? # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1476

static VALUE
strscan_eos_p(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    return EOS_P(p) ? Qtrue : Qfalse;
}

#fixed_anchor?Boolean (readonly)

Returns whether the fixed-anchor property is set.

[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 2112

static VALUE
strscan_fixed_anchor_p(VALUE self)
{
    struct strscanner *p;
    p = check_strscan(self);
    return p->fixed_anchor_p ? Qtrue : Qfalse;
}

#matchedmatched_substring? (readonly)

Returns the matched substring from the most recent match attempt if it was successful, or nil otherwise; see Basic Matched Values:

scanner = StringScanner.new('foobarbaz')
scanner.matched        # => nil
scanner.pos = 3
scanner.match?(/bar/)  # => 3
scanner.matched        # => "bar"
scanner.match?(/nope/) # => nil
scanner.matched        # => nil
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1561

static VALUE
strscan_matched(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    if (! MATCHED_P(p)) return Qnil;
    return extract_range(p,
                         adjust_register_position(p, p->regs.beg[0]),
                         adjust_register_position(p, p->regs.end[0]));
}

#matched?Boolean (readonly)

Returns true of the most recent match attempt was successful, false otherwise; see Basic Matched Values:

scanner = StringScanner.new('foobarbaz')
scanner.matched?       # => false
scanner.pos = 3
scanner.exist?(/baz/)  # => 6
scanner.matched?       # => true
scanner.exist?(/nope/) # => nil
scanner.matched?       # => false
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1529

static VALUE
strscan_matched_p(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    return MATCHED_P(p) ? Qtrue : Qfalse;
}

#posbyte_position (rw) #pointerbyte_position

Alias for #pos.

#posbyte_position (rw) Also known as: #pointer

Returns the integer byte position, which may be different from the character position:

scanner = StringScanner.new(HIRAGANA_TEXT)
scanner.string  # => "こんにちは"
scanner.pos     # => 0
scanner.getch   # => "こ" # 3-byte character.
scanner.charpos # => 1
scanner.pos     # => 3
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 509

static VALUE
strscan_get_pos(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    return LONG2NUM(p->curr);
}

#pos=(n) ⇒ n (rw) #pointer=(n) ⇒ n
Also known as: #pointer=

Sets the byte position and the character position; returns n.

Does not affect match values.

For non-negative n, sets the position to n:

scanner = StringScanner.new(HIRAGANA_TEXT)
scanner.string  # => "こんにちは"
scanner.pos = 3 # => 3
scanner.rest    # => "んにちは"
scanner.charpos # => 1

For negative n, counts from the end of the stored string:

scanner.pos = -9 # => -9
scanner.pos      # => 6
scanner.rest     # => "にちは"
scanner.charpos  # => 2
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 538

static VALUE
strscan_set_pos(VALUE self, VALUE v)
{
    struct strscanner *p;
    long i;

    GET_SCANNER(self, p);
    i = NUM2LONG(v);
    if (i < 0) i += S_LEN(p);
    if (i < 0) rb_raise(rb_eRangeError, "index out of range");
    if (i > S_LEN(p)) rb_raise(rb_eRangeError, "index out of range");
    p->curr = i;
    return LONG2NUM(i);
}

#resttarget_substring (readonly)

Returns the 'rest' of the stored string (all after the current position), which is the target substring:

scanner = StringScanner.new('foobarbaz')
scanner.rest # => "foobarbaz"
scanner.pos = 3
scanner.rest # => "barbaz"
scanner.terminate
scanner.rest # => ""
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1949

static VALUE
strscan_rest(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    if (EOS_P(p)) {
        return str_new(p, "", 0);
    }
    return extract_range(p, p->curr, S_LEN(p));
}

#rest?Boolean (readonly)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1498

static VALUE
strscan_rest_p(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    return EOS_P(p) ? Qfalse : Qtrue;
}

#stringstored_string (rw)

Returns the stored string:

scanner = StringScanner.new('foobar')
scanner.string # => "foobar"
scanner.concat('baz')
scanner.string # => "foobarbaz"
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 408

static VALUE
strscan_get_string(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    return p->str;
}

#string=(other_string) ⇒ other_string (rw)

Replaces the stored string with the given other_string:

scanner = StringScanner.new('foobar')
scanner.scan(/foo/)
put_situation(scanner)
#### Situation:
####   pos:       3
####   charpos:   3
####   rest:      "bar"
####   rest_size: 3
match_values_cleared?(scanner) # => false

scanner.string = 'baz'         # => "baz"
put_situation(scanner)
#### Situation:
####   pos:       0
####   charpos:   0
####   rest:      "baz"
####   rest_size: 3
match_values_cleared?(scanner) # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 452

static VALUE
strscan_set_string(VALUE self, VALUE str)
{
    struct strscanner *p = check_strscan(self);

    StringValue(str);
    RB_OBJ_WRITE(self, &p->str, str);
    p->curr = 0;
    CLEAR_MATCH_STATUS(p);
    return str;
}

Instance Method Details

#<<(more_string) ⇒ self Also known as: #concat

scanner = StringScanner.new('foo')
scanner.string           # => "foo"
scanner.terminate
scanner.concat('barbaz') # => #<StringScanner 3/9 "foo" @ "barba...">
scanner.string           # => "foobarbaz"
put_situation(scanner)
#### Situation:
####   pos:       3
####   charpos:   3
####   rest:      "barbaz"
####   rest_size: 6
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 493

static VALUE
strscan_concat(VALUE self, VALUE str)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    StringValue(str);
    rb_str_append(p->str, str);
    return self;
}

#[](specifier) ⇒ substring?

Returns a captured substring or nil; see Captured Match Values.

When there are captures:

scanner = StringScanner.new('Fri Dec 12 1975 14:39')
scanner.scan(/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /)
  • specifier zero: returns the entire matched substring:

    scanner[0]         # => "Fri Dec 12 "
    scanner.pre_match  # => ""
    scanner.post_match # => "1975 14:39"
  • specifier positive integer. returns the nth capture, or nil if out of range:

    scanner[1] # => "Fri"
    scanner[2] # => "Dec"
    scanner[3] # => "12"
    scanner[4] # => nil
  • specifier negative integer. counts backward from the last subgroup:

    scanner[-1] # => "12"
    scanner[-4] # => "Fri Dec 12 "
    scanner[-5] # => nil
  • specifier symbol or string. returns the named subgroup, or nil if no such:

    scanner[:wday]  # => "Fri"
    scanner['wday'] # => "Fri"
    scanner[:month] # => "Dec"
    scanner[:day]   # => "12"
    scanner[:nope]  # => nil

When there are no captures, only [0] returns non-nil:

scanner = StringScanner.new('foobarbaz')
scanner.exist?(/bar/)
scanner[0] # => "bar"
scanner[1] # => nil

For a failed match, even [0] returns nil:

scanner.scan(/nope/) # => nil
scanner[0]           # => nil
scanner[1]           # => nil
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1695

static VALUE
strscan_aref(VALUE self, VALUE idx)
{
    const char *name;
    struct strscanner *p;
    long i;

    GET_SCANNER(self, p);
    if (! MATCHED_P(p))        return Qnil;

    switch (TYPE(idx)) {
        case T_SYMBOL:
            idx = rb_sym2str(idx);
            /* fall through */
        case T_STRING:
            RSTRING_GETMEM(idx, name, i);
            i = name_to_backref_number(&(p->regs), p->regex, name, name + i, rb_enc_get(idx));
            break;
        default:
            i = NUM2LONG(idx);
    }

    if (i < 0)
        i += p->regs.num_regs;
    if (i < 0)                 return Qnil;
    if (i >= p->regs.num_regs) return Qnil;
    if (p->regs.beg[i] == -1)  return Qnil;

    return extract_range(p,
                         adjust_register_position(p, p->regs.beg[i]),
                         adjust_register_position(p, p->regs.end[i]));
}

#capturessubstring_array?

Returns the array of captured match values at indexes (1..) if the most recent match attempt succeeded, or nil otherwise:

scanner = StringScanner.new('Fri Dec 12 1975 14:39')
scanner.captures         # => nil

scanner.exist?(/(?<wday>\w+) (?<month>\w+) (?<day>\d+) /)
scanner.captures         # => ["Fri", "Dec", "12"]
scanner.values_at(*0..4) # => ["Fri Dec 12 ", "Fri", "Dec", "12", nil]

scanner.exist?(/Fri/)
scanner.captures         # => []

scanner.scan(/nope/)
scanner.captures         # => nil
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1788

static VALUE
strscan_captures(VALUE self)
{
    struct strscanner *p;
    int   i, num_regs;
    VALUE new_ary;

    GET_SCANNER(self, p);
    if (! MATCHED_P(p))        return Qnil;

    num_regs = p->regs.num_regs;
    new_ary  = rb_ary_new2(num_regs);

    for (i = 1; i < num_regs; i++) {
        VALUE str;
        if (p->regs.beg[i] == -1)
            str = Qnil;
        else
            str = extract_range(p,
                                adjust_register_position(p, p->regs.beg[i]),
                                adjust_register_position(p, p->regs.end[i]));
        rb_ary_push(new_ary, str);
    }

    return new_ary;
}

#charposcharacter_position

Returns the character position (initially zero), which may be different from the byte position given by method #pos:

scanner = StringScanner.new(HIRAGANA_TEXT)
scanner.string # => "こんにちは"
scanner.getch  # => "こ" # 3-byte character.
scanner.getch  # => "ん" # 3-byte character.
put_situation(scanner)
# Situation:
#   pos:       6
#   charpos:   2
#   rest:      "にちは"
#   rest_size: 9
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 523

static VALUE
strscan_get_charpos(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);

    return LONG2NUM(rb_enc_strlen(S_PBEG(p), CURPTR(p), rb_enc_get(p->str)));
}

#check(pattern) ⇒ matched_substring?

Attempts to match the given pattern at the beginning of the target substring; does not modify the positions.

If the match succeeds:

scanner = StringScanner.new('foobarbaz')
scanner.pos = 3
scanner.check('bar') # => "bar"
put_match_values(scanner)
#### Basic match values:
####   matched?:       true
####   matched_size:   3
####   pre_match:      "foo"
####   matched  :      "bar"
####   post_match:     "baz"
#### Captured match values:
####   size:           1
####   captures:       []
####   named_captures: {}
####   values_at:      ["bar", nil]
####   []:
####     [0]:          "bar"
####     [1]:          nil
#### => 0..1
put_situation(scanner)
#### Situation:
####   pos:       3
####   charpos:   3
####   rest:      "barbaz"
####   rest_size: 6

If the match fails:

scanner.check(/nope/)          # => nil
match_values_cleared?(scanner) # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 896

static VALUE
strscan_check(VALUE self, VALUE re)
{
    return strscan_do_scan(self, re, 0, 1, 1);
}

#check_until(pattern) ⇒ substring?

Attempts to match the given pattern anywhere (at any position) in the target substring; does not modify the positions.

If the match succeeds:

  • Sets all match values.
  • Returns the matched substring, which extends from the current position to the end of the matched substring.
scanner = StringScanner.new('foobarbazbatbam')
scanner.pos = 6
scanner.check_until(/bat/) # => "bazbat"
put_match_values(scanner)
#### Basic match values:
####   matched?:       true
####   matched_size:   3
####   pre_match:      "foobarbaz"
####   matched  :      "bat"
####   post_match:     "bam"
#### Captured match values:
####   size:           1
####   captures:       []
####   named_captures: {}
####   values_at:      ["bat", nil]
####   []:
####     [0]:          "bat"
####     [1]:          nil
put_situation(scanner)
#### Situation:
####   pos:       6
####   charpos:   6
####   rest:      "bazbatbam"
####   rest_size: 9

If the match fails:

scanner.check_until(/nope/)    # => nil
match_values_cleared?(scanner) # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1069

static VALUE
strscan_check_until(VALUE self, VALUE re)
{
    return strscan_do_scan(self, re, 0, 1, 0);
}

#<<(more_string) ⇒ self #concat(more_string) ⇒ self

Alias for #<<.

#exist?(pattern) ⇒ byte_offset?

Attempts to match the given pattern anywhere (at any position) n the target substring; does not modify the positions.

If the match succeeds:

  • Returns a byte offset: the distance in bytes between the current position and the end of the matched substring.
  • Sets all match values.
scanner = StringScanner.new('foobarbazbatbam')
scanner.pos = 6
scanner.exist?(/bat/) # => 6
put_match_values(scanner)
#### Basic match values:
####   matched?:       true
####   matched_size:   3
####   pre_match:      "foobarbaz"
####   matched  :      "bat"
####   post_match:     "bam"
#### Captured match values:
####   size:           1
####   captures:       []
####   named_captures: {}
####   values_at:      ["bat", nil]
####   []:
####     [0]:          "bat"
####     [1]:          nil
put_situation(scanner)
#### Situation:
####   pos:       6
####   charpos:   6
####   rest:      "bazbatbam"
####   rest_size: 9

If the match fails:

scanner.exist?(/nope/)         # => nil
match_values_cleared?(scanner) # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 995

static VALUE
strscan_exist_p(VALUE self, VALUE re)
{
    return strscan_do_scan(self, re, 0, 0, 0);
}

#get_bytebyte_as_character?

Returns the next byte, if available:

  • If the position is not at the end of the stored string:

    scanner = StringScanner.new(HIRAGANA_TEXT)
    # => #<StringScanner 0/15 @ "\xE3\x81\x93\xE3\x82...">
    scanner.string                                   # => "こんにちは"
    [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\xE3", 1, 1]
    [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\x81", 2, 2]
    [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\x93", 3, 1]
    [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\xE3", 4, 2]
    [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\x82", 5, 3]
    [scanner.get_byte, scanner.pos, scanner.charpos] # => ["\x93", 6, 2]
  • Otherwise, returns nil, and does not change the positions.

    scanner.terminate
    [scanner.get_byte, scanner.pos, scanner.charpos] # => [nil, 15, 5]
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1190

static VALUE
strscan_get_byte(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    CLEAR_MATCH_STATUS(p);
    if (EOS_P(p))
        return Qnil;

    p->prev = p->curr;
    p->curr++;
    MATCHED(p);
    adjust_registers_to_matched(p);
    return extract_range(p,
                         adjust_register_position(p, p->regs.beg[0]),
                         adjust_register_position(p, p->regs.end[0]));
}

#getchcharacter?

Returns the next (possibly multibyte) character, if available:

  • If the position is at the beginning of a character:

    scanner = StringScanner.new(HIRAGANA_TEXT)
    scanner.string                                # => "こんにちは"
    [scanner.getch, scanner.pos, scanner.charpos] # => ["こ", 3, 1]
    [scanner.getch, scanner.pos, scanner.charpos] # => ["ん", 6, 2]
    [scanner.getch, scanner.pos, scanner.charpos] # => ["に", 9, 3]
    [scanner.getch, scanner.pos, scanner.charpos] # => ["ち", 12, 4]
    [scanner.getch, scanner.pos, scanner.charpos] # => ["は", 15, 5]
    [scanner.getch, scanner.pos, scanner.charpos] # => [nil, 15, 5]
  • If the position is within a multi-byte character (that is, not at its beginning), behaves like #get_byte (returns a 1-byte character):

    scanner.pos = 1
    [scanner.getch, scanner.pos, scanner.charpos] # => ["\x81", 2, 2]
    [scanner.getch, scanner.pos, scanner.charpos] # => ["\x93", 3, 1]
    [scanner.getch, scanner.pos, scanner.charpos] # => ["ん", 6, 2]
  • If the position is at the end of the stored string, returns nil and does not modify the positions:

    scanner.terminate
    [scanner.getch, scanner.pos, scanner.charpos] # => [nil, 15, 5]
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1117

static VALUE
strscan_getch(VALUE self)
{
    struct strscanner *p;
    long len;

    GET_SCANNER(self, p);
    CLEAR_MATCH_STATUS(p);
    if (EOS_P(p))
        return Qnil;

    len = rb_enc_mbclen(CURPTR(p), S_PEND(p), rb_enc_get(p->str));
    len = minl(len, S_RESTLEN(p));
    p->prev = p->curr;
    p->curr += len;
    MATCHED(p);
    adjust_registers_to_matched(p);
    return extract_range(p,
                         adjust_register_position(p, p->regs.beg[0]),
                         adjust_register_position(p, p->regs.end[0]));
}

#dupshallow_copy (private)

Returns a shallow copy of self; the stored string in the copy is the same string as in self.

[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 300

static VALUE
strscan_init_copy(VALUE vself, VALUE vorig)
{
    struct strscanner *self, *orig;

    self = check_strscan(vself);
    orig = check_strscan(vorig);
    if (self != orig) {
	self->flags = orig->flags;
	RB_OBJ_WRITE(vself, &self->str, orig->str);
	self->prev = orig->prev;
	self->curr = orig->curr;
	if (rb_reg_region_copy(&self->regs, &orig->regs))
	    rb_memerror();
	RB_GC_GUARD(vorig);
    }

    return vself;
}

#inspectString

Returns a string representation of self that may show:

  1. The current position.
  2. The size (in bytes) of the stored string.
  3. The substring preceding the current position.
  4. The substring following the current position (which is also the target substring).
scanner = StringScanner.new("Fri Dec 12 1975 14:39")
scanner.pos = 11
scanner.inspect # => "#<StringScanner 11/21 \"...c 12 \" @ \"1975 ...\">"

If at beginning-of-string, item 4 above (following substring) is omitted:

scanner.reset
scanner.inspect # => "#<StringScanner 0/21 @ \"Fri D...\">"

If at end-of-string, all items above are omitted:

scanner.terminate
scanner.inspect # => "#<StringScanner fin>"
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 2034

static VALUE
strscan_inspect(VALUE self)
{
    struct strscanner *p;
    VALUE a, b;

    p = check_strscan(self);
    if (NIL_P(p->str)) {
	a = rb_sprintf("#<%"PRIsVALUE" (uninitialized)>", rb_obj_class(self));
	return a;
    }
    if (EOS_P(p)) {
	a = rb_sprintf("#<%"PRIsVALUE" fin>", rb_obj_class(self));
	return a;
    }
    if (p->curr == 0) {
	b = inspect2(p);
	a = rb_sprintf("#<%"PRIsVALUE" %ld/%ld @ %"PRIsVALUE">",
		       rb_obj_class(self),
		       p->curr, S_LEN(p),
		       b);
	return a;
    }
    a = inspect1(p);
    b = inspect2(p);
    a = rb_sprintf("#<%"PRIsVALUE" %ld/%ld %"PRIsVALUE" @ %"PRIsVALUE">",
		   rb_obj_class(self),
		   p->curr, S_LEN(p),
		   a, b);
    return a;
}

#match?(pattern) ⇒ updated_position?

Attempts to match the given pattern at the beginning of the target substring; does not modify the positions.

If the match succeeds:

  • Sets match values.
  • Returns the size in bytes of the matched substring.
scanner = StringScanner.new('foobarbaz')
scanner.pos = 3
scanner.match?(/bar/) => 3
put_match_values(scanner)
#### Basic match values:
####   matched?:       true
####   matched_size:   3
####   pre_match:      "foo"
####   matched  :      "bar"
####   post_match:     "baz"
#### Captured match values:
####   size:           1
####   captures:       []
####   named_captures: {}
####   values_at:      ["bar", nil]
####   []:
####     [0]:          "bar"
####     [1]:          nil
put_situation(scanner)
#### Situation:
####   pos:       3
####   charpos:   3
####   rest:      "barbaz"
####   rest_size: 6

If the match fails:

  • Clears match values.
  • Returns nil.
  • Does not increment positions.
scanner.match?(/nope/)         # => nil
match_values_cleared?(scanner) # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 824

static VALUE
strscan_match_p(VALUE self, VALUE re)
{
    return strscan_do_scan(self, re, 0, 0, 1);
}

#matched_sizesubstring_size?

Returns the size (in bytes) of the matched substring from the most recent match match attempt if it was successful, or nil otherwise; see Basic Matched Values:

scanner = StringScanner.new('foobarbaz')
scanner.matched_size   # => nil

pos = 3
scanner.exist?(/baz/)  # => 9
scanner.matched_size   # => 3

scanner.exist?(/nope/) # => nil
scanner.matched_size   # => nil
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1598

static VALUE
strscan_matched_size(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    if (! MATCHED_P(p)) return Qnil;
    return LONG2NUM(p->regs.end[0] - p->regs.beg[0]);
}

#named_capturesHash

Returns the array of captured match values at indexes (1..) if the most recent match attempt succeeded, or nil otherwise; see Captured Match Values:

scanner = StringScanner.new('Fri Dec 12 1975 14:39')
scanner.named_captures # => {}

pattern = /(?<wday>\w+) (?<month>\w+) (?<day>\d+) /
scanner.match?(pattern)
scanner.named_captures # => {"wday"=>"Fri", "month"=>"Dec", "day"=>"12"}

scanner.string = 'nope'
scanner.match?(pattern)
scanner.named_captures # => {"wday"=>nil, "month"=>nil, "day"=>nil}

scanner.match?(/nosuch/)
scanner.named_captures # => {}
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 2176

static VALUE
strscan_named_captures(VALUE self)
{
    struct strscanner *p;
    named_captures_data data;
    GET_SCANNER(self, p);
    data.self = self;
    data.captures = rb_hash_new();
    if (!RB_NIL_P(p->regex)) {
        onig_foreach_name(RREGEXP_PTR(p->regex), named_captures_iter, &data);
    }

    return data.captures;
}

#peek(length) ⇒ substring

Returns the substring string[pos, length]; does not update match values or positions:

scanner = StringScanner.new('foobarbaz')
scanner.pos = 3
scanner.peek(3)   # => "bar"
scanner.terminate
scanner.peek(3)   # => ""
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1228

static VALUE
strscan_peek(VALUE self, VALUE vlen)
{
    struct strscanner *p;
    long len;

    GET_SCANNER(self, p);

    len = NUM2LONG(vlen);
    if (EOS_P(p))
        return str_new(p, "", 0);

    len = minl(len, S_RESTLEN(p));
    return extract_beg_len(p, p->curr, len);
}

#peek_byte

Peeks at the current byte and returns it as an integer.

s = StringScanner.new('ab')
s.peek_byte         # => 97
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1173

static VALUE
strscan_peek_byte(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    if (EOS_P(p))
        return Qnil;

    return INT2FIX((unsigned char)*CURPTR(p));
}

#post_matchsubstring

Returns the substring that follows the matched substring from the most recent match attempt if it was successful, or nil otherwise; see Basic Match Values:

scanner = StringScanner.new('foobarbaz')
scanner.post_match     # => nil

scanner.pos = 3
scanner.match?(/bar/)  # => 3
scanner.post_match     # => "baz"

scanner.match?(/nope/) # => nil
scanner.post_match     # => nil
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1917

static VALUE
strscan_post_match(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    if (! MATCHED_P(p)) return Qnil;
    return extract_range(p,
                         adjust_register_position(p, p->regs.end[0]),
                         S_LEN(p));
}

#pre_matchsubstring

Returns the substring that precedes the matched substring from the most recent match attempt if it was successful, or nil otherwise; see Basic Match Values:

scanner = StringScanner.new('foobarbaz')
scanner.pre_match      # => nil

scanner.pos = 3
scanner.exist?(/baz/)  # => 6
scanner.pre_match      # => "foobar" # Substring of entire string, not just target string.

scanner.exist?(/nope/) # => nil
scanner.pre_match      # => nil
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1880

static VALUE
strscan_pre_match(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    if (! MATCHED_P(p)) return Qnil;
    return extract_range(p,
                         0,
                         adjust_register_position(p, p->regs.beg[0]));
}

#resetself

Sets both byte position and character position to zero, and clears match values; returns self:

scanner = StringScanner.new('foobarbaz')
scanner.exist?(/bar/)          # => 6
scanner.reset                  # => #<StringScanner 0/9 @ "fooba...">
put_situation(scanner)
#### Situation:
####   pos:       0
####   charpos:   0
####   rest:      "foobarbaz"
####   rest_size: 9
#### => nil
match_values_cleared?(scanner) # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 364

static VALUE
strscan_reset(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    p->curr = 0;
    CLEAR_MATCH_STATUS(p);
    return self;
}

#rest_sizeInteger

Returns the size (in bytes) of the #rest of the stored string:

scanner = StringScanner.new('foobarbaz')
scanner.rest      # => "foobarbaz"
scanner.rest_size # => 9
scanner.pos = 3
scanner.rest      # => "barbaz"
scanner.rest_size # => 6
scanner.terminate
scanner.rest      # => ""
scanner.rest_size # => 0
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1983

static VALUE
strscan_rest_size(VALUE self)
{
    struct strscanner *p;
    long i;

    GET_SCANNER(self, p);
    if (EOS_P(p)) {
        return INT2FIX(0);
    }
    i = S_RESTLEN(p);
    return INT2FIX(i);
}

#scan(pattern) ⇒ substring?

Attempts to match the given pattern at the beginning of the target substring.

If the match succeeds:

scanner = StringScanner.new(HIRAGANA_TEXT)
scanner.string     # => "こんにちは"
scanner.pos = 6
scanner.scan(//) # => "に"
put_match_values(scanner)
# Basic match values:
#   matched?:       true
#   matched_size:   3
#   pre_match:      "こん"
#   matched  :      "に"
#   post_match:     "ちは"
# Captured match values:
#   size:           1
#   captures:       []
#   named_captures: {}
#   values_at:      ["に", nil]
#   []:
#     [0]:          "に"
#     [1]:          nil
put_situation(scanner)
# Situation:
#   pos:       9
#   charpos:   3
#   rest:      "ちは"
#   rest_size: 6

If the match fails:

  • Returns nil.
  • Does not increment byte and character positions.
  • Clears match values.
scanner.scan(/nope/)           # => nil
match_values_cleared?(scanner) # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 762

static VALUE
strscan_scan(VALUE self, VALUE re)
{
    return strscan_do_scan(self, re, 1, 1, 1);
}

#scan_base10_integer (private)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1282

static VALUE
strscan_scan_base10_integer(VALUE self)
{
    char *ptr;
    long len = 0, remaining_len;
    struct strscanner *p;

    GET_SCANNER(self, p);
    CLEAR_MATCH_STATUS(p);

    strscan_must_ascii_compat(p->str);

    ptr = CURPTR(p);

    remaining_len = S_RESTLEN(p);

    if (remaining_len <= 0) {
        return Qnil;
    }

    if (ptr[len] == '-' || ptr[len] == '+') {
        len++;
    }

    if (!rb_isdigit(ptr[len])) {
        return Qnil;
    }

    p->prev = p->curr;

    while (len < remaining_len && rb_isdigit(ptr[len])) {
        len++;
    }

    return strscan_parse_integer(p, 10, len);
}

#scan_base16_integer (private)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1320

static VALUE
strscan_scan_base16_integer(VALUE self)
{
    char *ptr;
    long len = 0, remaining_len;
    struct strscanner *p;

    GET_SCANNER(self, p);
    CLEAR_MATCH_STATUS(p);

    strscan_must_ascii_compat(p->str);

    ptr = CURPTR(p);

    remaining_len = S_RESTLEN(p);

    if (remaining_len <= 0) {
        return Qnil;
    }

    if (ptr[len] == '-' || ptr[len] == '+') {
        len++;
    }

    if ((remaining_len >= (len + 3)) && ptr[len] == '0' && ptr[len + 1] == 'x' && rb_isxdigit(ptr[len + 2])) {
        len += 2;
    }

    if (len >= remaining_len || !rb_isxdigit(ptr[len])) {
        return Qnil;
    }

    p->prev = p->curr;

    while (len < remaining_len && rb_isxdigit(ptr[len])) {
        len++;
    }

    return strscan_parse_integer(p, 16, len);
}

#scan_byteinteger_byte

Scans one byte and returns it as an integer. This method is not multibyte character sensitive. See also: #getch.

[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1148

static VALUE
strscan_scan_byte(VALUE self)
{
    struct strscanner *p;
    VALUE byte;

    GET_SCANNER(self, p);
    CLEAR_MATCH_STATUS(p);
    if (EOS_P(p))
        return Qnil;

    byte = INT2FIX((unsigned char)*CURPTR(p));
    p->prev = p->curr;
    p->curr++;
    MATCHED(p);
    adjust_registers_to_matched(p);
    return byte;
}

#scan_full(re, s, f)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 921

static VALUE
strscan_scan_full(VALUE self, VALUE re, VALUE s, VALUE f)
{
    return strscan_do_scan(self, re, RTEST(s), RTEST(f), 1);
}

#scan_integer(base: 10)

If base isn’t provided or is 10, then it is equivalent to calling #scan with a [+-]?d+ pattern, and returns an Integer or nil.

If base is 16, then it is equivalent to calling #scan with a [+-]?(0x)?[0-9a-fA-F]+ pattern, and returns an Integer or nil.

The scanned string must be encoded with an ASCII compatible encoding, otherwise Encoding::CompatibilityError will be raised.

[ GitHub ]

  
# File 'ext/strscan/lib/strscan/strscan.rb', line 15

def scan_integer(base: 10)
  case base
  when 10
    scan_base10_integer
  when 16
    scan_base16_integer
  else
    raise ArgumentError, "Unsupported integer base: #{base.inspect}, expected 10 or 16"
  end
end

#scan_until(pattern) ⇒ substring?

Attempts to match the given pattern anywhere (at any position) in the target substring.

If the match attempt succeeds:

scanner = StringScanner.new(HIRAGANA_TEXT)
scanner.string           # => "こんにちは"
scanner.pos = 6
scanner.scan_until(//) # => "にち"
put_match_values(scanner)
# Basic match values:
#   matched?:       true
#   matched_size:   3
#   pre_match:      "こんに"
#   matched  :      "ち"
#   post_match:     "は"
# Captured match values:
#   size:           1
#   captures:       []
#   named_captures: {}
#   values_at:      ["ち", nil]
#   []:
#     [0]:          "ち"
#     [1]:          nil
put_situation(scanner)
# Situation:
#   pos:       12
#   charpos:   4
#   rest:      "は"
#   rest_size: 3

If the match attempt fails:

  • Clears match data.
  • Returns nil.
  • Does not update positions.
scanner.scan_until(/nope/)     # => nil
match_values_cleared?(scanner) # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 932

static VALUE
strscan_scan_until(VALUE self, VALUE re)
{
    return strscan_do_scan(self, re, 1, 1, 0);
}

#search_full(re, s, f)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1094

static VALUE
strscan_search_full(VALUE self, VALUE re, VALUE s, VALUE f)
{
    return strscan_do_scan(self, re, RTEST(s), RTEST(f), 0);
}

#sizecaptures_count

Returns the count of captures if the most recent match attempt succeeded, nil otherwise; see Captures Match Values:

scanner = StringScanner.new('Fri Dec 12 1975 14:39')
scanner.size                        # => nil

pattern = /(?<wday>\w+) (?<month>\w+) (?<day>\d+) /
scanner.match?(pattern)
scanner.values_at(*0..scanner.size) # => ["Fri Dec 12 ", "Fri", "Dec", "12", nil]
scanner.size                        # => 4

scanner.match?(/nope/)              # => nil
scanner.size                        # => nil
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1752

static VALUE
strscan_size(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    if (! MATCHED_P(p))        return Qnil;
    return INT2FIX(p->regs.num_regs);
}

#skip(pattern) match_size or nil)

Attempts to match the given pattern at the beginning of the target substring;

If the match succeeds:

scanner = StringScanner.new(HIRAGANA_TEXT)
scanner.string                  # => "こんにちは"
scanner.pos = 6
scanner.skip(//)              # => 3
put_match_values(scanner)
# Basic match values:
#   matched?:       true
#   matched_size:   3
#   pre_match:      "こん"
#   matched  :      "に"
#   post_match:     "ちは"
# Captured match values:
#   size:           1
#   captures:       []
#   named_captures: {}
#   values_at:      ["に", nil]
#   []:
#     [0]:          "に"
#     [1]:          nil
put_situation(scanner)
# Situation:
#   pos:       9
#   charpos:   3
#   rest:      "ちは"
#   rest_size: 6

scanner.skip(/nope/)            # => nil
match_values_cleared?(scanner)  # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 835

static VALUE
strscan_skip(VALUE self, VALUE re)
{
    return strscan_do_scan(self, re, 1, 0, 1);
}

#skip_until(pattern) ⇒ matched_substring_size?

Attempts to match the given pattern anywhere (at any position) in the target substring; does not modify the positions.

If the match attempt succeeds:

  • Sets match values.
  • Returns the size of the matched substring.
scanner = StringScanner.new(HIRAGANA_TEXT)
scanner.string           # => "こんにちは"
scanner.pos = 6
scanner.skip_until(//) # => 6
put_match_values(scanner)
# Basic match values:
#   matched?:       true
#   matched_size:   3
#   pre_match:      "こんに"
#   matched  :      "ち"
#   post_match:     "は"
# Captured match values:
#   size:           1
#   captures:       []
#   named_captures: {}
#   values_at:      ["ち", nil]
#   []:
#     [0]:          "ち"
#     [1]:          nil
put_situation(scanner)
# Situation:
#   pos:       12
#   charpos:   4
#   rest:      "は"
#   rest_size: 3

If the match attempt fails:

  • Clears match values.
  • Returns nil.
scanner.skip_until(/nope/)     # => nil
match_values_cleared?(scanner) # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1006

static VALUE
strscan_skip_until(VALUE self, VALUE re)
{
    return strscan_do_scan(self, re, 1, 0, 0);
}

#terminateself

Sets the scanner to end-of-string; returns self:

scanner = StringScanner.new(HIRAGANA_TEXT)
scanner.string                 # => "こんにちは"
scanner.scan_until(//)
put_situation(scanner)
# Situation:
#   pos:       9
#   charpos:   3
#   rest:      "ちは"
#   rest_size: 6
match_values_cleared?(scanner) # => false

scanner.terminate              # => #<StringScanner fin>
put_situation(scanner)
# Situation:
#   pos:       15
#   charpos:   5
#   rest:      ""
#   rest_size: 0
match_values_cleared?(scanner) # => true
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 380

static VALUE
strscan_terminate(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    p->curr = S_LEN(p);
    CLEAR_MATCH_STATUS(p);
    return self;
}

#unscanself

Sets the position to its value previous to the recent successful match attempt:

scanner = StringScanner.new('foobarbaz')
scanner.scan(/foo/)
put_situation(scanner)
#### Situation:
####   pos:       3
####   charpos:   3
####   rest:      "barbaz"
####   rest_size: 6
scanner.unscan
#### => #<StringScanner 0/9 @ "fooba...">
put_situation(scanner)
#### Situation:
####   pos:       0
####   charpos:   0
####   rest:      "foobarbaz"
####   rest_size: 9

Raises an exception if match values are clear:

scanner.scan(/nope/)           # => nil
match_values_cleared?(scanner) # => true
scanner.unscan                 # Raises StringScanner::Error.
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1399

static VALUE
strscan_unscan(VALUE self)
{
    struct strscanner *p;

    GET_SCANNER(self, p);
    if (! MATCHED_P(p))
        rb_raise(ScanError, "unscan failed: previous match record not exist");
    p->curr = p->prev;
    CLEAR_MATCH_STATUS(p);
    return self;
}

#values_at(*specifiers) ⇒ array_of_captures?

Returns an array of captured substrings, or nil of none.

For each specifier, the returned substring is [specifier]; see #[].

scanner = StringScanner.new('Fri Dec 12 1975 14:39')
pattern = /(?<wday>\w+) (?<month>\w+) (?<day>\d+) /
scanner.match?(pattern)
scanner.values_at(*0..3)               # => ["Fri Dec 12 ", "Fri", "Dec", "12"]
scanner.values_at(*%i[wday month day]) # => ["Fri", "Dec", "12"]
[ GitHub ]

  
# File 'ext/strscan/strscan.c', line 1837

static VALUE
strscan_values_at(int argc, VALUE *argv, VALUE self)
{
    struct strscanner *p;
    long i;
    VALUE new_ary;

    GET_SCANNER(self, p);
    if (! MATCHED_P(p))        return Qnil;

    new_ary = rb_ary_new2(argc);
    for (i = 0; i<argc; i++) {
        rb_ary_push(new_ary, strscan_aref(self, argv[i]));
    }

    return new_ary;
}