123456789_123456789_123456789_123456789_123456789_

Class: Ripper

Relationships & Source Files
Namespace Children
Classes:
Extension / Inclusion / Inheritance Descendants
Subclasses:
Inherits: Object
Defined in: ext/ripper/lib/ripper.rb,
parse.y,
ext/ripper/lib/ripper/core.rb,
ext/ripper/lib/ripper/filter.rb,
ext/ripper/lib/ripper/lexer.rb,
ext/ripper/lib/ripper/sexp.rb

Overview

Ripper is a Ruby script parser.

You can get information from the parser with event-based style. Information such as abstract syntax trees or simple lexical analysis of the Ruby program.

Usage

Ripper provides an easy interface for parsing your program into a symbolic expression tree (or S-expression).

Understanding the output of the parser may come as a challenge, it’s recommended you use PP to format the output for legibility.

require 'ripper'
require 'pp'

pp Ripper.sexp('def hello(world) "Hello, #{world}!"; end')
  #=> [:program,
       [[:def,
         [:@ident, "hello", [1, 4]],
         [:paren,
          [:params, [[:@ident, "world", [1, 10]]], nil, nil, nil, nil, nil, nil]],
         [:bodystmt,
          [[:string_literal,
            [:string_content,
             [:@tstring_content, "Hello, ", [1, 18]],
             [:string_embexpr, [[:var_ref, [:@ident, "world", [1, 27]]]]],
             [:@tstring_content, "!", [1, 33]]]]],
          nil,
          nil,
          nil]]]]

You can see in the example above, the expression starts with :program.

From here, a method definition at :def, followed by the method’s identifier :@ident. After the method’s identifier comes the parentheses :paren and the method parameters under :params.

Next is the method body, starting at :bodystmt (stmt meaning statement), which contains the full definition of the method.

In our case, we’re simply returning a String, so next we have the :string_literal expression.

Within our :string_literal you’ll notice two @tstring_content, this is the literal part for Hello, and !. Between the two @tstring_content statements is a :string_embexpr, where embexpr is an embedded expression. Our expression consists of a local variable, or var_ref, with the identifier (@ident) of world.

Resources

Requirements

  • ruby 1.9 (support CVS HEAD only)

  • bison 1.28 or later (Other yaccs do not work)

License

Ruby License.

Constant Summary

Class Method Summary

Instance Attribute Summary

Instance Method Summary

Constructor Details

.new(src, filename = "(ripper)", lineno = 1) ⇒ Ripper

Create a new Ripper object. src must be a String, an IO, or an ::Object which has #gets method.

This method does not starts parsing. See also #parse and .parse.

[ GitHub ]

  
# File 'parse.y', line 14425

static VALUE
ripper_initialize(int argc, VALUE *argv, VALUE self)
{
    struct parser_params *p;
    VALUE src, fname, lineno;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    rb_scan_args(argc, argv, "12", &src, &fname, &lineno);
    if (RB_TYPE_P(src, T_FILE)) {
        p->lex.gets = ripper_lex_io_get;
    }
    else if (rb_respond_to(src, id_gets)) {
        p->lex.gets = ripper_lex_get_generic;
    }
    else {
        StringValue(src);
        p->lex.gets = lex_get_str;
    }
    p->lex.input = src;
    p->eofp = 0;
    if (NIL_P(fname)) {
        fname = STR_NEW2("(ripper)");
	OBJ_FREEZE(fname);
    }
    else {
	StringValueCStr(fname);
	fname = rb_str_new_frozen(fname);
    }
    parser_initialize(p);

    p->ruby_sourcefile_string = fname;
    p->ruby_sourcefile = RSTRING_PTR(fname);
    p->ruby_sourceline = NIL_P(lineno) ? 0 : NUM2INT(lineno) - 1;

    return Qnil;
}

Class Method Details

#dedent_string(input, width) ⇒ Integer (private) Also known as: #dedent_string

USE OF RIPPER LIBRARY ONLY.

Strips up to width leading whitespaces from input, and returns the stripped column width.

[ GitHub ]

  
# File 'parse.y', line 8256

static VALUE
parser_dedent_string(VALUE self, VALUE input, VALUE width)
{
    int wid, col;

    StringValue(input);
    wid = NUM2UINT(width);
    col = dedent_string(input, wid);
    return INT2NUM(col);
}

.lex(src, filename = '-', lineno = 1, **kw)

Tokenizes the Ruby program and returns an array of an array, which is formatted like [[lineno, column], type, token, state]. The #filename argument is mostly ignored. By default, this method does not handle syntax errors in src, use the raise_errors keyword to raise a SyntaxError for an error in src.

require 'ripper'
require 'pp'

pp Ripper.lex("def m(a) nil end")
#=> [[[1,  0], :on_kw,     "def", FNAME    ],
     [[1,  3], :on_sp,     " ",   FNAME    ],
     [[1,  4], :on_ident,  "m",   ENDFN    ],
     [[1,  5], :on_lparen, "(",   BEG|LABEL],
     [[1,  6], :on_ident,  "a",   ARG      ],
     [[1,  7], :on_rparen, ")",   ENDFN    ],
     [[1,  8], :on_sp,     " ",   BEG      ],
     [[1,  9], :on_kw,     "nil", END      ],
     [[1, 12], :on_sp,     " ",   END      ],
     [[1, 13], :on_kw,     "end", END      ]]
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/lexer.rb', line 51

def Ripper.lex(src, filename = '-', lineno = 1, **kw)
  Lexer.new(src, filename, lineno).lex(**kw)
end

.lex_state_name(integer) ⇒ String

Returns a string representation of lex_state.

[ GitHub ]

  
# File 'parse.y', line 14640

static VALUE
ripper_lex_state_name(VALUE self, VALUE state)
{
    return rb_parser_lex_state_name(NUM2INT(state));
}

.parse(src, filename = '(ripper)', lineno = 1)

Parses the given Ruby program read from src. src must be a String or an IO or a object with a #gets method.

[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 18

def Ripper.parse(src, filename = '(ripper)', lineno = 1)
  new(src, filename, lineno).parse
end

.sexp(src, filename = '-', lineno = 1, raise_errors: false)

EXPERIMENTAL

Parses src and create S-exp tree. Returns more readable tree rather than .sexp_raw. This method is mainly for developer use. The #filename argument is mostly ignored. By default, this method does not handle syntax errors in src, returning nil in such cases. Use the raise_errors keyword to raise a SyntaxError for an error in src.

require 'ripper'
require 'pp'

pp Ripper.sexp("def m(a) nil end")
  #=> [:program,
       [[:def,
        [:@ident, "m", [1, 4]],
        [:paren, [:params, [[:@ident, "a", [1, 6]]], nil, nil, nil, nil, nil, nil]],
        [:bodystmt, [[:var_ref, [:@kw, "nil", [1, 9]]]], nil, nil, nil]]]]
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/sexp.rb', line 35

def Ripper.sexp(src, filename = '-', lineno = 1, raise_errors: false)
  builder = SexpBuilderPP.new(src, filename, lineno)
  sexp = builder.parse
  if builder.error?
    if raise_errors
      raise SyntaxError, builder.error
    end
  else
    sexp
  end
end

.sexp_raw(src, filename = '-', lineno = 1, raise_errors: false)

EXPERIMENTAL

Parses src and create S-exp tree. This method is mainly for developer use. The #filename argument is mostly ignored. By default, this method does not handle syntax errors in src, returning nil in such cases. Use the raise_errors keyword to raise a SyntaxError for an error in src.

require 'ripper'
require 'pp'

pp Ripper.sexp_raw("def m(a) nil end")
  #=> [:program,
       [:stmts_add,
        [:stmts_new],
        [:def,
         [:@ident, "m", [1, 4]],
         [:paren, [:params, [[:@ident, "a", [1, 6]]], nil, nil, nil]],
         [:bodystmt,
          [:stmts_add, [:stmts_new], [:var_ref, [:@kw, "nil", [1, 9]]]],
          nil,
          nil,
          nil]]]]
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/sexp.rb', line 71

def Ripper.sexp_raw(src, filename = '-', lineno = 1, raise_errors: false)
  builder = SexpBuilder.new(src, filename, lineno)
  sexp = builder.parse
  if builder.error?
    if raise_errors
      raise SyntaxError, builder.error
    end
  else
    sexp
  end
end

.slice(src, pattern, n = 0)

EXPERIMENTAL

Parses src and return a string which was matched to pattern. pattern should be described as Regexp.

require 'ripper'

p Ripper.slice('def m(a) nil end', 'ident')                   #=> "m"
p Ripper.slice('def m(a) nil end', '[ident lparen rparen]+')  #=> "m(a)"
p Ripper.slice("<<EOS\nstring\nEOS",
               'heredoc_beg nl $(tstring_content*) heredoc_end', 1)
    #=> "string\n"
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/lexer.rb', line 275

def Ripper.slice(src, pattern, n = 0)
  if m = token_match(src, pattern)
  then m.string(n)
  else nil
  end
end

.token_match(src, pattern)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/lexer.rb', line 282

def Ripper.token_match(src, pattern)   #:nodoc:
  TokenPattern.compile(pattern).match(src)
end

.tokenize(src, filename = '-', lineno = 1, **kw)

Tokenizes the Ruby program and returns an array of strings. The #filename and #lineno arguments are mostly ignored, since the return value is just the tokenized input. By default, this method does not handle syntax errors in src, use the raise_errors keyword to raise a SyntaxError for an error in src.

p Ripper.tokenize("def m(a) nil end")
   # => ["def", " ", "m", "(", "a", ")", " ", "nil", " ", "end"]
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/lexer.rb', line 25

def Ripper.tokenize(src, filename = '-', lineno = 1, **kw)
  Lexer.new(src, filename, lineno).tokenize(**kw)
end

Instance Attribute Details

#debug_outputObject (rw)

Get debug output.

[ GitHub ]

  
# File 'parse.y', line 14039

VALUE
rb_parser_get_debug_output(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    return p->debug_output;
}

#debug_output=(obj) (rw)

Set debug output.

[ GitHub ]

  
# File 'parse.y', line 14054

VALUE
rb_parser_set_debug_output(VALUE self, VALUE output)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    return p->debug_output = output;
}

#end_seen?Boolean (readonly)

Return true if parsed source ended by _END_.

[ GitHub ]

  
# File 'parse.y', line 13976

VALUE
rb_parser_end_seen_p(VALUE vparser)
{
    struct parser_params *p;

    TypedData_Get_Struct(vparser, struct parser_params, &parser_data_type, p);
    return RBOOL(p->ruby__end__seen);
}

#error?Boolean (readonly)

Return true if parsed source has errors.

[ GitHub ]

  
# File 'parse.y', line 13960

static VALUE
ripper_error_p(VALUE vparser)
{
    struct parser_params *p;

    TypedData_Get_Struct(vparser, struct parser_params, &parser_data_type, p);
    return RBOOL(p->error_p);
}

#yydebugBoolean (rw)

Get yydebug.

[ GitHub ]

  
# File 'parse.y', line 14007

VALUE
rb_parser_get_yydebug(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    return RBOOL(p->debug);
}

#yydebug=(flag) (rw)

Set yydebug.

[ GitHub ]

  
# File 'parse.y', line 14023

VALUE
rb_parser_set_yydebug(VALUE self, VALUE flag)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    p->debug = RTEST(flag);
    return flag;
}

Instance Method Details

#_dispatch_0 (private)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 34

def _dispatch_0() nil end

#_dispatch_1(a) (private)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 35

def _dispatch_1(a) a end

#_dispatch_2(a, b) (private)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 36

def _dispatch_2(a, b) a end

#_dispatch_3(a, b, c) (private)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 37

def _dispatch_3(a, b, c) a end

#_dispatch_4(a, b, c, d) (private)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 38

def _dispatch_4(a, b, c, d) a end

#_dispatch_5(a, b, c, d, e) (private)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 39

def _dispatch_5(a, b, c, d, e) a end

#_dispatch_6(a, b, c, d, e, f) (private)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 40

def _dispatch_6(a, b, c, d, e, f) a end

#_dispatch_7(a, b, c, d, e, f, g) (private)

This method is for internal use only.
[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 41

def _dispatch_7(a, b, c, d, e, f, g) a end

#assert_Qundef(obj, msg)

This method is for internal use only.
[ GitHub ]

  
# File 'parse.y', line 14616

static VALUE
ripper_assert_Qundef(VALUE self, VALUE obj, VALUE msg)
{
    StringValue(msg);
    if (UNDEF_P(obj)) {
        rb_raise(rb_eArgError, "%"PRIsVALUE, msg);
    }
    return Qnil;
}

#columnInteger

Return column number of current parsing line. This number starts from 0.

[ GitHub ]

  
# File 'parse.y', line 14520

static VALUE
ripper_column(VALUE self)
{
    struct parser_params *p;
    long col;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (NIL_P(p->parsing_thread)) return Qnil;
    col = p->lex.ptok - p->lex.pbeg;
    return LONG2NUM(col);
}

#compile_error(msg) (private)

This method is called when the parser found syntax error.

[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 63

def compile_error(msg)
end

#dedent_string(input, width) ⇒ Integer (private) #dedent_string(input, width) ⇒ Integer

Alias for .dedent_string.

#encodingEncoding

Return encoding of the source.

[ GitHub ]

  
# File 'parse.y', line 13991

VALUE
rb_parser_encoding(VALUE vparser)
{
    struct parser_params *p;

    TypedData_Get_Struct(vparser, struct parser_params, &parser_data_type, p);
    return rb_enc_from_encoding(p->enc);
}

#filenameString

Return current parsing filename.

[ GitHub ]

  
# File 'parse.y', line 14541

static VALUE
ripper_filename(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    return p->ruby_sourcefile_string;
}

#linenoInteger

Return line number of current parsing line. This number starts from 1.

[ GitHub ]

  
# File 'parse.y', line 14560

static VALUE
ripper_lineno(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (NIL_P(p->parsing_thread)) return Qnil;
    return INT2NUM(p->ruby_sourceline);
}

#parse

Start parsing and returns the value of the root action.

[ GitHub ]

  
# File 'parse.y', line 14492

static VALUE
ripper_parse(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (!NIL_P(p->parsing_thread)) {
        if (p->parsing_thread == rb_thread_current())
            rb_raise(rb_eArgError, "Ripper#parse is not reentrant");
        else
            rb_raise(rb_eArgError, "Ripper#parse is not multithread-safe");
    }
    p->parsing_thread = rb_thread_current();
    rb_ensure(ripper_parse0, self, ripper_ensure, self);

    return p->result;
}

#rawVALUE(obj)

This method is for internal use only.
[ GitHub ]

  
# File 'parse.y', line 14627

static VALUE
ripper_value(VALUE self, VALUE obj)
{
    return ULONG2NUM(obj);
}

#stateInteger

Return scanner state of current token.

[ GitHub ]

  
# File 'parse.y', line 14579

static VALUE
ripper_state(VALUE self)
{
    struct parser_params *p;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
	rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (NIL_P(p->parsing_thread)) return Qnil;
    return INT2NUM(p->lex.state);
}

#tokenString

Return the current token string.

[ GitHub ]

  
# File 'parse.y', line 14598

static VALUE
ripper_token(VALUE self)
{
    struct parser_params *p;
    long pos, len;

    TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p);
    if (!ripper_initialized_p(p)) {
        rb_raise(rb_eArgError, "method called for uninitialized object");
    }
    if (NIL_P(p->parsing_thread)) return Qnil;
    pos = p->lex.ptok - p->lex.pbeg;
    len = p->lex.pcur - p->lex.ptok;
    return rb_str_subseq(p->lex.lastline, pos, len);
}

#validate_object(x)

This method is for internal use only.
[ GitHub ]

  
# File 'parse.y', line 14242

static VALUE
ripper_validate_object(VALUE self, VALUE x)
{
    if (x == Qfalse) return x;
    if (x == Qtrue) return x;
    if (NIL_P(x)) return x;
    if (UNDEF_P(x))
	rb_raise(rb_eArgError, "Qundef given");
    if (FIXNUM_P(x)) return x;
    if (SYMBOL_P(x)) return x;
    switch (BUILTIN_TYPE(x)) {
      case T_STRING:
      case T_OBJECT:
      case T_ARRAY:
      case T_BIGNUM:
      case T_FLOAT:
      case T_COMPLEX:
      case T_RATIONAL:
	break;
      case T_NODE:
	if (!nd_type_p((NODE *)x, NODE_RIPPER)) {
	    rb_raise(rb_eArgError, "NODE given: %p", (void *)x);
	}
	x = ((NODE *)x)->nd_rval;
	break;
      default:
	rb_raise(rb_eArgError, "wrong type of ruby object: %p (%s)",
		 (void *)x, rb_obj_classname(x));
    }
    if (!RBASIC_CLASS(x)) {
	rb_raise(rb_eArgError, "hidden ruby object: %p (%s)",
		 (void *)x, rb_builtin_type_name(TYPE(x)));
    }
    return x;
}

#warn(fmt, *args) (private)

This method is called when weak warning is produced by the parser. fmt and args is printf style.

[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 54

def warn(fmt, *args)
end

#warning(fmt, *args) (private)

This method is called when strong warning is produced by the parser. fmt and args is printf style.

[ GitHub ]

  
# File 'ext/ripper/lib/ripper/core.rb', line 59

def warning(fmt, *args)
end