Class: Ripper
Relationships & Source Files | |
Namespace Children | |
Classes:
| |
Extension / Inclusion / Inheritance Descendants | |
Subclasses:
|
|
Inherits: | Object |
Defined in: | ext/ripper/lib/ripper.rb, parse.y, ext/ripper/lib/ripper/core.rb, ext/ripper/lib/ripper/filter.rb, ext/ripper/lib/ripper/lexer.rb, ext/ripper/lib/ripper/sexp.rb |
Overview
Ripper
is a Ruby script parser.
You can get information from the parser with event-based style. Information such as abstract syntax trees or simple lexical analysis of the Ruby program.
Usage
Ripper
provides an easy interface for parsing your program into a symbolic expression tree (or S-expression).
Understanding the output of the parser may come as a challenge, it’s recommended you use PP to format the output for legibility.
require 'ripper'
require 'pp'
pp Ripper.sexp('def hello(world) "Hello, #{world}!"; end')
#=> [:program,
[[:def,
[:@ident, "hello", [1, 4]],
[:paren,
[:params, [[:@ident, "world", [1, 10]]], nil, nil, nil, nil, nil, nil]],
[:bodystmt,
[[:string_literal,
[:string_content,
[:@tstring_content, "Hello, ", [1, 18]],
[:string_embexpr, [[:var_ref, [:@ident, "world", [1, 27]]]]],
[:@tstring_content, "!", [1, 33]]]]],
nil,
nil,
nil]]]]
You can see in the example above, the expression starts with :program
.
From here, a method definition at :def
, followed by the method’s identifier :@ident
. After the method’s identifier comes the parentheses :paren
and the method parameters under :params
.
Next is the method body, starting at :bodystmt
(stmt
meaning statement), which contains the full definition of the method.
In our case, we’re simply returning a String, so next we have the :string_literal
expression.
Within our :string_literal
you’ll notice two @tstring_content
, this is the literal part for Hello,
and !
. Between the two @tstring_content
statements is a :string_embexpr
, where embexpr is an embedded expression. Our expression consists of a local variable, or var_ref
, with the identifier (@ident
) of world
.
Resources
Requirements
-
ruby 1.9 (support CVS HEAD only)
-
bison 1.28 or later (Other yaccs do not work)
License
Ruby License.
-
Minero Aoki
-
aamine@loveruby.net
Constant Summary
-
EVENTS =
This array contains name of all ripper events.
PARSER_EVENTS + SCANNER_EVENTS
-
PARSER_EVENTS =
This array contains name of parser events.
PARSER_EVENT_TABLE.keys
-
SCANNER_EVENTS =
This array contains name of scanner events.
SCANNER_EVENT_TABLE.keys
-
Version =
version of
Ripper
rb_usascii_str_new2(RIPPER_VERSION)
Class Method Summary
-
#dedent_string(input, width) ⇒ Integer
(also: #dedent_string)
USE OF RIPPER LIBRARY ONLY.
-
.lex(src, filename = '-', lineno = 1, **kw)
Tokenizes the Ruby program and returns an array of an array, which is formatted like
[[lineno, column], type, token, state]
. -
.lex_state_name(integer) ⇒ String
Returns a string representation of lex_state.
-
.new(src, filename = "(ripper)", lineno = 1) ⇒ Ripper
constructor
Create a new
Ripper
object. -
.parse(src, filename = '(ripper)', lineno = 1)
Parses the given Ruby program read from
src
. -
.sexp(src, filename = '-', lineno = 1, raise_errors: false)
- EXPERIMENTAL
Parses
src
and create S-exp tree.
-
.sexp_raw(src, filename = '-', lineno = 1, raise_errors: false)
- EXPERIMENTAL
Parses
src
and create S-exp tree.
-
.slice(src, pattern, n = 0)
- EXPERIMENTAL
Parses
src
and return a string which was matched topattern
.
-
.tokenize(src, filename = '-', lineno = 1, **kw)
Tokenizes the Ruby program and returns an array of strings.
- .token_match(src, pattern) Internal use only
Instance Attribute Summary
-
#debug_output ⇒ Object
rw
Get debug output.
-
#debug_output=(obj)
rw
Set debug output.
-
#end_seen? ⇒ Boolean
readonly
Return true if parsed source ended by _END_.
-
#error? ⇒ Boolean
readonly
Return true if parsed source has errors.
-
#yydebug ⇒ Boolean
rw
Get yydebug.
-
#yydebug=(flag)
rw
Set yydebug.
Instance Method Summary
-
#column ⇒ Integer
Return column number of current parsing line.
-
#encoding ⇒ Encoding
Return encoding of the source.
-
#filename ⇒ String
Return current parsing filename.
-
#lineno ⇒ Integer
Return line number of current parsing line.
-
#parse
Start parsing and returns the value of the root action.
-
#state ⇒ Integer
Return scanner state of current token.
-
#token ⇒ String
Return the current token string.
-
#compile_error(msg)
private
This method is called when the parser found syntax error.
-
#dedent_string(input, width) ⇒ Integer
private
Alias for .dedent_string.
-
#warn(fmt, *args)
private
This method is called when weak warning is produced by the parser.
-
#warning(fmt, *args)
private
This method is called when strong warning is produced by the parser.
- #assert_Qundef(obj, msg) Internal use only
- #rawVALUE(obj) Internal use only
- #validate_object(x) Internal use only
- #_dispatch_0 private Internal use only
- #_dispatch_1(a) private Internal use only
- #_dispatch_2(a, b) private Internal use only
- #_dispatch_3(a, b, c) private Internal use only
- #_dispatch_4(a, b, c, d) private Internal use only
- #_dispatch_5(a, b, c, d, e) private Internal use only
- #_dispatch_6(a, b, c, d, e, f) private Internal use only
- #_dispatch_7(a, b, c, d, e, f, g) private Internal use only
Constructor Details
.new(src, filename = "(ripper)", lineno = 1) ⇒ Ripper
# File 'parse.y', line 13610
static VALUE ripper_initialize(int argc, VALUE *argv, VALUE self) { struct parser_params *p; VALUE src, fname, lineno; TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p); rb_scan_args(argc, argv, "12", &src, &fname, &lineno); if (RB_TYPE_P(src, T_FILE)) { p->lex.gets = ripper_lex_io_get; } else if (rb_respond_to(src, id_gets)) { p->lex.gets = ripper_lex_get_generic; } else { StringValue(src); p->lex.gets = lex_get_str; } p->lex.input = src; p->eofp = 0; if (NIL_P(fname)) { fname = STR_NEW2("(ripper)"); OBJ_FREEZE(fname); } else { StringValueCStr(fname); fname = rb_str_new_frozen(fname); } parser_initialize(p); p->ruby_sourcefile_string = fname; p->ruby_sourcefile = RSTRING_PTR(fname); p->ruby_sourceline = NIL_P(lineno) ? 0 : NUM2INT(lineno) - 1; return Qnil; }
Class Method Details
#dedent_string(input, width) ⇒ Integer
(private) Also known as: #dedent_string
USE OF RIPPER LIBRARY ONLY.
Strips up to width
leading whitespaces from input
, and returns the stripped column width.
# File 'parse.y', line 7594
static VALUE parser_dedent_string(VALUE self, VALUE input, VALUE width) { int wid, col; StringValue(input); wid = NUM2UINT(width); col = dedent_string(input, wid); return INT2NUM(col); }
.lex(src, filename = '-', lineno = 1, **kw)
Tokenizes the Ruby program and returns an array of an array, which is formatted like [[lineno, column], type, token, state]
. The #filename argument is mostly ignored. By default, this method does not handle syntax errors in src
, use the raise_errors
keyword to raise a SyntaxError for an error in src
.
require 'ripper'
require 'pp'
pp Ripper.lex("def m(a) nil end")
#=> [[[1, 0], :on_kw, "def", FNAME ],
[[1, 3], :on_sp, " ", FNAME ],
[[1, 4], :on_ident, "m", ENDFN ],
[[1, 5], :on_lparen, "(", BEG|LABEL],
[[1, 6], :on_ident, "a", ARG ],
[[1, 7], :on_rparen, ")", ENDFN ],
[[1, 8], :on_sp, " ", BEG ],
[[1, 9], :on_kw, "nil", END ],
[[1, 12], :on_sp, " ", END ],
[[1, 13], :on_kw, "end", END ]]
.lex_state_name(integer) ⇒ String
Returns a string representation of lex_state.
# File 'parse.y', line 13825
static VALUE ripper_lex_state_name(VALUE self, VALUE state) { return rb_parser_lex_state_name(NUM2INT(state)); }
.parse(src, filename = '(ripper)', lineno = 1)
Parses the given Ruby program read from src
. src
must be a String or an IO or a object with a #gets
method.
.sexp(src, filename = '-', lineno = 1, raise_errors: false)
- EXPERIMENTAL
-
Parses
src
and create S-exp tree. Returns more readable tree rather than .sexp_raw. This method is mainly for developer use. The #filename argument is mostly ignored. By default, this method does not handle syntax errors insrc
, returningnil
in such cases. Use theraise_errors
keyword to raise a SyntaxError for an error insrc
.require 'ripper' require 'pp' pp Ripper.sexp("def m(a) nil end") #=> [:program, [[:def, [:@ident, "m", [1, 4]], [:paren, [:params, [[:@ident, "a", [1, 6]]], nil, nil, nil, nil, nil, nil]], [:bodystmt, [[:var_ref, [:@kw, "nil", [1, 9]]]], nil, nil, nil]]]]
.sexp_raw(src, filename = '-', lineno = 1, raise_errors: false)
- EXPERIMENTAL
-
Parses
src
and create S-exp tree. This method is mainly for developer use. The #filename argument is mostly ignored. By default, this method does not handle syntax errors insrc
, returningnil
in such cases. Use theraise_errors
keyword to raise a SyntaxError for an error insrc
.require 'ripper' require 'pp' pp Ripper.sexp_raw("def m(a) nil end") #=> [:program, [:stmts_add, [:stmts_new], [:def, [:@ident, "m", [1, 4]], [:paren, [:params, [[:@ident, "a", [1, 6]]], nil, nil, nil]], [:bodystmt, [:stmts_add, [:stmts_new], [:var_ref, [:@kw, "nil", [1, 9]]]], nil, nil, nil]]]]
.slice(src, pattern, n = 0)
- EXPERIMENTAL
-
Parses
src
and return a string which was matched topattern
.pattern
should be described as Regexp.require 'ripper' p Ripper.slice('def m(a) nil end', 'ident') #=> "m" p Ripper.slice('def m(a) nil end', '[ident lparen rparen]+') #=> "m(a)" p Ripper.slice("<<EOS\nstring\nEOS", 'heredoc_beg nl $(tstring_content*) heredoc_end', 1) #=> "string\n"
# File 'ext/ripper/lib/ripper/lexer.rb', line 224
def Ripper.slice(src, pattern, n = 0) if m = token_match(src, pattern) then m.string(n) else nil end end
.token_match(src, pattern)
# File 'ext/ripper/lib/ripper/lexer.rb', line 231
def Ripper.token_match(src, pattern) #:nodoc: TokenPattern.compile(pattern).match(src) end
.tokenize(src, filename = '-', lineno = 1, **kw)
Tokenizes the Ruby program and returns an array of strings. The #filename and #lineno arguments are mostly ignored, since the return value is just the tokenized input. By default, this method does not handle syntax errors in src
, use the raise_errors
keyword to raise a SyntaxError for an error in src
.
p Ripper.tokenize("def m(a) nil end")
# => ["def", " ", "m", "(", "a", ")", " ", "nil", " ", "end"]
Instance Attribute Details
#debug_output ⇒ Object (rw)
Get debug output.
# File 'parse.y', line 13222
VALUE rb_parser_get_debug_output(VALUE self) { struct parser_params *p; TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p); return p->debug_output; }
#debug_output=(obj) (rw)
Set debug output.
# File 'parse.y', line 13237
VALUE rb_parser_set_debug_output(VALUE self, VALUE output) { struct parser_params *p; TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p); return p->debug_output = output; }
#end_seen? ⇒ Boolean
(readonly)
Return true if parsed source ended by _END_.
# File 'parse.y', line 13159
VALUE rb_parser_end_seen_p(VALUE vparser) { struct parser_params *p; TypedData_Get_Struct(vparser, struct parser_params, &parser_data_type, p); return p->ruby__end__seen ? Qtrue : Qfalse; }
#error? ⇒ Boolean
(readonly)
Return true if parsed source has errors.
# File 'parse.y', line 13143
static VALUE ripper_error_p(VALUE vparser) { struct parser_params *p; TypedData_Get_Struct(vparser, struct parser_params, &parser_data_type, p); return p->error_p ? Qtrue : Qfalse; }
#yydebug ⇒ Boolean
(rw)
Get yydebug.
# File 'parse.y', line 13190
VALUE rb_parser_get_yydebug(VALUE self) { struct parser_params *p; TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p); return p->debug ? Qtrue : Qfalse; }
#yydebug=(flag) (rw)
Set yydebug.
# File 'parse.y', line 13206
VALUE rb_parser_set_yydebug(VALUE self, VALUE flag) { struct parser_params *p; TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p); p->debug = RTEST(flag); return flag; }
Instance Method Details
#_dispatch_0 (private)
# File 'ext/ripper/lib/ripper/core.rb', line 34
def _dispatch_0() nil end
#_dispatch_1(a) (private)
# File 'ext/ripper/lib/ripper/core.rb', line 35
def _dispatch_1(a) a end
#_dispatch_2(a, b) (private)
# File 'ext/ripper/lib/ripper/core.rb', line 36
def _dispatch_2(a, b) a end
#_dispatch_3(a, b, c) (private)
# File 'ext/ripper/lib/ripper/core.rb', line 37
def _dispatch_3(a, b, c) a end
#_dispatch_4(a, b, c, d) (private)
# File 'ext/ripper/lib/ripper/core.rb', line 38
def _dispatch_4(a, b, c, d) a end
#_dispatch_5(a, b, c, d, e) (private)
# File 'ext/ripper/lib/ripper/core.rb', line 39
def _dispatch_5(a, b, c, d, e) a end
#_dispatch_6(a, b, c, d, e, f) (private)
# File 'ext/ripper/lib/ripper/core.rb', line 40
def _dispatch_6(a, b, c, d, e, f) a end
#_dispatch_7(a, b, c, d, e, f, g) (private)
# File 'ext/ripper/lib/ripper/core.rb', line 41
def _dispatch_7(a, b, c, d, e, f, g) a end
#assert_Qundef(obj, msg)
# File 'parse.y', line 13801
static VALUE ripper_assert_Qundef(VALUE self, VALUE obj, VALUE msg) { StringValue(msg); if (obj == Qundef) { rb_raise(rb_eArgError, "%"PRIsVALUE, msg); } return Qnil; }
#column ⇒ Integer
Return column number of current parsing line. This number starts from 0.
# File 'parse.y', line 13705
static VALUE ripper_column(VALUE self) { struct parser_params *p; long col; TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p); if (!ripper_initialized_p(p)) { rb_raise(rb_eArgError, "method called for uninitialized object"); } if (NIL_P(p->parsing_thread)) return Qnil; col = p->lex.ptok - p->lex.pbeg; return LONG2NUM(col); }
#compile_error(msg) (private)
This method is called when the parser found syntax error.
# File 'ext/ripper/lib/ripper/core.rb', line 63
def compile_error(msg) end
#dedent_string(input, width) ⇒ Integer
(private)
#dedent_string(input, width) ⇒ Integer
Integer
(private)
#dedent_string(input, width) ⇒ Integer
Alias for .dedent_string.
#encoding ⇒ Encoding
Return encoding of the source.
# File 'parse.y', line 13174
VALUE rb_parser_encoding(VALUE vparser) { struct parser_params *p; TypedData_Get_Struct(vparser, struct parser_params, &parser_data_type, p); return rb_enc_from_encoding(p->enc); }
#filename ⇒ String
Return current parsing filename.
# File 'parse.y', line 13726
static VALUE ripper_filename(VALUE self) { struct parser_params *p; TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p); if (!ripper_initialized_p(p)) { rb_raise(rb_eArgError, "method called for uninitialized object"); } return p->ruby_sourcefile_string; }
#lineno ⇒ Integer
Return line number of current parsing line. This number starts from 1.
# File 'parse.y', line 13745
static VALUE ripper_lineno(VALUE self) { struct parser_params *p; TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p); if (!ripper_initialized_p(p)) { rb_raise(rb_eArgError, "method called for uninitialized object"); } if (NIL_P(p->parsing_thread)) return Qnil; return INT2NUM(p->ruby_sourceline); }
#parse
Start parsing and returns the value of the root action.
# File 'parse.y', line 13677
static VALUE ripper_parse(VALUE self) { struct parser_params *p; TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p); if (!ripper_initialized_p(p)) { rb_raise(rb_eArgError, "method called for uninitialized object"); } if (!NIL_P(p->parsing_thread)) { if (p->parsing_thread == rb_thread_current()) rb_raise(rb_eArgError, "Ripper#parse is not reentrant"); else rb_raise(rb_eArgError, "Ripper#parse is not multithread-safe"); } p->parsing_thread = rb_thread_current(); rb_ensure(ripper_parse0, self, ripper_ensure, self); return p->result; }
#rawVALUE(obj)
# File 'parse.y', line 13812
static VALUE ripper_value(VALUE self, VALUE obj) { return ULONG2NUM(obj); }
#state ⇒ Integer
Return scanner state of current token.
# File 'parse.y', line 13764
static VALUE ripper_state(VALUE self) { struct parser_params *p; TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p); if (!ripper_initialized_p(p)) { rb_raise(rb_eArgError, "method called for uninitialized object"); } if (NIL_P(p->parsing_thread)) return Qnil; return INT2NUM(p->lex.state); }
#token ⇒ String
Return the current token string.
# File 'parse.y', line 13783
static VALUE ripper_token(VALUE self) { struct parser_params *p; long pos, len; TypedData_Get_Struct(self, struct parser_params, &parser_data_type, p); if (!ripper_initialized_p(p)) { rb_raise(rb_eArgError, "method called for uninitialized object"); } if (NIL_P(p->parsing_thread)) return Qnil; pos = p->lex.ptok - p->lex.pbeg; len = p->lex.pcur - p->lex.ptok; return rb_str_subseq(p->lex.lastline, pos, len); }
#validate_object(x)
# File 'parse.y', line 13427
static VALUE ripper_validate_object(VALUE self, VALUE x) { if (x == Qfalse) return x; if (x == Qtrue) return x; if (x == Qnil) return x; if (x == Qundef) rb_raise(rb_eArgError, "Qundef given"); if (FIXNUM_P(x)) return x; if (SYMBOL_P(x)) return x; switch (BUILTIN_TYPE(x)) { case T_STRING: case T_OBJECT: case T_ARRAY: case T_BIGNUM: case T_FLOAT: case T_COMPLEX: case T_RATIONAL: break; case T_NODE: if (nd_type((NODE *)x) != NODE_RIPPER) { rb_raise(rb_eArgError, "NODE given: %p", (void *)x); } x = ((NODE *)x)->nd_rval; break; default: rb_raise(rb_eArgError, "wrong type of ruby object: %p (%s)", (void *)x, rb_obj_classname(x)); } if (!RBASIC_CLASS(x)) { rb_raise(rb_eArgError, "hidden ruby object: %p (%s)", (void *)x, rb_builtin_type_name(TYPE(x))); } return x; }
#warn(fmt, *args) (private)
This method is called when weak warning is produced by the parser. fmt
and args
is printf style.
# File 'ext/ripper/lib/ripper/core.rb', line 54
def warn(fmt, *args) end
#warning(fmt, *args) (private)
This method is called when strong warning is produced by the parser. fmt
and args
is printf style.
# File 'ext/ripper/lib/ripper/core.rb', line 59
def warning(fmt, *args) end