Module: Marshal
Relationships & Source Files | |
Defined in: | marshal.c, marshal.rb |
Overview
The marshaling library converts collections of ::Ruby
objects into a byte stream, allowing them to be stored outside the currently active script. This data may subsequently be read and the original objects reconstituted.
Marshaled data has major and minor version numbers stored along with the object information. In normal use, marshaling can only load data written with the same major version number and an equal or lower minor version number. If Ruby’s “verbose” flag is set (normally using -d, -v, -w, or –verbose) the major and minor numbers must match exactly. Marshal
versioning is independent of Ruby’s version numbers. You can extract the version by reading the first two bytes of marshaled data.
str = Marshal.dump("thing")
RUBY_VERSION #=> "1.9.0"
str[0].ord #=> 4
str[1].ord #=> 8
Some objects cannot be dumped: if the objects to be dumped include bindings, procedure or method objects, instances of class ::IO
, or singleton objects, a ::TypeError
will be raised.
If your class has special serialization needs (for example, if you want to serialize in some specific format), or if it contains objects that would otherwise not be serializable, you can implement your own serialization strategy.
There are two methods of doing this, your object can define either marshal_dump and marshal_load or _dump and _load. marshal_dump will take precedence over _dump if both are defined. marshal_dump may result in smaller Marshal
strings.
Security considerations
By design, .load can deserialize almost any class loaded into the ::Ruby
process. In many cases this can lead to remote code execution if the Marshal
data is loaded from an untrusted source.
As a result, .load is not suitable as a general purpose serialization format and you should never unmarshal user supplied input or other untrusted data.
If you need to deserialize untrusted data, use JSON or another serialization format that is only able to load simple, ‘primitive’ types such as ::String
, ::Array
, ::Hash
, etc. Never allow user input to specify arbitrary types to deserialize into.
marshal_dump and marshal_load
When dumping an object the method marshal_dump will be called. marshal_dump must return a result containing the information necessary for marshal_load to reconstitute the object. The result can be any object.
When loading an object dumped using marshal_dump the object is first allocated then marshal_load is called with the result from marshal_dump. marshal_load must recreate the object from the information in the result.
Example:
class MyObj
def initialize name, version, data
@name = name
@version = version
@data = data
end
def marshal_dump
[@name, @version]
end
def marshal_load array
@name, @version = array
end
end
_dump and _load
Use _dump and _load when you need to allocate the object you’re restoring yourself.
When dumping an object the instance method _dump is called with an ::Integer
which indicates the maximum depth of objects to dump (a value of -1 implies that you should disable depth checking). _dump must return a ::String
containing the information necessary to reconstitute the object.
The class method _load should take a ::String
and use it to return an object of the same class.
Example:
class MyObj
def initialize name, version, data
@name = name
@version = version
@data = data
end
def _dump level
[@name, @version].join ':'
end
def self._load args
new(*args.split(':'))
end
end
Since .dump outputs a string you can have _dump return a Marshal
string which is Marshal.loaded
in _load for complex objects.
Constant Summary
-
MAJOR_VERSION =
major version
INT2FIX(MARSHAL_MAJOR)
-
MINOR_VERSION =
minor version
INT2FIX(MARSHAL_MINOR)
Class Method Summary
-
.load(source, proc = nil, freeze: false) ⇒ Object
(also: .restore)
Returns the result of converting the serialized data in source into a
::Ruby
object (possibly with associated subordinate objects). -
.restore(source, proc = nil, freeze: false)
Alias for .load.
-
.dump(obj [, anIO] , limit=-1 ) ⇒ anIO
mod_func
Serializes obj and all descendant objects.
Class Method Details
.dump(obj [, anIO] , limit=-1 ) ⇒ anIO
(mod_func)
Serializes obj and all descendant objects. If anIO is specified, the serialized data will be written to it, otherwise the data will be returned as a ::String
. If limit is specified, the traversal of subobjects will be limited to that depth. If limit is negative, no checking of depth will be performed.
class Klass
def initialize(str)
@str = str
end
def say_hello
@str
end
end
(produces no output)
o = Klass.new("hello\n")
data = Marshal.dump(o)
obj = Marshal.load(data)
obj.say_hello #=> "hello\n"
Marshal
can’t dump following objects:
-
anonymous Class/Module.
-
objects which are related to system (ex:
::Dir
,::File::Stat
,::IO
,::File
, Socket and so on) -
an instance of
::MatchData
,::Data
,::Method
,::UnboundMethod
,::Proc
,::Thread
, ThreadGroup, Continuation -
objects which define singleton methods
# File 'marshal.c', line 1156
static VALUE marshal_dump(int argc, VALUE *argv, VALUE _) { VALUE obj, port, a1, a2; int limit = -1; port = Qnil; rb_scan_args(argc, argv, "12", &obj, &a1, &a2); if (argc == 3) { if (!NIL_P(a2)) limit = NUM2INT(a2); if (NIL_P(a1)) io_needed(); port = a1; } else if (argc == 2) { if (FIXNUM_P(a1)) limit = FIX2INT(a1); else if (NIL_P(a1)) io_needed(); else port = a1; } return rb_marshal_dump_limited(obj, port, limit); }
.load(source, proc = nil, freeze: false) ⇒ Object
.restore(source, proc = nil, freeze: false) ⇒ Object
Also known as: .restore
Returns the result of converting the serialized data in source into a ::Ruby
object (possibly with associated subordinate objects). source may be either an instance of ::IO
or an object that responds to to_str. If proc is specified, each object will be passed to the proc, as the object is being deserialized.
Never pass untrusted data (including user supplied input) to this method. Please see the overview for further details.
If the freeze: true
argument is passed, deserialized object would be deeply frozen. Note that it may lead to more efficient memory usage due to frozen strings deduplication:
serialized = Marshal.dump(['value1', 'value2', 'value1', 'value2'])
deserialized = Marshal.load(serialized)
deserialized.map(&:frozen?)
# => [false, false, false, false]
deserialized.map(&:object_id)
# => [1023900, 1023920, 1023940, 1023960] -- 4 different objects
deserialized = Marshal.load(serialized, freeze: true)
deserialized.map(&:frozen?)
# => [true, true, true, true]
deserialized.map(&:object_id)
# => [1039360, 1039380, 1039360, 1039380] -- only 2 different objects, object_ids repeating
# File 'marshal.rb', line 33
def self.load(source, proc = nil, freeze: false) Primitive.marshal_load(source, proc, freeze) end
.restore(source, proc = nil, freeze: false)
Alias for .load.
# File 'marshal.rb', line 38
alias restore load