Class: Prism::Source
| Relationships & Source Files | |
| Extension / Inclusion / Inheritance Descendants | |
|
Subclasses:
|
|
| Inherits: | Object |
| Defined in: | lib/prism/parse_result.rb, prism/extension.c |
Overview
This represents a source of Ruby code that has been parsed. It is used in conjunction with locations to allow them to resolve line numbers and source ranges.
Class Method Summary
-
.for(source, start_line, offsets)
Create a new source object with the given source code.
-
.new(source, start_line, offsets) ⇒ Source
constructor
Create a new source object with the given source code.
Instance Attribute Summary
-
#offsets
readonly
The list of newline byte offsets in the source code.
-
#source
readonly
The source code that this source object represents.
-
#start_line
readonly
The line number where this source starts.
Instance Method Summary
-
#byte_offset(line, column)
Converts the line number and column in bytes to a byte offset.
-
#character_column(byte_offset)
Return the column in characters for the given byte offset.
-
#character_offset(byte_offset)
Return the character offset for the given byte offset.
-
#code_units_cache(encoding)
Generate a cache that targets a specific encoding for calculating code unit offsets.
-
#code_units_column(byte_offset, encoding)
Returns the column in code units for the given encoding for the given byte offset.
-
#code_units_offset(byte_offset, encoding)
Returns the offset from the start of the file for the given byte offset counting in code units for the given encoding.
-
#column(byte_offset)
Return the column in bytes for the given byte offset.
-
#deep_freeze
Freeze this object and the objects it contains.
-
#encoding
Returns the encoding of the source code, which is set by parameters to the parser or by the encoding magic comment.
-
#line(byte_offset)
Binary search through the offsets to find the line number for the given byte offset.
-
#line_end(byte_offset)
Returns the byte offset of the end of the line corresponding to the given byte offset.
-
#line_start(byte_offset)
Return the byte offset of the start of the line corresponding to the given byte offset.
-
#lines
Returns the lines of the source code as an array of strings.
-
#replace_offsets(offsets)
Replace the value of offsets with the given value.
-
#replace_start_line(start_line)
Replace the value of start_line with the given value.
-
#slice(byte_offset, length)
Perform a byteslice on the source code using the given byte offset and byte length.
-
#find_line(byte_offset)
private
Internal use only
Binary search through the offsets to find the line number for the given byte offset.
Constructor Details
.new(source, start_line, offsets) ⇒ Source
Create a new source object with the given source code.
# File 'lib/prism/parse_result.rb', line 67
def initialize(source, start_line, offsets) @source = source @start_line = start_line # set after parsing is done @offsets = offsets # set after parsing is done end
Class Method Details
.for(source, start_line, offsets)
Create a new source object with the given source code. This method should be used instead of .new and it will return either a Source or a specialized and more performant ASCIISource if no multibyte characters are present in the source code.
Note that if you are calling this method manually, you will need to supply the start_line and offsets parameters. start_line is the line number that the source starts on, which is typically 1 but can be different if this source is a subset of a larger source or if this is an eval. offsets is an array of byte offsets for the start of each line in the source code, which can be calculated by iterating through the source code and recording the byte offset whenever a newline character is encountered.
# File 'lib/prism/parse_result.rb', line 32
def self.for(source, start_line, offsets) if source.ascii_only? ASCIISource.new(source, start_line, offsets) elsif source.encoding == Encoding::BINARY source.force_encoding(Encoding::UTF_8) if source.valid_encoding? new(source, start_line, offsets) else # This is an extremely niche use case where the file is marked as # binary, contains multi-byte characters, and those characters are not # valid UTF-8. In this case we'll mark it as binary and fall back to # treating everything as a single-byte character. This _may_ cause # problems when asking for code units, but it appears to be the # cleanest solution at the moment. source.force_encoding(Encoding::BINARY) ASCIISource.new(source, start_line, offsets) end else new(source, start_line, offsets) end end
Instance Attribute Details
#offsets (readonly)
The list of newline byte offsets in the source code.
# File 'lib/prism/parse_result.rb', line 62
attr_reader :offsets #: Array[Integer]
#source (readonly)
The source code that this source object represents.
# File 'lib/prism/parse_result.rb', line 56
attr_reader :source #: String
#start_line (readonly)
The line number where this source starts.
# File 'lib/prism/parse_result.rb', line 59
attr_reader :start_line #: Integer
Instance Method Details
#byte_offset(line, column)
Converts the line number and column in bytes to a byte offset.
#character_column(byte_offset)
Return the column in characters for the given byte offset.
# File 'lib/prism/parse_result.rb', line 162
def character_column(byte_offset) character_offset(byte_offset) - character_offset(line_start(byte_offset)) end
#character_offset(byte_offset)
Return the character offset for the given byte offset.
# File 'lib/prism/parse_result.rb', line 155
def character_offset(byte_offset) (source.byteslice(0, byte_offset) or raise).length end
#code_units_cache(encoding)
Generate a cache that targets a specific encoding for calculating code unit offsets.
# File 'lib/prism/parse_result.rb', line 194
def code_units_cache(encoding) CodeUnitsCache.new(source, encoding) end
#code_units_column(byte_offset, encoding)
Returns the column in code units for the given encoding for the given byte offset.
# File 'lib/prism/parse_result.rb', line 202
def code_units_column(byte_offset, encoding) code_units_offset(byte_offset, encoding) - code_units_offset(line_start(byte_offset), encoding) end
#code_units_offset(byte_offset, encoding)
Returns the offset from the start of the file for the given byte offset counting in code units for the given encoding.
This method is tested with UTF-8, UTF-16, and UTF-32. If there is the concept of code units that differs from the number of characters in other encodings, it is not captured here.
We purposefully replace invalid and undefined characters with replacement characters in this conversion. This happens for two reasons. First, it’s possible that the given byte offset will not occur on a character boundary. Second, it’s possible that the source code will contain a character that has no equivalent in the given encoding.
# File 'lib/prism/parse_result.rb', line 180
def code_units_offset(byte_offset, encoding) byteslice = (source.byteslice(0, byte_offset) or raise).encode(encoding, invalid: :replace, undef: :replace) if encoding == Encoding::UTF_16LE || encoding == Encoding::UTF_16BE byteslice.bytesize / 2 else byteslice.length end end
#column(byte_offset)
Return the column in bytes for the given byte offset.
# File 'lib/prism/parse_result.rb', line 148
def column(byte_offset) byte_offset - line_start(byte_offset) end
#deep_freeze
Freeze this object and the objects it contains.
#encoding
Returns the encoding of the source code, which is set by parameters to the parser or by the encoding magic comment.
# File 'lib/prism/parse_result.rb', line 91
def encoding source.encoding end
#find_line(byte_offset) (private)
Binary search through the offsets to find the line number for the given byte offset.
# File 'lib/prism/parse_result.rb', line 221
def find_line(byte_offset) # :nodoc: index = offsets.bsearch_index { |offset| offset > byte_offset } || offsets.length index - 1 end
#line(byte_offset)
Binary search through the offsets to find the line number for the given byte offset.
# File 'lib/prism/parse_result.rb', line 125
def line(byte_offset) start_line + find_line(byte_offset) end
#line_end(byte_offset)
Returns the byte offset of the end of the line corresponding to the given byte offset.
# File 'lib/prism/parse_result.rb', line 141
def line_end(byte_offset) offsets[find_line(byte_offset) + 1] || source.bytesize end
#line_start(byte_offset)
Return the byte offset of the start of the line corresponding to the given byte offset.
# File 'lib/prism/parse_result.rb', line 133
def line_start(byte_offset) offsets[find_line(byte_offset)] end
#lines
Returns the lines of the source code as an array of strings.
# File 'lib/prism/parse_result.rb', line 98
def lines source.lines end
#replace_offsets(offsets)
Replace the value of offsets with the given value.
#replace_start_line(start_line)
Replace the value of start_line with the given value.
# File 'lib/prism/parse_result.rb', line 76
def replace_start_line(start_line) @start_line = start_line end
#slice(byte_offset, length)
Perform a byteslice on the source code using the given byte offset and byte length.
# File 'lib/prism/parse_result.rb', line 106
def slice(byte_offset, length) source.byteslice(byte_offset, length) or raise end