Class: SyntaxSuggest::CleanDocument
| Relationships & Source Files | |
| Inherits: | Object |
| Defined in: | lib/syntax_suggest/clean_document.rb |
Overview
Parses and sanitizes source into a lexically aware document
Internally the document is represented by an array with each
index containing a CodeLine correlating to a line from the source code.
There are three main phases in the algorithm:
- Sanitize/format input source
- Search for invalid blocks
- Format invalid blocks into something meaningful
This class handles the first part.
The reason this class exists is to format input source for better/easier/cleaner exploration.
The CodeSearch class operates at the line level so we must be careful to not introduce lines that look valid by themselves, but when removed will trigger syntax errors or strange behavior.
Join Trailing slashes
Code with a trailing slash is logically treated as a single line:
1 it "code can be split" \
2 "across multiple lines" do
In this case removing line 2 would add a syntax error. We get around this by internally joining the two lines into a single "line" object
Logically Consecutive lines
Code that can be broken over multiple lines such as method calls are on different lines:
1 User.
2 where(name: "schneems").
3 first
Removing line 2 can introduce a syntax error. To fix this, all lines are joined into one.
Heredocs
A heredoc is an way of defining a multi-line string. They can cause many problems. If left as a single line, the parser would try to parse the contents as ruby code rather than as a string. Even without this problem, we still hit an issue with indentation:
1 foo =<<~HEREDOC
2 "Be yourself; everyone else is already taken.""
3 ― Oscar Wilde
4 puts "I look like ruby code" # but i'm still a heredoc
5 HEREDOC
If we didn't join these lines then our algorithm would think that line 4 is separate from the rest, has a higher indentation, then look at it first and remove it.
If the code evaluates line 5 by itself it will think line 5 is a constant, remove it, and introduce a syntax errror.
All of these problems are fixed by joining the whole heredoc into a single line.
Class Method Summary
- .new(source:) ⇒ CleanDocument constructor
Instance Method Summary
-
#call
Call all of the document "cleaners" and return self.
-
#join_consecutive!
Smushes logically "consecutive" lines.
-
#join_groups(groups)
Helper method for joining "groups" of lines.
-
#join_heredoc!
Smushes all heredoc lines into one line.
-
#join_trailing_slash!
Join lines with a trailing slash.
-
#lines
Return an array of CodeLines in the document.
-
#take_while_including(range = 0)
Helper method for grabbing elements from document.
-
#to_s
Renders the document back to a string.
Constructor Details
.new(source:) ⇒ CleanDocument
# File 'lib/syntax_suggest/clean_document.rb', line 70
def initialize(source:) @document = CodeLine.from_source(source) end
Instance Method Details
#call
Call all of the document "cleaners" and return self
# File 'lib/syntax_suggest/clean_document.rb', line 76
def call join_trailing_slash! join_consecutive! join_heredoc! self end
#join_consecutive!
# File 'lib/syntax_suggest/clean_document.rb', line 141
def join_consecutive! consecutive_groups = @document.select(&:consecutive?).map do |code_line| take_while_including(code_line.index..) do |line| line.consecutive? end end join_groups(consecutive_groups) self end
#join_groups(groups)
Helper method for joining "groups" of lines
Input is expected to be type Array<Array
The outer array holds the various "groups" while the inner array holds code lines.
All code lines are "joined" into the first line in their group.
To preserve document size, empty lines are placed in the place of the lines that were "joined"
# File 'lib/syntax_suggest/clean_document.rb', line 182
def join_groups(groups) groups.each do |lines| line = lines.first # Handle the case of multiple groups in a row # if one is already replaced, move on next if @document[line.index].empty? # Join group into the first line @document[line.index] = CodeLine.new( tokens: lines.map(&:tokens).flatten, line: lines.join, index: line.index, consecutive: false ) # Hide the rest of the lines lines[1..].each do |line| # The above lines already have newlines in them, if add more # then there will be double newline, use an empty line instead @document[line.index] = CodeLine.new(line: "", index: line.index, tokens: [], consecutive: false) end end self end
#join_heredoc!
# File 'lib/syntax_suggest/clean_document.rb', line 107
def join_heredoc! start_index_stack = [] heredoc_beg_end_index = [] lines.each do |line| line.tokens.each do |token| case token.type when :HEREDOC_START start_index_stack << line.index when :HEREDOC_END start_index = start_index_stack.pop end_index = line.index heredoc_beg_end_index << [start_index, end_index] end end end heredoc_groups = heredoc_beg_end_index.map { |start_index, end_index| @document[start_index..end_index] } join_groups(heredoc_groups) self end
#join_trailing_slash!
# File 'lib/syntax_suggest/clean_document.rb', line 162
def join_trailing_slash! trailing_groups = @document.select(&:trailing_slash?).map do |code_line| take_while_including(code_line.index..) { |x| x.trailing_slash? } end join_groups(trailing_groups) self end
#lines
Return an array of CodeLines in the document
# File 'lib/syntax_suggest/clean_document.rb', line 86
def lines @document end
#take_while_including(range = 0)
Helper method for grabbing elements from document
Like take_while except when it stops
iterating, it also returns the line
that caused it to stop
# File 'lib/syntax_suggest/clean_document.rb', line 213
def take_while_including(range = 0..) take_next_and_stop = false @document[range].take_while do |line| next if take_next_and_stop take_next_and_stop = !(yield line) true end end
#to_s
Renders the document back to a string
# File 'lib/syntax_suggest/clean_document.rb', line 91
def to_s @document.join end