You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: crates/parser/src/statement.rs
+9-11
Original file line number
Diff line number
Diff line change
@@ -5,9 +5,8 @@ use crate::{parser::Parser, syntax_kind_codegen::SyntaxKind};
5
5
6
6
/// A super simple lexer for sql statements.
7
7
///
8
-
/// One weakness of pg_query.rs is that it does not parse whitespace or newlines. To circumvent
9
-
/// this, we use a very simple lexer that just knows what kind of characters are being used. It
10
-
/// does not know anything about postgres syntax or keywords. For example, all words such as `select` and `from` are put into the `Word` type.
8
+
/// One weakness of pg_query.rs is that it does not parse whitespace or newlines. We use a very
9
+
/// simple lexer to fill the gaps.
11
10
#[derive(Logos,Debug,PartialEq)]
12
11
pubenumStatementToken{
13
12
// comments and whitespaces
@@ -38,18 +37,17 @@ impl Parser {
38
37
/// The main entry point for parsing a statement `text`. `at_offset` is the offset of the statement in the source file.
39
38
///
40
39
/// On a high level, the algorithm works as follows:
41
-
/// 1. Parse the statement with pg_query.rs and order nodes by their position. If the
42
-
/// statement contains syntax errors, the parser will report the error and continue to work without information
40
+
/// 1. Parse the statement with pg_query.rs. If the statement contains syntax errors, the parser will report the error and continue to work without information
43
41
/// about the nodes. The result will be a flat list of tokens under the generic `Stmt` node.
44
42
/// If successful, the first node in the ordered list will be the main node of the statement,
45
43
/// and serves as a root node.
46
44
/// 2. Scan the statements for tokens with pg_query.rs. This will never fail, even if the statement contains syntax errors.
47
-
/// 3. Parse the statement with the `StatementToken` lexer. The lexer will be the main vehicle
48
-
/// while walking the statement.
49
-
/// 4. Walk the statement with the `StatementToken` lexer.
50
-
/// - at every token, consume all nodes that are within the token's range.
51
-
/// - if there is a pg_query token within the token's range, consume it. if not, fallback to
52
-
/// the StatementToken. This is the case for e.g. whitespace.
45
+
/// 3. Parse the statement with the `StatementToken` lexer. The lexer only contains the tokens
46
+
/// that are not parsed by pg_query.rs, such as whitespace.
47
+
/// 4. Define a pointer that starts at 0 and move it along the statement.
48
+
/// - first, check if the current pointer is within a pg_query token. If so, consume the
49
+
/// token.
50
+
/// - if not, consume the next token from the `StatementToken` lexer.
0 commit comments