Skip to content
This repository was archived by the owner on Jan 28, 2021. It is now read-only.

Conversation

@erizocosmico
Copy link
Contributor

@erizocosmico erizocosmico commented May 10, 2019

qualify_columns rule has been a source of bugs for quite a long time
due to the way we used to look for columns. Before, we looked for
all available schemas in all the tree of a query (excluding subqueries).
This required a lot of exceptions and treatments for special cases
that have been added over time in order to patch the bugs that kept
appearing.
It had special cases for aliases, for GroupBy, etc that kept complicating
the code and making the rule harder to follow and confusing.

This refactor simplifies the logic of the rule and treats all nodes
in the exact same way so it's simpler, more obvious and easier to
reason about.
Now, a node only has knowledge of the columns (aliases or not) defined
until it reaches the first Project, GroupBy, ResolvedTable or subquery
in each branch of the tree. This way, we can gather all the available
columns and infer the schema (which we cannot just call using the Schema
method because the tree is not resolved yet). Then, qualifying columns
becomes a trivial job once you have the schema.

All the tests of go-mysql-server and gitbase pass with this new
implementation of the rule.

TL;DR: got sick of this rule while debugging a gitbase issue and rewrote it so we don't have to get sick of it anymore while debugging

qualify_columns rule has been a source of bugs for quite a long time
due to the way we used to look for columns. Before, we looked for
all available schemas in all the tree of a query (excluding subqueries).
This required a lot of exceptions and treatments for special cases
that have been added over time in order to patch the bugs that kept
appearing.
It had special cases for aliases, for GroupBy, etc that kept complicating
the code and making the rule harder to follow and confusing.

This refactor simplifies the logic of the rule and treats all nodes
in the exact same way so it's simpler, more obvious and easier to
reason about.
Now, a node only has knowledge of the columns (aliases or not) defined
until it reaches the first Project, GroupBy, ResolvedTable or subquery
in each branch of the tree. This way, we can gather all the available
columns and infer the schema (which we cannot just call using the Schema
method because the tree is not resolved yet). Then, qualifying columns
becomes a trivial job once you have the schema.

All the tests of go-mysql-server and gitbase pass with this new
implementation of the rule.

Signed-off-by: Miguel Molina <[email protected]>
@erizocosmico erizocosmico added the enhancement New feature or request label May 10, 2019
@erizocosmico erizocosmico requested a review from a team May 10, 2019 14:57
@erizocosmico erizocosmico self-assigned this May 10, 2019
@ajnavarro ajnavarro merged commit 33c1da4 into src-d:master May 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants