Skip to content

Commit cbc788a

Browse files
authored
Merge pull request #136 from dbt-labs/feature/allow-exceptions-to-rules
✨ Allow users to exclude given models/rows
2 parents 665f68d + 93111d8 commit cbc788a

25 files changed

+165
-53
lines changed

README.md

Lines changed: 80 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -45,46 +45,44 @@ Once you've installed the package, all you have to do is run a `dbt build --sele
4545
----
4646
## Package Documentation
4747

48-
__[DAG Issues](#dag-issues)__
49-
- [Direct Join to Source](#direct-join-to-source)
50-
- [Downstream Models Dependent on Source](#downstream-models-dependent-on-source)
51-
- [Model Fanout](#model-fanout)
52-
- [Multiple Sources Joined](#multiple-sources-joined)
53-
- [Rejoining of Upstream Concepts](#rejoining-of-upstream-concepts)
54-
- [Root Models](#root-models)
55-
- [Source Fanout](#source-fanout)
56-
- [Staging Models Dependent on Downstream Models](#staging-models-dependent-on-downstream-models)
57-
- [Staging Models Dependent on Other Staging Models](#staging-models-dependent-on-other-staging-models)
58-
- [Unused Sources](#unused-sources)
59-
60-
__[Testing](#testing)__
61-
- [Models without Primary Key Tests](#models-without-primary-key-tests)
62-
- [Test Coverage](#test-coverage)
63-
64-
__[Documentation](#documentation)__
65-
- [Documentation Coverage](#documentation-coverage)
66-
- [Undocumented Models](#undocumented-models)
67-
68-
__[Structure](#structure)__
69-
- [Model Naming Conventions](#model-naming-conventions)
70-
- [Model Directories](#model-directories)
71-
- [Source Directories](#model-directories)
72-
- [Test Directories](#test-directories)
73-
74-
__[Performance](#performance)__
75-
- [Chained View Dependencies](#chained-view-dependencies)
76-
- [Exposure Parents Materializations](#exposure-parents-materializations)
77-
78-
__[Customization](#customization)__
48+
### Rules
49+
- __[DAG Issues](#dag-issues)__
50+
- [Direct Join to Source](#direct-join-to-source)
51+
- [Downstream Models Dependent on Source](#downstream-models-dependent-on-source)
52+
- [Model Fanout](#model-fanout)
53+
- [Multiple Sources Joined](#multiple-sources-joined)
54+
- [Rejoining of Upstream Concepts](#rejoining-of-upstream-concepts)
55+
- [Root Models](#root-models)
56+
- [Source Fanout](#source-fanout)
57+
- [Staging Models Dependent on Downstream Models](#staging-models-dependent-on-downstream-models)
58+
- [Staging Models Dependent on Other Staging Models](#staging-models-dependent-on-other-staging-models)
59+
- [Unused Sources](#unused-sources)
60+
- __[Testing](#testing)__
61+
- [Models without Primary Key Tests](#models-without-primary-key-tests)
62+
- [Test Coverage](#test-coverage)
63+
- __[Documentation](#documentation)__
64+
- [Documentation Coverage](#documentation-coverage)
65+
- [Undocumented Models](#undocumented-models)
66+
- __[Structure](#structure)__
67+
- [Model Naming Conventions](#model-naming-conventions)
68+
- [Model Directories](#model-directories)
69+
- [Source Directories](#model-directories)
70+
- [Test Directories](#test-directories)
71+
- __[Performance](#performance)__
72+
- [Chained View Dependencies](#chained-view-dependencies)
73+
- [Exposure Parents Materializations](#exposure-parents-materializations)
74+
75+
### [Customization](#customization)
7976
- [Disabling Models](#disabling-models)
8077
- [Overriding Variables](#overriding-variables)
78+
- [Configuring exceptions to the rules](#configuring-exceptions-to-the-rules)
8179

82-
__[Querying the DAG with SQL](#querying-the-dag-with-sql)__
80+
### [Querying the DAG with SQL](#querying-the-dag-with-sql)
8381

84-
__[Limitations](#limitations)__
82+
### [Limitations](#limitations)
8583
- [BigQuery and Databricks](#bigquery-and-databricks)
8684

87-
__[Contributing](#contributing)__
85+
### [Contributing](#contributing)
8886

8987
----
9088

@@ -859,6 +857,54 @@ vars:
859857

860858
Changing `max_depth_dag` number to a higher one might prevent the package from running properly on BigQuery and Databricks/Spark.
861859

860+
861+
### Configuring exceptions to the rules
862+
863+
While the rules defined in this package are considered best practices, we realize that there might be exceptions to those rules and people might want to exclude given results to get passing tests despite not following all the recommendations.
864+
865+
An example would be excluding all models with names matching with `stg_..._unioned` from `fct_multiple_sources_joined` as we might want to union 2 different tables representing the same data in some of our staging models and we don't want the test to fail for those models.
866+
867+
The package offers the ability to define a seed called `dbt_project_evaluator_exceptions.csv` to list those exceptions we don't want to be reported. This seed must contain the following columns:
868+
- `fct_name`: the name of the fact table for which we want to define exceptions (Please note that it is not possible to exclude specific models for all the `coverage` tests, but there are variables available to configure those to the particular users' needs)
869+
- `column_name`: the column name from `fct_name` we will be looking at to define exceptions
870+
- `id_to_exclude`: the values (or `like` pattern) we want to exclude for `column_name`
871+
- `comment`: a field where people can document why a given exception is legitimate
872+
873+
The following section describes the steps to follow to configure exceptions.
874+
875+
#### 1. Create a new seed
876+
877+
With our previous example, the seed `dbt_project_evaluator_exceptions.csv` would look like:
878+
```
879+
fct_name,column_name,id_to_exclude,comment
880+
fct_multiple_sources_joined,child,stg_%_unioned,Models called _unioned can union multiple sources
881+
```
882+
883+
which looks like the following when loaded in the warehouse
884+
885+
|fct_name |column_name|id_to_exclude |comment |
886+
|---------------------------|-----------|----------------|--------------------------------------------------|
887+
|fct_multiple_sources_joined|child |stg\_%\_unioned |Models called \_unioned can union multiple sources|
888+
889+
890+
#### 2. Deactivate the seed from the original package
891+
892+
Only a single seed can exist with a given name. When using a custom one, we need to deactivate the one from the package by adding the following to our `dbt_project.yml`
893+
```
894+
seeds:
895+
dbt_project_evaluator:
896+
dbt_project_evaluator_exceptions:
897+
+enabled: false
898+
```
899+
900+
#### 3. Run the seed and the package
901+
902+
We then run both the seed and the package by executing the following command:
903+
```
904+
dbt build --select package:dbt_project_evaluator dbt_project_evaluator_exceptions
905+
```
906+
907+
862908
----
863909
864910
## Querying the DAG with SQL

integration_tests/dbt_project.yml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,4 +40,9 @@ tests:
4040
dbt_project_evaluator_schema_tests:
4141
unique_int_all_dag_relationships_path:
4242
# Grouping by expressions of type ARRAY is not allowed for BigQuery
43-
+enabled: "{{ false if target.type in ['bigquery'] else true }}"
43+
+enabled: "{{ false if target.type in ['bigquery'] else true }}"
44+
45+
seeds:
46+
dbt_project_evaluator:
47+
dbt_project_evaluator_exceptions:
48+
+enabled: false
Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,2 @@
11
parent,parent_resource_type,child,child_resource_type,distance
2-
source_1.table_2,source,int_model_4,model,1
3-
stg_model_1,model,int_model_4,model,1
2+
source_1.table_2,source,int_model_4,model,1
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
fct_name,column_name,id_to_exclude,comment
2+
fct_direct_join_to_source,parent_id,model.dbt_project_evaluator_integration_tests.stg_model_1,This is actually OK because...

macros/filter_exceptions.sql

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
{% macro filter_exceptions(model_ref) %}
2+
3+
{% set query_filters %}
4+
select
5+
column_name,
6+
id_to_exclude
7+
from {{ ref('dbt_project_evaluator_exceptions') }}
8+
where fct_name = '{{ model_ref.name }}'
9+
{% endset %}
10+
11+
{% if execute %}
12+
where 1 = 1
13+
{% for row_filter in run_query(query_filters) %}
14+
and {{ row_filter[0] }} not like '{{ row_filter[1] }}'
15+
{% endfor %}
16+
{% endif %}
17+
18+
{% endmacro %}

models/marts/dag/fct_direct_join_to_source.sql

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,4 +33,6 @@ final as (
3333
order by direct_model_relationships.child
3434
)
3535

36-
select * from final
36+
select * from final
37+
38+
{{ filter_exceptions(this) }}

models/marts/dag/fct_marts_or_intermediate_dependent_on_source.sql

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,4 +15,6 @@ final as (
1515
where parent_resource_type = 'source'
1616
and child_model_type in ('marts', 'intermediate')
1717
)
18-
select * from final
18+
select * from final
19+
20+
{{ filter_exceptions(this) }}

models/marts/dag/fct_model_fanout.sql

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,3 +29,5 @@ model_fanout as (
2929
)
3030

3131
select * from model_fanout
32+
33+
{{ filter_exceptions(this) }}

models/marts/dag/fct_multiple_sources_joined.sql

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,6 @@ multiple_sources_joined as (
1616
having count(*) > 1
1717
)
1818

19-
select * from multiple_sources_joined
19+
select * from multiple_sources_joined
20+
21+
{{ filter_exceptions(this) }}

models/marts/dag/fct_rejoining_of_upstream_concepts.sql

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,13 @@ final as (
5555
from triad_relationships
5656
left join single_use_resources
5757
on triad_relationships.parent_and_child = single_use_resources.parent
58+
),
59+
60+
final_filtered as (
61+
select * from final
62+
where is_loop_independent
5863
)
5964

60-
select * from final
61-
where is_loop_independent
65+
select * from final_filtered
66+
67+
{{ filter_exceptions(this) }}

0 commit comments

Comments
 (0)