Skip to content

Commit b40b1b2

Browse files
authored
Simplify lakeformation shares - single shared_db and single resource link table (#1016)
### Feature or Bugfix - Feature ### Detail Implements design specified in #746. Multiple share requests share a single shared Glue database instead of a shared database per share request. Same for resource link tables, multiple shares in the same target account get permissions to the same resource link table. In addition, this PR includes: - Unified LakeFormation processor for same account and cross account shares (remove duplicated code) - Code clean-up of Lake Formation manager and client: remove duplicates, added more error handling - Ensure "no-failures" if a resource already exists or is already deleted or if a permission is already granted or revoked - Redesigned tests and increased coverage #### Upgrade LakeFormation version LakeFormation v3 ([docs](https://docs.aws.amazon.com/lake-formation/latest/dg/optimize-ram.html)) simplifies cross-account sharing requests. We want to see if there are advantages of implementing the capability of LakeFormation to define the external IAM role directly, instead of using first an AWS account and then the role. In addition, v4 introduces LF hybrid mode. We are not using it at the moment, but it can be a great addition in future features. Because we already distinguish between "old" and "new" shares, it is relatively easy to upgrade to a new LF version without affecting "old" shares. **Does it make sense to upgrade the sharing mechanism?** - Is it possible technically for all principals? --> No, which is a blocker ```Could not grant principal arn:aws:quicksight:us-east-1:11111111111:group/default/dataall permissions ['DESCRIBE', 'SELECT'] to {'Table': {'DatabaseName': 'dataall_imported_c3_tvnews81', 'Name': 'children_books_raw', 'CatalogId': '2222222222222'}} due to: An error occurred (InvalidInputException) when calling the GrantPermissions operation: Cross account requests are only allowed for AWS Accounts, Organizations, IAM Principals and All IAMPrincipals``` - What actions are simplified? In this [commit ](f5b9bee) we can see how the final implementation would look like. **How does the migration look like?** 1. Upgrade LF version: It is possible to change the LF cross account settings programmatically using [put_data_lake_settings](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/lakeformation/client/put_data_lake_settings.html) API. We already use this API in a custom resource of the environment stack. We could introduce the LF version as part of that call and make sure that it is executed on update (not only on create). 2. Modify code in sharing task: for old_shares use the AWS account and for new shares use the IAM roles as principals of the grant. for old_shares execute additional grant for principals in target, for new shares skip. See [commit ](f5b9bee). 3. Backwards compatibility for shares done with V2: - At the moment the code is meant to work for versions >= v2. v4 is very recent and most AWS accounts use v3 - Grant: If a RAM request is v2 and the LF version is upgraded to v3, new tables added to the RAM invitation can use v3 underlying code (I tested one table shared with old v2 procedure and added one table to the request with new code changes and it succeeds. In source the principals are the target IAM roles not the target account) - Revoke: not tested **Conclusion** - We cannot implement LF v3 direct sharing with principals because it does not support Quicksight groups. - I think that setting the LF version in the environment stack is a good idea regardless of it being explicitly used It gives us control to set the latest version in the accounts we use. This is a feature that should be part of a separate PR. - v2 RAM invitations still work with v3 direct-IAM role code. Therefore we could keep using v2 shares without the need to inspect shares and upgrade them. This means that the upgrade is pretty seamless, and we can take it at a later time when the Quicksight blocker is resolved. Issues resolved: - [X] Issue: deletion of new "shared" database does not happen when other old shares are still present in environment account. - [X] ~[Pre-existing issue]: In cross-account shares revoke ALL does not revoke permissions to QS group (by-design)~ - [X] [Pre-existing issue]: in same account requesters have access to original table. It is needed! If we remove access to the original table the resource link access to data does not work. - [X] ~Issue: when revoking, permissions to the shared db are not revoked for principal~ - [X] ~Issue: shared db is not deleted when all shares are revoked~ - [X] ~Issue: duplication of LF permissions. Pivot role and QS group were granted permissions on the original db and on the shared db with every new share~ - [ ] [Pre-existing issue]: orphan LF resources on failed shares. If a share fails it can be deleted from data.all UI but there are resources that are created in LF that require manual clean-up (ugly operation assuming pivot role) - [X] ~Changes in UI: data consumption details~ ### Next steps and remarks - Because there are shared resources between shares, it is possible that parallel shares on the same tables have conflicts: e.g. 2 shares trying to create a resource link simultaneously. I have tried testing with shares approved in a matter of seconds and did not run into issues (at the end a look-up is way faster than updating a policy). But we should keep an eye on this. - Upgrade Lake Formation version. Not possible at this moment, because it would not work for Quicksight groups. To be re-evaluated in the future. - I think the orphan LF resources on failed shares can be done in a separate PR to limit the scope of this PR. - I tried to revoke the requester access to the original table in same account sharing, but it is not possible. For the requester to be able to access the underlying data of the resource link, it needs to have access to the table. - The issue "deletion of new "shared" database does not happen when other old shares are still present in environment account." has not a big impact, but I implemented a solution nevertheless. ### Relates - #746 ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
1 parent 752e824 commit b40b1b2

File tree

16 files changed

+1285
-1559
lines changed

16 files changed

+1285
-1559
lines changed

backend/dataall/base/utils/alarm_service.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,4 +74,3 @@ def publish_message_to_alarms_topic(self, subject, message):
7474
return response
7575
except ClientError as e:
7676
logger.error(f'Failed to deliver message due to: {e} ')
77-
raise e

backend/dataall/base/utils/naming_convention.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ class NamingConventionPattern(Enum):
77

88
S3 = {'regex': '[^a-zA-Z0-9-]', 'separator': '-', 'max_length': 63}
99
IAM = {'regex': '[^a-zA-Z0-9-_]', 'separator': '-', 'max_length': 63}
10-
GLUE = {'regex': '[^a-zA-Z0-9_]', 'separator': '_', 'max_length': 63}
10+
GLUE = {'regex': '[^a-zA-Z0-9_]', 'separator': '_', 'max_length': 240} # Limit 255 - 15 extra chars buffer
1111
GLUE_ETL = {'regex': '[^a-zA-Z0-9-]', 'separator': '-', 'max_length': 52}
1212
NOTEBOOK = {'regex': '[^a-zA-Z0-9-]', 'separator': '-', 'max_length': 63}
1313
MLSTUDIO_DOMAIN = {'regex': '[^a-zA-Z0-9-]', 'separator': '-', 'max_length': 63}

backend/dataall/modules/dataset_sharing/api/resolvers.py

Lines changed: 6 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -182,18 +182,12 @@ def resolve_group(context: Context, source: ShareObject, **kwargs):
182182
def resolve_consumption_data(context: Context, source: ShareObject, **kwargs):
183183
if not source:
184184
return None
185-
with context.engine.scoped_session() as session:
186-
ds: Dataset = DatasetRepository.get_dataset_by_uri(session, source.datasetUri)
187-
if ds:
188-
S3AccessPointName = utils.slugify(
189-
source.datasetUri + '-' + source.principalId,
190-
max_length=50, lowercase=True, regex_pattern='[^a-zA-Z0-9-]', separator='-'
191-
)
192-
return {
193-
's3AccessPointName': S3AccessPointName,
194-
'sharedGlueDatabase': (ds.GlueDatabaseName + '_shared_' + source.shareUri)[:254] if ds else 'Not created',
195-
's3bucketName': ds.S3BucketName,
196-
}
185+
return ShareObjectService.resolve_share_object_consumption_data(
186+
uri=source.shareUri,
187+
datasetUri=source.datasetUri,
188+
principalId=source.principalId,
189+
environmentUri=source.environmentUri
190+
)
197191

198192

199193
def resolve_share_object_statistics(context: Context, source: ShareObject, **kwargs):

backend/dataall/modules/dataset_sharing/aws/glue_client.py

Lines changed: 95 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -16,16 +16,26 @@ def __init__(self, account_id, region, database):
1616

1717
def create_database(self, location):
1818
try:
19-
existing_database = self.database_exists()
19+
log.info(
20+
f'Creating database {self._database} '
21+
f'in account {self._account_id}...'
22+
)
23+
existing_database = self.get_glue_database()
2024
if existing_database:
2125
glue_database_created = True
2226
else:
2327
self._create_glue_database(location)
2428
glue_database_created = True
29+
log.info(
30+
f'Successfully created database {self._database}'
31+
f'in account {self._account_id}'
32+
)
2533
return glue_database_created
2634
except ClientError as e:
2735
log.error(
28-
f'Failed to create database {self._database} on account {self._account_id} due to {e}'
36+
f'Failed to create database {self._database} '
37+
f'in account {self._account_id} '
38+
f'due to {e}'
2939
)
3040
raise e
3141

@@ -39,89 +49,141 @@ def _create_glue_database(self, location):
3949
}
4050
if location:
4151
db_input['LocationUri'] = location
42-
log.info(f'Creating Glue database with input: {db_input}')
4352
response = self._client.create_database(CatalogId=self._account_id, DatabaseInput=db_input)
44-
log.info(f'response Create Database: {response}')
4553
return response
4654
except ClientError as e:
47-
log.debug(f'Failed to create database {database}', e)
4855
raise e
4956

50-
def database_exists(self):
57+
def get_glue_database(self):
58+
try:
59+
log.info(
60+
f'Getting database {self._database} '
61+
f'in account {self._account_id}...'
62+
)
63+
database = self._client.get_database(CatalogId=self._account_id, Name=self._database)
64+
return database
65+
except ClientError:
66+
log.info(f'Database {self._database} not found in account {self._account_id}')
67+
return False
68+
69+
def database_exists(self, database_name):
5170
try:
52-
self._client.get_database(CatalogId=self._account_id, Name=self._database)
71+
log.info(
72+
f'Check database exists {self._database} '
73+
f'in account {self._account_id}...'
74+
)
75+
self._client.get_database(CatalogId=self._account_id, Name=database_name)
5376
return True
5477
except ClientError:
55-
log.info(f'Database {self._database} does not exist on account {self._account_id}...')
78+
log.info(f'Database {database_name} not found in account {self._account_id}')
5679
return False
5780

5881
def table_exists(self, table_name):
5982
try:
83+
log.info(
84+
f'Check table exists {table_name} '
85+
f'in database {self._database} '
86+
f'in account {self._account_id}...'
87+
)
6088
table = (
6189
self._client.get_table(
6290
CatalogId=self._account_id, DatabaseName=self._database, Name=table_name
6391
)
6492
)
65-
log.info(f'Glue table found: {table_name}')
93+
log.info(f'Glue table {table_name} not found in account {self._account_id} in database {self._database}')
6694
return table
6795
except ClientError:
6896
log.info(f'Glue table not found: {table_name}')
6997
return None
7098

7199
def delete_table(self, table_name):
72100
database = self._database
73-
log.info(
74-
'Deleting table {} in database {}'.format(
75-
table_name, database
76-
)
77-
)
78-
response = self._client.delete_table(
79-
CatalogId=self._account_id,
80-
DatabaseName=database,
81-
Name=table_name
82-
)
83-
84-
return response
101+
try:
102+
log.info(
103+
f'Deleting table {table_name} '
104+
f'in database {self._database} '
105+
f'in catalog {self._account_id}...'
106+
)
107+
response = self._client.delete_table(
108+
CatalogId=self._account_id,
109+
DatabaseName=database,
110+
Name=table_name
111+
)
112+
log.info(
113+
f'Successfully deleted table {table_name} '
114+
f'in database {database} '
115+
f'in catalog {self._account_id} '
116+
f'response: {response}'
117+
)
118+
return response
119+
except ClientError as e:
120+
log.error(
121+
f'Could not delete table {table_name} '
122+
f'in database {database} '
123+
f'in catalog {self._account_id} '
124+
f'due to: {e}'
125+
)
126+
raise e
85127

86-
def create_resource_link(self, resource_link_name, resource_link_input):
128+
def create_resource_link(self, resource_link_name, table, catalog_id):
87129
account_id = self._account_id
88-
database = self._database
130+
shared_database = self._database
131+
resource_link_input = {
132+
'Name': table.GlueTableName,
133+
'TargetTable': {
134+
'CatalogId': catalog_id,
135+
'DatabaseName': table.GlueDatabaseName,
136+
'Name': table.GlueTableName,
137+
},
138+
}
89139

90-
log.info(
91-
f'Creating ResourceLink {resource_link_name} in database {account_id}://{database}'
92-
)
93140
try:
141+
log.info(
142+
f'Creating ResourceLink {resource_link_name} '
143+
f'in database {shared_database}...'
144+
)
94145
resource_link = self.table_exists(resource_link_name)
95146
if resource_link:
96147
log.info(
97-
f'ResourceLink {resource_link_name} already exists in database {account_id}://{database}'
148+
f'ResourceLink {resource_link_name} '
149+
f'in database {account_id}://{shared_database}'
150+
f'already exists: {resource_link}'
98151
)
99152
else:
100153
resource_link = self._client.create_table(
101154
CatalogId=account_id,
102-
DatabaseName=database,
155+
DatabaseName=shared_database,
103156
TableInput=resource_link_input,
104157
)
105158
log.info(
106-
f'Successfully created ResourceLink {resource_link_name} in database {account_id}://{database}'
159+
f'Successfully created ResourceLink {resource_link_name} '
160+
f'in database {account_id}://{shared_database} '
161+
f'response: {resource_link}'
107162
)
108163
return resource_link
109164
except ClientError as e:
110165
log.error(
111166
f'Could not create ResourceLink {resource_link_name} '
112-
f'in database {account_id}://{database} '
167+
f'in database {account_id}://{shared_database} '
113168
f'due to: {e}'
114169
)
115170
raise e
116171

117172
def delete_database(self):
118173
account_id = self._account_id
119174
database = self._database
120-
121-
log.info(f'Deleting database {account_id}://{database} ...')
122175
try:
123-
if self.database_exists():
176+
log.info(
177+
f'Deleting database {self._database} '
178+
f'in account {self._account_id}...'
179+
)
180+
existing_database = self.get_glue_database()
181+
if existing_database:
124182
self._client.delete_database(CatalogId=account_id, Name=database)
183+
log.info(
184+
f'Successfully deleted database {database} '
185+
f'in account {account_id}'
186+
)
125187
return True
126188
except ClientError as e:
127189
log.error(

0 commit comments

Comments
 (0)