Container support for AWS #205

prajinkhadka · 2024-06-11T16:43:41Z

Added Support for container deployment in AWS.

Build the Image Locally,
Push the tagged Image to ECR
Check the container_deployment flag, and deploy accordingly ( the Image URI or the code directly )
Maintain cache ( added a parameter, container_uri in the cache config JSON )

Summary by CodeRabbit

Release Notes

New Features
- Introduced a setup script for the Python package to streamline installation.
- Added support for container deployment across multiple cloud platforms, enhancing flexibility in function management.
- Updated configuration options to support AWS deployment with Python runtime.
- Enhanced command-line interface with a new flag for container deployment.
- Expanded the functionality of the AWS, Azure, GCP, and OpenWhisk classes to manage Docker images and container deployments.
- Added a new ECRContainer class for managing Docker images in AWS Elastic Container Registry.
- Updated documentation to clarify deployment processes and architecture options.
Bug Fixes
- Improved error handling for unsupported deployments and clarified error messages.
Documentation
- Updated documentation to reflect new features and usage instructions for container deployment.
Chores
- Updated dependency lists to include necessary libraries for enhanced functionality.

mcopik

Very nice progress! I left some comments.

The important issues are:

CLI option and documentation
Testing Node.js
Moving resource allocation to proper resource classes.
Checking the "update" of container-based function on AWS.
Regression testing

benchmarks/wrappers/aws/python/setup.py

config/example.json

sebs/aws/aws.py

sebs/faas/system.py

sebs/aws/aws.py

prajinkhadka · 2024-06-24T00:05:47Z

Very nice progress! I left some comments.

The important issues are:

* CLI option and documentation

* Testing Node.js

* Moving resource allocation to proper resource classes.

* Checking the "update" of container-based function on AWS.

* Regression testing

Hi, Marcin, I have reviewed the changes. Also, I performed the regression tests for both Node.js and Python.
I also added the logic to Fail early if container deployment flag is set as True for other than AWS.

regressionlogs_nodejs.txt
regressionlogs_python.txt

config/example.json

docs/usage.md

sebs/aws/config.py

mcopik · 2024-07-02T16:25:55Z

@coderabbitai review

coderabbitai · 2024-07-02T16:26:03Z

Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2024-07-02T16:26:06Z

Caution

Review failed

The pull request is closed.

Walkthrough

This pull request introduces a variety of changes across multiple files to enhance support for containerized deployment in serverless benchmarks. Key modifications include the addition of a new setup script for a Python package, updates to configuration files to reflect AWS deployment settings, and enhancements to Dockerfiles for both Python and Node.js functions. Additionally, several classes and methods across the codebase have been updated to accommodate new parameters related to container deployment, improving the functionality and flexibility of the deployment process.

Changes

File Path	Change Summary
`benchmarks/wrappers/aws/python/setup.py`	New setup script added for Python package, specifying dependencies and package structure.
`config/example.json`	Updated deployment settings from "openwhisk" to "aws", changed runtime from "nodejs" to "python" 3.8, added "container_deployment" field.
`config/systems.json`	Added "container_deployment" key, updated deployment files for AWS Python, included new package dependency for Node.js.
`dockerfiles/aws/nodejs/Dockerfile.function`	New Dockerfile for Node.js function, includes build argument for base image and sets up function handler.
`dockerfiles/aws/python/Dockerfile.function`	New Dockerfile for Python function, includes conditional installation of dependencies based on version.
`docs/modularity.md`	Updated documentation for clarity on adding benchmarks and extending the SeBS framework.
`docs/usage.md`	Added `--container-deployment` flag to CLI documentation and refined local deployment instructions.
`requirements.aws.txt`	Updated `boto3-stubs` to include `ecr` support.
`requirements.txt`	Added `rich` package for terminal formatting.
`sebs.py`	Enhanced CLI with new `--container-deployment` option, updated function signatures to include new parameters.
`sebs/aws/aws.py`	Updated methods for Docker image management and Lambda function creation, added new methods for ECR interactions.
`sebs/aws/config.py`	Enhanced `AWSResources` class with Docker registry management methods and properties.
`sebs/azure/azure.py`	Updated methods to handle container deployments, added error handling for unsupported features.
`sebs/benchmark.py`	Added properties for managing container-related information.
`sebs/cache.py`	Updated methods to include `container_uri` for code package management.
`sebs/config.py`	Added method for retrieving supported container deployment options.
`sebs/experiments/config.py`	Introduced `_container_deployment` attribute for configuration management.
`sebs/experiments/invocation_overhead.py`	Updated `before_sample` method to include new parameters for function updates.
`sebs/faas/system.py`	Enhanced methods for packaging and creating functions with container deployment support.
`sebs/gcp/gcp.py`	Updated methods to accommodate container deployment parameters.
`sebs/local/local.py`	Enhanced methods to manage container deployment parameters.
`sebs/openwhisk/openwhisk.py`	Updated methods for Docker image management and function creation with container deployment parameters.
`sebs/sebs.py`	Modified `get_deployment` method to validate container deployment configuration.

Possibly related issues

Container Support for AWS #228: This PR addresses container support for AWS, which aligns with the changes made to enhance Docker and container deployment capabilities in this pull request.

Possibly related PRs

updated support for benchmark language versions #198: Updates support for different language versions, directly related to the new setup script for the Python package.
Minor updates to PR #199 #226: Modifications to the command-line interface in sebs.py, which may interact with the new deployment configurations.

🐰 In a world of clouds and code,
Where functions hop and data flowed,
We build with care, with Docker's might,
Container dreams take joyful flight!
So let us cheer, with paws so spry,
For benchmarks soar, oh me, oh my! 🐇

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 13

Outside diff range and nitpick comments (4)

sebs/openwhisk/openwhisk.py (2)

Line range hint 123-206: Optimize Docker Image Build and Push Logic.

The method includes redundant checks and can be optimized for clarity and performance.

-        if registry_name is not None and registry_name != "":
-            repository_name = f"{registry_name}/{repository_name}"
-        else:
-            registry_name = "Docker Hub"
+        repository_name = f"{registry_name}/{repository_name}" if registry_name else "Docker Hub"

        # Check if the image is already in the registry.
        if is_cached and self.find_image(self.docker_client, repository_name, image_tag):
            self.logging.info(
                f"Skipping building OpenWhisk Docker package for {benchmark}, using "
                f"Docker image {repository_name}:{image_tag} from registry: "
                f"{registry_name}."
            )
            return False, repository_name

        self.logging.info(
            f"Image {repository_name}:{image_tag} doesn't exist in the registry, "
            f"building OpenWhisk package for {benchmark}."
        )

        build_dir = os.path.join(directory, "docker")
        os.makedirs(build_dir, exist_ok=True)
        shutil.copy(
            os.path.join(DOCKER_DIR, self.name(), language_name, "Dockerfile.function"),
            os.path.join(build_dir, "Dockerfile"),
        )

        for fn in os.listdir(directory):
            if fn not in ("index.js", "__main__.py"):
                file = os.path.join(directory, fn)
                shutil.move(file, build_dir)

        with open(os.path.join(build_dir, ".dockerignore"), "w") as f:
            f.write("Dockerfile")

        builder_image = self.system_config.benchmark_base_images(self.name(), language_name)[
            language_version
        ]
        self.logging.info(f"Build the benchmark base image {repository_name}:{image_tag}.")

        buildargs = {"VERSION": language_version, "BASE_IMAGE": builder_image}
        image, _ = self.docker_client.images.build(
            tag=f"{repository_name}:{image_tag}", path=build_dir, buildargs=buildargs
        )

        # Now push the image to the registry
        self.logging.info(
            f"Push the benchmark base image {repository_name}:{image_tag} "
            f"to registry: {registry_name}."
        )
        self.push_image_to_repository(self.docker_client, repository_name, image_tag)

        return True, repository_name

Line range hint 255-335: Optimize Function Creation Logic.

The method includes redundant checks and can be optimized for clarity and performance.

-            function_found = False
-            docker_image = ""
+            function_found = any(
+                func_name in line.split()[0] for line in actions.stdout.decode().split("\n") if line
+            )

            function_cfg = OpenWhiskFunctionConfig.from_benchmark(code_package)
            function_cfg.storage = cast(Minio, self.get_storage()).config
            if function_found:
                # docker image is overwritten by the update
                res = OpenWhiskFunction(
                    func_name, code_package.benchmark, code_package.hash, function_cfg
                )
                # Update function - we don't know what version is stored
                self.logging.info(f"Retrieved existing OpenWhisk action {func_name}.")
                self.update_function(res, code_package, container_deployment, container_uri)
            else:
                try:
                    self.logging.info(f"Creating new OpenWhisk action {func_name}")
                    docker_image = self.system_config.benchmark_image_name(
                        self.name(),
                        code_package.benchmark,
                        code_package.language_name,
                        code_package.language_version,
                    )
                    subprocess.run(
                        [
                            *self.get_wsk_cmd(),
                            "action",
                            "create",
                            func_name,
                            "--web",
                            "true",
                            "--docker",
                            docker_image,
                            "--memory",
                            str(code_package.benchmark_config.memory),
                            "--timeout",
                            str(code_package.benchmark_config.timeout * 1000),
                            *self.storage_arguments(),
                            code_package.code_location,
                        ],
                        stderr=subprocess.PIPE,
                        stdout=subprocess.PIPE,
                        check=True,
                    )
                    function_cfg.docker_image = docker_image
                    res = OpenWhiskFunction(
                        func_name, code_package.benchmark, code_package.hash, function_cfg
                    )
                except subprocess.CalledProcessError as e:
                    self.logging.error(f"Cannot create action {func_name}.")
                    self.logging.error(f"Output: {e.stderr.decode('utf-8')}")
                    raise RuntimeError(e)

        except FileNotFoundError:
            self.logging.error("Could not retrieve OpenWhisk functions - is path to wsk correct?")
            raise RuntimeError("Failed to access wsk binary")

        # Add LibraryTrigger to a new function
        trigger = LibraryTrigger(func_name, self.get_wsk_cmd())
        trigger.logging_handlers = self.logging_handlers
        res.add_trigger(trigger)

        return res

sebs.py (1)

Line range hint 541-568: Optimize Resource Listing Logic.

The function includes redundant variables and can be optimized for clarity and performance.

-        for idx, bucket in enumerate(buckets):
+        for bucket in buckets:
            sebs_client.logging.info(f"({idx}) {bucket}")

    elif resource == "resource-groups":

        if deployment_client.name() != "azure":
            sebs_client.logging.error("Resource groups are only supported on Azure!")
            return

-        groups = deployment_client.config.resources.list_resource_groups(
-            deployment_client.cli_instance
-        )
-        sebs_client.logging.info("Resource grup:")
-        for idx, bucket in enumerate(groups):
-            sebs_client.logging.info(f"({idx}) {bucket}")
+        groups = deployment_client.config.resources.list_resource_groups(deployment_client.cli_instance)
+        sebs_client.logging.info("Resource group:")
+        for group in groups:
+            sebs_client.logging.info(f"{group}")

sebs/aws/aws.py (1)

Line range hint 150-188: Use f-strings for improved readability.

Convert format calls to f-strings for better readability.

-        execute("zip -qu -r9 {}.zip * .".format(benchmark), shell=True, cwd=directory)
-        benchmark_archive = "{}.zip".format(os.path.join(directory, benchmark))
-        self.logging.info("Created {} archive".format(benchmark_archive))
+        execute(f"zip -qu -r9 {benchmark}.zip * .", shell=True, cwd=directory)
+        benchmark_archive = f"{os.path.join(directory, benchmark)}.zip"
+        self.logging.info(f"Created {benchmark_archive} archive")

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 0b8d777 and df21c86.

Files selected for processing (22)

benchmarks/wrappers/aws/python/setup.py (1 hunks)
config/example.json (2 hunks)
config/systems.json (2 hunks)
dockerfiles/aws/nodejs/Dockerfile.function (1 hunks)
dockerfiles/aws/python/Dockerfile.function (1 hunks)
docs/modularity.md (1 hunks)
docs/usage.md (1 hunks)
sebs.py (14 hunks)
sebs/aws/aws.py (10 hunks)
sebs/aws/config.py (5 hunks)
sebs/azure/azure.py (6 hunks)
sebs/benchmark.py (6 hunks)
sebs/cache.py (4 hunks)
sebs/config.py (2 hunks)
sebs/experiments/config.py (4 hunks)
sebs/experiments/invocation_overhead.py (1 hunks)
sebs/faas/system.py (4 hunks)
sebs/gcp/gcp.py (4 hunks)
sebs/local/local.py (3 hunks)
sebs/openwhisk/openwhisk.py (9 hunks)
sebs/regression.py (1 hunks)
sebs/sebs.py (2 hunks)

Files not reviewed due to errors (1)

sebs/gcp/gcp.py (no review received)

Files skipped from review due to trivial changes (3)

benchmarks/wrappers/aws/python/setup.py
dockerfiles/aws/nodejs/Dockerfile.function
dockerfiles/aws/python/Dockerfile.function

Additional context used

Ruff

sebs/experiments/config.py

64-64: Use config.get("flags", {}) instead of an if block

Replace with config.get("flags", {})

(SIM401)

sebs/sebs.py

110-110: Use f-string instead of format call

Convert to f-string

(UP032)

115-115: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/local/local.py

154-154: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/experiments/invocation_overhead.py

41-41: Avoid extraneous parentheses

Remove extraneous parentheses

(UP034)

sebs/faas/system.py

263-263: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/aws/config.py

3-3: uuid imported but unused

Remove unused import: uuid

(F401)

243-243: Use f-string instead of format call

Convert to f-string

(UP032)

266-266: Use dct.get("lambda-role", "") instead of an if block

Replace with dct.get("lambda-role", "")

(SIM401)

sebs/openwhisk/openwhisk.py

238-238: Use f-string instead of format call

Convert to f-string

(UP032)

sebs.py

139-139: Use f-string instead of format call

Convert to f-string

(UP032)

370-370: Local variable config is assigned to but never used

Remove assignment to unused variable config

(F841)

618-618: Loop control variable idx not used within loop body

Rename unused idx to _idx

(B007)

sebs/azure/azure.py

217-217: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/benchmark.py

180-180: Use f-string instead of format call

Convert to f-string

(UP032)

490-490: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/gcp/gcp.py

199-199: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/aws/aws.py

80-80: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

176-176: Use f-string instead of format call

Convert to f-string

(UP032)

182-182: Use f-string instead of format call

Convert to f-string

(UP032)

185-185: Use f-string instead of format call

Convert to f-string

(UP032)

378-378: Use f-string instead of format call

Convert to f-string

(UP032)

393-393: Use context handler for opening files

(SIM115)

401-401: Use f-string instead of format call

Convert to f-string

(UP032)

408-410: Use f-string instead of format call

Convert to f-string

(UP032)

516-520: Use f-string instead of format call

Convert to f-string

(UP032)

Additional comments not posted (48)

config/example.json (2)

7-7: Reminder: Ensure the new parameter container_deployment is documented.

The parameter container_deployment is a new addition. Ensure it is documented as an optional feature.

50-54: Clarification on docker_registry key.

The docker_registry key includes registry name, username, password, and container repository. Ensure these details are correctly configured.

sebs/experiments/config.py (4)

10-10: New attribute _container_deployment added.

The attribute _container_deployment is a new addition to the Config class.

35-38: New property container_deployment added.

The property container_deployment is a new addition to the Config class.

50-50: Include container_deployment in serialization.

The container_deployment attribute is now included in the serialized output.

62-62: Include container_deployment in deserialization.

The container_deployment attribute is now included in the deserialized input.

sebs/config.py (2)

42-44: New method supported_container_deployment added.

The method supported_container_deployment is a new addition to the SeBSConfig class.

60-62: Update benchmark_image_name method.

The method benchmark_image_name now includes an optional registry parameter.

config/systems.json (2)

3-4: New parameter container_deployment added.

The parameter container_deployment is a new addition under the general section.

74-75: Include storage.py and setup.py in deployment configuration.

The files storage.py and setup.py are now included in the aws deployment configuration for Python.
docs/usage.md (1)
7-7: Clarify the scope of the --container-deployment flag.

The sentence "To enforce redeployment of code, benchmark inputs, container deployment (supported in AWS) please use flags --update-code, --update-storage and --container-deployment respectively." could be clearer. Consider rephrasing for better readability.
- To enforce redeployment of code, benchmark inputs, container deployment (supported in AWS) please use flags `--update-code`, `--update-storage` and `--container-deployment` respectively.
+ To enforce redeployment of code, benchmark inputs, and container deployment (supported in AWS), please use the flags `--update-code`, `--update-storage`, and `--container-deployment` respectively.
sebs/sebs.py (1)
88-89: Use f-string instead of format call.

Convert to f-string for better readability and performance.
- raise RuntimeError("Deployment {name} not supported!".format(name=name))
+ raise RuntimeError(f"Deployment {name} not supported!")
Likely invalid or redundant comment.
sebs/local/local.py (1)
135-136: Document the new parameter container_deployment.

The container_deployment parameter is added but not documented. Add a docstring to explain its purpose.
    """
        It would be sufficient to just pack the code and ship it as zip to AWS.
        However, to have a compatible function implementation across providers,
        we create a small module.
        Issue: relative imports in Python when using storage wrapper.
        Azure expects a relative import inside a module.

        Structure:
        function
        - function.py
        - storage.py
        - resources
        handler.py

        dir: directory where code is located
        benchmark: benchmark name
        container_deployment: flag indicating if container deployment is used
    """
sebs/faas/system.py (7)

170-171: LGTM!

The addition of the container_deployment parameter in the abstract method signature is consistent with the changes required for container deployment support.

175-183: LGTM!

The abstract method build_base_image with the new signature is consistent with the changes required for container deployment support.

186-187: LGTM!

The abstract method find_image with the new signature is consistent with the changes required for container deployment support.

190-191: LGTM!

The abstract method repository_authorization with the new signature is consistent with the changes required for container deployment support.

194-195: LGTM!

The abstract method push_image_to_repository with the new signature is consistent with the changes required for container deployment support.

198-204: LGTM!

The addition of the container_deployment and container_uri parameters in the abstract method create_function is consistent with the changes required for container deployment support.

212-218: LGTM!

The addition of the container_deployment and container_uri parameters in the abstract method update_function is consistent with the changes required for container deployment support.

sebs/cache.py (2)

165-171: LGTM!

The addition of the container_uri parameter in the add_code_package function is consistent with the changes required for container deployment support.

241-245: LGTM!

The addition of the container_uri parameter in the update_code_package function is consistent with the changes required for container deployment support.

sebs/aws/config.py (4)

229-238: LGTM!

The function check_ecr_repository_exists correctly checks if an ECR repository exists and handles exceptions appropriately.

239-255: LGTM!

The function create_ecr_repository correctly creates an ECR repository if it does not exist and handles exceptions appropriately.

Tools

Ruff

243-243: Use f-string instead of format call

Convert to f-string

(UP032)

256-265: LGTM!

The initialize function correctly initializes the Docker registry details.

287-298: LGTM!

The update_cache function correctly updates the cache with Docker registry details.

sebs/regression.py (1)

346-364: LGTM!

The regression_suite function correctly includes assertions for deployment keys.

docs/modularity.md (1)

343-343: Update Documentation for New Parameters in update_function.

The update_function method now accepts container_deployment and container_uri parameters, which should be documented to explain their purpose and usage.
sebs/openwhisk/openwhisk.py (5)
114-122: LGTM! Proper Error Handling in push_image_to_repository.

The method correctly handles errors by logging and raising a RuntimeError if an error is found in the response.

Line range hint 215-239: LGTM! Convert Format Call to f-String.

The method correctly handles packaging the code for deployment. Convert the format call to an f-string for consistency.
-        self.logging.info("Zip archive size {:2f} MB".format(bytes_size / 1024.0 / 1024.0))
+        self.logging.info(f"Zip archive size {bytes_size / 1024.0 / 1024.0:.2f} MB")
Line range hint 337-365: LGTM! Proper Function Update Handling.

The method correctly updates the function and handles errors.

Line range hint 518-518: LGTM! Proper Cold Start Enforcement.

The method correctly updates the function configuration to enforce a cold start.

Line range hint 97-112: Ensure Proper Handling of Docker Manifest Inspection.

The find_image method attempts to inspect a Docker manifest. Ensure that the experimentalManifest configuration is correctly set and that the execute function handles errors gracefully.

Verification successful

Ensure Proper Handling of Docker Manifest Inspection.

The execute function in sebs/utils.py raises a RuntimeError if the command fails, which ensures that errors are handled gracefully. The experimentalManifest configuration is correctly set and used in the codebase.

sebs/utils.py: The execute function raises a RuntimeError on failure, ensuring proper error handling.

sebs/openwhisk/openwhisk.py: Usage of experimentalManifest configuration is appropriate.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the experimentalManifest configuration and error handling in the execute function.

# Test: Search for the experimentalManifest configuration usage. Expect: Proper error handling and configuration.
rg 'experimentalManifest'
Length of output: 596

Script:
#!/bin/bash
# Description: Verify the error handling in the execute function.

# Search for the definition of the execute function
rg 'def execute'
Length of output: 292

Script:
#!/bin/bash
# Description: Inspect the implementation of the execute function in sebs/utils.py for error handling.

# Display the implementation of the execute function in sebs/utils.py
rg --context 10 'def execute' sebs/utils.py
Length of output: 719
sebs.py (4)
100-104: LGTM! Proper Addition of Container Deployment Option.

The new option for container deployment is correctly added and documented.

Line range hint 125-151: LGTM! Convert Format Call to f-String.

The function correctly updates the configuration and handles the new option. Convert the format call to an f-string for consistency.
-    sebs_client.logging.info("Created experiment output at {}".format(output_dir))
+    sebs_client.logging.info(f"Created experiment output at {output_dir}")
Line range hint 231-243: LGTM! Proper Invocation Logic Update.

The function correctly updates the invocation logic and handles the new container deployment option.

347-347: LGTM! Proper Regression Logic Update.

The function correctly updates the regression logic and handles the new container deployment option.
sebs/azure/azure.py (2)
Line range hint 163-218: LGTM! Convert Format Call to f-String.

The method correctly handles packaging the code for deployment and raises an error for unsupported container deployment. Convert the format call to an f-string for consistency.
-        execute("zip -qu -r9 {}.zip * .".format(benchmark), shell=True, cwd=directory)
+        execute(f"zip -qu -r9 {benchmark}.zip * .", shell=True, cwd=directory)
292-298: LGTM! Proper Function Update Handling.

The method correctly updates the function and handles errors.
sebs/benchmark.py (5)
127-131: LGTM!

The container_uri property is correctly implemented.

144-147: LGTM!

The container_deployment property is correctly implemented.

177-180: LGTM! Consider using an f-string.

The _container_deployment attribute is initialized correctly. You can optimize the string formatting.
- raise RuntimeError("Benchmark {benchmark} not found!".format(benchmark=self._benchmark))
+ raise RuntimeError(f"Benchmark {self._benchmark} not found!")
Tools

Ruff

180-180: Use f-string instead of format call

Convert to f-string

(UP032)

483-495: LGTM!

The build method handles the container deployment flag and URI correctly.

Also applies to: 516-547

Tools

Ruff

490-490: Use f-string instead of format call

Convert to f-string

(UP032)

538-544: LGTM!

The cache update logic correctly includes the container URI.
sebs/aws/aws.py (4)
202-208: LGTM!

The code changes are approved.

494-496: Use f-strings for improved readability.

Convert format calls to f-strings for better readability.
-        self.client.update_function_configuration(
-            FunctionName=name,
-            Timeout=function.config.timeout,
-            MemorySize=function.config.memory,
-        )
+        self.client.update_function_configuration(FunctionName=name, Timeout=function.config.timeout, MemorySize=function.config.memory)
Likely invalid or redundant comment.

185-185: Use context manager for file operations.

Use a context manager to open files to ensure they are properly closed after use.
-                    package_body = open(package, "rb").read()
+                    with open(package, "rb") as file:
+                        package_body = file.read()
Likely invalid or redundant comment.

Tools

Ruff

185-185: Use f-string instead of format call

Convert to f-string

(UP032)

67-79: Avoid using mutable default arguments.

Replace mutable default arguments with None and initialize them within the function.
-        if self.config.resources.docker_username:
-            if self.config.resources.docker_registry:
-                docker_client.login(
-                    username=self.config.resources.docker_username,
-                    password=self.config.resources.docker_password,
-                    registry=self.config.resources.docker_registry,
-                )
-            else:
-                docker_client.login(
-                    username=self.config.resources.docker_username,
-                    password=self.config.resources.docker_password,
-                )
+        docker_username = self.config.resources.docker_username
+        docker_password = self.config.resources.docker_password
+        docker_registry = self.config.resources.docker_registry
+        if docker_username and docker_password:
+            if docker_registry:
+                docker_client.login(
+                    username=docker_username,
+                    password=docker_password,
+                    registry=docker_registry,
+                )
+            else:
+                docker_client.login(
+                    username=docker_username,
+                    password=docker_password,
+                )
Likely invalid or redundant comment.

coderabbitai · 2024-07-02T16:31:10Z

sebs/experiments/config.py

        cfg._download_results = config["download_results"]
+        cfg._container_deployment = config["container_deployment"]
        cfg._runtime = Runtime.deserialize(config["runtime"])
        cfg._flags = config["flags"] if "flags" in config else {}


Optimize dictionary access.

Use config.get("flags", {}) instead of an if block.

- cfg._flags = config["flags"] if "flags" in config else {} + cfg._flags = config.get("flags", {})

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cfg._flags = config["flags"] if "flags" in config else {}

cfg._flags = config.get("flags", {})

Tools

Ruff

64-64: Use config.get("flags", {}) instead of an if block

Replace with config.get("flags", {})

(SIM401)

sebs/sebs.py

sebs/local/local.py

sebs/aws/aws.py

coderabbitai · 2024-07-02T16:31:11Z

sebs/local/local.py


        bytes_size = os.path.getsize(directory)
        mbytes = bytes_size / 1024.0 / 1024.0
        self.logging.info("Function size {:2f} MB".format(mbytes))


Use f-string instead of format call.

Convert to f-string for better readability and performance.

- self.logging.info("Function size {:2f} MB".format(mbytes)) + self.logging.info(f"Function size {mbytes:.2f} MB")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

self.logging.info("Function size {:2f} MB".format(mbytes))

self.logging.info(f"Function size {mbytes:.2f} MB")

Tools

Ruff

154-154: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/aws/aws.py

coderabbitai · 2024-07-02T16:31:11Z

sebs/aws/aws.py

+            code_package.benchmark,
+            code_package.language_name,
+            code_package.language_version,


Use f-strings for improved readability.

Convert format calls to f-strings for better readability.

- func_name = "{}-{}-{}".format( - code_package.benchmark, - code_package.language_name, - code_package.language_version, - ) + func_name = f"{code_package.benchmark}-{code_package.language_name}-{code_package.language_version}"

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

code_package.benchmark,

code_package.language_name,

code_package.language_version,

func_name = f"{code_package.benchmark}-{code_package.language_name}-{code_package.language_version}"

coderabbitai · 2024-07-02T16:31:11Z

sebs/experiments/invocation_overhead.py

        self._benchmark = benchmark

    def before_sample(self, size: int, input_benchmark: dict):
        arr = bytearray((random.getrandbits(8) for i in range(size)))


Avoid extraneous parentheses.

Remove the extraneous parentheses for better readability.

- arr = bytearray((random.getrandbits(8) for i in range(size))) + arr = bytearray(random.getrandbits(8) for i in range(size))

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

arr = bytearray((random.getrandbits(8) for i in range(size)))

arr = bytearray(random.getrandbits(8) for i in range(size))

Tools

Ruff

41-41: Avoid extraneous parentheses

Remove extraneous parentheses

(UP034)

sebs/aws/aws.py

mcopik · 2024-11-06T15:07:45Z

Remaining issues:

Rebase with master.
Add abstractions in the faas/system class.
Add documentation on deployment, configuration, and container support.
Extend CLI to list & delete all images.
Resolve coderabbit issues.
Verify caching works correctly (no cache/replace existing image/create image).
Extend regression and rerun x86.
Extend regression and rerun arm64.

…iders that support container deployment]

…ate function with container URI

… aws.

…uthorization, and rase NotImplementedError for GCP and Azure

…ndomly

…tectures

coderabbitai

Actionable comments posted: 11

🧹 Outside diff range and nitpick comments (8)

sebs/faas/system.py (1)

168-172: LGTM! Consider adding docstring updates.

The new parameters architecture and container_deployment align well with the container support objectives. Consider updating the method's docstring to document:

The new parameters and their purpose

The updated return type that includes container deployment info
sebs/benchmark.py (1)
512-516: Improve type hints and add docstring for the build method

The method would benefit from:

Type aliases for better readability

A docstring explaining the return tuple values

Consider applying this improvement:
+# At the top of the file with other imports
+from typing import TypeAlias

+# Type aliases for better readability
+BuildStepCallable: TypeAlias = Callable[[str, str, str, str, str, bool, bool], Tuple[str, int, str]]
+BuildResult: TypeAlias = Tuple[bool, str, bool, str]

 def build(
     self,
-    deployment_build_step: Callable[[str, str, str, str, str, bool, bool], Tuple[str, int, str]],
-) -> Tuple[bool, str, bool, str]:
+    deployment_build_step: BuildStepCallable,
+) -> BuildResult:
+    """Build the benchmark package.
+
+    Args:
+        deployment_build_step: Callback for building the deployment package
+
+    Returns:
+        Tuple containing:
+        - bool: Whether a new build was performed
+        - str: Location of the code package
+        - bool: Whether this is a container deployment
+        - str: Container URI if applicable, empty string otherwise
+    """
Also applies to: 523-526
sebs/aws/aws.py (2)
139-142: Initialize container_uri with None instead of empty string.

For better type safety and explicit handling of the unset case, consider using None as the default value.
-        container_uri = ""
+        container_uri: Optional[str] = None
333-341: Consider early architecture validation.

The architecture mismatch is detected during the build process. Consider validating this earlier to fail fast and provide better user experience.
+    def _validate_architecture(self, target_arch: str) -> None:
+        """Validate if the target architecture is supported on the current system."""
+        isa = platform.processor()
+        if (isa == "x86_64" and target_arch != "x64") or (
+            isa == "arm64" and target_arch != "arm64"
+        ):
+            raise ValueError(
+                f"Unsupported target architecture: {target_arch} on CPU architecture: {isa}. "
+                "Please ensure QEMU is configured for cross-architecture builds."
+            )

     def build_base_image(
         self,
         directory: str,
         language_name: str,
         language_version: str,
         architecture: str,
         benchmark: str,
         is_cached: bool,
     ) -> Tuple[bool, str]:
+        self._validate_architecture(architecture)
sebs/cache.py (4)
Line range hint 127-131: Ensure consistency in get_code_package method parameters

The get_code_package method signature has been updated to include language_version and architecture, but the docstring and parameter descriptions have not been updated accordingly. Please update the docstring to reflect the new parameters for clarity and maintainability.

Apply this diff to update the docstring:
     """
         Access cached version of benchmark code.

         :param deployment: allowed deployment clouds or local
         :param benchmark:
-        :param  language:
+        :param language: programming language of the benchmark
+        :param language_version: version of the programming language
+        :param architecture: system architecture (e.g., x86_64)
264-272: Ensure consistent updates to cached configuration

In the block updating cached_config within add_code_package, consider combining the duplicate code for updating containers and code_package to reduce redundancy and improve maintainability.

Refactor the code to unify the update logic:
# Determine the main key based on the deployment type
main_key = "containers" if code_package.container_deployment else "code_package"
# Update the cached configuration
cached_config[deployment_name][language][main_key][key] = language_config
182-186: Update method docstring for add_code_package

The add_code_package method has been updated, but its docstring is missing or outdated. Adding a docstring with parameter descriptions will enhance code readability and maintenance.

Add the following docstring:
def add_code_package(
    self,
    deployment_name: str,
    code_package: "Benchmark",
):
    """
    Add a new code package or container image to the cache.

    :param deployment_name: Name of the deployment platform (e.g., 'aws')
    :param code_package: Instance of Benchmark containing code package information
    """
    # ... rest of the method ...
293-295: Update method docstring for update_code_package

Similarly, the update_code_package method lacks a docstring. Adding a docstring will clarify the purpose and usage of the method.

Add the following docstring:
def update_code_package(
    self,
    deployment_name: str,
    code_package: "Benchmark",
):
    """
    Update an existing code package or container image in the cache.

    :param deployment_name: Name of the deployment platform (e.g., 'aws')
    :param code_package: Instance of Benchmark containing updated code package information
    """
    # ... rest of the method ...

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 5b3717a and e4a5b26.

📒 Files selected for processing (8)

config/systems.json (2 hunks)
dockerfiles/aws/python/Dockerfile.function (1 hunks)
sebs/aws/aws.py (8 hunks)
sebs/benchmark.py (9 hunks)
sebs/cache.py (9 hunks)
sebs/config.py (1 hunks)
sebs/faas/system.py (5 hunks)
sebs/sebs.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

dockerfiles/aws/python/Dockerfile.function
sebs/sebs.py

🧰 Additional context used

🪛 Ruff

sebs/aws/aws.py

263-263: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

423-423: Use a context manager for opening files

(SIM115)

🔇 Additional comments (13)

sebs/config.py (3)

50-55: LGTM! Verify base image configurations.

The method correctly incorporates architecture-specific base images. The implementation maintains consistency with other architecture-related changes.

Let's verify the base image configurations:

#!/bin/bash
# Description: Check base image configurations
# Test: Extract and validate base image structure for different architectures
jq -r '.[] | select(.languages != null) | .languages[] | select(.base_images != null) | .base_images | keys[]' config/systems.json

47-48: Add docstring and verify config structure.

While the implementation is straightforward, please add a docstring to document the purpose and expected return values. Also, let's verify the config structure matches the return type hint.

#!/bin/bash
# Description: Verify container_deployment config structure
# Test: Check if the value is a list of strings
jq -r '.general.container_deployment | if type=="array" then "Valid array" else "Invalid type: " + type end' config/systems.json

63-69: LGTM! Image naming follows Docker conventions.

The implementation correctly:

Incorporates architecture in image tags
Follows Docker naming conventions
Fixed the previous issue with spaces in tags

Let's verify the image tag format:

Also applies to: 77-84

✅ Verification successful

Image tag format follows Docker conventions and is correctly implemented

The implementation in benchmark_image_tag correctly:

Uses only allowed characters (alphanumeric, dots, hyphens)
Follows a consistent format: function.<system>.<benchmark>.<language>-<version>-<architecture>[-prefix]
Properly incorporates the architecture parameter
Has no spaces or invalid characters in the tag format

The code meets all Docker naming conventions and the previous review's concerns have been addressed.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify image tag format
# Test: Check for any invalid characters or formats in image tags
rg -g '*.py' 'benchmark_image_tag.*return.*tag' -A 1 | grep -v 'benchmark_image_tag' | grep -v '\-\-' | grep -v '^$' | while read -r line; do
    if echo "$line" | grep -q '[^a-zA-Z0-9_.-]'; then
        echo "Invalid characters found in tag format: $line"
    fi
done

Length of output: 131

Script:

#!/bin/bash
# Let's first see the actual implementation of benchmark_image_tag
rg -g '*.py' -A 10 'def benchmark_image_tag'

Length of output: 581

config/systems.json (4)

84-85: Verify setup.py integration with container builds.

The addition of setup.py to deployment files suggests Python package installation during container builds.

#!/bin/bash
# Check if setup.py exists and contains necessary configuration
fd -t f "setup.py" -X grep -l "install_requires\|packages"

# Verify Dockerfile references setup.py
fd -t f "Dockerfile" -X grep -l "setup.py"

Line range hint 1-276: Verify regression test coverage for all configurations.

The PR comments mention regression testing was performed, but please ensure:

All architecture combinations (x64/arm64) are tested
Container deployment works with both Python and Node.js
Cache mechanism works as expected with container_uri

#!/bin/bash
# Check for test files covering container deployment
fd -t f "test_.*\.py" -X grep -l "container_deployment\|container_uri"

# Look for cache configuration handling
rg "container_uri.*cache"

92-97: ⚠️ Potential issue

Review Node.js version support and version tagging.

Limited Version Support:
- Only Node.js 16 is supported
- Consider adding support for Node.js 18 and 20 as they are widely used
Version Tag Inconsistency:
- x64: Generic tag (amazon/aws-lambda-nodejs:16)
- arm64: Specific version tag (amazon/aws-lambda-nodejs:16.2024.09.06.13)

#!/bin/bash
# Check for Node.js version compatibility in codebase
rg -l "engine.*node" package.json

# Verify Node.js function compatibility
fd -t f "handler.js" -X grep -l "async.*exports.handler"

3-4: Consider making docker_repository configurable and document container deployment.

The hardcoded docker repository might limit flexibility. Consider:

Making it configurable through environment variables
Adding documentation about:
- Required repository permissions
- How to set up custom repositories
- AWS-specific container deployment limitations

sebs/benchmark.py (3)

182-182: LGTM!

The initialization of container-related fields is properly typed and follows the class's initialization pattern.

Also applies to: 205-205

246-265: LGTM!

The cache query logic properly handles both container and non-container deployments, with safe dictionary access for the container URI.

182-182: Consider early validation of container deployment support

From the PR objectives, we know that container deployment is only supported for AWS. Consider adding an early validation in the constructor to fail fast if container deployment is requested for unsupported platforms.

Let's verify the current platform support:

Also applies to: 246-265

sebs/aws/aws.py (2)

185-196: 🛠️ Refactor suggestion

Improve error handling in find_image method.

The method silently catches ClientError. Consider logging the error for better debugging.

-        except ClientError:
+        except ClientError as e:
+            self.logging.debug(f"Image not found: {e}")
             return False

Likely invalid or redundant comment.

423-424: ⚠️ Potential issue

Use context manager for file operations.

Use a context manager to ensure proper file handling and resource cleanup.

-                    package_body = open(package, "rb").read()
-                    create_function_params["Code"] = {"ZipFile": package_body}
+                    with open(package, "rb") as f:
+                        create_function_params["Code"] = {"ZipFile": f.read()}

Likely invalid or redundant comment.

🧰 Tools

🪛 Ruff

423-423: Use a context manager for opening files

(SIM115)

sebs/cache.py (1)

303-309: Handle missing cache directory in update_code_package

In the update_code_package method, if the cached_dir does not exist, the method calls self.add_code_package. Ensure that this behavior is intentional and does not lead to any unintended side effects. Consider logging a message to indicate that the code package was not found and is being added.

Please confirm that this fallback to add_code_package is the desired behavior. If so, consider adding a log message.

config/systems.json

coderabbitai · 2024-11-06T21:38:34Z

sebs/faas/system.py

+    def build_base_image(
+        self,
+        directory: str,
+        language_name: str,
+        language_version: str,
+        architecture: str,
+        benchmark: str,
+        is_cached: bool,
+    ) -> Tuple[bool, str]:
+        pass
+
+    @abstractmethod
+    def find_image(self, repository_client, repository_name, image_tag) -> bool:
+        pass
+
+    @abstractmethod
+    def push_image_to_repository(self, repository_client, repository_uri, image_tag):
+        pass
+


🛠️ Refactor suggestion

Add docstrings and error handling for container-related methods.

The new abstract methods for container support are well-structured, but they need:

Docstrings describing:

Purpose and responsibilities

Parameters and return values

Expected behavior and error conditions

Error handling guidance for implementations:

Network issues during image push

Repository authentication failures

Image build failures

coderabbitai · 2024-11-06T21:38:34Z

sebs/benchmark.py

+    @property
+    def container_uri(self) -> str:
+        assert self._container_uri is not None
+        return self._container_uri
+


⚠️ Potential issue

Replace assertion with proper None check in container_uri property

The assertion could be disabled in production Python environments (using the -O flag). Consider using a proper None check with a descriptive error message.

Apply this diff:

@property def container_uri(self) -> str: - assert self._container_uri is not None + if self._container_uri is None: + raise ValueError("Container URI is not set") return self._container_uri

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@property

def container_uri(self) -> str:

assert self._container_uri is not None

return self._container_uri

@property

def container_uri(self) -> str:

if self._container_uri is None:

raise ValueError("Container URI is not set")

return self._container_uri

coderabbitai · 2024-11-06T21:38:34Z

sebs/aws/aws.py

+        try:
+            self.docker_client.login(username=username, password=password, registry=registry_url)
+            push_image(repository_uri, image_tag)
+            self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")
+        except docker.errors.APIError as e:
+            self.logging.error(f"Failed to push the image to registry {repository_uri}.")
+            self.logging.error(f"Error: {str(e)}")
+            raise RuntimeError("Couldn't push to Docker registry")


🛠️ Refactor suggestion

Improve exception chaining in push_image_to_repository.

Use proper exception chaining to preserve the original error context.

- raise RuntimeError("Couldn't push to Docker registry") + raise RuntimeError("Couldn't push to Docker registry") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

self.docker_client.login(username=username, password=password, registry=registry_url)

push_image(repository_uri, image_tag)

self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")

except docker.errors.APIError as e:

self.logging.error(f"Failed to push the image to registry {repository_uri}.")

self.logging.error(f"Error: {str(e)}")

raise RuntimeError("Couldn't push to Docker registry")

try:

self.docker_client.login(username=username, password=password, registry=registry_url)

push_image(repository_uri, image_tag)

self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")

except docker.errors.APIError as e:

self.logging.error(f"Failed to push the image to registry {repository_uri}.")

self.logging.error(f"Error: {str(e)}")

raise RuntimeError("Couldn't push to Docker registry") from e

🧰 Tools

🪛 Ruff

263-263: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

coderabbitai · 2024-11-06T21:38:34Z

sebs/cache.py

+    def __init__(self, cache_dir: str, docker_client: docker.DockerClient):
        super().__init__()
+        self.docker_client = docker_client


🛠️ Refactor suggestion

Consider making 'docker_client' an optional parameter

Adding the required docker_client parameter to the __init__ method may break existing code that instantiates the Cache class without providing this parameter. To maintain backward compatibility, consider making docker_client an optional parameter with a default value of None. This way, existing code will continue to function, and you can handle docker_client being None where necessary.

Apply this diff to make docker_client optional:

+from typing import Optional -def __init__(self, cache_dir: str, docker_client: docker.DockerClient): +def __init__(self, cache_dir: str, docker_client: Optional[docker.DockerClient] = None): super().__init__() self.docker_client = docker_client

Ensure that you handle the case where self.docker_client might be None in methods that use it.

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-06T21:38:35Z

sebs/cache.py

+    def get_container(
+        self, deployment: str, benchmark: str, language: str,
+        language_version: str, architecture: str
+    ) -> Optional[Dict[str, Any]]:
+        cfg = self.get_benchmark_config(deployment, benchmark)
+
+        key = f"{language_version}-{architecture}"
+        if cfg and language in cfg and key in cfg[language]["containers"]:
+            return cfg[language]["containers"][key]
+        else:
+            return None
+


⚠️ Potential issue

Handle missing 'containers' key in get_container method

In the get_container method, if cfg[language] does not contain the key "containers", accessing it will raise a KeyError. To prevent this, consider using the dict.get() method with a default empty dictionary.

Apply this diff to safely access the "containers" key:

key = f"{language_version}-{architecture}" - if cfg and language in cfg and key in cfg[language]["containers"]: + if cfg and language in cfg and key in cfg[language].get("containers", {}): return cfg[language]["containers"][key] else: return None

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-06T21:38:35Z

sebs/cache.py

+                    image = self.docker_client.images.get(code_package.container_uri)
+                    language_config["image-uri"] = code_package.container_uri
+                    language_config["image-id"] = image.id
+


🛠️ Refactor suggestion

Add exception handling when retrieving Docker images

When retrieving the Docker image with self.docker_client.images.get(code_package.container_uri), an exception may be raised if the image is not found or if there is a connection issue with the Docker daemon. To make the code more robust, add exception handling to catch potential exceptions and handle them gracefully.

Apply this diff to add exception handling:

if code_package.container_deployment: - image = self.docker_client.images.get(code_package.container_uri) + try: + image = self.docker_client.images.get(code_package.container_uri) + except docker.errors.ImageNotFound: + raise RuntimeError(f"Image '{code_package.container_uri}' not found.") + except docker.errors.APIError as e: + raise RuntimeError(f"Docker API error: {e}") language_config["image-uri"] = code_package.container_uri language_config["image-id"] = image.id

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

image = self.docker_client.images.get(code_package.container_uri)

language_config["image-uri"] = code_package.container_uri

language_config["image-id"] = image.id

if code_package.container_deployment:

try:

image = self.docker_client.images.get(code_package.container_uri)

except docker.errors.ImageNotFound:

raise RuntimeError(f"Image '{code_package.container_uri}' not found.")

except docker.errors.APIError as e:

raise RuntimeError(f"Docker API error: {e}")

language_config["image-uri"] = code_package.container_uri

language_config["image-id"] = image.id

coderabbitai · 2024-11-06T21:38:35Z

sebs/cache.py

+
                    key = f"{language_version}-{architecture}"
-                    config[deployment_name][language]["code_package"][key]["date"][
+                    if code_package.container_deployment:
+                        main_key = "containers"
+                    else:
+                        main_key = "code_package"
+
+                    config[deployment_name][language][main_key][key]["date"][
                        "modified"
                    ] = date
-                    config[deployment_name][language]["code_package"][key][
+                    config[deployment_name][language][main_key][key][
                        "hash"
                    ] = code_package.hash
-                    config[deployment_name][language]["code_package"][key][
+                    config[deployment_name][language][main_key][key][
                        "size"
                    ] = code_package.code_size
+
+                    if code_package.container_deployment:
+
+                        image = self.docker_client.images.get(code_package.container_uri)
+                        config[deployment_name][language][main_key][key][
+                            "image-id"
+                        ] = image.id
+                        config[deployment_name][language][main_key][key][
+                            "image-uri"
+                        ] = code_package.container_uri
+


🛠️ Refactor suggestion

Add exception handling when updating Docker image information

Similar to the addition in add_code_package, when updating the Docker image information in update_code_package, add exception handling around the self.docker_client.images.get call to handle potential errors gracefully.

Apply this diff to add exception handling:

if code_package.container_deployment: main_key = "containers" else: main_key = "code_package" config[deployment_name][language][main_key][key]["date"]["modified"] = date config[deployment_name][language][main_key][key]["hash"] = code_package.hash config[deployment_name][language][main_key][key]["size"] = code_package.code_size if code_package.container_deployment: - image = self.docker_client.images.get(code_package.container_uri) + try: + image = self.docker_client.images.get(code_package.container_uri) + except docker.errors.ImageNotFound: + raise RuntimeError(f"Image '{code_package.container_uri}' not found.") + except docker.errors.APIError as e: + raise RuntimeError(f"Docker API error: {e}") config[deployment_name][language][main_key][key]["image-id"] = image.id config[deployment_name][language][main_key][key]["image-uri"] = code_package.container_uri

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

key = f"{language_version}-{architecture}"

config[deployment_name][language]["code_package"][key]["date"][

if code_package.container_deployment:

main_key = "containers"

else:

main_key = "code_package"

config[deployment_name][language][main_key][key]["date"][

"modified"

] = date

config[deployment_name][language]["code_package"][key][

config[deployment_name][language][main_key][key][

"hash"

] = code_package.hash

config[deployment_name][language]["code_package"][key][

config[deployment_name][language][main_key][key][

"size"

] = code_package.code_size

if code_package.container_deployment:

image = self.docker_client.images.get(code_package.container_uri)

config[deployment_name][language][main_key][key][

"image-id"

] = image.id

config[deployment_name][language][main_key][key][

"image-uri"

] = code_package.container_uri

key = f"{language_version}-{architecture}"

if code_package.container_deployment:

main_key = "containers"

else:

main_key = "code_package"

config[deployment_name][language][main_key][key]["date"][

"modified"

] = date

config[deployment_name][language][main_key][key][

"hash"

] = code_package.hash

config[deployment_name][language][main_key][key][

"size"

] = code_package.code_size

if code_package.container_deployment:

try:

image = self.docker_client.images.get(code_package.container_uri)

except docker.errors.ImageNotFound:

raise RuntimeError(f"Image '{code_package.container_uri}' not found.")

except docker.errors.APIError as e:

raise RuntimeError(f"Docker API error: {e}")

config[deployment_name][language][main_key][key][

"image-id"

] = image.id

config[deployment_name][language][main_key][key][

"image-uri"

] = code_package.container_uri

coderabbitai

Actionable comments posted: 11

🧹 Outside diff range and nitpick comments (13)

sebs/aws/config.py (1)

257-282: Add retry mechanism for ECR API calls

The repository creation could fail due to transient AWS API issues. Consider implementing a retry mechanism similar to the one used in the HTTP API creation.

 def get_ecr_repository(self, ecr_client: ECRClient) -> str:
     if self._container_repository is not None:
         return self._container_repository

     self._container_repository = "sebs-benchmarks-{}".format(self._resources_id)
     self._docker_registry = self.check_ecr_repository_exists(
         ecr_client, self._container_repository
     )

     if self._docker_registry is None:
+        retries = 0
+        while retries < 5:
             try:
                 resp = ecr_client.create_repository(repositoryName=self._container_repository)
                 self.logging.info(f"Created ECR repository: {self._container_repository}")
                 self._docker_registry = resp["repository"]["repositoryUri"]
+                break
             except ecr_client.exceptions.RepositoryAlreadyExistsException:
                 self.logging.info(f"ECR repository {self._container_repository} already exists.")
                 self._docker_registry = self.check_ecr_repository_exists(
                     ecr_client, self._container_repository
                 )
+            except ecr_client.exceptions.ServerException:
+                retries += 1
+                if retries == 5:
+                    raise
+                time.sleep(retries)

     return self._container_repository

sebs.py (3)

450-459: Enhance function documentation.

The docstring should include details about the new measure_interval parameter and its impact on memory measurements.

 def start(
     benchmark,
     benchmark_input_size,
     output,
     deployments,
     storage_configuration,
     measure_interval,
     remove_containers,
     **kwargs,
 ):
     """
     Start a given number of function instances and a storage instance.
+
+    Args:
+        measure_interval: Duration between memory measurements in milliseconds.
+                         Use -1 to disable measurements.
+        remove_containers: If True, containers are removed after stopping.
     """

596-596: Fix typo in log message.

There's a typo in the log message.

-        sebs_client.logging.info("Resource grup:")
+        sebs_client.logging.info("Resource groups:")

643-654: Improve resource group management implementation.

Fix the unused loop variable as suggested by static analysis.
Add more descriptive error messages for resource group operations.

-        groups = deployment_client.config.resources.list_resource_groups(
-            deployment_client.cli_instance
-        )
-        for idx, group in enumerate(groups):
+        groups = deployment_client.config.resources.list_resource_groups(
+            deployment_client.cli_instance
+        )
+        for _idx, group in enumerate(groups):
             if len(prefix) > 0 and not group.startswith(prefix):
                 continue

-            sebs_client.logging.info(f"Removing resource group: {group}")
+            sebs_client.logging.info(f"Attempting to remove resource group: {group}")
             deployment_client.config.resources.delete_resource_group(
                 deployment_client.cli_instance, group, wait
             )
+            sebs_client.logging.info(f"Successfully removed resource group: {group}")

🧰 Tools

🪛 Ruff

646-646: Loop control variable idx not used within loop body

Rename unused idx to _idx

(B007)

sebs/azure/azure.py (1)

162-171: LGTM! Consider adding a docstring.

The changes correctly handle container deployment parameters and properly raise NotImplementedError for unsupported functionality. The return type has been updated to include container_uri.

Consider adding a docstring to document the new parameters and return value:

def package_code(
    self,
    directory: str,
    language_name: str,
    language_version: str,
    architecture: str,
    benchmark: str,
    is_cached: bool,
    container_deployment: bool,
) -> Tuple[str, int, str]:
    """Package the code for deployment.

    Args:
        directory: Source directory containing the code
        language_name: Programming language name
        language_version: Language version
        architecture: Target architecture
        benchmark: Benchmark name
        is_cached: Whether to use cached package
        container_deployment: Whether to use container deployment

    Returns:
        Tuple containing:
        - Directory path
        - Code size in bytes
        - Container URI (empty for non-container deployments)

    Raises:
        NotImplementedError: If container deployment is True as it's not supported
    """

sebs/aws/aws.py (3)

138-151: Add type hints for improved code maintainability.

Consider adding type hints for the new parameters and return type:

    def package_code(
        self,
        directory: str,
        language_name: str,
        language_version: str,
-       architecture: str,
-       benchmark: str,
-       is_cached: bool,
-       container_deployment: bool,
+       architecture: str,
+       benchmark: str,
+       is_cached: bool,
+       container_deployment: bool,
-    ) -> Tuple[str, int, str]:
+    ) -> Tuple[str, int, str]:  # (package_path, size_in_bytes, container_uri)

238-278: Consider extracting zip deployment logic to a separate method.

The create_function method handles both container and zip deployments. Consider extracting the zip deployment logic to improve readability and maintainability:

+    def _create_zip_function(
+        self,
+        create_function_params: Dict[str, Any],
+        package: str,
+        code_size: int,
+        benchmark: str,
+        architecture: str,
+        language: str,
+        language_runtime: str,
+    ) -> None:
+        if code_size < 50 * 1024 * 1024:
+            with open(package, "rb") as f:
+                create_function_params["Code"] = {"ZipFile": f.read()}
+        else:
+            storage_client = self.get_storage()
+            code_bucket = storage_client.get_bucket(
+                Resources.StorageBucketType.DEPLOYMENT
+            )
+            code_prefix = os.path.join(
+                benchmark, architecture, os.path.basename(package)
+            )
+            storage_client.upload(code_bucket, package, code_prefix)
+            create_function_params["Code"] = {
+                "S3Bucket": code_bucket,
+                "S3Key": code_prefix,
+            }
+        create_function_params["Runtime"] = f"{language}{self._map_language_runtime(language, language_runtime)}"
+        create_function_params["Handler"] = "handler.handler"

    def create_function(
        self,
        code_package: Benchmark,
        func_name: str,
        container_deployment: bool,
        container_uri: str,
    ) -> "LambdaFunction":
        # ... existing code ...
        if container_deployment:
            create_function_params["PackageType"] = "Image"
            create_function_params["Code"] = {"ImageUri": container_uri}
        else:
            create_function_params["PackageType"] = "Zip"
            self._create_zip_function(
                create_function_params,
                package,
                code_size,
                benchmark,
                architecture,
                language,
                language_runtime,
            )

🧰 Tools

🪛 Ruff

253-253: Use a context manager for opening files

(SIM115)

323-367: Consider extracting zip deployment update logic to a separate method.

Similar to the create_function refactoring suggestion, consider extracting the zip deployment update logic to improve readability and maintainability.

+    def _update_zip_function(
+        self,
+        name: str,
+        package: str,
+        code_size: int,
+        benchmark: str,
+        architecture: str,
+        function: LambdaFunction,
+    ) -> None:
+        if code_size < 50 * 1024 * 1024:
+            with open(package, "rb") as code_body:
+                self.client.update_function_code(
+                    FunctionName=name,
+                    ZipFile=code_body.read(),
+                    Architectures=[self._map_architecture(architecture)],
+                )
+        else:
+            storage = cast(S3, self.get_storage())
+            bucket = function.code_bucket(benchmark, storage)
+            code_prefix = os.path.join(
+                benchmark,
+                architecture,
+                os.path.basename(package)
+            )
+            storage.upload(bucket, package, code_prefix)
+            self.client.update_function_code(
+                FunctionName=name,
+                S3Bucket=bucket,
+                S3Key=code_prefix,
+                Architectures=[self._map_architecture(architecture)],
+            )

    def update_function(
        self,
        function: Function,
        code_package: Benchmark,
        container_deployment: bool,
        container_uri: str,
    ):
        name = function.name
        function = cast(LambdaFunction, function)
        
        if container_deployment:
            self.client.update_function_code(
                FunctionName=name,
                ImageUri=container_uri
            )
        else:
            self._update_zip_function(
                name,
                code_package.code_location,
                code_package.code_size,
                code_package.benchmark,
                function_cfg.architecture.value,
                function,
            )

sebs/faas/container.py (2)

40-43: Simplify code by using a ternary operator

You can make the code more concise by replacing the if-else block with a ternary operator.
-        if isinstance(txt, str):
-            line = json.loads(txt)
-        else:
-            line = txt
+        line = json.loads(txt) if isinstance(txt, str) else txt
🧰 Tools

🪛 Ruff

40-43: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt instead of if-else-block

Replace if-else-block with line = json.loads(txt) if isinstance(txt, str) else txt

(SIM108)

66-67: Raise a more specific exception for better error handling

Currently, a generic Exception is raised when an error occurs. Consider raising a more specific exception, such as docker.errors.APIError or defining a custom exception class. This improves clarity and allows for more precise exception handling.
-            raise Exception(line["error"])
+            raise docker.errors.APIError(line["error"])

sebs/aws/container.py (2)

69-159: Consider refactoring 'build_base_image' method for improved readability

The build_base_image method is lengthy and encompasses multiple logical steps, such as checking the image cache, preparing the build context, building the Docker image, and pushing the image to the repository. Breaking it down into smaller, well-named helper methods would enhance readability and maintainability.

148-150: Add error handling for the Docker image build process

To ensure issues during the Docker build are properly managed, consider adding exception handling around the image build step. This allows for graceful error reporting and recovery.

Example modification:
+            try:
                 image, _ = self.docker_client.images.build(
                     tag=repository_uri, path=build_dir, buildargs=buildargs
                 )
+            except docker.errors.BuildError as e:
+                self.logging.error(f"Failed to build Docker image: {e}")
+                raise RuntimeError("Docker image build failed") from e

sebs/gcp/gcp.py (1)

200-200: Consider removing container_uri from the return value if unused.

Since container_deployment is not supported and container_uri is initialized but not utilized within the package_code method, consider removing it from the return statement to simplify the code.

Apply this diff to adjust the return statement:
-return os.path.join(directory, "{}.zip".format(benchmark)), bytes_size, container_uri
+return os.path.join(directory, "{}.zip".format(benchmark)), bytes_size
And update the method signature accordingly.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between e4a5b26 and ce64f7d.

📒 Files selected for processing (13)

dockerfiles/aws/python/Dockerfile.function (1 hunks)
requirements.aws.txt (1 hunks)
sebs.py (12 hunks)
sebs/aws/aws.py (8 hunks)
sebs/aws/config.py (4 hunks)
sebs/aws/container.py (1 hunks)
sebs/azure/azure.py (6 hunks)
sebs/faas/container.py (1 hunks)
sebs/faas/system.py (5 hunks)
sebs/gcp/gcp.py (4 hunks)
sebs/local/local.py (3 hunks)
sebs/openwhisk/openwhisk.py (9 hunks)
sebs/sebs.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

dockerfiles/aws/python/Dockerfile.function
requirements.aws.txt
sebs/local/local.py
sebs/openwhisk/openwhisk.py
sebs/sebs.py

🧰 Additional context used

🪛 Ruff

sebs.py

646-646: Loop control variable idx not used within loop body

Rename unused idx to _idx

(B007)

sebs/aws/aws.py

253-253: Use a context manager for opening files

(SIM115)

sebs/aws/container.py

67-67: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

sebs/faas/container.py

40-43: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt instead of if-else-block

Replace if-else-block with line = json.loads(txt) if isinstance(txt, str) else txt

(SIM108)

🔇 Additional comments (17)

sebs/aws/config.py (4)

1-8: LGTM: Appropriate imports for ECR support

The added imports and type hints are well-chosen for ECR integration.

136-151: LGTM: Well-structured property definitions

The property getters are properly implemented with correct type hints and follow Python best practices.

330-338: LGTM: Proper cache management

The cache updates are well-structured and maintain consistency with the existing caching mechanism.

303-307: 🛠️ Refactor suggestion

Use safe dictionary access

The current implementation assumes dictionary keys exist, which could raise KeyError.

 if "docker" in dct:
-    ret._docker_registry = dct["docker"]["registry"]
-    ret._docker_username = dct["docker"]["username"]
-    ret._container_repository = dct["container_repository"]
+    docker_config = dct["docker"]
+    ret._docker_registry = docker_config.get("registry")
+    ret._docker_username = docker_config.get("username")
+    ret._container_repository = dct.get("container_repository")

Likely invalid or redundant comment.

sebs/azure/azure.py (3)

295-304: Changes look good!

The implementation correctly handles container deployment parameters and raises NotImplementedError for unsupported functionality.

341-350: Changes look good!

The implementation correctly handles container deployment parameters and raises NotImplementedError for unsupported functionality.

530-530: LGTM!

The update_function call correctly passes False for container_deployment and empty string for container_uri, which is consistent with Azure not supporting container deployment.

sebs/aws/aws.py (3)

13-13: LGTM: Clean ECR client integration.

The ECR client integration follows the existing pattern of client initialization in the AWS class.

Also applies to: 76-78

400-401: LGTM: Clean function name update for container deployments.

The addition of the "-docker" suffix for container deployments is consistent with the existing naming pattern.

253-253: ⚠️ Potential issue

Use context manager for file operations.

Use a context manager to ensure proper file handling:

-                    package_body = open(package, "rb").read()
+                    with open(package, "rb") as f:
+                        package_body = f.read()

Likely invalid or redundant comment.

🧰 Tools

🪛 Ruff

253-253: Use a context manager for opening files

(SIM115)

sebs/faas/container.py (2)

11-19: Well-defined constructor initializes class attributes correctly

The __init__ method properly sets up the Docker client, experimental manifest flag, and system configuration for the DockerContainer class.

69-87: push_image method handles image pushing effectively

The push_image method correctly manages the image push process, including progress tracking and error handling with proper logging.

sebs/aws/container.py (1)

66-66: 🛠️ Refactor suggestion

Enhance exception handling by chaining exceptions

To preserve the original exception traceback and provide better error context, use exception chaining by adding from e when raising a new exception in an except block.

Apply this diff to improve exception handling:

-            raise RuntimeError("Couldn't push to Docker registry")
+            raise RuntimeError("Couldn't push to Docker registry") from e

Likely invalid or redundant comment.

sebs/gcp/gcp.py (4)

141-145: Method signature updated to include container deployment parameters.

The package_code method now accepts architecture and container_deployment parameters, aligning with the changes made for container support. This update ensures consistency across different cloud providers.

149-150: Proper handling of unsupported container deployment in GCP.

Raising a NotImplementedError when container_deployment is True correctly indicates that container deployment is not supported in GCP, preventing unintended usage.

202-208: Method signature updated to include container deployment parameters.

The create_function method now includes container_deployment and container_uri parameters, ensuring consistency with the overall interface and preparing for potential future support.

210-211: Correctly raising an error for unsupported container deployment.

The NotImplementedError provides clear feedback that container deployment is not yet implemented for GCP, which helps maintain correct usage.

sebs/faas/system.py

sebs/aws/config.py

sebs.py

sebs/faas/container.py

sebs/aws/container.py

sebs/gcp/gcp.py

… OpenWhisk support

…nd container

coderabbitai

Actionable comments posted: 15

🧹 Outside diff range and nitpick comments (20)

sebs/openwhisk/container.py (3)

1-9: LGTM with minor style suggestion.

The imports and class declaration are well-organized with proper type hints.

Remove the extra blank line at line 8 to maintain consistent spacing.

10-17: Add docstrings to static methods.

While the methods are clear and concise, adding docstrings would improve code documentation and maintainability.

Consider adding docstrings:

 @staticmethod
 def name() -> str:
+    """Return the identifier for OpenWhisk container.
+    
+    Returns:
+        str: The string 'openwhisk'
+    """
     return "openwhisk"

 @staticmethod
 def typename() -> str:
+    """Return the full type name for OpenWhisk container.
+    
+    Returns:
+        str: The string 'OpenWhisk.Container'
+    """
     return "OpenWhisk.Container"

18-27: Add constructor documentation and parameter validation.

The constructor implementation is correct but could benefit from additional documentation and validation.

Consider these improvements:

 def __init__(
     self,
     system_config: SeBSConfig,
     config: OpenWhiskConfig,
     docker_client: docker.client,
     experimental_manifest: bool,
 ):
+    """Initialize OpenWhisk container manager.
+    
+    Args:
+        system_config (SeBSConfig): System-wide configuration
+        config (OpenWhiskConfig): OpenWhisk-specific configuration
+        docker_client (docker.client): Docker client instance
+        experimental_manifest (bool): Flag for experimental features
+    
+    Raises:
+        ValueError: If required configurations are missing
+    """
+    if not system_config:
+        raise ValueError("system_config is required")
+    if not config:
+        raise ValueError("config is required")
+    if not docker_client:
+        raise ValueError("docker_client is required")
     super().__init__(system_config, docker_client, experimental_manifest)
     self.config = config

sebs/aws/container.py (2)

14-21: Add return type annotation to name() method.

For consistency with typename(), add a return type annotation to the name() method.

    @staticmethod
-   def name():
+   def name() -> str:
        return "aws"

38-53: Add docstring to document the registry_name method.

The method logic is correct, but adding a docstring would improve code documentation by explaining the purpose and return values.

    def registry_name(
        self, benchmark: str, language_name: str, language_version: str, architecture: str
    ) -> Tuple[str, str, str, str]:
+       """
+       Constructs AWS ECR registry details for a given benchmark and runtime.
+       
+       Args:
+           benchmark: Name of the benchmark
+           language_name: Programming language name
+           language_version: Language version
+           architecture: Target architecture
+       
+       Returns:
+           Tuple containing (registry_name, repository_name, image_tag, image_uri)
+       """

config/systems.json (1)

Line range hint 1-285: Consider adding JSON schema validation.

To ensure configuration consistency and catch potential issues early, consider:

Creating a JSON schema for this configuration file

Adding validation in the code that reads this configuration

Including schema documentation

This would help prevent configuration errors and improve maintainability.

sebs/utils.py (1)

Line range hint 103-138: Remove commented-out logging code.

The commented-out configure_logging implementation should be removed to maintain code cleanliness. If this alternative implementation is needed in the future, it can be retrieved from version control history.

sebs/faas/container.py (4)

29-41: Add parameter validation in constructor.

Consider adding validation for docker_client to ensure it's a valid Docker client instance, as its methods are used throughout the class.

 def __init__(
     self,
     system_config: SeBSConfig,
     docker_client,
     experimental_manifest: bool = False,
 ):
     super().__init__()
+    if not hasattr(docker_client, 'images'):
+        raise ValueError("Invalid docker_client provided")
 
     self.docker_client = docker_client
     self.experimental_manifest = experimental_manifest
     self.system_config = system_config
     self._disable_rich_output = False

60-90: Optimize code and improve error handling.

Two suggestions for improvement:

Use ternary operator for better readability
Add more specific error handling for JSON parsing

-        if isinstance(txt, str):
-            line = json.loads(txt)
-        else:
-            line = txt
+        try:
+            line = json.loads(txt) if isinstance(txt, str) else txt
+        except json.JSONDecodeError as e:
+            self.logging.error(f"Failed to parse progress update: {e}")
+            raise

🧰 Tools

🪛 Ruff

62-65: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt instead of if-else-block

Replace if-else-block with line = json.loads(txt) if isinstance(txt, str) else txt

(SIM108)

91-122: Refactor to reduce code duplication.

The image push logic is duplicated between the progress and non-progress paths. Consider extracting the common functionality.

     def push_image(self, repository_uri, image_tag):
+        def handle_push_stream(stream):
+            for line in stream:
+                if "error" in line:
+                    self.logging.error(f"Failed to push the image to registry {repository_uri}")
+                    raise RuntimeError(line)
+                yield line
+
         try:
+            self.logging.info(f"Pushing image {image_tag} to {repository_uri}")
+            ret = self.docker_client.images.push(
+                repository=repository_uri, tag=image_tag, stream=True, decode=True
+            )
 
             if not self.disable_rich_output:
-
                 layer_tasks = {}
                 with Progress() as progress:
-
-                    self.logging.info(f"Pushing image {image_tag} to {repository_uri}")
-                    ret = self.docker_client.images.push(
-                        repository=repository_uri, tag=image_tag, stream=True, decode=True
-                    )
-                    for line in ret:
+                    for line in handle_push_stream(ret):
                         self.show_progress(line, progress, layer_tasks)
-
             else:
-                self.logging.info(f"Pushing image {image_tag} to {repository_uri}")
-                ret = self.docker_client.images.push(
-                    repository=repository_uri, tag=image_tag, stream=True, decode=True
-                )
-
-                for val in ret:
-                    if "error" in val:
-                        self.logging.error(f"Failed to push the image to registry {repository_uri}")
-                        raise RuntimeError(val)
+                for _ in handle_push_stream(ret):
+                    pass

1-211: Consider adding timeout and retry mechanisms.

For improved reliability, consider:

Adding configurable timeouts for Docker operations
Implementing retry mechanisms for transient failures in image push/pull operations
Adding circuit breakers for Docker daemon communication

This would make the container operations more resilient in production environments.

🧰 Tools

🪛 Ruff

62-65: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt instead of if-else-block

Replace if-else-block with line = json.loads(txt) if isinstance(txt, str) else txt

(SIM108)

sebs/openwhisk/openwhisk.py (1)

376-377: Consider adding enable_rich_output method for symmetry.

For better API design, consider adding a corresponding method to enable rich output.
 def disable_rich_output(self):
     self.container_client.disable_rich_output = True
+
+def enable_rich_output(self):
+    self.container_client.disable_rich_output = False

sebs/faas/system.py (2)

158-171: Improve docstring accuracy and completeness.

The docstring has a few minor issues:

Architecture example uses 'x64' but the code uses 'x86_64' (line 162).

Return tuple description is missing the second element (size of package).
-            architecture: Target architecture (e.g., 'x64', 'arm64')
+            architecture: Target architecture (e.g., 'x86_64', 'arm64')

-            - Size of the package
+            - Size of the package in bytes
401-403: Add docstring for disable_rich_output method.

The purpose and usage of this method are unclear. Please add a docstring explaining:

The method's purpose

Whether it should be abstract

Expected behavior in subclasses
     def disable_rich_output(self):
+        """
+        Disable rich output formatting in the FaaS system.
+        
+        This method can be overridden by subclasses to implement
+        system-specific rich output disabling behavior.
+        """
         pass

sebs.py (3)

228-228: Add validation for storage configuration.

Consider adding validation to ensure the storage configuration file is properly formatted and contains required fields.

@click.option("--storage-configuration", default=None, type=str, help="JSON configuration of deployed storage.")
+def validate_storage_config(ctx, param, value):
+    if value is None:
+        return None
+    try:
+        with open(value, 'r') as f:
+            config = json.load(f)
+            # Add validation for required fields
+            return value
+    except (json.JSONDecodeError, IOError) as e:
+        raise click.BadParameter(f"Invalid storage configuration: {str(e)}")
+
+@click.option("--storage-configuration",
+    default=None,
+    type=str,
+    callback=validate_storage_config,
+    help="JSON configuration of deployed storage.")

439-460: Enhance documentation for measurement interval parameter.

Consider adding more detailed documentation about:

The purpose and impact of the measurement interval
Recommended values for different use cases
Performance implications of different intervals

644-655: Fix unused loop variable and enhance error handling.

Address the unused loop variable.
Consider adding error handling for the resource group deletion operation.

-        for idx, group in enumerate(groups):
+        for _idx, group in enumerate(groups):
             if len(prefix) > 0 and not group.startswith(prefix):
                 continue

             sebs_client.logging.info(f"Removing resource group: {group}")
-            deployment_client.config.resources.delete_resource_group(
-                deployment_client.cli_instance, group, wait
-            )
+            try:
+                deployment_client.config.resources.delete_resource_group(
+                    deployment_client.cli_instance, group, wait
+                )
+            except Exception as e:
+                sebs_client.logging.error(f"Failed to delete resource group {group}: {str(e)}")
+                if not wait:
+                    sebs_client.logging.warning("Resource group deletion might still be in progress")

🧰 Tools

🪛 Ruff

647-647: Loop control variable idx not used within loop body

Rename unused idx to _idx

(B007)

sebs/aws/aws.py (1)

238-278: Consider extracting zip deployment logic.

The create_function method is handling both container and zip deployments, making it quite long. Consider extracting the zip deployment logic into a separate method for better maintainability.

+def _create_zip_function(self, create_function_params: dict, package: str, code_size: int,
+                        benchmark: str, architecture: str, language: str,
+                        language_runtime: str) -> dict:
+    if code_size < 50 * 1024 * 1024:
+        with open(package, "rb") as f:
+            package_body = f.read()
+        create_function_params["Code"] = {"ZipFile": package_body}
+    else:
+        code_package_name = os.path.basename(package)
+        storage_client = self.get_storage()
+        code_bucket = storage_client.get_bucket(Resources.StorageBucketType.DEPLOYMENT)
+        code_prefix = os.path.join(benchmark, architecture, code_package_name)
+        storage_client.upload(code_bucket, package, code_prefix)
+        create_function_params["Code"] = {
+            "S3Bucket": code_bucket,
+            "S3Key": code_prefix,
+        }
+    create_function_params["Runtime"] = f"{language}{self._map_language_runtime(language, language_runtime)}"
+    create_function_params["Handler"] = "handler.handler"
+    return create_function_params

 def create_function(self, code_package: Benchmark, func_name: str,
                    container_deployment: bool, container_uri: str) -> "LambdaFunction":
     # ... existing code ...
     if container_deployment:
         create_function_params["PackageType"] = "Image"
         create_function_params["Code"] = {"ImageUri": container_uri}
     else:
         create_function_params["PackageType"] = "Zip"
-        if code_size < 50 * 1024 * 1024:
-            # ... existing zip handling code ...
+        create_function_params = self._create_zip_function(
+            create_function_params, package, code_size, benchmark,
+            architecture, language, language_runtime
+        )

🧰 Tools

🪛 Ruff

253-253: Use a context manager for opening files

(SIM115)

sebs/gcp/gcp.py (1)

141-150: LGTM! Consider improving the error message.

The changes to add container deployment support are well-structured. However, the error message could be more informative.

Consider updating the error message to be more specific:
-            raise NotImplementedError("Container Deployment is not supported in GCP")
+            raise NotImplementedError("Container deployment is not supported in GCP. Use AWS for container-based deployments.")

sebs/regression.py (1)

131-131: Remove Unnecessary Commented Code

The line:
# for trigger in triggers:
is commented out. If this code is no longer needed, consider removing it to keep the codebase clean. If it's temporarily commented out for debugging or development purposes, please add a comment explaining its intended future use.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ce64f7d and 18cc75b.

📒 Files selected for processing (14)

config/systems.json (8 hunks)
sebs.py (13 hunks)
sebs/aws/aws.py (9 hunks)
sebs/aws/container.py (1 hunks)
sebs/benchmark.py (9 hunks)
sebs/config.py (1 hunks)
sebs/faas/container.py (1 hunks)
sebs/faas/system.py (7 hunks)
sebs/gcp/gcp.py (4 hunks)
sebs/openwhisk/container.py (1 hunks)
sebs/openwhisk/openwhisk.py (9 hunks)
sebs/regression.py (11 hunks)
sebs/sebs.py (2 hunks)
sebs/utils.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

sebs/benchmark.py
sebs/sebs.py

🧰 Additional context used

🪛 Ruff

sebs.py

647-647: Loop control variable idx not used within loop body

Rename unused idx to _idx

(B007)

sebs/aws/aws.py

253-253: Use a context manager for opening files

(SIM115)

sebs/aws/container.py

79-79: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

sebs/faas/container.py

62-65: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt instead of if-else-block

Replace if-else-block with line = json.loads(txt) if isinstance(txt, str) else txt

(SIM108)

🔇 Additional comments (26)

sebs/openwhisk/container.py (1)

28-46: 🛠️ Refactor suggestion

Add input validation and improve registry handling.

While the method correctly constructs registry information, it could benefit from additional validation and error handling.

Consider these improvements:

 def registry_name(
     self, benchmark: str, language_name: str, language_version: str, architecture: str
 ) -> Tuple[str, str, str, str]:
+    """Construct registry information for container image.
+    
+    Args:
+        benchmark (str): Benchmark identifier
+        language_name (str): Programming language name
+        language_version (str): Language version
+        architecture (str): Target architecture
+    
+    Returns:
+        Tuple[str, str, str, str]: Registry name, repository name, image tag, and URI
+    
+    Raises:
+        ValueError: If input parameters are invalid
+    """
+    if not all([benchmark, language_name, language_version, architecture]):
+        raise ValueError("All parameters must be non-empty strings")
+
+    # Sanitize inputs to ensure valid Docker names
+    safe_benchmark = benchmark.lower().replace(' ', '-')
+    safe_lang_name = language_name.lower().replace(' ', '-')
 
     registry_name = self.config.resources.docker_registry
 
     # We need to retag created images when pushing to registry other
     # than default
     repository_name = self.system_config.docker_repository()
     image_tag = self.system_config.benchmark_image_tag(
-        self.name(), benchmark, language_name, language_version, architecture
+        self.name(), safe_benchmark, safe_lang_name, language_version, architecture
     )
     if registry_name is not None and registry_name != "":
+        # Validate registry name format
+        if not registry_name.lower().replace('.', '').replace('-', '').isalnum():
+            raise ValueError(f"Invalid registry name: {registry_name}")
         repository_name = f"{registry_name}/{repository_name}"
     else:
-        registry_name = "Docker Hub"
+        registry_name = "docker.io"  # Use official Docker Hub registry name
     image_uri = f"{repository_name}:{image_tag}"
 
     return registry_name, repository_name, image_tag, image_uri

Key improvements:

Added input validation
Added parameter sanitization for Docker compatibility
Added registry name validation
Used official Docker Hub registry name

Let's verify the Docker registry naming conventions:

sebs/aws/container.py (4)

1-13: LGTM! Well-structured imports and class definition.

The imports are properly organized and the class inheritance is appropriate for implementing AWS ECR container support.

22-33: LGTM! Well-structured constructor.

The constructor properly initializes the ECR client and maintains parent class functionality.

34-37: LGTM! Well-typed client property.

54-65: 🛠️ Refactor suggestion

Simplify the method and improve error handling.

The method has a redundant return False at the end.
The past review comment about logging exceptions is still valid.

    def find_image(self, repository_name, image_tag) -> bool:
        try:
            response = self.ecr_client.describe_images(
                repositoryName=repository_name, imageIds=[{"imageTag": image_tag}]
            )
            if response["imageDetails"]:
                return True
-        except ClientError:
+        except ClientError as e:
+            self.logging.error(f"Error checking image existence: {e}")
            return False
-
-        return False

config/systems.json (6)

58-59: LGTM: Local configuration changes are consistent.

The addition of architecture and deployment specifications for local development is well-structured.

113-114: LGTM: AWS container deployment configuration is well-structured.

The changes properly support:

Multiple architectures (x64, arm64)
Multiple deployment methods (package, container)

This aligns well with the PR objectives for container deployment support.

120-126: LGTM: Azure configuration changes maintain consistency.

The restructuring of base images and addition of architecture/deployment specifications align with the overall configuration structure.

Also applies to: 171-172

178-185: LGTM: GCP configuration uses secure base images.

Good practice observed:

Using latest Ubuntu LTS (22.04) for Python runtimes
Consistent architecture specification
Clear deployment method specification

Also applies to: 228-229

84-85: Document the setup.py requirements and usage.

The addition of setup.py to deployment files is a significant change that affects the packaging process. Please ensure this is documented in:

Deployment documentation
Configuration guide
Any relevant README files

282-283: Verify container-only deployment for OpenWhisk.

OpenWhisk is configured with "deployments": ["container"] while other providers support both package and container deployments. Please verify if this is intentional and document the reasoning.

✅ Verification successful

Based on the extensive search results, I can now provide a definitive response about the container-only deployment configuration for OpenWhisk:

Container-only deployment is intentional and required for OpenWhisk

OpenWhisk is specifically configured to use container-only deployment due to its inherent 48MB code package size limitation. This is well-documented in the codebase:

The container-only approach is explicitly mentioned in docs/platforms.md: "OpenWhisk has a very low code package size limit of only 48 megabytes. So, to circumvent this limit, we deploy functions using pre-built Docker images."
The implementation in sebs/openwhisk/ consistently uses Docker containers for function deployment
All official OpenWhisk base images are properly configured in config/systems.json

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if this limitation is documented
fd -t f "README|DEPLOY|CONFIGURATION" -X grep -l -A 5 "OpenWhisk.*container"

Length of output: 78

Script:

#!/bin/bash
# Let's check the systems.json file to understand deployment patterns across providers
cat config/systems.json | jq -r 'to_entries[] | select(.value.deployments != null) | "\(.key): \(.value.deployments | join(", "))"'

# Also search for any OpenWhisk related documentation or configuration
rg -i "openwhisk" -A 5 -B 5

Length of output: 69282

sebs/utils.py (1)

57-57: Verify the impact of the condition change.

The change from if value: to if value is not None: allows falsy values (0, '', [], {}, False) to be stored in the nested dictionary. While this might be intentional for container configurations, please verify that this doesn't break existing functionality.

✅ Verification successful

Let me analyze the usage of update_nested_dict further by checking the values being passed to it.

Based on the gathered information, I can now provide the final response:

The condition change is safe and intentional

The change from if value: to if value is not None: is actually a safer implementation because:

All calls to update_nested_dict pass configuration values that should be explicitly checked for None rather than falsy values
The function is used to update configuration dictionaries where empty strings, zeros, or False values might be valid configuration options
The change aligns with the function's type hint Optional[str], making the implementation more consistent with its interface

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for calls to update_nested_dict to understand the impact
# Test: Look for potential issues with falsy values being stored

# Find all direct calls to update_nested_dict
echo "Direct calls to update_nested_dict:"
rg "update_nested_dict\(" -A 2

# Find potential configuration files that might be affected
echo -e "\nPotential configuration files:"
fd -e json -e yaml -e yml

Length of output: 2993

Script:

#!/bin/bash
# Check the values being passed to update_nested_dict
echo "Checking update_nested_dict usage in sebs.py:"
rg -B 2 "update_nested_dict\(" sebs.py

# Check the implementation of update_nested_dict
echo -e "\nChecking update_nested_dict implementation:"
rg -B 5 -A 10 "def update_nested_dict" sebs/utils.py

# Check if there are any tests for update_nested_dict
echo -e "\nChecking for tests:"
fd -e py test

Length of output: 1949

sebs/faas/container.py (4)

1-14: LGTM!

The imports are well-organized and the class hierarchy is appropriate.

15-28: LGTM!

The abstract method and property implementations follow Python best practices.

123-128: LGTM!

The abstract method is well-defined with clear parameters and return type annotation.

42-59: ⚠️ Potential issue

Security: Fix command injection vulnerability.

The execute() call with f-string interpolation is vulnerable to command injection. This is still a valid issue from the previous review.

Apply this fix:

-                execute(f"docker manifest inspect {repository_name}:{image_tag}")
+                execute(["docker", "manifest", "inspect", f"{repository_name}:{image_tag}"])

sebs/openwhisk/openwhisk.py (2)

35-37: LGTM: Container client initialization is well-structured.

The OpenWhiskContainer is properly initialized with all necessary parameters.

114-116: Verify container image building across architectures.

The image building process needs to be tested across different architectures to ensure compatibility.

sebs/faas/system.py (2)

188-212: LGTM!

The method signature and docstring are well-structured and complete. The documentation properly describes the new container-related parameters and potential errors.

220-238: LGTM!

The method signature and docstring are well-structured and complete. The documentation properly describes the new container-related parameters and potential errors.

sebs.py (2)

100-104: LGTM! Well-documented CLI option.

The --container-deployment flag is well-implemented with clear, descriptive help text that properly indicates AWS-only support.

133-133: Verify container deployment configuration usage.

The container deployment configuration is properly added. Let's verify its usage across the codebase.

Also applies to: 157-157

✅ Verification successful

Based on the shell script results, I can generate the final response:

Container deployment configuration is properly integrated and verified

The container deployment configuration is correctly used across the codebase:

Properly propagated through function parameters in deployment systems
Correctly handled in AWS Lambda with container/zip package type switching
Appropriately rejected with clear error messages in unsupported platforms (Azure, GCP, Local)
Well-integrated with the configuration system and cache validation

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify container deployment configuration usage
# Test: Search for container_deployment usage in deployment implementations
rg -A 5 "container_deployment"

# Test: Search for container deployment related configuration
ast-grep --pattern 'container_deployment = $_'

Length of output: 19665

sebs/aws/aws.py (2)

13-13: LGTM: Clean ECR client integration.

The ECR client integration follows the existing pattern of client initialization in the AWS class.

Also applies to: 76-78

400-401: LGTM: Clean helper method implementations.

The changes to default_function_name and the addition of disable_rich_output are straightforward and well-implemented.

Also applies to: 629-630

sebs/gcp/gcp.py (2)

202-211: LGTM! The changes are consistent.

The method signature update and error handling for container deployment are consistent with the package_code method.

345-354: LGTM! The changes maintain consistency.

The method signature update and error handling for container deployment are consistent with other methods in the class.

coderabbitai · 2024-11-07T01:34:25Z

sebs/aws/container.py

+    def push_image(self, repository_uri, image_tag):
+
+        username, password, registry_url = self.config.resources.ecr_repository_authorization(
+            self.client
+        )
+
+        try:
+            self.docker_client.login(username=username, password=password, registry=registry_url)
+            super().push_image(repository_uri, image_tag)
+            self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")
+        except docker.errors.APIError as e:
+            self.logging.error(f"Failed to push the image to registry {repository_uri}.")
+            self.logging.error(f"Error: {str(e)}")
+            raise RuntimeError("Couldn't push to Docker registry")


🛠️ Refactor suggestion

Implement proper exception chaining in error handling.

The error handling is good, but the exception should be chained to preserve the original error context.

try: self.docker_client.login(username=username, password=password, registry=registry_url) super().push_image(repository_uri, image_tag) self.logging.info(f"Successfully pushed the image to registry {repository_uri}.") except docker.errors.APIError as e: self.logging.error(f"Failed to push the image to registry {repository_uri}.") self.logging.error(f"Error: {str(e)}") - raise RuntimeError("Couldn't push to Docker registry") + raise RuntimeError("Couldn't push to Docker registry") from e

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def push_image(self, repository_uri, image_tag):

username, password, registry_url = self.config.resources.ecr_repository_authorization(

self.client

)

try:

self.docker_client.login(username=username, password=password, registry=registry_url)

super().push_image(repository_uri, image_tag)

self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")

except docker.errors.APIError as e:

self.logging.error(f"Failed to push the image to registry {repository_uri}.")

self.logging.error(f"Error: {str(e)}")

raise RuntimeError("Couldn't push to Docker registry")

def push_image(self, repository_uri, image_tag):

username, password, registry_url = self.config.resources.ecr_repository_authorization(

self.client

)

try:

self.docker_client.login(username=username, password=password, registry=registry_url)

super().push_image(repository_uri, image_tag)

self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")

except docker.errors.APIError as e:

self.logging.error(f"Failed to push the image to registry {repository_uri}.")

self.logging.error(f"Error: {str(e)}")

raise RuntimeError("Couldn't push to Docker registry") from e

🧰 Tools

🪛 Ruff

79-79: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

coderabbitai · 2024-11-07T01:34:25Z

sebs/config.py

+    def benchmark_base_images(
+        self, deployment_name: str, language_name: str, architecture: str
+    ) -> Dict[str, str]:
+        return self._system_config[deployment_name]["languages"][language_name]["base_images"][
+            architecture
+        ]


🛠️ Refactor suggestion

Add parameter validation for benchmark_base_images.

Similar to supported_language_versions, this method should validate its parameters.

Consider this implementation:

def benchmark_base_images( self, deployment_name: str, language_name: str, architecture: str ) -> Dict[str, str]: + if deployment_name not in self._system_config: + raise ValueError(f"Unknown deployment: {deployment_name}") + if language_name not in self._system_config[deployment_name]["languages"]: + raise ValueError(f"Unsupported language: {language_name}") + if architecture not in self.supported_architecture(deployment_name): + raise ValueError(f"Unsupported architecture: {architecture}") return self._system_config[deployment_name]["languages"][language_name]["base_images"][ architecture ]

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-07T01:34:25Z

sebs/config.py

+    def supported_package_deployment(self, deployment_name: str) -> bool:
+        return "package" in self._system_config[deployment_name]["deployments"]
+
+    def supported_container_deployment(self, deployment_name: str) -> bool:
+        return "container" in self._system_config[deployment_name]["deployments"]
+


🛠️ Refactor suggestion

Add deployment name validation.

Both methods should validate the deployment_name parameter to prevent KeyError exceptions.

Consider this implementation for both methods:

def supported_package_deployment(self, deployment_name: str) -> bool: + if deployment_name not in self._system_config: + raise ValueError(f"Unknown deployment: {deployment_name}") return "package" in self._system_config[deployment_name]["deployments"] def supported_container_deployment(self, deployment_name: str) -> bool: + if deployment_name not in self._system_config: + raise ValueError(f"Unknown deployment: {deployment_name}") return "container" in self._system_config[deployment_name]["deployments"]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def supported_package_deployment(self, deployment_name: str) -> bool:

return "package" in self._system_config[deployment_name]["deployments"]

def supported_container_deployment(self, deployment_name: str) -> bool:

return "container" in self._system_config[deployment_name]["deployments"]

def supported_package_deployment(self, deployment_name: str) -> bool:

if deployment_name not in self._system_config:

raise ValueError(f"Unknown deployment: {deployment_name}")

return "package" in self._system_config[deployment_name]["deployments"]

def supported_container_deployment(self, deployment_name: str) -> bool:

if deployment_name not in self._system_config:

raise ValueError(f"Unknown deployment: {deployment_name}")

return "container" in self._system_config[deployment_name]["deployments"]

coderabbitai · 2024-11-07T01:34:25Z

sebs/config.py

+    def supported_language_versions(
+        self, deployment_name: str, language_name: str, architecture: str
+    ) -> List[str]:
+        return self._system_config[deployment_name]["languages"][language_name]["base_images"][
+            architecture


🛠️ Refactor suggestion

Add architecture parameter validation and error handling.

The method should validate the architecture parameter and handle missing architectures gracefully to prevent runtime errors.

Consider this implementation:

def supported_language_versions( self, deployment_name: str, language_name: str, architecture: str ) -> List[str]: + if architecture not in self.supported_architecture(deployment_name): + raise ValueError(f"Unsupported architecture: {architecture}") return self._system_config[deployment_name]["languages"][language_name]["base_images"][ architecture ].keys()

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-07T01:34:25Z

sebs/config.py

+        tag = self.benchmark_image_tag(
+            system, benchmark, language_name, language_version, architecture
+        )


💡 Codebase verification

Based on the investigation of the codebase, I can now provide the final response:

Parameter validation is needed, but tag format is Docker-compliant

The current implementation of benchmark_image_tag follows Docker's tag naming conventions correctly:

Uses only allowed characters (alphanumeric, dots, dashes)

Has a consistent format: function.<system>.<benchmark>.<language>-<version>-<architecture>[-prefix]

However, there's no parameter validation, which could lead to runtime issues. The suggested improvements in the review comment are valid and would enhance code reliability.

Add parameter validation in benchmark_image_tag method

Add docstring to document the tag format

Consider adding test cases for the validation

🔗 Analysis chain

Verify image tag format and add parameter validation.

The tag format looks good, but consider adding parameter validation and documenting the format.

Let's verify the tag format meets Docker requirements:

Consider adding validation and documentation:

def benchmark_image_tag( self, system: str, benchmark: str, language_name: str, language_version: str, architecture: str, ) -> str: + """Generate a Docker image tag following the format: function.<system>.<benchmark>.<language>-<version>-<arch>[-prefix] + + Args: + system: The deployment system (e.g., 'aws') + benchmark: The benchmark name + language_name: The programming language + language_version: The language version + architecture: The target architecture + + Returns: + str: A valid Docker image tag + """ + if not all([system, benchmark, language_name, language_version, architecture]): + raise ValueError("All parameters must be non-empty strings") tag = f"function.{system}.{benchmark}.{language_name}-{language_version}-{architecture}" if self.image_tag_prefix: tag = f"{tag}-{self.image_tag_prefix}" return tag

Also applies to: 87-87

Let me gather more information about the image tag format by checking the actual implementation and test cases.

Let me check for any test cases that might validate the image tag format and any existing validation in the config class.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Verify image tag format in the codebase # Test: Search for any potential invalid characters in image tags rg -g '*.py' 'tag\s*=.*[^a-zA-Z0-9_.-]'

Length of output: 998

Script:

#!/bin/bash # Check the implementation of benchmark_image_tag and its test cases rg -g '*.py' -A 10 'def benchmark_image_tag' # Check for any existing parameter validation rg -g '*.py' 'raise ValueError.*tag' # Look for test cases related to image tags rg -g '*test*.py' 'test.*image.*tag'

Length of output: 659

Script:

#!/bin/bash # Check for test files related to config fd test_config.py # Check the content of test files for image tag testing rg -g '*test*.py' -B 2 -A 5 'benchmark_image_tag' # Check for any validation in Config class rg -g '*.py' -B 2 -A 5 'class Config'

Length of output: 820

coderabbitai · 2024-11-07T01:34:26Z

sebs/faas/system.py

+        architecture: str,
        benchmark: str,
        is_cached: bool,
-    ) -> Tuple[str, int]:
+        container_deployment: bool,
+    ) -> Tuple[str, int, str]:


⚠️ Potential issue

Fix return type annotation inconsistency.

The return type annotation doesn't match the docstring. The tuple should include the container URI as the fourth element.

- ) -> Tuple[str, int, str]: + ) -> Tuple[str, int, bool, str]:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

architecture: str,

benchmark: str,

is_cached: bool,

) -> Tuple[str, int]:

container_deployment: bool,

) -> Tuple[str, int, str]:

architecture: str,

benchmark: str,

is_cached: bool,

container_deployment: bool,

) -> Tuple[str, int, bool, str]:

coderabbitai · 2024-11-07T01:34:26Z

sebs/aws/aws.py

-                Runtime="{}{}".format(
+                create_function_params["PackageType"] = "Zip"
+                if code_size < 50 * 1024 * 1024:
+                    package_body = open(package, "rb").read()


⚠️ Potential issue

Use context manager for file operations.

Use a context manager to ensure proper file cleanup.

-package_body = open(package, "rb").read() +with open(package, "rb") as f: + package_body = f.read()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

package_body = open(package, "rb").read()

with open(package, "rb") as f:

package_body = f.read()

🧰 Tools

🪛 Ruff

253-253: Use a context manager for opening files

(SIM115)

coderabbitai · 2024-11-07T01:34:26Z

sebs/aws/aws.py

+        if container_deployment:
+            self.client.update_function_code(FunctionName=name, ImageUri=container_uri)
+        else:
+            code_size = code_package.code_size
+            package = code_package.code_location
+            benchmark = code_package.benchmark
+
+            function_cfg = FunctionConfig.from_benchmark(code_package)
+            architecture = function_cfg.architecture.value
+
+            # Run AWS update
+            # AWS Lambda limit on zip deployment
+            if code_size < 50 * 1024 * 1024:
+                with open(package, "rb") as code_body:
+                    self.client.update_function_code(
+                        FunctionName=name,
+                        ZipFile=code_body.read(),
+                        Architectures=[self._map_architecture(architecture)],
+                    )
+            # Upload code package to S3, then update
+            else:
+                code_package_name = os.path.basename(package)
+
+                storage = cast(S3, self.get_storage())
+                bucket = function.code_bucket(code_package.benchmark, storage)
+                code_prefix = os.path.join(benchmark, architecture, code_package_name)
+                storage.upload(bucket, package, code_prefix)

-        # Run AWS update
-        # AWS Lambda limit on zip deployment
-        if code_size < 50 * 1024 * 1024:
-            with open(package, "rb") as code_body:
                self.client.update_function_code(
                    FunctionName=name,
-                    ZipFile=code_body.read(),
+                    S3Bucket=bucket,
+                    S3Key=code_prefix,
                    Architectures=[self._map_architecture(architecture)],
                )
-        # Upload code package to S3, then update
-        else:
-            code_package_name = os.path.basename(package)
-
-            storage = cast(S3, self.get_storage())
-            bucket = function.code_bucket(code_package.benchmark, storage)
-            code_prefix = os.path.join(benchmark, architecture, code_package_name)
-            storage.upload(bucket, package, code_prefix)
-
-            self.client.update_function_code(
-                FunctionName=name,
-                S3Bucket=bucket,
-                S3Key=code_prefix,
-                Architectures=[self._map_architecture(architecture)],
-            )
+


🛠️ Refactor suggestion

Reduce code duplication in deployment logic.

The zip deployment logic is duplicated between create_function and update_function. Consider extracting common code into a shared helper method.

+def _handle_zip_deployment(self, package: str, code_size: int, benchmark: str, + architecture: str) -> Tuple[str, str, str]: + if code_size < 50 * 1024 * 1024: + return None, None, None + code_package_name = os.path.basename(package) + storage = cast(S3, self.get_storage()) + bucket = storage.get_bucket(Resources.StorageBucketType.DEPLOYMENT) + code_prefix = os.path.join(benchmark, architecture, code_package_name) + storage.upload(bucket, package, code_prefix) + return bucket, code_prefix, code_package_name def update_function(self, function: Function, code_package: Benchmark, container_deployment: bool, container_uri: str): # ... existing code ... if container_deployment: self.client.update_function_code(FunctionName=name, ImageUri=container_uri) else: - # ... existing zip handling code ... + bucket, code_prefix, _ = self._handle_zip_deployment( + package, code_size, benchmark, architecture + ) + if bucket is None: + with open(package, "rb") as code_body: + self.client.update_function_code( + FunctionName=name, + ZipFile=code_body.read(), + Architectures=[self._map_architecture(architecture)], + ) + else: + self.client.update_function_code( + FunctionName=name, + S3Bucket=bucket, + S3Key=code_prefix, + Architectures=[self._map_architecture(architecture)], + )

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-07T01:34:27Z

sebs/aws/aws.py

+        # if the containerized deployment is set to True
+        if container_deployment:
+            # build base image and upload to ECR
+            _, container_uri = self.ecr_client.build_base_image(
+                directory, language_name, language_version, architecture, benchmark, is_cached
+            )


💡 Codebase verification

🛠️ Refactor suggestion

Based on the gathered information, I can now provide the final response. The codebase shows consistent error handling patterns in container-related operations (in container.py) and other AWS services, but the specific build_base_image call is not wrapped in error handling. This confirms the original review comment's concern.

Add error handling for container image building

The container image building process should follow the established error handling patterns found in the codebase, particularly since other ECR and container operations already implement proper error handling (as seen in container.py with docker.errors.APIError handling).

if container_deployment: - # build base image and upload to ECR - _, container_uri = self.ecr_client.build_base_image( - directory, language_name, language_version, architecture, benchmark, is_cached - ) + try: + # build base image and upload to ECR + _, container_uri = self.ecr_client.build_base_image( + directory, language_name, language_version, architecture, benchmark, is_cached + ) + except Exception as e: + self.logging.error(f"Failed to build container image: {e}") + raise RuntimeError("Container image build failed") from e

🔗 Analysis chain

Add error handling for container image building.

The container image building process could fail for various reasons (network issues, invalid Dockerfile, etc.). Consider adding error handling to gracefully handle failures.

if container_deployment: - # build base image and upload to ECR - _, container_uri = self.ecr_client.build_base_image( - directory, language_name, language_version, architecture, benchmark, is_cached - ) + try: + # build base image and upload to ECR + _, container_uri = self.ecr_client.build_base_image( + directory, language_name, language_version, architecture, benchmark, is_cached + ) + except Exception as e: + self.logging.error(f"Failed to build container image: {e}") + raise RuntimeError("Container image build failed") from e

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Check if there are any error handling patterns in the codebase rg -A 2 "try:.*?(build|push).*?except"

Length of output: 40

Script:

#!/bin/bash # Let's check for any existing error handling patterns in the ECR client and related files rg -A 5 "try:" sebs/aws/ # Also check for any error handling in the aws.py file specifically rg -A 5 "except" sebs/aws/aws.py # Look for the ECR client implementation to understand current error handling ast-grep --pattern 'class ECRClient { $$$ }'

Length of output: 5853

Script:

#!/bin/bash # Let's check the ECR client's build_base_image method implementation ast-grep --pattern 'build_base_image($$$) { $$$ }' # Also check for any existing error handling in container.py since it seems to handle ECR operations rg -A 10 "build_base_image" sebs/aws/

Length of output: 770

coderabbitai · 2024-11-07T01:34:27Z

sebs/regression.py

+        bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:None]
+        test_name = f"{bench}, {arch}, {deployment_type}"


⚠️ Potential issue

Correct Slice Notation in Test ID Parsing

The current code:

bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:None]

uses incorrect slice notation. In Python, slicing to the end of a list should omit the stop index or use None without brackets. The [-3:None] slice does not work as intended. The correct way is to use [-3:].

Apply this diff to fix the slice notation:

-bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:None] +bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:]

mcopik requested changes Jun 14, 2024

View reviewed changes

mcopik requested changes Jun 28, 2024

View reviewed changes

config/example.json Outdated Show resolved Hide resolved

docs/usage.md Outdated Show resolved Hide resolved

sebs/aws/config.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Jul 2, 2024

View reviewed changes

[Feat] Add build base docker image as abstract method.

b81d30e

prajinkhadka added 21 commits November 6, 2024 16:08

[Feat] Update aws to build the base docker image.

1aaf601

[Feat] Add Container deployment flag. To Do [Check for the valid prov…

8b0881a

…iders that support container deployment]

[Feat] Add Container deployment support.

014c786

[Feat] Addd support of the base docker image AWS (python)

1bfc272

[Feat] Code Refacotring.

abf2724

[Feat] Added Refactoring to do tasks.

ab23f73

[Feat] 1. Return container deployment flag, container URI. 2. Add cre…

dbada8f

…ate function with container URI

[Feat] Refactored the code. To Do: Need to add entrrypoint for docker…

bdca27e

… aws.

Add entry point in the Dockerfile

3b248c6

Maintain cache for the container deployment.

b486fcb

Add correct entrypoint in dockerfile for lambda

e8e5c58

Remove Whitespaces

5565b8e

Docler client login through user provided credentials, refactor ecr a…

33f72b8

…uthorization, and rase NotImplementedError for GCP and Azure

Fix Mistyped

4f61e00

Minor fixes: Getting docker username, passwoord from config

1f65c2a

Minor fixes: Only show the push image error once

5a59ff4

Minor fixes: Refactor some print debugging statements.

2955bd8

Get repository name from Config.

3950429

Get repository name from Config if user has provided else generate ra…

29fae4e

…ndomly

Refactoring

2577aa6

Refactored the create function with simple if/else

b7015ad

mcopik added 7 commits November 6, 2024 20:04

[aws] remove debug output

8175ab3

[aws] [system] Extend cache system to support containers

8f39932

[aws] Fix typo in the Docker image

37f167f

[aws] Build arm64 Docker images

32b4646

[system] Supporting proper caching of containers with different archi…

e4a5b26

…tectures

[aws] Prune pip cache after building the image

8a91679

[aws] Remove unnecessary package

03561be

coderabbitai bot reviewed Nov 6, 2024

View reviewed changes

mcopik added 3 commits November 6, 2024 22:43

[aws] CodeRabbit fixes

bee420a

[aws] Adjust other implementations to a new interface

00c5d59

[aws] Move container implementation to a separate class

ce64f7d

coderabbitai bot reviewed Nov 6, 2024

View reviewed changes

mcopik added 8 commits November 7, 2024 00:46

[aws] Fix CodeRabbit issue

279c66f

[aws] [whisk] Fuse two container implementations

8e3d1f6

[tools] Fix incorrect handling of False values

00765dd

[system] Reorient container definition to distinguish between AWS and…

5d1bc6a

… OpenWhisk support

[system] Add storage configuration option for benchmark invoke

eeb365f

[system] Disable rich output for regression

2df6828

[benchmark] Separate directory for concurrent build of code package a…

df08ad3

…nd container

[system] Extend regression to support containers

18cc75b

coderabbitai bot reviewed Nov 7, 2024

View reviewed changes

mcopik added 2 commits November 7, 2024 02:38

[system] Provide documentation on multi-platform builds

7da8d56

[aws] Update docs

ff6359a

mcopik force-pushed the container-support branch from 18cc75b to 9df1aa0 Compare November 7, 2024 01:38

mcopik merged commit bf5bc35 into master Nov 7, 2024
1 check was pending

mcopik deleted the container-support branch November 7, 2024 01:42

coderabbitai bot mentioned this pull request Mar 18, 2025

Fix perf-cost and network-ping-pong experiments #246

Closed

coderabbitai bot mentioned this pull request Mar 26, 2025

New CLI options #243

Open

2 tasks

This was referenced Jun 18, 2025

Add support for C++ benchmarks. #99

Open

Improve SeBS documentation #244

Open

	cfg._flags = config["flags"] if "flags" in config else {}
	cfg._flags = config.get("flags", {})

	self.logging.info("Function size {:2f} MB".format(mbytes))
	self.logging.info(f"Function size {mbytes:.2f} MB")

	arr = bytearray((random.getrandbits(8) for i in range(size)))
	arr = bytearray(random.getrandbits(8) for i in range(size))

	package_body = open(package, "rb").read()
	with open(package, "rb") as f:
	package_body = f.read()

		bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:None]
		test_name = f"{bench}, {arch}, {deployment_type}"

Container support for AWS #205

Container support for AWS #205

Uh oh!

Conversation

prajinkhadka commented Jun 11, 2024 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

mcopik left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

prajinkhadka commented Jun 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mcopik commented Jul 2, 2024

Uh oh!

coderabbitai bot commented Jul 2, 2024

Uh oh!

coderabbitai bot commented Jul 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Possibly related issues

Possibly related PRs

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mcopik commented Nov 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Nov 6, 2024

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 6, 2024

Choose a reason for hiding this comment

Uh oh!

prajinkhadka commented Jun 11, 2024 •

edited by coderabbitai bot

Loading

mcopik left a comment •

edited

Loading

prajinkhadka commented Jun 24, 2024 •

edited

Loading

coderabbitai bot commented Jul 2, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

mcopik commented Nov 6, 2024 •

edited

Loading