Skip to content

Conversation

@prajinkhadka
Copy link
Collaborator

@prajinkhadka prajinkhadka commented Jun 11, 2024

Added Support for container deployment in AWS.

  • Build the Image Locally,
  • Push the tagged Image to ECR
  • Check the container_deployment flag, and deploy accordingly ( the Image URI or the code directly )
  • Maintain cache ( added a parameter, container_uri in the cache config JSON )

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a setup script for the Python package to streamline installation.
    • Added support for container deployment across multiple cloud platforms, enhancing flexibility in function management.
    • Updated configuration options to support AWS deployment with Python runtime.
    • Enhanced command-line interface with a new flag for container deployment.
    • Expanded the functionality of the AWS, Azure, GCP, and OpenWhisk classes to manage Docker images and container deployments.
    • Added a new ECRContainer class for managing Docker images in AWS Elastic Container Registry.
    • Updated documentation to clarify deployment processes and architecture options.
  • Bug Fixes

    • Improved error handling for unsupported deployments and clarified error messages.
  • Documentation

    • Updated documentation to reflect new features and usage instructions for container deployment.
  • Chores

    • Updated dependency lists to include necessary libraries for enhanced functionality.

Copy link
Collaborator

@mcopik mcopik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice progress! I left some comments.

The important issues are:

  • CLI option and documentation
  • Testing Node.js
  • Moving resource allocation to proper resource classes.
  • Checking the "update" of container-based function on AWS.
  • Regression testing

@prajinkhadka
Copy link
Collaborator Author

prajinkhadka commented Jun 24, 2024

Very nice progress! I left some comments.

The important issues are:

* CLI option and documentation

* Testing Node.js

* Moving resource allocation to proper resource classes.

* Checking the "update" of container-based function on AWS.

* Regression testing

Hi, Marcin, I have reviewed the changes. Also, I performed the regression tests for both Node.js and Python.
I also added the logic to Fail early if container deployment flag is set as True for other than AWS.

regressionlogs_nodejs.txt
regressionlogs_python.txt

@mcopik
Copy link
Collaborator

mcopik commented Jul 2, 2024

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Jul 2, 2024

Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Jul 2, 2024

Caution

Review failed

The pull request is closed.

Walkthrough

This pull request introduces a variety of changes across multiple files to enhance support for containerized deployment in serverless benchmarks. Key modifications include the addition of a new setup script for a Python package, updates to configuration files to reflect AWS deployment settings, and enhancements to Dockerfiles for both Python and Node.js functions. Additionally, several classes and methods across the codebase have been updated to accommodate new parameters related to container deployment, improving the functionality and flexibility of the deployment process.

Changes

File Path Change Summary
benchmarks/wrappers/aws/python/setup.py New setup script added for Python package, specifying dependencies and package structure.
config/example.json Updated deployment settings from "openwhisk" to "aws", changed runtime from "nodejs" to "python" 3.8, added "container_deployment" field.
config/systems.json Added "container_deployment" key, updated deployment files for AWS Python, included new package dependency for Node.js.
dockerfiles/aws/nodejs/Dockerfile.function New Dockerfile for Node.js function, includes build argument for base image and sets up function handler.
dockerfiles/aws/python/Dockerfile.function New Dockerfile for Python function, includes conditional installation of dependencies based on version.
docs/modularity.md Updated documentation for clarity on adding benchmarks and extending the SeBS framework.
docs/usage.md Added --container-deployment flag to CLI documentation and refined local deployment instructions.
requirements.aws.txt Updated boto3-stubs to include ecr support.
requirements.txt Added rich package for terminal formatting.
sebs.py Enhanced CLI with new --container-deployment option, updated function signatures to include new parameters.
sebs/aws/aws.py Updated methods for Docker image management and Lambda function creation, added new methods for ECR interactions.
sebs/aws/config.py Enhanced AWSResources class with Docker registry management methods and properties.
sebs/azure/azure.py Updated methods to handle container deployments, added error handling for unsupported features.
sebs/benchmark.py Added properties for managing container-related information.
sebs/cache.py Updated methods to include container_uri for code package management.
sebs/config.py Added method for retrieving supported container deployment options.
sebs/experiments/config.py Introduced _container_deployment attribute for configuration management.
sebs/experiments/invocation_overhead.py Updated before_sample method to include new parameters for function updates.
sebs/faas/system.py Enhanced methods for packaging and creating functions with container deployment support.
sebs/gcp/gcp.py Updated methods to accommodate container deployment parameters.
sebs/local/local.py Enhanced methods to manage container deployment parameters.
sebs/openwhisk/openwhisk.py Updated methods for Docker image management and function creation with container deployment parameters.
sebs/sebs.py Modified get_deployment method to validate container deployment configuration.

Possibly related issues

  • Container Support for AWS #228: This PR addresses container support for AWS, which aligns with the changes made to enhance Docker and container deployment capabilities in this pull request.

Possibly related PRs

🐰 In a world of clouds and code,
Where functions hop and data flowed,
We build with care, with Docker's might,
Container dreams take joyful flight!
So let us cheer, with paws so spry,
For benchmarks soar, oh me, oh my! 🐇


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Outside diff range and nitpick comments (4)
sebs/openwhisk/openwhisk.py (2)

Line range hint 123-206: Optimize Docker Image Build and Push Logic.

The method includes redundant checks and can be optimized for clarity and performance.

-        if registry_name is not None and registry_name != "":
-            repository_name = f"{registry_name}/{repository_name}"
-        else:
-            registry_name = "Docker Hub"
+        repository_name = f"{registry_name}/{repository_name}" if registry_name else "Docker Hub"

        # Check if the image is already in the registry.
        if is_cached and self.find_image(self.docker_client, repository_name, image_tag):
            self.logging.info(
                f"Skipping building OpenWhisk Docker package for {benchmark}, using "
                f"Docker image {repository_name}:{image_tag} from registry: "
                f"{registry_name}."
            )
            return False, repository_name

        self.logging.info(
            f"Image {repository_name}:{image_tag} doesn't exist in the registry, "
            f"building OpenWhisk package for {benchmark}."
        )

        build_dir = os.path.join(directory, "docker")
        os.makedirs(build_dir, exist_ok=True)
        shutil.copy(
            os.path.join(DOCKER_DIR, self.name(), language_name, "Dockerfile.function"),
            os.path.join(build_dir, "Dockerfile"),
        )

        for fn in os.listdir(directory):
            if fn not in ("index.js", "__main__.py"):
                file = os.path.join(directory, fn)
                shutil.move(file, build_dir)

        with open(os.path.join(build_dir, ".dockerignore"), "w") as f:
            f.write("Dockerfile")

        builder_image = self.system_config.benchmark_base_images(self.name(), language_name)[
            language_version
        ]
        self.logging.info(f"Build the benchmark base image {repository_name}:{image_tag}.")

        buildargs = {"VERSION": language_version, "BASE_IMAGE": builder_image}
        image, _ = self.docker_client.images.build(
            tag=f"{repository_name}:{image_tag}", path=build_dir, buildargs=buildargs
        )

        # Now push the image to the registry
        self.logging.info(
            f"Push the benchmark base image {repository_name}:{image_tag} "
            f"to registry: {registry_name}."
        )
        self.push_image_to_repository(self.docker_client, repository_name, image_tag)

        return True, repository_name

Line range hint 255-335: Optimize Function Creation Logic.

The method includes redundant checks and can be optimized for clarity and performance.

-            function_found = False
-            docker_image = ""
+            function_found = any(
+                func_name in line.split()[0] for line in actions.stdout.decode().split("\n") if line
+            )

            function_cfg = OpenWhiskFunctionConfig.from_benchmark(code_package)
            function_cfg.storage = cast(Minio, self.get_storage()).config
            if function_found:
                # docker image is overwritten by the update
                res = OpenWhiskFunction(
                    func_name, code_package.benchmark, code_package.hash, function_cfg
                )
                # Update function - we don't know what version is stored
                self.logging.info(f"Retrieved existing OpenWhisk action {func_name}.")
                self.update_function(res, code_package, container_deployment, container_uri)
            else:
                try:
                    self.logging.info(f"Creating new OpenWhisk action {func_name}")
                    docker_image = self.system_config.benchmark_image_name(
                        self.name(),
                        code_package.benchmark,
                        code_package.language_name,
                        code_package.language_version,
                    )
                    subprocess.run(
                        [
                            *self.get_wsk_cmd(),
                            "action",
                            "create",
                            func_name,
                            "--web",
                            "true",
                            "--docker",
                            docker_image,
                            "--memory",
                            str(code_package.benchmark_config.memory),
                            "--timeout",
                            str(code_package.benchmark_config.timeout * 1000),
                            *self.storage_arguments(),
                            code_package.code_location,
                        ],
                        stderr=subprocess.PIPE,
                        stdout=subprocess.PIPE,
                        check=True,
                    )
                    function_cfg.docker_image = docker_image
                    res = OpenWhiskFunction(
                        func_name, code_package.benchmark, code_package.hash, function_cfg
                    )
                except subprocess.CalledProcessError as e:
                    self.logging.error(f"Cannot create action {func_name}.")
                    self.logging.error(f"Output: {e.stderr.decode('utf-8')}")
                    raise RuntimeError(e)

        except FileNotFoundError:
            self.logging.error("Could not retrieve OpenWhisk functions - is path to wsk correct?")
            raise RuntimeError("Failed to access wsk binary")

        # Add LibraryTrigger to a new function
        trigger = LibraryTrigger(func_name, self.get_wsk_cmd())
        trigger.logging_handlers = self.logging_handlers
        res.add_trigger(trigger)

        return res
sebs.py (1)

Line range hint 541-568: Optimize Resource Listing Logic.

The function includes redundant variables and can be optimized for clarity and performance.

-        for idx, bucket in enumerate(buckets):
+        for bucket in buckets:
            sebs_client.logging.info(f"({idx}) {bucket}")

    elif resource == "resource-groups":

        if deployment_client.name() != "azure":
            sebs_client.logging.error("Resource groups are only supported on Azure!")
            return

-        groups = deployment_client.config.resources.list_resource_groups(
-            deployment_client.cli_instance
-        )
-        sebs_client.logging.info("Resource grup:")
-        for idx, bucket in enumerate(groups):
-            sebs_client.logging.info(f"({idx}) {bucket}")
+        groups = deployment_client.config.resources.list_resource_groups(deployment_client.cli_instance)
+        sebs_client.logging.info("Resource group:")
+        for group in groups:
+            sebs_client.logging.info(f"{group}")
sebs/aws/aws.py (1)

Line range hint 150-188: Use f-strings for improved readability.

Convert format calls to f-strings for better readability.

-        execute("zip -qu -r9 {}.zip * .".format(benchmark), shell=True, cwd=directory)
-        benchmark_archive = "{}.zip".format(os.path.join(directory, benchmark))
-        self.logging.info("Created {} archive".format(benchmark_archive))
+        execute(f"zip -qu -r9 {benchmark}.zip * .", shell=True, cwd=directory)
+        benchmark_archive = f"{os.path.join(directory, benchmark)}.zip"
+        self.logging.info(f"Created {benchmark_archive} archive")
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 0b8d777 and df21c86.

Files selected for processing (22)
  • benchmarks/wrappers/aws/python/setup.py (1 hunks)
  • config/example.json (2 hunks)
  • config/systems.json (2 hunks)
  • dockerfiles/aws/nodejs/Dockerfile.function (1 hunks)
  • dockerfiles/aws/python/Dockerfile.function (1 hunks)
  • docs/modularity.md (1 hunks)
  • docs/usage.md (1 hunks)
  • sebs.py (14 hunks)
  • sebs/aws/aws.py (10 hunks)
  • sebs/aws/config.py (5 hunks)
  • sebs/azure/azure.py (6 hunks)
  • sebs/benchmark.py (6 hunks)
  • sebs/cache.py (4 hunks)
  • sebs/config.py (2 hunks)
  • sebs/experiments/config.py (4 hunks)
  • sebs/experiments/invocation_overhead.py (1 hunks)
  • sebs/faas/system.py (4 hunks)
  • sebs/gcp/gcp.py (4 hunks)
  • sebs/local/local.py (3 hunks)
  • sebs/openwhisk/openwhisk.py (9 hunks)
  • sebs/regression.py (1 hunks)
  • sebs/sebs.py (2 hunks)
Files not reviewed due to errors (1)
  • sebs/gcp/gcp.py (no review received)
Files skipped from review due to trivial changes (3)
  • benchmarks/wrappers/aws/python/setup.py
  • dockerfiles/aws/nodejs/Dockerfile.function
  • dockerfiles/aws/python/Dockerfile.function
Additional context used
Ruff
sebs/experiments/config.py

64-64: Use config.get("flags", {}) instead of an if block

Replace with config.get("flags", {})

(SIM401)

sebs/sebs.py

110-110: Use f-string instead of format call

Convert to f-string

(UP032)


115-115: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/local/local.py

154-154: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/experiments/invocation_overhead.py

41-41: Avoid extraneous parentheses

Remove extraneous parentheses

(UP034)

sebs/faas/system.py

263-263: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/aws/config.py

3-3: uuid imported but unused

Remove unused import: uuid

(F401)


243-243: Use f-string instead of format call

Convert to f-string

(UP032)


266-266: Use dct.get("lambda-role", "") instead of an if block

Replace with dct.get("lambda-role", "")

(SIM401)

sebs/openwhisk/openwhisk.py

238-238: Use f-string instead of format call

Convert to f-string

(UP032)

sebs.py

139-139: Use f-string instead of format call

Convert to f-string

(UP032)


370-370: Local variable config is assigned to but never used

Remove assignment to unused variable config

(F841)


618-618: Loop control variable idx not used within loop body

Rename unused idx to _idx

(B007)

sebs/azure/azure.py

217-217: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/benchmark.py

180-180: Use f-string instead of format call

Convert to f-string

(UP032)


490-490: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/gcp/gcp.py

199-199: Use f-string instead of format call

Convert to f-string

(UP032)

sebs/aws/aws.py

80-80: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


176-176: Use f-string instead of format call

Convert to f-string

(UP032)


182-182: Use f-string instead of format call

Convert to f-string

(UP032)


185-185: Use f-string instead of format call

Convert to f-string

(UP032)


378-378: Use f-string instead of format call

Convert to f-string

(UP032)


393-393: Use context handler for opening files

(SIM115)


401-401: Use f-string instead of format call

Convert to f-string

(UP032)


408-410: Use f-string instead of format call

Convert to f-string

(UP032)


516-520: Use f-string instead of format call

Convert to f-string

(UP032)

Additional comments not posted (48)
config/example.json (2)

7-7: Reminder: Ensure the new parameter container_deployment is documented.

The parameter container_deployment is a new addition. Ensure it is documented as an optional feature.


50-54: Clarification on docker_registry key.

The docker_registry key includes registry name, username, password, and container repository. Ensure these details are correctly configured.

sebs/experiments/config.py (4)

10-10: New attribute _container_deployment added.

The attribute _container_deployment is a new addition to the Config class.


35-38: New property container_deployment added.

The property container_deployment is a new addition to the Config class.


50-50: Include container_deployment in serialization.

The container_deployment attribute is now included in the serialized output.


62-62: Include container_deployment in deserialization.

The container_deployment attribute is now included in the deserialized input.

sebs/config.py (2)

42-44: New method supported_container_deployment added.

The method supported_container_deployment is a new addition to the SeBSConfig class.


60-62: Update benchmark_image_name method.

The method benchmark_image_name now includes an optional registry parameter.

config/systems.json (2)

3-4: New parameter container_deployment added.

The parameter container_deployment is a new addition under the general section.


74-75: Include storage.py and setup.py in deployment configuration.

The files storage.py and setup.py are now included in the aws deployment configuration for Python.

docs/usage.md (1)

7-7: Clarify the scope of the --container-deployment flag.

The sentence "To enforce redeployment of code, benchmark inputs, container deployment (supported in AWS) please use flags --update-code, --update-storage and --container-deployment respectively." could be clearer. Consider rephrasing for better readability.

- To enforce redeployment of code, benchmark inputs, container deployment (supported in AWS) please use flags `--update-code`, `--update-storage` and `--container-deployment` respectively.
+ To enforce redeployment of code, benchmark inputs, and container deployment (supported in AWS), please use the flags `--update-code`, `--update-storage`, and `--container-deployment` respectively.
sebs/sebs.py (1)

88-89: Use f-string instead of format call.

Convert to f-string for better readability and performance.

- raise RuntimeError("Deployment {name} not supported!".format(name=name))
+ raise RuntimeError(f"Deployment {name} not supported!")

Likely invalid or redundant comment.

sebs/local/local.py (1)

135-136: Document the new parameter container_deployment.

The container_deployment parameter is added but not documented. Add a docstring to explain its purpose.

    """
        It would be sufficient to just pack the code and ship it as zip to AWS.
        However, to have a compatible function implementation across providers,
        we create a small module.
        Issue: relative imports in Python when using storage wrapper.
        Azure expects a relative import inside a module.

        Structure:
        function
        - function.py
        - storage.py
        - resources
        handler.py

        dir: directory where code is located
        benchmark: benchmark name
        container_deployment: flag indicating if container deployment is used
    """
sebs/faas/system.py (7)

170-171: LGTM!

The addition of the container_deployment parameter in the abstract method signature is consistent with the changes required for container deployment support.


175-183: LGTM!

The abstract method build_base_image with the new signature is consistent with the changes required for container deployment support.


186-187: LGTM!

The abstract method find_image with the new signature is consistent with the changes required for container deployment support.


190-191: LGTM!

The abstract method repository_authorization with the new signature is consistent with the changes required for container deployment support.


194-195: LGTM!

The abstract method push_image_to_repository with the new signature is consistent with the changes required for container deployment support.


198-204: LGTM!

The addition of the container_deployment and container_uri parameters in the abstract method create_function is consistent with the changes required for container deployment support.


212-218: LGTM!

The addition of the container_deployment and container_uri parameters in the abstract method update_function is consistent with the changes required for container deployment support.

sebs/cache.py (2)

165-171: LGTM!

The addition of the container_uri parameter in the add_code_package function is consistent with the changes required for container deployment support.


241-245: LGTM!

The addition of the container_uri parameter in the update_code_package function is consistent with the changes required for container deployment support.

sebs/aws/config.py (4)

229-238: LGTM!

The function check_ecr_repository_exists correctly checks if an ECR repository exists and handles exceptions appropriately.


239-255: LGTM!

The function create_ecr_repository correctly creates an ECR repository if it does not exist and handles exceptions appropriately.

Tools
Ruff

243-243: Use f-string instead of format call

Convert to f-string

(UP032)


256-265: LGTM!

The initialize function correctly initializes the Docker registry details.


287-298: LGTM!

The update_cache function correctly updates the cache with Docker registry details.

sebs/regression.py (1)

346-364: LGTM!

The regression_suite function correctly includes assertions for deployment keys.

docs/modularity.md (1)

343-343: Update Documentation for New Parameters in update_function.

The update_function method now accepts container_deployment and container_uri parameters, which should be documented to explain their purpose and usage.

sebs/openwhisk/openwhisk.py (5)

114-122: LGTM! Proper Error Handling in push_image_to_repository.

The method correctly handles errors by logging and raising a RuntimeError if an error is found in the response.


Line range hint 215-239: LGTM! Convert Format Call to f-String.

The method correctly handles packaging the code for deployment. Convert the format call to an f-string for consistency.

-        self.logging.info("Zip archive size {:2f} MB".format(bytes_size / 1024.0 / 1024.0))
+        self.logging.info(f"Zip archive size {bytes_size / 1024.0 / 1024.0:.2f} MB")

Line range hint 337-365: LGTM! Proper Function Update Handling.

The method correctly updates the function and handles errors.


Line range hint 518-518: LGTM! Proper Cold Start Enforcement.

The method correctly updates the function configuration to enforce a cold start.


Line range hint 97-112: Ensure Proper Handling of Docker Manifest Inspection.

The find_image method attempts to inspect a Docker manifest. Ensure that the experimentalManifest configuration is correctly set and that the execute function handles errors gracefully.

Verification successful

Ensure Proper Handling of Docker Manifest Inspection.

The execute function in sebs/utils.py raises a RuntimeError if the command fails, which ensures that errors are handled gracefully. The experimentalManifest configuration is correctly set and used in the codebase.

  • sebs/utils.py: The execute function raises a RuntimeError on failure, ensuring proper error handling.
  • sebs/openwhisk/openwhisk.py: Usage of experimentalManifest configuration is appropriate.
Scripts executed

The following scripts were executed for the analysis:


Script:

#!/bin/bash
# Description: Verify the experimentalManifest configuration and error handling in the execute function.

# Test: Search for the experimentalManifest configuration usage. Expect: Proper error handling and configuration.
rg 'experimentalManifest'

Length of output: 596



Script:

#!/bin/bash
# Description: Verify the error handling in the execute function.

# Search for the definition of the execute function
rg 'def execute'

Length of output: 292



Script:

#!/bin/bash
# Description: Inspect the implementation of the execute function in sebs/utils.py for error handling.

# Display the implementation of the execute function in sebs/utils.py
rg --context 10 'def execute' sebs/utils.py

Length of output: 719

sebs.py (4)

100-104: LGTM! Proper Addition of Container Deployment Option.

The new option for container deployment is correctly added and documented.


Line range hint 125-151: LGTM! Convert Format Call to f-String.

The function correctly updates the configuration and handles the new option. Convert the format call to an f-string for consistency.

-    sebs_client.logging.info("Created experiment output at {}".format(output_dir))
+    sebs_client.logging.info(f"Created experiment output at {output_dir}")

Line range hint 231-243: LGTM! Proper Invocation Logic Update.

The function correctly updates the invocation logic and handles the new container deployment option.


347-347: LGTM! Proper Regression Logic Update.

The function correctly updates the regression logic and handles the new container deployment option.

sebs/azure/azure.py (2)

Line range hint 163-218: LGTM! Convert Format Call to f-String.

The method correctly handles packaging the code for deployment and raises an error for unsupported container deployment. Convert the format call to an f-string for consistency.

-        execute("zip -qu -r9 {}.zip * .".format(benchmark), shell=True, cwd=directory)
+        execute(f"zip -qu -r9 {benchmark}.zip * .", shell=True, cwd=directory)

292-298: LGTM! Proper Function Update Handling.

The method correctly updates the function and handles errors.

sebs/benchmark.py (5)

127-131: LGTM!

The container_uri property is correctly implemented.


144-147: LGTM!

The container_deployment property is correctly implemented.


177-180: LGTM! Consider using an f-string.

The _container_deployment attribute is initialized correctly. You can optimize the string formatting.

- raise RuntimeError("Benchmark {benchmark} not found!".format(benchmark=self._benchmark))
+ raise RuntimeError(f"Benchmark {self._benchmark} not found!")
Tools
Ruff

180-180: Use f-string instead of format call

Convert to f-string

(UP032)


483-495: LGTM!

The build method handles the container deployment flag and URI correctly.

Also applies to: 516-547

Tools
Ruff

490-490: Use f-string instead of format call

Convert to f-string

(UP032)


538-544: LGTM!

The cache update logic correctly includes the container URI.

sebs/aws/aws.py (4)

202-208: LGTM!

The code changes are approved.


494-496: Use f-strings for improved readability.

Convert format calls to f-strings for better readability.

-        self.client.update_function_configuration(
-            FunctionName=name,
-            Timeout=function.config.timeout,
-            MemorySize=function.config.memory,
-        )
+        self.client.update_function_configuration(FunctionName=name, Timeout=function.config.timeout, MemorySize=function.config.memory)

Likely invalid or redundant comment.


185-185: Use context manager for file operations.

Use a context manager to open files to ensure they are properly closed after use.

-                    package_body = open(package, "rb").read()
+                    with open(package, "rb") as file:
+                        package_body = file.read()

Likely invalid or redundant comment.

Tools
Ruff

185-185: Use f-string instead of format call

Convert to f-string

(UP032)


67-79: Avoid using mutable default arguments.

Replace mutable default arguments with None and initialize them within the function.

-        if self.config.resources.docker_username:
-            if self.config.resources.docker_registry:
-                docker_client.login(
-                    username=self.config.resources.docker_username,
-                    password=self.config.resources.docker_password,
-                    registry=self.config.resources.docker_registry,
-                )
-            else:
-                docker_client.login(
-                    username=self.config.resources.docker_username,
-                    password=self.config.resources.docker_password,
-                )
+        docker_username = self.config.resources.docker_username
+        docker_password = self.config.resources.docker_password
+        docker_registry = self.config.resources.docker_registry
+        if docker_username and docker_password:
+            if docker_registry:
+                docker_client.login(
+                    username=docker_username,
+                    password=docker_password,
+                    registry=docker_registry,
+                )
+            else:
+                docker_client.login(
+                    username=docker_username,
+                    password=docker_password,
+                )

Likely invalid or redundant comment.

cfg._download_results = config["download_results"]
cfg._container_deployment = config["container_deployment"]
cfg._runtime = Runtime.deserialize(config["runtime"])
cfg._flags = config["flags"] if "flags" in config else {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimize dictionary access.

Use config.get("flags", {}) instead of an if block.

-  cfg._flags = config["flags"] if "flags" in config else {}
+  cfg._flags = config.get("flags", {})
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cfg._flags = config["flags"] if "flags" in config else {}
cfg._flags = config.get("flags", {})
Tools
Ruff

64-64: Use config.get("flags", {}) instead of an if block

Replace with config.get("flags", {})

(SIM401)


bytes_size = os.path.getsize(directory)
mbytes = bytes_size / 1024.0 / 1024.0
self.logging.info("Function size {:2f} MB".format(mbytes))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use f-string instead of format call.

Convert to f-string for better readability and performance.

- self.logging.info("Function size {:2f} MB".format(mbytes))
+ self.logging.info(f"Function size {mbytes:.2f} MB")
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.logging.info("Function size {:2f} MB".format(mbytes))
self.logging.info(f"Function size {mbytes:.2f} MB")
Tools
Ruff

154-154: Use f-string instead of format call

Convert to f-string

(UP032)

Comment on lines 517 to 549
code_package.benchmark,
code_package.language_name,
code_package.language_version,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use f-strings for improved readability.

Convert format calls to f-strings for better readability.

-        func_name = "{}-{}-{}".format(
-            code_package.benchmark,
-            code_package.language_name,
-            code_package.language_version,
-        )
+        func_name = f"{code_package.benchmark}-{code_package.language_name}-{code_package.language_version}"
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
code_package.benchmark,
code_package.language_name,
code_package.language_version,
func_name = f"{code_package.benchmark}-{code_package.language_name}-{code_package.language_version}"

self._benchmark = benchmark

def before_sample(self, size: int, input_benchmark: dict):
arr = bytearray((random.getrandbits(8) for i in range(size)))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid extraneous parentheses.

Remove the extraneous parentheses for better readability.

- arr = bytearray((random.getrandbits(8) for i in range(size)))
+ arr = bytearray(random.getrandbits(8) for i in range(size))
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
arr = bytearray((random.getrandbits(8) for i in range(size)))
arr = bytearray(random.getrandbits(8) for i in range(size))
Tools
Ruff

41-41: Avoid extraneous parentheses

Remove extraneous parentheses

(UP034)

@mcopik
Copy link
Collaborator

mcopik commented Nov 6, 2024

Remaining issues:

  • Rebase with master.
  • Add abstractions in the faas/system class.
  • Add documentation on deployment, configuration, and container support.
  • Extend CLI to list & delete all images.
  • Resolve coderabbit issues.
  • Verify caching works correctly (no cache/replace existing image/create image).
  • Extend regression and rerun x86.
  • Extend regression and rerun arm64.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Outside diff range and nitpick comments (8)
sebs/faas/system.py (1)

168-172: LGTM! Consider adding docstring updates.

The new parameters architecture and container_deployment align well with the container support objectives. Consider updating the method's docstring to document:

  • The new parameters and their purpose
  • The updated return type that includes container deployment info
sebs/benchmark.py (1)

512-516: Improve type hints and add docstring for the build method

The method would benefit from:

  1. Type aliases for better readability
  2. A docstring explaining the return tuple values

Consider applying this improvement:

+# At the top of the file with other imports
+from typing import TypeAlias

+# Type aliases for better readability
+BuildStepCallable: TypeAlias = Callable[[str, str, str, str, str, bool, bool], Tuple[str, int, str]]
+BuildResult: TypeAlias = Tuple[bool, str, bool, str]

 def build(
     self,
-    deployment_build_step: Callable[[str, str, str, str, str, bool, bool], Tuple[str, int, str]],
-) -> Tuple[bool, str, bool, str]:
+    deployment_build_step: BuildStepCallable,
+) -> BuildResult:
+    """Build the benchmark package.
+
+    Args:
+        deployment_build_step: Callback for building the deployment package
+
+    Returns:
+        Tuple containing:
+        - bool: Whether a new build was performed
+        - str: Location of the code package
+        - bool: Whether this is a container deployment
+        - str: Container URI if applicable, empty string otherwise
+    """

Also applies to: 523-526

sebs/aws/aws.py (2)

139-142: Initialize container_uri with None instead of empty string.

For better type safety and explicit handling of the unset case, consider using None as the default value.

-        container_uri = ""
+        container_uri: Optional[str] = None

333-341: Consider early architecture validation.

The architecture mismatch is detected during the build process. Consider validating this earlier to fail fast and provide better user experience.

+    def _validate_architecture(self, target_arch: str) -> None:
+        """Validate if the target architecture is supported on the current system."""
+        isa = platform.processor()
+        if (isa == "x86_64" and target_arch != "x64") or (
+            isa == "arm64" and target_arch != "arm64"
+        ):
+            raise ValueError(
+                f"Unsupported target architecture: {target_arch} on CPU architecture: {isa}. "
+                "Please ensure QEMU is configured for cross-architecture builds."
+            )

     def build_base_image(
         self,
         directory: str,
         language_name: str,
         language_version: str,
         architecture: str,
         benchmark: str,
         is_cached: bool,
     ) -> Tuple[bool, str]:
+        self._validate_architecture(architecture)
sebs/cache.py (4)

Line range hint 127-131: Ensure consistency in get_code_package method parameters

The get_code_package method signature has been updated to include language_version and architecture, but the docstring and parameter descriptions have not been updated accordingly. Please update the docstring to reflect the new parameters for clarity and maintainability.

Apply this diff to update the docstring:

     """
         Access cached version of benchmark code.

         :param deployment: allowed deployment clouds or local
         :param benchmark:
-        :param  language:
+        :param language: programming language of the benchmark
+        :param language_version: version of the programming language
+        :param architecture: system architecture (e.g., x86_64)

264-272: Ensure consistent updates to cached configuration

In the block updating cached_config within add_code_package, consider combining the duplicate code for updating containers and code_package to reduce redundancy and improve maintainability.

Refactor the code to unify the update logic:

# Determine the main key based on the deployment type
main_key = "containers" if code_package.container_deployment else "code_package"
# Update the cached configuration
cached_config[deployment_name][language][main_key][key] = language_config

182-186: Update method docstring for add_code_package

The add_code_package method has been updated, but its docstring is missing or outdated. Adding a docstring with parameter descriptions will enhance code readability and maintenance.

Add the following docstring:

def add_code_package(
    self,
    deployment_name: str,
    code_package: "Benchmark",
):
    """
    Add a new code package or container image to the cache.

    :param deployment_name: Name of the deployment platform (e.g., 'aws')
    :param code_package: Instance of Benchmark containing code package information
    """
    # ... rest of the method ...

293-295: Update method docstring for update_code_package

Similarly, the update_code_package method lacks a docstring. Adding a docstring will clarify the purpose and usage of the method.

Add the following docstring:

def update_code_package(
    self,
    deployment_name: str,
    code_package: "Benchmark",
):
    """
    Update an existing code package or container image in the cache.

    :param deployment_name: Name of the deployment platform (e.g., 'aws')
    :param code_package: Instance of Benchmark containing updated code package information
    """
    # ... rest of the method ...
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 5b3717a and e4a5b26.

📒 Files selected for processing (8)
  • config/systems.json (2 hunks)
  • dockerfiles/aws/python/Dockerfile.function (1 hunks)
  • sebs/aws/aws.py (8 hunks)
  • sebs/benchmark.py (9 hunks)
  • sebs/cache.py (9 hunks)
  • sebs/config.py (1 hunks)
  • sebs/faas/system.py (5 hunks)
  • sebs/sebs.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • dockerfiles/aws/python/Dockerfile.function
  • sebs/sebs.py
🧰 Additional context used
🪛 Ruff
sebs/aws/aws.py

263-263: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


423-423: Use a context manager for opening files

(SIM115)

🔇 Additional comments (13)
sebs/config.py (3)

50-55: LGTM! Verify base image configurations.

The method correctly incorporates architecture-specific base images. The implementation maintains consistency with other architecture-related changes.

Let's verify the base image configurations:

#!/bin/bash
# Description: Check base image configurations
# Test: Extract and validate base image structure for different architectures
jq -r '.[] | select(.languages != null) | .languages[] | select(.base_images != null) | .base_images | keys[]' config/systems.json

47-48: Add docstring and verify config structure.

While the implementation is straightforward, please add a docstring to document the purpose and expected return values. Also, let's verify the config structure matches the return type hint.

#!/bin/bash
# Description: Verify container_deployment config structure
# Test: Check if the value is a list of strings
jq -r '.general.container_deployment | if type=="array" then "Valid array" else "Invalid type: " + type end' config/systems.json

63-69: LGTM! Image naming follows Docker conventions.

The implementation correctly:

  • Incorporates architecture in image tags
  • Follows Docker naming conventions
  • Fixed the previous issue with spaces in tags

Let's verify the image tag format:

Also applies to: 77-84

✅ Verification successful

Image tag format follows Docker conventions and is correctly implemented

The implementation in benchmark_image_tag correctly:

  • Uses only allowed characters (alphanumeric, dots, hyphens)
  • Follows a consistent format: function.<system>.<benchmark>.<language>-<version>-<architecture>[-prefix]
  • Properly incorporates the architecture parameter
  • Has no spaces or invalid characters in the tag format

The code meets all Docker naming conventions and the previous review's concerns have been addressed.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify image tag format
# Test: Check for any invalid characters or formats in image tags
rg -g '*.py' 'benchmark_image_tag.*return.*tag' -A 1 | grep -v 'benchmark_image_tag' | grep -v '\-\-' | grep -v '^$' | while read -r line; do
    if echo "$line" | grep -q '[^a-zA-Z0-9_.-]'; then
        echo "Invalid characters found in tag format: $line"
    fi
done

Length of output: 131


Script:

#!/bin/bash
# Let's first see the actual implementation of benchmark_image_tag
rg -g '*.py' -A 10 'def benchmark_image_tag'

Length of output: 581

config/systems.json (4)

84-85: Verify setup.py integration with container builds.

The addition of setup.py to deployment files suggests Python package installation during container builds.

#!/bin/bash
# Check if setup.py exists and contains necessary configuration
fd -t f "setup.py" -X grep -l "install_requires\|packages"

# Verify Dockerfile references setup.py
fd -t f "Dockerfile" -X grep -l "setup.py"

Line range hint 1-276: Verify regression test coverage for all configurations.

The PR comments mention regression testing was performed, but please ensure:

  1. All architecture combinations (x64/arm64) are tested
  2. Container deployment works with both Python and Node.js
  3. Cache mechanism works as expected with container_uri
#!/bin/bash
# Check for test files covering container deployment
fd -t f "test_.*\.py" -X grep -l "container_deployment\|container_uri"

# Look for cache configuration handling
rg "container_uri.*cache" 

92-97: ⚠️ Potential issue

Review Node.js version support and version tagging.

  1. Limited Version Support:

    • Only Node.js 16 is supported
    • Consider adding support for Node.js 18 and 20 as they are widely used
  2. Version Tag Inconsistency:

    • x64: Generic tag (amazon/aws-lambda-nodejs:16)
    • arm64: Specific version tag (amazon/aws-lambda-nodejs:16.2024.09.06.13)
#!/bin/bash
# Check for Node.js version compatibility in codebase
rg -l "engine.*node" package.json

# Verify Node.js function compatibility
fd -t f "handler.js" -X grep -l "async.*exports.handler"

3-4: Consider making docker_repository configurable and document container deployment.

The hardcoded docker repository might limit flexibility. Consider:

  1. Making it configurable through environment variables
  2. Adding documentation about:
    • Required repository permissions
    • How to set up custom repositories
    • AWS-specific container deployment limitations
sebs/benchmark.py (3)

182-182: LGTM!

The initialization of container-related fields is properly typed and follows the class's initialization pattern.

Also applies to: 205-205


246-265: LGTM!

The cache query logic properly handles both container and non-container deployments, with safe dictionary access for the container URI.


182-182: Consider early validation of container deployment support

From the PR objectives, we know that container deployment is only supported for AWS. Consider adding an early validation in the constructor to fail fast if container deployment is requested for unsupported platforms.

Let's verify the current platform support:

Also applies to: 246-265

sebs/aws/aws.py (2)

185-196: 🛠️ Refactor suggestion

Improve error handling in find_image method.

The method silently catches ClientError. Consider logging the error for better debugging.

-        except ClientError:
+        except ClientError as e:
+            self.logging.debug(f"Image not found: {e}")
             return False

Likely invalid or redundant comment.


423-424: ⚠️ Potential issue

Use context manager for file operations.

Use a context manager to ensure proper file handling and resource cleanup.

-                    package_body = open(package, "rb").read()
-                    create_function_params["Code"] = {"ZipFile": package_body}
+                    with open(package, "rb") as f:
+                        create_function_params["Code"] = {"ZipFile": f.read()}

Likely invalid or redundant comment.

🧰 Tools
🪛 Ruff

423-423: Use a context manager for opening files

(SIM115)

sebs/cache.py (1)

303-309: Handle missing cache directory in update_code_package

In the update_code_package method, if the cached_dir does not exist, the method calls self.add_code_package. Ensure that this behavior is intentional and does not lead to any unintended side effects. Consider logging a message to indicate that the code package was not found and is being added.

Please confirm that this fallback to add_code_package is the desired behavior. If so, consider adding a log message.

Comment on lines 176 to 196
def build_base_image(
self,
directory: str,
language_name: str,
language_version: str,
architecture: str,
benchmark: str,
is_cached: bool,
) -> Tuple[bool, str]:
pass

@abstractmethod
def find_image(self, repository_client, repository_name, image_tag) -> bool:
pass

@abstractmethod
def push_image_to_repository(self, repository_client, repository_uri, image_tag):
pass

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add docstrings and error handling for container-related methods.

The new abstract methods for container support are well-structured, but they need:

  1. Docstrings describing:
    • Purpose and responsibilities
    • Parameters and return values
    • Expected behavior and error conditions
  2. Error handling guidance for implementations:
    • Network issues during image push
    • Repository authentication failures
    • Image build failures

Comment on lines +127 to +131
@property
def container_uri(self) -> str:
assert self._container_uri is not None
return self._container_uri

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace assertion with proper None check in container_uri property

The assertion could be disabled in production Python environments (using the -O flag). Consider using a proper None check with a descriptive error message.

Apply this diff:

@property
def container_uri(self) -> str:
-    assert self._container_uri is not None
+    if self._container_uri is None:
+        raise ValueError("Container URI is not set")
    return self._container_uri
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@property
def container_uri(self) -> str:
assert self._container_uri is not None
return self._container_uri
@property
def container_uri(self) -> str:
if self._container_uri is None:
raise ValueError("Container URI is not set")
return self._container_uri

sebs/aws/aws.py Outdated
Comment on lines 256 to 263
try:
self.docker_client.login(username=username, password=password, registry=registry_url)
push_image(repository_uri, image_tag)
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")
except docker.errors.APIError as e:
self.logging.error(f"Failed to push the image to registry {repository_uri}.")
self.logging.error(f"Error: {str(e)}")
raise RuntimeError("Couldn't push to Docker registry")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve exception chaining in push_image_to_repository.

Use proper exception chaining to preserve the original error context.

-            raise RuntimeError("Couldn't push to Docker registry")
+            raise RuntimeError("Couldn't push to Docker registry") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
self.docker_client.login(username=username, password=password, registry=registry_url)
push_image(repository_uri, image_tag)
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")
except docker.errors.APIError as e:
self.logging.error(f"Failed to push the image to registry {repository_uri}.")
self.logging.error(f"Error: {str(e)}")
raise RuntimeError("Couldn't push to Docker registry")
try:
self.docker_client.login(username=username, password=password, registry=registry_url)
push_image(repository_uri, image_tag)
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")
except docker.errors.APIError as e:
self.logging.error(f"Failed to push the image to registry {repository_uri}.")
self.logging.error(f"Error: {str(e)}")
raise RuntimeError("Couldn't push to Docker registry") from e
🧰 Tools
🪛 Ruff

263-263: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

Comment on lines +46 to +48
def __init__(self, cache_dir: str, docker_client: docker.DockerClient):
super().__init__()
self.docker_client = docker_client
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider making 'docker_client' an optional parameter

Adding the required docker_client parameter to the __init__ method may break existing code that instantiates the Cache class without providing this parameter. To maintain backward compatibility, consider making docker_client an optional parameter with a default value of None. This way, existing code will continue to function, and you can handle docker_client being None where necessary.

Apply this diff to make docker_client optional:

+from typing import Optional

-def __init__(self, cache_dir: str, docker_client: docker.DockerClient):
+def __init__(self, cache_dir: str, docker_client: Optional[docker.DockerClient] = None):
     super().__init__()
     self.docker_client = docker_client

Ensure that you handle the case where self.docker_client might be None in methods that use it.

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +137 to +148
def get_container(
self, deployment: str, benchmark: str, language: str,
language_version: str, architecture: str
) -> Optional[Dict[str, Any]]:
cfg = self.get_benchmark_config(deployment, benchmark)

key = f"{language_version}-{architecture}"
if cfg and language in cfg and key in cfg[language]["containers"]:
return cfg[language]["containers"][key]
else:
return None

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle missing 'containers' key in get_container method

In the get_container method, if cfg[language] does not contain the key "containers", accessing it will raise a KeyError. To prevent this, consider using the dict.get() method with a default empty dictionary.

Apply this diff to safely access the "containers" key:

         key = f"{language_version}-{architecture}"
-        if cfg and language in cfg and key in cfg[language]["containers"]:
+        if cfg and language in cfg and key in cfg[language].get("containers", {}):
             return cfg[language]["containers"][key]
         else:
             return None

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +228 to +231
image = self.docker_client.images.get(code_package.container_uri)
language_config["image-uri"] = code_package.container_uri
language_config["image-id"] = image.id

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add exception handling when retrieving Docker images

When retrieving the Docker image with self.docker_client.images.get(code_package.container_uri), an exception may be raised if the image is not found or if there is a connection issue with the Docker daemon. To make the code more robust, add exception handling to catch potential exceptions and handle them gracefully.

Apply this diff to add exception handling:

             if code_package.container_deployment:

-                image = self.docker_client.images.get(code_package.container_uri)
+                try:
+                    image = self.docker_client.images.get(code_package.container_uri)
+                except docker.errors.ImageNotFound:
+                    raise RuntimeError(f"Image '{code_package.container_uri}' not found.")
+                except docker.errors.APIError as e:
+                    raise RuntimeError(f"Docker API error: {e}")
                 language_config["image-uri"] = code_package.container_uri
                 language_config["image-id"] = image.id
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
image = self.docker_client.images.get(code_package.container_uri)
language_config["image-uri"] = code_package.container_uri
language_config["image-id"] = image.id
if code_package.container_deployment:
try:
image = self.docker_client.images.get(code_package.container_uri)
except docker.errors.ImageNotFound:
raise RuntimeError(f"Image '{code_package.container_uri}' not found.")
except docker.errors.APIError as e:
raise RuntimeError(f"Docker API error: {e}")
language_config["image-uri"] = code_package.container_uri
language_config["image-id"] = image.id

Comment on lines +329 to +355

key = f"{language_version}-{architecture}"
config[deployment_name][language]["code_package"][key]["date"][
if code_package.container_deployment:
main_key = "containers"
else:
main_key = "code_package"

config[deployment_name][language][main_key][key]["date"][
"modified"
] = date
config[deployment_name][language]["code_package"][key][
config[deployment_name][language][main_key][key][
"hash"
] = code_package.hash
config[deployment_name][language]["code_package"][key][
config[deployment_name][language][main_key][key][
"size"
] = code_package.code_size

if code_package.container_deployment:

image = self.docker_client.images.get(code_package.container_uri)
config[deployment_name][language][main_key][key][
"image-id"
] = image.id
config[deployment_name][language][main_key][key][
"image-uri"
] = code_package.container_uri

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add exception handling when updating Docker image information

Similar to the addition in add_code_package, when updating the Docker image information in update_code_package, add exception handling around the self.docker_client.images.get call to handle potential errors gracefully.

Apply this diff to add exception handling:

             if code_package.container_deployment:
                 main_key = "containers"
             else:
                 main_key = "code_package"

             config[deployment_name][language][main_key][key]["date"]["modified"] = date
             config[deployment_name][language][main_key][key]["hash"] = code_package.hash
             config[deployment_name][language][main_key][key]["size"] = code_package.code_size

             if code_package.container_deployment:
-                image = self.docker_client.images.get(code_package.container_uri)
+                try:
+                    image = self.docker_client.images.get(code_package.container_uri)
+                except docker.errors.ImageNotFound:
+                    raise RuntimeError(f"Image '{code_package.container_uri}' not found.")
+                except docker.errors.APIError as e:
+                    raise RuntimeError(f"Docker API error: {e}")
                 config[deployment_name][language][main_key][key]["image-id"] = image.id
                 config[deployment_name][language][main_key][key]["image-uri"] = code_package.container_uri
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
key = f"{language_version}-{architecture}"
config[deployment_name][language]["code_package"][key]["date"][
if code_package.container_deployment:
main_key = "containers"
else:
main_key = "code_package"
config[deployment_name][language][main_key][key]["date"][
"modified"
] = date
config[deployment_name][language]["code_package"][key][
config[deployment_name][language][main_key][key][
"hash"
] = code_package.hash
config[deployment_name][language]["code_package"][key][
config[deployment_name][language][main_key][key][
"size"
] = code_package.code_size
if code_package.container_deployment:
image = self.docker_client.images.get(code_package.container_uri)
config[deployment_name][language][main_key][key][
"image-id"
] = image.id
config[deployment_name][language][main_key][key][
"image-uri"
] = code_package.container_uri
key = f"{language_version}-{architecture}"
if code_package.container_deployment:
main_key = "containers"
else:
main_key = "code_package"
config[deployment_name][language][main_key][key]["date"][
"modified"
] = date
config[deployment_name][language][main_key][key][
"hash"
] = code_package.hash
config[deployment_name][language][main_key][key][
"size"
] = code_package.code_size
if code_package.container_deployment:
try:
image = self.docker_client.images.get(code_package.container_uri)
except docker.errors.ImageNotFound:
raise RuntimeError(f"Image '{code_package.container_uri}' not found.")
except docker.errors.APIError as e:
raise RuntimeError(f"Docker API error: {e}")
config[deployment_name][language][main_key][key][
"image-id"
] = image.id
config[deployment_name][language][main_key][key][
"image-uri"
] = code_package.container_uri

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Outside diff range and nitpick comments (13)
sebs/aws/config.py (1)

257-282: Add retry mechanism for ECR API calls

The repository creation could fail due to transient AWS API issues. Consider implementing a retry mechanism similar to the one used in the HTTP API creation.

 def get_ecr_repository(self, ecr_client: ECRClient) -> str:
     if self._container_repository is not None:
         return self._container_repository

     self._container_repository = "sebs-benchmarks-{}".format(self._resources_id)
     self._docker_registry = self.check_ecr_repository_exists(
         ecr_client, self._container_repository
     )

     if self._docker_registry is None:
+        retries = 0
+        while retries < 5:
             try:
                 resp = ecr_client.create_repository(repositoryName=self._container_repository)
                 self.logging.info(f"Created ECR repository: {self._container_repository}")
                 self._docker_registry = resp["repository"]["repositoryUri"]
+                break
             except ecr_client.exceptions.RepositoryAlreadyExistsException:
                 self.logging.info(f"ECR repository {self._container_repository} already exists.")
                 self._docker_registry = self.check_ecr_repository_exists(
                     ecr_client, self._container_repository
                 )
+            except ecr_client.exceptions.ServerException:
+                retries += 1
+                if retries == 5:
+                    raise
+                time.sleep(retries)

     return self._container_repository
sebs.py (3)

450-459: Enhance function documentation.

The docstring should include details about the new measure_interval parameter and its impact on memory measurements.

 def start(
     benchmark,
     benchmark_input_size,
     output,
     deployments,
     storage_configuration,
     measure_interval,
     remove_containers,
     **kwargs,
 ):
     """
     Start a given number of function instances and a storage instance.
+
+    Args:
+        measure_interval: Duration between memory measurements in milliseconds.
+                         Use -1 to disable measurements.
+        remove_containers: If True, containers are removed after stopping.
     """

596-596: Fix typo in log message.

There's a typo in the log message.

-        sebs_client.logging.info("Resource grup:")
+        sebs_client.logging.info("Resource groups:")

643-654: Improve resource group management implementation.

  1. Fix the unused loop variable as suggested by static analysis.
  2. Add more descriptive error messages for resource group operations.
-        groups = deployment_client.config.resources.list_resource_groups(
-            deployment_client.cli_instance
-        )
-        for idx, group in enumerate(groups):
+        groups = deployment_client.config.resources.list_resource_groups(
+            deployment_client.cli_instance
+        )
+        for _idx, group in enumerate(groups):
             if len(prefix) > 0 and not group.startswith(prefix):
                 continue

-            sebs_client.logging.info(f"Removing resource group: {group}")
+            sebs_client.logging.info(f"Attempting to remove resource group: {group}")
             deployment_client.config.resources.delete_resource_group(
                 deployment_client.cli_instance, group, wait
             )
+            sebs_client.logging.info(f"Successfully removed resource group: {group}")
🧰 Tools
🪛 Ruff

646-646: Loop control variable idx not used within loop body

Rename unused idx to _idx

(B007)

sebs/azure/azure.py (1)

162-171: LGTM! Consider adding a docstring.

The changes correctly handle container deployment parameters and properly raise NotImplementedError for unsupported functionality. The return type has been updated to include container_uri.

Consider adding a docstring to document the new parameters and return value:

def package_code(
    self,
    directory: str,
    language_name: str,
    language_version: str,
    architecture: str,
    benchmark: str,
    is_cached: bool,
    container_deployment: bool,
) -> Tuple[str, int, str]:
    """Package the code for deployment.

    Args:
        directory: Source directory containing the code
        language_name: Programming language name
        language_version: Language version
        architecture: Target architecture
        benchmark: Benchmark name
        is_cached: Whether to use cached package
        container_deployment: Whether to use container deployment

    Returns:
        Tuple containing:
        - Directory path
        - Code size in bytes
        - Container URI (empty for non-container deployments)

    Raises:
        NotImplementedError: If container deployment is True as it's not supported
    """
sebs/aws/aws.py (3)

138-151: Add type hints for improved code maintainability.

Consider adding type hints for the new parameters and return type:

    def package_code(
        self,
        directory: str,
        language_name: str,
        language_version: str,
-       architecture: str,
-       benchmark: str,
-       is_cached: bool,
-       container_deployment: bool,
+       architecture: str,
+       benchmark: str,
+       is_cached: bool,
+       container_deployment: bool,
-    ) -> Tuple[str, int, str]:
+    ) -> Tuple[str, int, str]:  # (package_path, size_in_bytes, container_uri)

238-278: Consider extracting zip deployment logic to a separate method.

The create_function method handles both container and zip deployments. Consider extracting the zip deployment logic to improve readability and maintainability:

+    def _create_zip_function(
+        self,
+        create_function_params: Dict[str, Any],
+        package: str,
+        code_size: int,
+        benchmark: str,
+        architecture: str,
+        language: str,
+        language_runtime: str,
+    ) -> None:
+        if code_size < 50 * 1024 * 1024:
+            with open(package, "rb") as f:
+                create_function_params["Code"] = {"ZipFile": f.read()}
+        else:
+            storage_client = self.get_storage()
+            code_bucket = storage_client.get_bucket(
+                Resources.StorageBucketType.DEPLOYMENT
+            )
+            code_prefix = os.path.join(
+                benchmark, architecture, os.path.basename(package)
+            )
+            storage_client.upload(code_bucket, package, code_prefix)
+            create_function_params["Code"] = {
+                "S3Bucket": code_bucket,
+                "S3Key": code_prefix,
+            }
+        create_function_params["Runtime"] = f"{language}{self._map_language_runtime(language, language_runtime)}"
+        create_function_params["Handler"] = "handler.handler"

    def create_function(
        self,
        code_package: Benchmark,
        func_name: str,
        container_deployment: bool,
        container_uri: str,
    ) -> "LambdaFunction":
        # ... existing code ...
        if container_deployment:
            create_function_params["PackageType"] = "Image"
            create_function_params["Code"] = {"ImageUri": container_uri}
        else:
            create_function_params["PackageType"] = "Zip"
            self._create_zip_function(
                create_function_params,
                package,
                code_size,
                benchmark,
                architecture,
                language,
                language_runtime,
            )
🧰 Tools
🪛 Ruff

253-253: Use a context manager for opening files

(SIM115)


323-367: Consider extracting zip deployment update logic to a separate method.

Similar to the create_function refactoring suggestion, consider extracting the zip deployment update logic to improve readability and maintainability.

+    def _update_zip_function(
+        self,
+        name: str,
+        package: str,
+        code_size: int,
+        benchmark: str,
+        architecture: str,
+        function: LambdaFunction,
+    ) -> None:
+        if code_size < 50 * 1024 * 1024:
+            with open(package, "rb") as code_body:
+                self.client.update_function_code(
+                    FunctionName=name,
+                    ZipFile=code_body.read(),
+                    Architectures=[self._map_architecture(architecture)],
+                )
+        else:
+            storage = cast(S3, self.get_storage())
+            bucket = function.code_bucket(benchmark, storage)
+            code_prefix = os.path.join(
+                benchmark,
+                architecture,
+                os.path.basename(package)
+            )
+            storage.upload(bucket, package, code_prefix)
+            self.client.update_function_code(
+                FunctionName=name,
+                S3Bucket=bucket,
+                S3Key=code_prefix,
+                Architectures=[self._map_architecture(architecture)],
+            )

    def update_function(
        self,
        function: Function,
        code_package: Benchmark,
        container_deployment: bool,
        container_uri: str,
    ):
        name = function.name
        function = cast(LambdaFunction, function)
        
        if container_deployment:
            self.client.update_function_code(
                FunctionName=name,
                ImageUri=container_uri
            )
        else:
            self._update_zip_function(
                name,
                code_package.code_location,
                code_package.code_size,
                code_package.benchmark,
                function_cfg.architecture.value,
                function,
            )
sebs/faas/container.py (2)

40-43: Simplify code by using a ternary operator

You can make the code more concise by replacing the if-else block with a ternary operator.

-        if isinstance(txt, str):
-            line = json.loads(txt)
-        else:
-            line = txt
+        line = json.loads(txt) if isinstance(txt, str) else txt
🧰 Tools
🪛 Ruff

40-43: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt instead of if-else-block

Replace if-else-block with line = json.loads(txt) if isinstance(txt, str) else txt

(SIM108)


66-67: Raise a more specific exception for better error handling

Currently, a generic Exception is raised when an error occurs. Consider raising a more specific exception, such as docker.errors.APIError or defining a custom exception class. This improves clarity and allows for more precise exception handling.

-            raise Exception(line["error"])
+            raise docker.errors.APIError(line["error"])
sebs/aws/container.py (2)

69-159: Consider refactoring 'build_base_image' method for improved readability

The build_base_image method is lengthy and encompasses multiple logical steps, such as checking the image cache, preparing the build context, building the Docker image, and pushing the image to the repository. Breaking it down into smaller, well-named helper methods would enhance readability and maintainability.


148-150: Add error handling for the Docker image build process

To ensure issues during the Docker build are properly managed, consider adding exception handling around the image build step. This allows for graceful error reporting and recovery.

Example modification:

+            try:
                 image, _ = self.docker_client.images.build(
                     tag=repository_uri, path=build_dir, buildargs=buildargs
                 )
+            except docker.errors.BuildError as e:
+                self.logging.error(f"Failed to build Docker image: {e}")
+                raise RuntimeError("Docker image build failed") from e
sebs/gcp/gcp.py (1)

200-200: Consider removing container_uri from the return value if unused.

Since container_deployment is not supported and container_uri is initialized but not utilized within the package_code method, consider removing it from the return statement to simplify the code.

Apply this diff to adjust the return statement:

-return os.path.join(directory, "{}.zip".format(benchmark)), bytes_size, container_uri
+return os.path.join(directory, "{}.zip".format(benchmark)), bytes_size

And update the method signature accordingly.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between e4a5b26 and ce64f7d.

📒 Files selected for processing (13)
  • dockerfiles/aws/python/Dockerfile.function (1 hunks)
  • requirements.aws.txt (1 hunks)
  • sebs.py (12 hunks)
  • sebs/aws/aws.py (8 hunks)
  • sebs/aws/config.py (4 hunks)
  • sebs/aws/container.py (1 hunks)
  • sebs/azure/azure.py (6 hunks)
  • sebs/faas/container.py (1 hunks)
  • sebs/faas/system.py (5 hunks)
  • sebs/gcp/gcp.py (4 hunks)
  • sebs/local/local.py (3 hunks)
  • sebs/openwhisk/openwhisk.py (9 hunks)
  • sebs/sebs.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • dockerfiles/aws/python/Dockerfile.function
  • requirements.aws.txt
  • sebs/local/local.py
  • sebs/openwhisk/openwhisk.py
  • sebs/sebs.py
🧰 Additional context used
🪛 Ruff
sebs.py

646-646: Loop control variable idx not used within loop body

Rename unused idx to _idx

(B007)

sebs/aws/aws.py

253-253: Use a context manager for opening files

(SIM115)

sebs/aws/container.py

67-67: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

sebs/faas/container.py

40-43: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt instead of if-else-block

Replace if-else-block with line = json.loads(txt) if isinstance(txt, str) else txt

(SIM108)

🔇 Additional comments (17)
sebs/aws/config.py (4)

1-8: LGTM: Appropriate imports for ECR support

The added imports and type hints are well-chosen for ECR integration.


136-151: LGTM: Well-structured property definitions

The property getters are properly implemented with correct type hints and follow Python best practices.


330-338: LGTM: Proper cache management

The cache updates are well-structured and maintain consistency with the existing caching mechanism.


303-307: 🛠️ Refactor suggestion

Use safe dictionary access

The current implementation assumes dictionary keys exist, which could raise KeyError.

 if "docker" in dct:
-    ret._docker_registry = dct["docker"]["registry"]
-    ret._docker_username = dct["docker"]["username"]
-    ret._container_repository = dct["container_repository"]
+    docker_config = dct["docker"]
+    ret._docker_registry = docker_config.get("registry")
+    ret._docker_username = docker_config.get("username")
+    ret._container_repository = dct.get("container_repository")

Likely invalid or redundant comment.

sebs/azure/azure.py (3)

295-304: Changes look good!

The implementation correctly handles container deployment parameters and raises NotImplementedError for unsupported functionality.


341-350: Changes look good!

The implementation correctly handles container deployment parameters and raises NotImplementedError for unsupported functionality.


530-530: LGTM!

The update_function call correctly passes False for container_deployment and empty string for container_uri, which is consistent with Azure not supporting container deployment.

sebs/aws/aws.py (3)

13-13: LGTM: Clean ECR client integration.

The ECR client integration follows the existing pattern of client initialization in the AWS class.

Also applies to: 76-78


400-401: LGTM: Clean function name update for container deployments.

The addition of the "-docker" suffix for container deployments is consistent with the existing naming pattern.


253-253: ⚠️ Potential issue

Use context manager for file operations.

Use a context manager to ensure proper file handling:

-                    package_body = open(package, "rb").read()
+                    with open(package, "rb") as f:
+                        package_body = f.read()

Likely invalid or redundant comment.

🧰 Tools
🪛 Ruff

253-253: Use a context manager for opening files

(SIM115)

sebs/faas/container.py (2)

11-19: Well-defined constructor initializes class attributes correctly

The __init__ method properly sets up the Docker client, experimental manifest flag, and system configuration for the DockerContainer class.


69-87: push_image method handles image pushing effectively

The push_image method correctly manages the image push process, including progress tracking and error handling with proper logging.

sebs/aws/container.py (1)

66-66: 🛠️ Refactor suggestion

Enhance exception handling by chaining exceptions

To preserve the original exception traceback and provide better error context, use exception chaining by adding from e when raising a new exception in an except block.

Apply this diff to improve exception handling:

-            raise RuntimeError("Couldn't push to Docker registry")
+            raise RuntimeError("Couldn't push to Docker registry") from e

Likely invalid or redundant comment.

sebs/gcp/gcp.py (4)

141-145: Method signature updated to include container deployment parameters.

The package_code method now accepts architecture and container_deployment parameters, aligning with the changes made for container support. This update ensures consistency across different cloud providers.


149-150: Proper handling of unsupported container deployment in GCP.

Raising a NotImplementedError when container_deployment is True correctly indicates that container deployment is not supported in GCP, preventing unintended usage.


202-208: Method signature updated to include container deployment parameters.

The create_function method now includes container_deployment and container_uri parameters, ensuring consistency with the overall interface and preparing for potential future support.


210-211: Correctly raising an error for unsupported container deployment.

The NotImplementedError provides clear feedback that container deployment is not yet implemented for GCP, which helps maintain correct usage.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🧹 Outside diff range and nitpick comments (20)
sebs/openwhisk/container.py (3)

1-9: LGTM with minor style suggestion.

The imports and class declaration are well-organized with proper type hints.

Remove the extra blank line at line 8 to maintain consistent spacing.


10-17: Add docstrings to static methods.

While the methods are clear and concise, adding docstrings would improve code documentation and maintainability.

Consider adding docstrings:

 @staticmethod
 def name() -> str:
+    """Return the identifier for OpenWhisk container.
+    
+    Returns:
+        str: The string 'openwhisk'
+    """
     return "openwhisk"

 @staticmethod
 def typename() -> str:
+    """Return the full type name for OpenWhisk container.
+    
+    Returns:
+        str: The string 'OpenWhisk.Container'
+    """
     return "OpenWhisk.Container"

18-27: Add constructor documentation and parameter validation.

The constructor implementation is correct but could benefit from additional documentation and validation.

Consider these improvements:

 def __init__(
     self,
     system_config: SeBSConfig,
     config: OpenWhiskConfig,
     docker_client: docker.client,
     experimental_manifest: bool,
 ):
+    """Initialize OpenWhisk container manager.
+    
+    Args:
+        system_config (SeBSConfig): System-wide configuration
+        config (OpenWhiskConfig): OpenWhisk-specific configuration
+        docker_client (docker.client): Docker client instance
+        experimental_manifest (bool): Flag for experimental features
+    
+    Raises:
+        ValueError: If required configurations are missing
+    """
+    if not system_config:
+        raise ValueError("system_config is required")
+    if not config:
+        raise ValueError("config is required")
+    if not docker_client:
+        raise ValueError("docker_client is required")
     super().__init__(system_config, docker_client, experimental_manifest)
     self.config = config
sebs/aws/container.py (2)

14-21: Add return type annotation to name() method.

For consistency with typename(), add a return type annotation to the name() method.

    @staticmethod
-   def name():
+   def name() -> str:
        return "aws"

38-53: Add docstring to document the registry_name method.

The method logic is correct, but adding a docstring would improve code documentation by explaining the purpose and return values.

    def registry_name(
        self, benchmark: str, language_name: str, language_version: str, architecture: str
    ) -> Tuple[str, str, str, str]:
+       """
+       Constructs AWS ECR registry details for a given benchmark and runtime.
+       
+       Args:
+           benchmark: Name of the benchmark
+           language_name: Programming language name
+           language_version: Language version
+           architecture: Target architecture
+       
+       Returns:
+           Tuple containing (registry_name, repository_name, image_tag, image_uri)
+       """
config/systems.json (1)

Line range hint 1-285: Consider adding JSON schema validation.

To ensure configuration consistency and catch potential issues early, consider:

  1. Creating a JSON schema for this configuration file
  2. Adding validation in the code that reads this configuration
  3. Including schema documentation

This would help prevent configuration errors and improve maintainability.

sebs/utils.py (1)

Line range hint 103-138: Remove commented-out logging code.

The commented-out configure_logging implementation should be removed to maintain code cleanliness. If this alternative implementation is needed in the future, it can be retrieved from version control history.

sebs/faas/container.py (4)

29-41: Add parameter validation in constructor.

Consider adding validation for docker_client to ensure it's a valid Docker client instance, as its methods are used throughout the class.

 def __init__(
     self,
     system_config: SeBSConfig,
     docker_client,
     experimental_manifest: bool = False,
 ):
     super().__init__()
+    if not hasattr(docker_client, 'images'):
+        raise ValueError("Invalid docker_client provided")
 
     self.docker_client = docker_client
     self.experimental_manifest = experimental_manifest
     self.system_config = system_config
     self._disable_rich_output = False

60-90: Optimize code and improve error handling.

Two suggestions for improvement:

  1. Use ternary operator for better readability
  2. Add more specific error handling for JSON parsing
-        if isinstance(txt, str):
-            line = json.loads(txt)
-        else:
-            line = txt
+        try:
+            line = json.loads(txt) if isinstance(txt, str) else txt
+        except json.JSONDecodeError as e:
+            self.logging.error(f"Failed to parse progress update: {e}")
+            raise
🧰 Tools
🪛 Ruff

62-65: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt instead of if-else-block

Replace if-else-block with line = json.loads(txt) if isinstance(txt, str) else txt

(SIM108)


91-122: Refactor to reduce code duplication.

The image push logic is duplicated between the progress and non-progress paths. Consider extracting the common functionality.

     def push_image(self, repository_uri, image_tag):
+        def handle_push_stream(stream):
+            for line in stream:
+                if "error" in line:
+                    self.logging.error(f"Failed to push the image to registry {repository_uri}")
+                    raise RuntimeError(line)
+                yield line
+
         try:
+            self.logging.info(f"Pushing image {image_tag} to {repository_uri}")
+            ret = self.docker_client.images.push(
+                repository=repository_uri, tag=image_tag, stream=True, decode=True
+            )
 
             if not self.disable_rich_output:
-
                 layer_tasks = {}
                 with Progress() as progress:
-
-                    self.logging.info(f"Pushing image {image_tag} to {repository_uri}")
-                    ret = self.docker_client.images.push(
-                        repository=repository_uri, tag=image_tag, stream=True, decode=True
-                    )
-                    for line in ret:
+                    for line in handle_push_stream(ret):
                         self.show_progress(line, progress, layer_tasks)
-
             else:
-                self.logging.info(f"Pushing image {image_tag} to {repository_uri}")
-                ret = self.docker_client.images.push(
-                    repository=repository_uri, tag=image_tag, stream=True, decode=True
-                )
-
-                for val in ret:
-                    if "error" in val:
-                        self.logging.error(f"Failed to push the image to registry {repository_uri}")
-                        raise RuntimeError(val)
+                for _ in handle_push_stream(ret):
+                    pass

1-211: Consider adding timeout and retry mechanisms.

For improved reliability, consider:

  1. Adding configurable timeouts for Docker operations
  2. Implementing retry mechanisms for transient failures in image push/pull operations
  3. Adding circuit breakers for Docker daemon communication

This would make the container operations more resilient in production environments.

🧰 Tools
🪛 Ruff

62-65: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt instead of if-else-block

Replace if-else-block with line = json.loads(txt) if isinstance(txt, str) else txt

(SIM108)

sebs/openwhisk/openwhisk.py (1)

376-377: Consider adding enable_rich_output method for symmetry.

For better API design, consider adding a corresponding method to enable rich output.

 def disable_rich_output(self):
     self.container_client.disable_rich_output = True
+
+def enable_rich_output(self):
+    self.container_client.disable_rich_output = False
sebs/faas/system.py (2)

158-171: Improve docstring accuracy and completeness.

The docstring has a few minor issues:

  1. Architecture example uses 'x64' but the code uses 'x86_64' (line 162).
  2. Return tuple description is missing the second element (size of package).
-            architecture: Target architecture (e.g., 'x64', 'arm64')
+            architecture: Target architecture (e.g., 'x86_64', 'arm64')

-            - Size of the package
+            - Size of the package in bytes

401-403: Add docstring for disable_rich_output method.

The purpose and usage of this method are unclear. Please add a docstring explaining:

  • The method's purpose
  • Whether it should be abstract
  • Expected behavior in subclasses
     def disable_rich_output(self):
+        """
+        Disable rich output formatting in the FaaS system.
+        
+        This method can be overridden by subclasses to implement
+        system-specific rich output disabling behavior.
+        """
         pass
sebs.py (3)

228-228: Add validation for storage configuration.

Consider adding validation to ensure the storage configuration file is properly formatted and contains required fields.

@click.option("--storage-configuration", default=None, type=str, help="JSON configuration of deployed storage.")
+def validate_storage_config(ctx, param, value):
+    if value is None:
+        return None
+    try:
+        with open(value, 'r') as f:
+            config = json.load(f)
+            # Add validation for required fields
+            return value
+    except (json.JSONDecodeError, IOError) as e:
+        raise click.BadParameter(f"Invalid storage configuration: {str(e)}")
+
+@click.option("--storage-configuration",
+    default=None,
+    type=str,
+    callback=validate_storage_config,
+    help="JSON configuration of deployed storage.")

439-460: Enhance documentation for measurement interval parameter.

Consider adding more detailed documentation about:

  • The purpose and impact of the measurement interval
  • Recommended values for different use cases
  • Performance implications of different intervals

644-655: Fix unused loop variable and enhance error handling.

  1. Address the unused loop variable.
  2. Consider adding error handling for the resource group deletion operation.
-        for idx, group in enumerate(groups):
+        for _idx, group in enumerate(groups):
             if len(prefix) > 0 and not group.startswith(prefix):
                 continue

             sebs_client.logging.info(f"Removing resource group: {group}")
-            deployment_client.config.resources.delete_resource_group(
-                deployment_client.cli_instance, group, wait
-            )
+            try:
+                deployment_client.config.resources.delete_resource_group(
+                    deployment_client.cli_instance, group, wait
+                )
+            except Exception as e:
+                sebs_client.logging.error(f"Failed to delete resource group {group}: {str(e)}")
+                if not wait:
+                    sebs_client.logging.warning("Resource group deletion might still be in progress")
🧰 Tools
🪛 Ruff

647-647: Loop control variable idx not used within loop body

Rename unused idx to _idx

(B007)

sebs/aws/aws.py (1)

238-278: Consider extracting zip deployment logic.

The create_function method is handling both container and zip deployments, making it quite long. Consider extracting the zip deployment logic into a separate method for better maintainability.

+def _create_zip_function(self, create_function_params: dict, package: str, code_size: int,
+                        benchmark: str, architecture: str, language: str,
+                        language_runtime: str) -> dict:
+    if code_size < 50 * 1024 * 1024:
+        with open(package, "rb") as f:
+            package_body = f.read()
+        create_function_params["Code"] = {"ZipFile": package_body}
+    else:
+        code_package_name = os.path.basename(package)
+        storage_client = self.get_storage()
+        code_bucket = storage_client.get_bucket(Resources.StorageBucketType.DEPLOYMENT)
+        code_prefix = os.path.join(benchmark, architecture, code_package_name)
+        storage_client.upload(code_bucket, package, code_prefix)
+        create_function_params["Code"] = {
+            "S3Bucket": code_bucket,
+            "S3Key": code_prefix,
+        }
+    create_function_params["Runtime"] = f"{language}{self._map_language_runtime(language, language_runtime)}"
+    create_function_params["Handler"] = "handler.handler"
+    return create_function_params

 def create_function(self, code_package: Benchmark, func_name: str,
                    container_deployment: bool, container_uri: str) -> "LambdaFunction":
     # ... existing code ...
     if container_deployment:
         create_function_params["PackageType"] = "Image"
         create_function_params["Code"] = {"ImageUri": container_uri}
     else:
         create_function_params["PackageType"] = "Zip"
-        if code_size < 50 * 1024 * 1024:
-            # ... existing zip handling code ...
+        create_function_params = self._create_zip_function(
+            create_function_params, package, code_size, benchmark,
+            architecture, language, language_runtime
+        )
🧰 Tools
🪛 Ruff

253-253: Use a context manager for opening files

(SIM115)

sebs/gcp/gcp.py (1)

141-150: LGTM! Consider improving the error message.

The changes to add container deployment support are well-structured. However, the error message could be more informative.

Consider updating the error message to be more specific:

-            raise NotImplementedError("Container Deployment is not supported in GCP")
+            raise NotImplementedError("Container deployment is not supported in GCP. Use AWS for container-based deployments.")
sebs/regression.py (1)

131-131: Remove Unnecessary Commented Code

The line:

# for trigger in triggers:

is commented out. If this code is no longer needed, consider removing it to keep the codebase clean. If it's temporarily commented out for debugging or development purposes, please add a comment explaining its intended future use.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ce64f7d and 18cc75b.

📒 Files selected for processing (14)
  • config/systems.json (8 hunks)
  • sebs.py (13 hunks)
  • sebs/aws/aws.py (9 hunks)
  • sebs/aws/container.py (1 hunks)
  • sebs/benchmark.py (9 hunks)
  • sebs/config.py (1 hunks)
  • sebs/faas/container.py (1 hunks)
  • sebs/faas/system.py (7 hunks)
  • sebs/gcp/gcp.py (4 hunks)
  • sebs/openwhisk/container.py (1 hunks)
  • sebs/openwhisk/openwhisk.py (9 hunks)
  • sebs/regression.py (11 hunks)
  • sebs/sebs.py (2 hunks)
  • sebs/utils.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • sebs/benchmark.py
  • sebs/sebs.py
🧰 Additional context used
🪛 Ruff
sebs.py

647-647: Loop control variable idx not used within loop body

Rename unused idx to _idx

(B007)

sebs/aws/aws.py

253-253: Use a context manager for opening files

(SIM115)

sebs/aws/container.py

79-79: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

sebs/faas/container.py

62-65: Use ternary operator line = json.loads(txt) if isinstance(txt, str) else txt instead of if-else-block

Replace if-else-block with line = json.loads(txt) if isinstance(txt, str) else txt

(SIM108)

🔇 Additional comments (26)
sebs/openwhisk/container.py (1)

28-46: 🛠️ Refactor suggestion

Add input validation and improve registry handling.

While the method correctly constructs registry information, it could benefit from additional validation and error handling.

Consider these improvements:

 def registry_name(
     self, benchmark: str, language_name: str, language_version: str, architecture: str
 ) -> Tuple[str, str, str, str]:
+    """Construct registry information for container image.
+    
+    Args:
+        benchmark (str): Benchmark identifier
+        language_name (str): Programming language name
+        language_version (str): Language version
+        architecture (str): Target architecture
+    
+    Returns:
+        Tuple[str, str, str, str]: Registry name, repository name, image tag, and URI
+    
+    Raises:
+        ValueError: If input parameters are invalid
+    """
+    if not all([benchmark, language_name, language_version, architecture]):
+        raise ValueError("All parameters must be non-empty strings")
+
+    # Sanitize inputs to ensure valid Docker names
+    safe_benchmark = benchmark.lower().replace(' ', '-')
+    safe_lang_name = language_name.lower().replace(' ', '-')
 
     registry_name = self.config.resources.docker_registry
 
     # We need to retag created images when pushing to registry other
     # than default
     repository_name = self.system_config.docker_repository()
     image_tag = self.system_config.benchmark_image_tag(
-        self.name(), benchmark, language_name, language_version, architecture
+        self.name(), safe_benchmark, safe_lang_name, language_version, architecture
     )
     if registry_name is not None and registry_name != "":
+        # Validate registry name format
+        if not registry_name.lower().replace('.', '').replace('-', '').isalnum():
+            raise ValueError(f"Invalid registry name: {registry_name}")
         repository_name = f"{registry_name}/{repository_name}"
     else:
-        registry_name = "Docker Hub"
+        registry_name = "docker.io"  # Use official Docker Hub registry name
     image_uri = f"{repository_name}:{image_tag}"
 
     return registry_name, repository_name, image_tag, image_uri

Key improvements:

  1. Added input validation
  2. Added parameter sanitization for Docker compatibility
  3. Added registry name validation
  4. Used official Docker Hub registry name

Let's verify the Docker registry naming conventions:

sebs/aws/container.py (4)

1-13: LGTM! Well-structured imports and class definition.

The imports are properly organized and the class inheritance is appropriate for implementing AWS ECR container support.


22-33: LGTM! Well-structured constructor.

The constructor properly initializes the ECR client and maintains parent class functionality.


34-37: LGTM! Well-typed client property.


54-65: 🛠️ Refactor suggestion

Simplify the method and improve error handling.

  1. The method has a redundant return False at the end.
  2. The past review comment about logging exceptions is still valid.
    def find_image(self, repository_name, image_tag) -> bool:
        try:
            response = self.ecr_client.describe_images(
                repositoryName=repository_name, imageIds=[{"imageTag": image_tag}]
            )
            if response["imageDetails"]:
                return True
-        except ClientError:
+        except ClientError as e:
+            self.logging.error(f"Error checking image existence: {e}")
            return False
-
-        return False
config/systems.json (6)

58-59: LGTM: Local configuration changes are consistent.

The addition of architecture and deployment specifications for local development is well-structured.


113-114: LGTM: AWS container deployment configuration is well-structured.

The changes properly support:

  • Multiple architectures (x64, arm64)
  • Multiple deployment methods (package, container)

This aligns well with the PR objectives for container deployment support.


120-126: LGTM: Azure configuration changes maintain consistency.

The restructuring of base images and addition of architecture/deployment specifications align with the overall configuration structure.

Also applies to: 171-172


178-185: LGTM: GCP configuration uses secure base images.

Good practice observed:

  • Using latest Ubuntu LTS (22.04) for Python runtimes
  • Consistent architecture specification
  • Clear deployment method specification

Also applies to: 228-229


84-85: Document the setup.py requirements and usage.

The addition of setup.py to deployment files is a significant change that affects the packaging process. Please ensure this is documented in:

  • Deployment documentation
  • Configuration guide
  • Any relevant README files

282-283: Verify container-only deployment for OpenWhisk.

OpenWhisk is configured with "deployments": ["container"] while other providers support both package and container deployments. Please verify if this is intentional and document the reasoning.

✅ Verification successful

Based on the extensive search results, I can now provide a definitive response about the container-only deployment configuration for OpenWhisk:

Container-only deployment is intentional and required for OpenWhisk

OpenWhisk is specifically configured to use container-only deployment due to its inherent 48MB code package size limitation. This is well-documented in the codebase:

  • The container-only approach is explicitly mentioned in docs/platforms.md: "OpenWhisk has a very low code package size limit of only 48 megabytes. So, to circumvent this limit, we deploy functions using pre-built Docker images."
  • The implementation in sebs/openwhisk/ consistently uses Docker containers for function deployment
  • All official OpenWhisk base images are properly configured in config/systems.json
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if this limitation is documented
fd -t f "README|DEPLOY|CONFIGURATION" -X grep -l -A 5 "OpenWhisk.*container"

Length of output: 78


Script:

#!/bin/bash
# Let's check the systems.json file to understand deployment patterns across providers
cat config/systems.json | jq -r 'to_entries[] | select(.value.deployments != null) | "\(.key): \(.value.deployments | join(", "))"'

# Also search for any OpenWhisk related documentation or configuration
rg -i "openwhisk" -A 5 -B 5

Length of output: 69282

sebs/utils.py (1)

57-57: Verify the impact of the condition change.

The change from if value: to if value is not None: allows falsy values (0, '', [], {}, False) to be stored in the nested dictionary. While this might be intentional for container configurations, please verify that this doesn't break existing functionality.

✅ Verification successful

Let me analyze the usage of update_nested_dict further by checking the values being passed to it.


Based on the gathered information, I can now provide the final response:

The condition change is safe and intentional

The change from if value: to if value is not None: is actually a safer implementation because:

  • All calls to update_nested_dict pass configuration values that should be explicitly checked for None rather than falsy values
  • The function is used to update configuration dictionaries where empty strings, zeros, or False values might be valid configuration options
  • The change aligns with the function's type hint Optional[str], making the implementation more consistent with its interface
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for calls to update_nested_dict to understand the impact
# Test: Look for potential issues with falsy values being stored

# Find all direct calls to update_nested_dict
echo "Direct calls to update_nested_dict:"
rg "update_nested_dict\(" -A 2

# Find potential configuration files that might be affected
echo -e "\nPotential configuration files:"
fd -e json -e yaml -e yml

Length of output: 2993


Script:

#!/bin/bash
# Check the values being passed to update_nested_dict
echo "Checking update_nested_dict usage in sebs.py:"
rg -B 2 "update_nested_dict\(" sebs.py

# Check the implementation of update_nested_dict
echo -e "\nChecking update_nested_dict implementation:"
rg -B 5 -A 10 "def update_nested_dict" sebs/utils.py

# Check if there are any tests for update_nested_dict
echo -e "\nChecking for tests:"
fd -e py test

Length of output: 1949

sebs/faas/container.py (4)

1-14: LGTM!

The imports are well-organized and the class hierarchy is appropriate.


15-28: LGTM!

The abstract method and property implementations follow Python best practices.


123-128: LGTM!

The abstract method is well-defined with clear parameters and return type annotation.


42-59: ⚠️ Potential issue

Security: Fix command injection vulnerability.

The execute() call with f-string interpolation is vulnerable to command injection. This is still a valid issue from the previous review.

Apply this fix:

-                execute(f"docker manifest inspect {repository_name}:{image_tag}")
+                execute(["docker", "manifest", "inspect", f"{repository_name}:{image_tag}"])
sebs/openwhisk/openwhisk.py (2)

35-37: LGTM: Container client initialization is well-structured.

The OpenWhiskContainer is properly initialized with all necessary parameters.


114-116: Verify container image building across architectures.

The image building process needs to be tested across different architectures to ensure compatibility.

sebs/faas/system.py (2)

188-212: LGTM!

The method signature and docstring are well-structured and complete. The documentation properly describes the new container-related parameters and potential errors.


220-238: LGTM!

The method signature and docstring are well-structured and complete. The documentation properly describes the new container-related parameters and potential errors.

sebs.py (2)

100-104: LGTM! Well-documented CLI option.

The --container-deployment flag is well-implemented with clear, descriptive help text that properly indicates AWS-only support.


133-133: Verify container deployment configuration usage.

The container deployment configuration is properly added. Let's verify its usage across the codebase.

Also applies to: 157-157

✅ Verification successful

Based on the shell script results, I can generate the final response:

Container deployment configuration is properly integrated and verified

The container deployment configuration is correctly used across the codebase:

  • Properly propagated through function parameters in deployment systems
  • Correctly handled in AWS Lambda with container/zip package type switching
  • Appropriately rejected with clear error messages in unsupported platforms (Azure, GCP, Local)
  • Well-integrated with the configuration system and cache validation
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify container deployment configuration usage
# Test: Search for container_deployment usage in deployment implementations
rg -A 5 "container_deployment"

# Test: Search for container deployment related configuration
ast-grep --pattern 'container_deployment = $_'

Length of output: 19665

sebs/aws/aws.py (2)

13-13: LGTM: Clean ECR client integration.

The ECR client integration follows the existing pattern of client initialization in the AWS class.

Also applies to: 76-78


400-401: LGTM: Clean helper method implementations.

The changes to default_function_name and the addition of disable_rich_output are straightforward and well-implemented.

Also applies to: 629-630

sebs/gcp/gcp.py (2)

202-211: LGTM! The changes are consistent.

The method signature update and error handling for container deployment are consistent with the package_code method.


345-354: LGTM! The changes maintain consistency.

The method signature update and error handling for container deployment are consistent with other methods in the class.

Comment on lines +66 to +79
def push_image(self, repository_uri, image_tag):

username, password, registry_url = self.config.resources.ecr_repository_authorization(
self.client
)

try:
self.docker_client.login(username=username, password=password, registry=registry_url)
super().push_image(repository_uri, image_tag)
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")
except docker.errors.APIError as e:
self.logging.error(f"Failed to push the image to registry {repository_uri}.")
self.logging.error(f"Error: {str(e)}")
raise RuntimeError("Couldn't push to Docker registry")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Implement proper exception chaining in error handling.

The error handling is good, but the exception should be chained to preserve the original error context.

        try:
            self.docker_client.login(username=username, password=password, registry=registry_url)
            super().push_image(repository_uri, image_tag)
            self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")
        except docker.errors.APIError as e:
            self.logging.error(f"Failed to push the image to registry {repository_uri}.")
            self.logging.error(f"Error: {str(e)}")
-           raise RuntimeError("Couldn't push to Docker registry")
+           raise RuntimeError("Couldn't push to Docker registry") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def push_image(self, repository_uri, image_tag):
username, password, registry_url = self.config.resources.ecr_repository_authorization(
self.client
)
try:
self.docker_client.login(username=username, password=password, registry=registry_url)
super().push_image(repository_uri, image_tag)
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")
except docker.errors.APIError as e:
self.logging.error(f"Failed to push the image to registry {repository_uri}.")
self.logging.error(f"Error: {str(e)}")
raise RuntimeError("Couldn't push to Docker registry")
def push_image(self, repository_uri, image_tag):
username, password, registry_url = self.config.resources.ecr_repository_authorization(
self.client
)
try:
self.docker_client.login(username=username, password=password, registry=registry_url)
super().push_image(repository_uri, image_tag)
self.logging.info(f"Successfully pushed the image to registry {repository_uri}.")
except docker.errors.APIError as e:
self.logging.error(f"Failed to push the image to registry {repository_uri}.")
self.logging.error(f"Error: {str(e)}")
raise RuntimeError("Couldn't push to Docker registry") from e
🧰 Tools
🪛 Ruff

79-79: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

Comment on lines +53 to +58
def benchmark_base_images(
self, deployment_name: str, language_name: str, architecture: str
) -> Dict[str, str]:
return self._system_config[deployment_name]["languages"][language_name]["base_images"][
architecture
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add parameter validation for benchmark_base_images.

Similar to supported_language_versions, this method should validate its parameters.

Consider this implementation:

 def benchmark_base_images(
     self, deployment_name: str, language_name: str, architecture: str
 ) -> Dict[str, str]:
+    if deployment_name not in self._system_config:
+        raise ValueError(f"Unknown deployment: {deployment_name}")
+    if language_name not in self._system_config[deployment_name]["languages"]:
+        raise ValueError(f"Unsupported language: {language_name}")
+    if architecture not in self.supported_architecture(deployment_name):
+        raise ValueError(f"Unsupported architecture: {architecture}")
     return self._system_config[deployment_name]["languages"][language_name]["base_images"][
         architecture
     ]

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +47 to +52
def supported_package_deployment(self, deployment_name: str) -> bool:
return "package" in self._system_config[deployment_name]["deployments"]

def supported_container_deployment(self, deployment_name: str) -> bool:
return "container" in self._system_config[deployment_name]["deployments"]

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add deployment name validation.

Both methods should validate the deployment_name parameter to prevent KeyError exceptions.

Consider this implementation for both methods:

 def supported_package_deployment(self, deployment_name: str) -> bool:
+    if deployment_name not in self._system_config:
+        raise ValueError(f"Unknown deployment: {deployment_name}")
     return "package" in self._system_config[deployment_name]["deployments"]

 def supported_container_deployment(self, deployment_name: str) -> bool:
+    if deployment_name not in self._system_config:
+        raise ValueError(f"Unknown deployment: {deployment_name}")
     return "container" in self._system_config[deployment_name]["deployments"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def supported_package_deployment(self, deployment_name: str) -> bool:
return "package" in self._system_config[deployment_name]["deployments"]
def supported_container_deployment(self, deployment_name: str) -> bool:
return "container" in self._system_config[deployment_name]["deployments"]
def supported_package_deployment(self, deployment_name: str) -> bool:
if deployment_name not in self._system_config:
raise ValueError(f"Unknown deployment: {deployment_name}")
return "package" in self._system_config[deployment_name]["deployments"]
def supported_container_deployment(self, deployment_name: str) -> bool:
if deployment_name not in self._system_config:
raise ValueError(f"Unknown deployment: {deployment_name}")
return "container" in self._system_config[deployment_name]["deployments"]

Comment on lines +37 to +41
def supported_language_versions(
self, deployment_name: str, language_name: str, architecture: str
) -> List[str]:
return self._system_config[deployment_name]["languages"][language_name]["base_images"][
architecture
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add architecture parameter validation and error handling.

The method should validate the architecture parameter and handle missing architectures gracefully to prevent runtime errors.

Consider this implementation:

 def supported_language_versions(
     self, deployment_name: str, language_name: str, architecture: str
 ) -> List[str]:
+    if architecture not in self.supported_architecture(deployment_name):
+        raise ValueError(f"Unsupported architecture: {architecture}")
     return self._system_config[deployment_name]["languages"][language_name]["base_images"][
         architecture
     ].keys()

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +70 to +72
tag = self.benchmark_image_tag(
system, benchmark, language_name, language_version, architecture
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Based on the investigation of the codebase, I can now provide the final response:

Parameter validation is needed, but tag format is Docker-compliant

The current implementation of benchmark_image_tag follows Docker's tag naming conventions correctly:

  • Uses only allowed characters (alphanumeric, dots, dashes)
  • Has a consistent format: function.<system>.<benchmark>.<language>-<version>-<architecture>[-prefix]

However, there's no parameter validation, which could lead to runtime issues. The suggested improvements in the review comment are valid and would enhance code reliability.

  • Add parameter validation in benchmark_image_tag method
  • Add docstring to document the tag format
  • Consider adding test cases for the validation
🔗 Analysis chain

Verify image tag format and add parameter validation.

The tag format looks good, but consider adding parameter validation and documenting the format.

Let's verify the tag format meets Docker requirements:

Consider adding validation and documentation:

 def benchmark_image_tag(
     self,
     system: str,
     benchmark: str,
     language_name: str,
     language_version: str,
     architecture: str,
 ) -> str:
+    """Generate a Docker image tag following the format: function.<system>.<benchmark>.<language>-<version>-<arch>[-prefix]
+    
+    Args:
+        system: The deployment system (e.g., 'aws')
+        benchmark: The benchmark name
+        language_name: The programming language
+        language_version: The language version
+        architecture: The target architecture
+    
+    Returns:
+        str: A valid Docker image tag
+    """
+    if not all([system, benchmark, language_name, language_version, architecture]):
+        raise ValueError("All parameters must be non-empty strings")
     tag = f"function.{system}.{benchmark}.{language_name}-{language_version}-{architecture}"
     if self.image_tag_prefix:
         tag = f"{tag}-{self.image_tag_prefix}"
     return tag

Also applies to: 87-87


Let me gather more information about the image tag format by checking the actual implementation and test cases.


Let me check for any test cases that might validate the image tag format and any existing validation in the config class.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify image tag format in the codebase
# Test: Search for any potential invalid characters in image tags
rg -g '*.py' 'tag\s*=.*[^a-zA-Z0-9_.-]'

Length of output: 998


Script:

#!/bin/bash
# Check the implementation of benchmark_image_tag and its test cases
rg -g '*.py' -A 10 'def benchmark_image_tag'

# Check for any existing parameter validation
rg -g '*.py' 'raise ValueError.*tag'

# Look for test cases related to image tags
rg -g '*test*.py' 'test.*image.*tag'

Length of output: 659


Script:

#!/bin/bash
# Check for test files related to config
fd test_config.py

# Check the content of test files for image tag testing
rg -g '*test*.py' -B 2 -A 5 'benchmark_image_tag'

# Check for any validation in Config class
rg -g '*.py' -B 2 -A 5 'class Config'

Length of output: 820

Comment on lines +180 to +171
architecture: str,
benchmark: str,
is_cached: bool,
) -> Tuple[str, int]:
container_deployment: bool,
) -> Tuple[str, int, str]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix return type annotation inconsistency.

The return type annotation doesn't match the docstring. The tuple should include the container URI as the fourth element.

-    ) -> Tuple[str, int, str]:
+    ) -> Tuple[str, int, bool, str]:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
architecture: str,
benchmark: str,
is_cached: bool,
) -> Tuple[str, int]:
container_deployment: bool,
) -> Tuple[str, int, str]:
architecture: str,
benchmark: str,
is_cached: bool,
container_deployment: bool,
) -> Tuple[str, int, bool, str]:

Runtime="{}{}".format(
create_function_params["PackageType"] = "Zip"
if code_size < 50 * 1024 * 1024:
package_body = open(package, "rb").read()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use context manager for file operations.

Use a context manager to ensure proper file cleanup.

-package_body = open(package, "rb").read()
+with open(package, "rb") as f:
+    package_body = f.read()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
package_body = open(package, "rb").read()
with open(package, "rb") as f:
package_body = f.read()
🧰 Tools
🪛 Ruff

253-253: Use a context manager for opening files

(SIM115)

Comment on lines +333 to +519
if container_deployment:
self.client.update_function_code(FunctionName=name, ImageUri=container_uri)
else:
code_size = code_package.code_size
package = code_package.code_location
benchmark = code_package.benchmark

function_cfg = FunctionConfig.from_benchmark(code_package)
architecture = function_cfg.architecture.value

# Run AWS update
# AWS Lambda limit on zip deployment
if code_size < 50 * 1024 * 1024:
with open(package, "rb") as code_body:
self.client.update_function_code(
FunctionName=name,
ZipFile=code_body.read(),
Architectures=[self._map_architecture(architecture)],
)
# Upload code package to S3, then update
else:
code_package_name = os.path.basename(package)

storage = cast(S3, self.get_storage())
bucket = function.code_bucket(code_package.benchmark, storage)
code_prefix = os.path.join(benchmark, architecture, code_package_name)
storage.upload(bucket, package, code_prefix)

# Run AWS update
# AWS Lambda limit on zip deployment
if code_size < 50 * 1024 * 1024:
with open(package, "rb") as code_body:
self.client.update_function_code(
FunctionName=name,
ZipFile=code_body.read(),
S3Bucket=bucket,
S3Key=code_prefix,
Architectures=[self._map_architecture(architecture)],
)
# Upload code package to S3, then update
else:
code_package_name = os.path.basename(package)

storage = cast(S3, self.get_storage())
bucket = function.code_bucket(code_package.benchmark, storage)
code_prefix = os.path.join(benchmark, architecture, code_package_name)
storage.upload(bucket, package, code_prefix)

self.client.update_function_code(
FunctionName=name,
S3Bucket=bucket,
S3Key=code_prefix,
Architectures=[self._map_architecture(architecture)],
)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Reduce code duplication in deployment logic.

The zip deployment logic is duplicated between create_function and update_function. Consider extracting common code into a shared helper method.

+def _handle_zip_deployment(self, package: str, code_size: int, benchmark: str,
+                          architecture: str) -> Tuple[str, str, str]:
+    if code_size < 50 * 1024 * 1024:
+        return None, None, None
+    code_package_name = os.path.basename(package)
+    storage = cast(S3, self.get_storage())
+    bucket = storage.get_bucket(Resources.StorageBucketType.DEPLOYMENT)
+    code_prefix = os.path.join(benchmark, architecture, code_package_name)
+    storage.upload(bucket, package, code_prefix)
+    return bucket, code_prefix, code_package_name

 def update_function(self, function: Function, code_package: Benchmark,
                    container_deployment: bool, container_uri: str):
     # ... existing code ...
     if container_deployment:
         self.client.update_function_code(FunctionName=name, ImageUri=container_uri)
     else:
-        # ... existing zip handling code ...
+        bucket, code_prefix, _ = self._handle_zip_deployment(
+            package, code_size, benchmark, architecture
+        )
+        if bucket is None:
+            with open(package, "rb") as code_body:
+                self.client.update_function_code(
+                    FunctionName=name,
+                    ZipFile=code_body.read(),
+                    Architectures=[self._map_architecture(architecture)],
+                )
+        else:
+            self.client.update_function_code(
+                FunctionName=name,
+                S3Bucket=bucket,
+                S3Key=code_prefix,
+                Architectures=[self._map_architecture(architecture)],
+            )

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +146 to +160
# if the containerized deployment is set to True
if container_deployment:
# build base image and upload to ECR
_, container_uri = self.ecr_client.build_base_image(
directory, language_name, language_version, architecture, benchmark, is_cached
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Based on the gathered information, I can now provide the final response. The codebase shows consistent error handling patterns in container-related operations (in container.py) and other AWS services, but the specific build_base_image call is not wrapped in error handling. This confirms the original review comment's concern.

Add error handling for container image building

The container image building process should follow the established error handling patterns found in the codebase, particularly since other ECR and container operations already implement proper error handling (as seen in container.py with docker.errors.APIError handling).

 if container_deployment:
-    # build base image and upload to ECR
-    _, container_uri = self.ecr_client.build_base_image(
-        directory, language_name, language_version, architecture, benchmark, is_cached
-    )
+    try:
+        # build base image and upload to ECR
+        _, container_uri = self.ecr_client.build_base_image(
+            directory, language_name, language_version, architecture, benchmark, is_cached
+        )
+    except Exception as e:
+        self.logging.error(f"Failed to build container image: {e}")
+        raise RuntimeError("Container image build failed") from e
🔗 Analysis chain

Add error handling for container image building.

The container image building process could fail for various reasons (network issues, invalid Dockerfile, etc.). Consider adding error handling to gracefully handle failures.

 if container_deployment:
-    # build base image and upload to ECR
-    _, container_uri = self.ecr_client.build_base_image(
-        directory, language_name, language_version, architecture, benchmark, is_cached
-    )
+    try:
+        # build base image and upload to ECR
+        _, container_uri = self.ecr_client.build_base_image(
+            directory, language_name, language_version, architecture, benchmark, is_cached
+        )
+    except Exception as e:
+        self.logging.error(f"Failed to build container image: {e}")
+        raise RuntimeError("Container image build failed") from e
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are any error handling patterns in the codebase
rg -A 2 "try:.*?(build|push).*?except" 

Length of output: 40


Script:

#!/bin/bash
# Let's check for any existing error handling patterns in the ECR client and related files
rg -A 5 "try:" sebs/aws/

# Also check for any error handling in the aws.py file specifically
rg -A 5 "except" sebs/aws/aws.py

# Look for the ECR client implementation to understand current error handling
ast-grep --pattern 'class ECRClient { $$$ }'

Length of output: 5853


Script:

#!/bin/bash
# Let's check the ECR client's build_base_image method implementation
ast-grep --pattern 'build_base_image($$$) { $$$ }'

# Also check for any existing error handling in container.py since it seems to handle ECR operations
rg -A 10 "build_base_image" sebs/aws/

Length of output: 770

Comment on lines +382 to +383
bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:None]
test_name = f"{bench}, {arch}, {deployment_type}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Correct Slice Notation in Test ID Parsing

The current code:

bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:None]

uses incorrect slice notation. In Python, slicing to the end of a list should omit the stop index or use None without brackets. The [-3:None] slice does not work as intended. The correct way is to use [-3:].

Apply this diff to fix the slice notation:

-bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:None]
+bench, arch, deployment_type = kwargs["test_id"].split("_")[-3:]

@mcopik mcopik force-pushed the container-support branch from 18cc75b to 9df1aa0 Compare November 7, 2024 01:38
@mcopik mcopik merged commit bf5bc35 into master Nov 7, 2024
1 check was pending
@mcopik mcopik deleted the container-support branch November 7, 2024 01:42
@coderabbitai coderabbitai bot mentioned this pull request Mar 26, 2025
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants