Skip to content

Conversation

@CTY-git
Copy link
Contributor

@CTY-git CTY-git commented Mar 5, 2025

Adds GithubAgent step which uses AgenticLLMV2.
AgenticLLMV2 now uses our AioLlmClient which allows its model to be changed.
Add logging to tools.

PR Checklist

  • The commit message follows our guidelines: Code of conduct
  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • Does this PR introduce a breaking change?
  • Include PR in release notes?

PR Type

  • Bugfix
  • Feature
  • Refactoring
  • Build /CI
  • Documentation
  • Others

What is the current behavior?

Issue Number: N/A

What is the new behavior?

Other information

@CTY-git CTY-git requested a review from whoisarpit March 5, 2025 02:39
@CTY-git CTY-git changed the title Add github agent step; Allow AgenticLLMV2 to select model; Log all to… Add GithubAgent step Mar 5, 2025
from patchwork.common.tools.tool import Tool


class GitHubTool(Tool, tool_name="github_tool"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

abc_register=False?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its alright since Tool.get_tools will exclude it when gh_token is not provided

@patched-admin
Copy link
Contributor

File Changed: patchwork/common/client/llm/utils.py

Rule 1: Potential Bugs
Details: The code introduces a potential bug by not setting a fixed seed for the random number generator. This could lead to unpredictable model names across different runs of the same code, potentially causing issues with caching or serialization mechanisms that rely on consistent type names.

Affected Code Snippet:

random_suffix = "".join(random.choice(string.ascii_lowercase) for _ in range(4))
return create_model(f"ResponseFormat_{random_suffix}", **base_model_field_defs)

Start Line: 119
End Line: 120


Rule 3: Coding Standards
Details: The modification appears to deviate from the original coding standards by introducing randomness in type naming. This could make debugging and testing more difficult as the model names will be different in each execution. The original code used a consistent, predictable naming scheme which is generally preferred for type definitions.

Affected Code Snippet:

# Original:
return create_model("ResponseFormat", **base_model_field_defs)

# Modified:
random_suffix = "".join(random.choice(string.ascii_lowercase) for _ in range(4))
return create_model(f"ResponseFormat_{random_suffix}", **base_model_field_defs)

Start Line: 119
End Line: 120

File Changed: patchwork/common/multiturn_strategy/agentic_strategy_v2.py

Rule 1: Do not ignore potential bugs in the code

Details: Potential bug identified in model_post_init method where json parsing could fail silently

Affected Code Snippet:

def model_post_init(self, __context: Any) -> None:
    if self.example_json == DEFAULT_AGENT_EXAMPLE_JSON:
        return

    wanted = json.loads(self.example_json)
    default_wanted = json.loads(DEFAULT_AGENT_EXAMPLE_JSON)
    default_wanted.update(wanted)
    self.example_json = json.dumps(default_wanted)

Start Line: 32
End Line: 39

Details: The execute method lacks proper error handling for loop.run_until_complete

Affected Code Snippet:

agent_output: AgentRunResult[Any] = loop.run_until_complete(
    agent.run(user_message, message_history=message_history)
)

Start Line: 106
End Line: 108

Rule 2: Do not overlook possible security vulnerabilities

Details: Potential security vulnerability in mustache_render with untrusted template_data

Affected Code Snippet:

system_prompt=mustache_render(system_prompt_template, self.__template_data)

Start Line: 59
End Line: 59

Details: Same vulnerability in agent configuration initialization

Affected Code Snippet:

system_prompt=mustache_render(agent_config.system_prompt, self.__template_data)

Start Line: 73
End Line: 73

File Changed: patchwork/common/tools/github_tool.py

Rule 1: Do not ignore potential bugs in the code

Details: The code has a potential bug in error handling. The subprocess.run() call doesn't handle exceptions and doesn't check the return code. Failed commands will still return stdout without any error indication.

Affected Code Snippet:

p = subprocess.run(
    ["gh", *args],
    env=env,
    cwd=self.path,
    text=True,
    stdout=subprocess.PIPE,
    stderr=subprocess.STDOUT,
)
return p.stdout

Start Line: 37

End Line: 46


Rule 2: Do not overlook possible security vulnerabilities

Details: The code has multiple security concerns:

  1. Direct command execution with user-provided arguments without sanitization
  2. No validation of the path parameter which could lead to path traversal
  3. Environment variables are copied without filtering sensitive information

Affected Code Snippet:

def execute(self, args: list[str]) -> str:
    env = os.environ.copy()
    env["GH_TOKEN"] = self.gh_token
    p = subprocess.run(
        ["gh", *args],
        env=env,
        cwd=self.path,
        text=True,
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,
    )
    return p.stdout

Start Line: 35

End Line: 46

File Changed: patchwork/common/tools/grep_tool.py

Rule 1: Do not ignore potential bugs in the code

Details: The change increases the character limit from 200 to 400 which could potentially lead to memory management issues if not properly handled by the rest of the codebase. However, without additional context about the implementation of the FindTextTool class and its usage, this cannot be definitively classified as a bug.

Affected Code Snippet:

class FindTextTool(Tool, tool_name="find_text"):
    __CHAR_LIMIT = 200  # Changed to 400
    __CHAR_LIMIT_TEXT = "<Too many characters>"

Start Line: 105
End Line: 106


Rule 2: Do not overlook possible security vulnerabilities

Details: Increasing the character limit from 200 to 400 could potentially expose the system to denial of service attacks if the tool is used in a public-facing interface, as it allows for larger text processing. However, since this is a private class variable (denoted by double underscore), the risk is minimal and depends on how the tool is exposed to external input.

Affected Code Snippet:

class FindTextTool(Tool, tool_name="find_text"):
    __CHAR_LIMIT = 200  # Changed to 400
    __CHAR_LIMIT_TEXT = "<Too many characters>"

Start Line: 105
End Line: 106

File Changed: patchwork/common/tools/tool.py

Rule 1: Do not ignore potential bugs in the code

Details: Potential bug found in __execute_logging_wrapper. The function doesn't handle async/await patterns which could break async tool execution. Additionally, the arg_text concatenation logic could lead to malformed output when both args and kwargs are present.

Affected Code Snippet:

@staticmethod
def __execute_logging_wrapper(func):
    @functools.wraps(func)
    def execute_logging_wrapper(self, *args, **kwargs):
        arg_text = ""
        if len(args) > 0:
            arg_text += f"args: {args}"
        if len(kwargs) > 0:
            arg_text += f"kwargs: {kwargs}"

        logger.info(f"Executing Tool: {self.name} with {arg_text}")
        return func(self, *args, **kwargs)

    return execute_logging_wrapper

Start Line: 66
End Line: 79


Rule 2: Do not overlook possible security vulnerabilities

Details: Security vulnerability identified in logging implementation. The code logs all arguments and keyword arguments without sanitization, which could potentially expose sensitive information in logs.

Affected Code Snippet:

logger.info(f"Executing Tool: {self.name} with {arg_text}")

Start Line: 75
End Line: 75

File Changed: patchwork/logger.py

Rule 1: Do not ignore potential bugs in the code

Details: Potential bug detected in race condition handling. The code introduces a global state modification through evict_null_handler() which could lead to threading issues if the logger is accessed concurrently.

Affected Code Snippet:

def evict_null_handler():
    global logger, __noop

    warnings.simplefilter("ignore")
    logger.removeHandler(__noop)

Start Line: 34
End Line: 38


Rule 2: Do not overlook possible security vulnerabilities

Details: Potential security vulnerability in directory creation. The code creates directories without checking for directory traversal attacks or proper permissions in the HOME_FOLDER path.

Affected Code Snippet:

if not os.path.exists(HOME_FOLDER):  # Check if HOME_FOLDER exists at this point
    os.makedirs(HOME_FOLDER)

Start Line: 151
End Line: 152

File Changed: patchwork/steps/AgenticLLMV2/AgenticLLMV2.py

Rule 1: Do not ignore potential bugs in the code

Details: Potential bug identified. The code removes the api_key parameter but doesn't ensure the inputs dictionary contains necessary credentials for AioLlmClient.create_aio_client(). This could lead to runtime errors if credentials are not properly passed.

Affected Code Snippet:

self.agentic_strategy = AgenticStrategyV2(
    model="claude-3-7-sonnet-latest",
    llm_client=AioLlmClient.create_aio_client(inputs),
    template_data=inputs.get("prompt_value", {}),
    system_prompt_template=inputs.get("system_prompt", "Summarise from our previous conversation"),
    user_prompt_template=inputs.get("user_prompt"),

Start Line: 19
End Line: 24


Rule 2: Do not overlook possible security vulnerabilities

Details: Security concern identified. The code passes the entire inputs dictionary to create_aio_client() without sanitization. This could potentially expose sensitive information if the inputs contain more data than needed for client creation.

Affected Code Snippet:

llm_client=AioLlmClient.create_aio_client(inputs),

Start Line: 21
End Line: 21

File Changed: patchwork/steps/BrowserUse/BrowserUse.py

Rule 2: Do not overlook possible security vulnerabilities

Details: The code contains a potential security vulnerability with disable_security=True in the browser configuration. While this was present in the original code and not introduced by this change, it's worth noting as it could expose the application to security risks by disabling browser security features.

Affected Code Snippet:

config = BrowserConfig(headless=True, disable_security=True, new_context_config=context_config)

Start Line: 43

End Line: 43


Rule 3: Do not deviate from original coding standards

Details: The changes actually deviate from the apparent coding standards in the original code. The original format used multi-line formatting for long function calls and class instantiations, which improves readability. The changes compress these into single lines, making them potentially harder to read and maintain.

Affected Code Snippet:

config = BrowserConfig(headless=True, disable_security=True, new_context_config=context_config)
self.llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", google_api_key=self.inputs["google_api_key"])

Start Line: 43, 144

End Line: 43, 144

These should maintain the original multi-line format for consistency and readability:

config = BrowserConfig(
    headless=True,
    disable_security=True,
    new_context_config=context_config
)
self.llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",
    google_api_key=self.inputs["google_api_key"]
)

File Changed: patchwork/steps/BrowserUse/typed.py

Rule 3: Do not deviate from original coding standards

Details: Minor formatting deviation detected. The original code used multi-line formatting for type annotations, which appears to be the established standard based on the surrounding code. The modification changes this to a single line, breaking consistency with other similar annotations in the file.

Affected Code Snippet:

anthropic_api_key: Annotated[str, StepTypeConfig(is_config=True, or_op=["google_api_key", "openai_api_key"])]

Start Line: 13

End Line: 13

Recommendation: Maintain consistent multi-line formatting for type annotations to match the established code style:

anthropic_api_key: Annotated[
    str, StepTypeConfig(is_config=True, or_op=["google_api_key", "openai_api_key"])
]

File Changed: patchwork/steps/GitHubAgent/GitHubAgent.py

Rule 1: Do not ignore potential bugs in the code

Details: The code contains potential bugs related to error handling and input validation:

  1. No validation for required "task" and "github_api_token" inputs
  2. No error handling for AioLlmClient creation
  3. No validation of execution result before returning

Affected Code Snippet:

def __init__(self, inputs):
    super().__init__(inputs)
    base_path = inputs.get("base_path", str(Path.cwd()))
    task = inputs["task"]  # Could raise KeyError
    self.agentic_strategy = AgenticStrategyV2(
        model="claude-3-7-sonnet-latest",
        llm_client=AioLlmClient.create_aio_client(inputs),  # No error handling
        ...
    )

def run(self) -> dict:
    result = self.agentic_strategy.execute(limit=10)  # No error handling
    return {**result, **self.agentic_strategy.usage()}  # No result validation

Start Line: 14

End Line: 46


Rule 2: Do not overlook possible security vulnerabilities

Details: Several security concerns are present:

  1. GitHub API token is passed directly without sanitization
  2. User input (task) is directly interpolated into prompt template
  3. No rate limiting protection for API calls

Affected Code Snippet:

task = inputs["task"]
self.agentic_strategy = AgenticStrategyV2(
    ...
    user_prompt_template=f"""\
Please help me with the following task using the GitHub CLI. You should not do anything extra.
Please take note of any requirements to the data required to fetch.

{task}
""",
    agent_configs=[
        AgentConfig(
            name="Assistant",
            tool_set=dict(github_tool=GitHubTool(base_path, inputs["github_api_token"])),
            ...
        )
    ],

Start Line: 17

End Line: 37

File Changed: patchwork/steps/GitHubAgent/typed.py

Rule 1: Do not ignore potential bugs in the code

Details: The code introduces a potential bug in the TypedDict definition where total=False is used. While this makes all fields optional, it could lead to runtime errors since some fields like max_llm_calls and API keys might be required for the agent to function properly. Consider separating required and optional fields.

Affected Code Snippet:

class GitHubAgentInputs(TypedDict, total=False):
    base_path: str
    prompt_value: Dict[str, Any]
    system_prompt: str
    user_prompt: str
    max_llm_calls: Annotated[int, StepTypeConfig(is_config=True)]

Start Line: 6
End Line: 11


Rule 2: Do not overlook possible security vulnerabilities

Details: The code handles sensitive API keys (openai_api_key, anthropic_api_key, google_api_key) as plain string fields. While typing is correct, there's no indication of secure handling or encryption of these sensitive credentials. Consider adding type hints that enforce secure string handling.

Affected Code Snippet:

    openai_api_key: Annotated[
        str, StepTypeConfig(is_config=True, or_op=["patched_api_key", "google_api_key", "anthropic_api_key"])
    ]
    anthropic_api_key: Annotated[
        str, StepTypeConfig(is_config=True, or_op=["patched_api_key", "google_api_key", "openai_api_key"])
    ]
    google_api_key: Annotated[
        str, StepTypeConfig(is_config=True, or_op=["patched_api_key", "openai_api_key", "anthropic_api_key"])
    ]

Start Line: 11
End Line: 19

File Changed: patchwork/steps/__init__.py

Rule 2: Do not overlook possible security vulnerabilities

Details: The addition of GitHubAgent and BrowserUse modules could introduce security vulnerabilities if not properly configured. These modules typically interact with external systems and may require careful security review.

Affected Code Snippet:

from patchwork.steps.BrowserUse.BrowserUse import BrowserUse
from patchwork.steps.GitHubAgent.GitHubAgent import GitHubAgent

Start Line: 4, 33
End Line: 4, 33

File Changed: pyproject.toml

Rule 1: Do not ignore potential bugs in the code

Details: Potential version compatibility issue detected. The update to pydantic-ai from ^0.0.23 to ^0.0.32 and anthropic from ^0.45.2 to ^0.49.0 represents significant version jumps in development-stage packages (indicated by 0.x.x versioning). This could introduce breaking changes or unstable behavior.

Affected Code Snippet:

pydantic-ai = "^0.0.23"
anthropic = "^0.45.2"

Start Line: 36, 47
End Line: 36, 47

@CTY-git CTY-git merged commit 767c1cd into main Mar 5, 2025
4 checks passed
@CTY-git CTY-git deleted the github-agent-experimental branch March 5, 2025 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants