-
Notifications
You must be signed in to change notification settings - Fork 19
Feature/nl parser improvements #293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature/nl parser improvements #293
Conversation
WalkthroughA new natural language parsing module is introduced to convert user installation requests into structured intents, including normalization, spelling correction, rule-based and semantic intent detection, confidence scoring, slot extraction, and clarification suggestions, accompanied by comprehensive tests and documentation. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 15
🧹 Nitpick comments (13)
test/test_installation_history.py (1)
10-16: Confirm capitalized module name and consider PEP 8 alignmentSwitching to
from Installation_history import ...is fine if the module is actually named with a capitalI. If the module is under your control, consider renaming it to a lowercase, underscore‑separated name and adjusting imports for PEP 8 compliance; otherwise, just ensure this import matches the real filename in the package layout.src/intent/detector.py (1)
6-10: Tighten public API typing/docstrings for Intent and detect()The detection logic itself looks solid and matches the intended flows. To better align with the project’s typing/docstring guidelines, consider:
- Making
detailsmore explicit thanOptional[dict], e.g.Optional[dict[str, object]]or a typed alias if you expect a fixed structure.- Adding a short docstring to
Intent(or at least todetect) describing expected input text and the shape/ordering of returned intents.This will make the new intent surface clearer to downstream callers and type checkers. As per coding guidelines, docstrings and strong type hints are expected on public APIs.
Also applies to: 26-49
src/intent/context.py (1)
13-69: Add explicit-> Nonereturn types on mutating SessionContext methods
SessionContextbehavior looks fine, but several public methods (__init__,set_gpu,add_intents,add_installed,add_clarification,reset) currently omit their-> Nonereturn annotations. Adding these will fully satisfy the “type hints required” guideline and make intent clearer to type checkers:- def set_gpu(self, gpu_name: str): + def set_gpu(self, gpu_name: str) -> None: @@ - def add_intents(self, intents: List[Intent]): + def add_intents(self, intents: List[Intent]) -> None: @@ - def add_installed(self, pkg: str): + def add_installed(self, pkg: str) -> None: @@ - def add_clarification(self, question: str): + def add_clarification(self, question: str) -> None: @@ - def reset(self): + def reset(self) -> None:(You can similarly annotate
__init__if you want full consistency.)As per coding guidelines, type hints are required.
src/intent/planner.py (1)
6-49: UseGPU_PACKAGESinhas_gpuand annotate it asClassVarRight now
GPU_PACKAGESis never used, andhas_gpuonly checks for agputarget, so a plan with onlycuda/pytorchintents won’t trigger GPU detection/config steps. You can both fix this and satisfy the Ruff ClassVar warning by:-from typing import List +from typing import List, ClassVar @@ - GPU_PACKAGES = ["cuda", "cudnn", "pytorch", "tensorflow"] + GPU_PACKAGES: ClassVar[List[str]] = ["cuda", "cudnn", "pytorch", "tensorflow"] @@ - # 1. If GPU-related intents exist → add GPU detection - has_gpu = any(i.target == "gpu" for i in intents) + # 1. If GPU-related intents exist → add GPU detection + has_gpu = any( + i.target == "gpu" or i.target in self.GPU_PACKAGES + for i in intents + )This keeps the behavior aligned with the intent of “GPU-related intents” while cleaning up the static-analysis complaint.
src/test_intent_detection.py (1)
1-15: Tests cover core detector behavior; type hints optional for consistencyThese tests nicely exercise the happy path and an empty-input case for
IntentDetector. If you want full alignment with the “type hints required” policy, you could add return annotations likedef test_detector_basic() -> None, but that’s purely consistency polish on test code.src/test_clarifier.py (1)
1-12: Clarifier integration test looks good; consider optional type annotationsThe test correctly validates that a clarification question is produced for an underspecified ML request. As with the other tests, you could optionally add
-> Noneon the test function to mirror the project’s typing convention, but the current test is functionally solid.src/test_llm_agent.py (2)
3-9: Consider using a factory method or instance attribute for mutable content.The
contentclass attribute on line 8 is a mutable list. While this works for the current test, it could cause unexpected behavior if multipleResponseinstances share state. Since this is a simple mock, it's low risk, but using an instance attribute or property would be safer.class MockMessages: def create(self, **kwargs): class Response: class Content: text = "install: tensorflow\ninstall: jupyter" - content = [Content()] + @property + def content(self): + return [self.Content()] return Response()
15-28: Test file location inconsistent with project structure.This test file is in
src/while other NL parser tests are intests/test_nl_parser.py. Consider moving totests/test_llm_agent.pyfor consistency. Also, the test could benefit from additional assertions to verify the structure of the plan (e.g., that plan items are strings) and that suggestions is a list.assert "plan" in result assert len(result["plan"]) > 0 assert "suggestions" in result + assert isinstance(result["plan"], list) + assert isinstance(result["suggestions"], list)src/test_planner.py (1)
4-16: Test validates planner correctly; consider moving totests/and using constants.The test properly validates the CUDA/PyTorch/GPU pipeline. Assertions against hardcoded strings (e.g.,
"Install CUDA 12.3 + drivers") may become brittle if planner output changes. Consider either:
- Moving to
tests/test_planner.pyfor consistency- Optionally defining expected plan steps as constants shared with the planner
Current test logic is correct.
src/intent/llm_agent.py (2)
67-70: Consider logging swallowed exceptions and using narrower exception types.Bare
except Exceptioncatches can mask unexpected errors. While the fallback behavior is appropriate for resilience, logging the exception would aid debugging in production.+import logging + +logger = logging.getLogger(__name__) + # 4. Improve intents using LLM (safe) try: improved_intents = self.enhance_intents_with_llm(text, intents) - except Exception: + except Exception as e: + logger.warning("LLM enhancement failed, using rule-based intents: %s", e) improved_intents = intentsAlso applies to: 79-82
100-106: LLM prompt may produce inconsistent output format.The prompt at lines 96-106 asks for "install: package" format but doesn't strongly constrain the LLM output. Consider adding examples or using a more structured prompt to improve reliability of parsing.
Return improvements or extra intents. -Format: "install: package" or "configure: component" +Format each intent on its own line using exactly one of: +- install: <package_name> +- configure: <component> +- verify: <target> + +Example: +install: tensorflow +configure: gpu """nl_parser.py (2)
142-142: Clarify deduplication idiom with a comment.The
dict.fromkeys()pattern for deduplication is clever but not immediately obvious to all readers.Apply this diff:
- slots["packages"] = list(dict.fromkeys(pkgs)) # unique preserve order + # Deduplicate while preserving order (dict.fromkeys maintains insertion order in Python 3.7+) + slots["packages"] = list(dict.fromkeys(pkgs))
173-220: Consider adding input length validation.The function processes arbitrary-length text without validation. While not likely to be a practical issue for typical use cases, extremely long inputs could cause performance degradation.
Consider adding a length check at the start of the function:
def parse_request(text: str) -> Dict[str, Any]: """Main function used by tests and demo.""" + # Prevent potential DoS from extremely long inputs + if len(text) > 10000: + return { + "intent": "unknown", + "confidence": 0.0, + "explanation": "Input too long", + "slots": {}, + "corrections": [], + "clarifications": ["Please provide a shorter request (max 10000 characters)"], + } + norm = normalize(text)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
.gitignore(0 hunks)nl_parser.py(1 hunks)requirements.txt(1 hunks)src/intent/clarifier.py(1 hunks)src/intent/context.py(1 hunks)src/intent/detector.py(1 hunks)src/intent/llm_agent.py(1 hunks)src/intent/planner.py(1 hunks)src/test_clarifier.py(1 hunks)src/test_context.py(1 hunks)src/test_intent_detection.py(1 hunks)src/test_llm_agent.py(1 hunks)src/test_planner.py(1 hunks)test/test_installation_history.py(1 hunks)tests/test_nl_parser.py(1 hunks)
💤 Files with no reviewable changes (1)
- .gitignore
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
src/intent/detector.pysrc/intent/context.pysrc/test_clarifier.pytests/test_nl_parser.pysrc/intent/planner.pysrc/intent/llm_agent.pysrc/test_llm_agent.pysrc/test_intent_detection.pysrc/intent/clarifier.pytest/test_installation_history.pysrc/test_context.pysrc/test_planner.pynl_parser.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Maintain >80% test coverage for pull requests
Files:
tests/test_nl_parser.py
**/*install*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*install*.py: Dry-run by default for all installations in command execution
No silent sudo execution - require explicit user confirmation
Implement audit logging to ~/.cortex/history.db for all package operations
Files:
test/test_installation_history.py
🧠 Learnings (1)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*.py : Type hints required in Python code
Applied to files:
requirements.txt
🧬 Code graph analysis (2)
tests/test_nl_parser.py (1)
nl_parser.py (1)
parse_request(173-220)
src/intent/llm_agent.py (1)
src/intent/detector.py (2)
Intent(7-10)detect(26-49)
🪛 Ruff (0.14.8)
src/intent/planner.py
8-8: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
src/intent/llm_agent.py
69-69: Do not catch blind exception: Exception
(BLE001)
81-81: Do not catch blind exception: Exception
(BLE001)
src/test_llm_agent.py
4-4: Unused method argument: kwargs
(ARG002)
8-8: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
22-22: Unused lambda argument: a
(ARG005)
22-22: Unused lambda argument: k
(ARG005)
🔇 Additional comments (8)
requirements.txt (1)
12-12: Validate PyYAML pin against supported Python versions and security policyAdding
PyYAML==6.0.3looks fine; just confirm this exact pin is required (vs>=) for your nl parsing/config use cases and that it’s compatible with your minimum Python version and current vulnerability guidance.src/intent/clarifier.py (1)
6-33: Clarification logic is coherent and matches intended ambiguity handlingThe branching in
needs_clarificationcleanly captures the key ambiguous cases (GPU presence, generic ML tools, CUDA without GPU mention, PyTorch version choice) with clear questions, and types/docstrings are in good shape. No changes needed here.src/test_context.py (1)
4-13: Test logic is sound; consider moving totests/directory.The test correctly validates SessionContext's core functionality: GPU setting, intent addition, and installed package tracking. For consistency with
tests/test_nl_parser.py, consider relocating this file totests/test_context.py.tests/test_nl_parser.py (1)
1-37: Comprehensive test coverage aligns with PR requirements.The test suite covers:
- 12 parametrized intent cases (exceeds 10+ requirement)
- Typo tolerance (
"kubernets","pyhton")- Slot extraction (python_version, platform)
- Ambiguity handling with clarification checks
- Confidence scoring (>= 0.5 threshold)
This satisfies the acceptance criteria from issue #248.
src/intent/llm_agent.py (1)
119-119: The.lower()call is appropriate and correctly designed for this codebase.The downstream systems (detector.py and planner.py) consistently use lowercase for all internal package matching. The COMMON_PACKAGES dictionary in detector.py (line 17) and GPU_PACKAGES list in planner.py (line 8) define packages as lowercase:
"pytorch","tensorflow","cuda", etc. All package comparisons in planner.py (lines 29, 32) check against lowercase strings. The lowercasing at line 119 is intentional and necessary—user-facing display strings with proper casing (e.g., "Install PyTorch") are generated separately in the output messages.nl_parser.py (3)
60-67: LGTM!The spell correction logic is sound, with a reasonable fuzzy match cutoff of 0.75 to avoid over-correction.
85-97: LGTM!The semantic matching logic correctly finds the best intent by comparing against all examples. The O(n*m) complexity is acceptable given the small, fixed set of intents and examples.
173-220: LGTM on the overall parsing logic!The main parsing workflow is well-structured: normalization → spell correction → dual intent detection (rule + semantic) → slot extraction → confidence aggregation → clarification decisions. The cascading intent selection logic (lines 184-195) appropriately prioritizes rule-based matches, validates with semantic agreement, and falls back gracefully to semantic-only or unknown.
mikejmorgan-ai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review: Request Changes
Great work @pavanimanchala53! The NL parsing logic is solid and meets all acceptance criteria. A few structural fixes needed before merge:
Issues to Fix
1. Code Location (High Priority)
nl_parser.pyis in the root directory but should be incortex/package- The
src/intent/module creates confusion - Cortex usescortex/notsrc/ - Please consolidate into
cortex/nl_parser.pyandcortex/intent/if needed
2. Tests Location (Medium)
- Tests in
src/test_*.pyshould be intests/directory - Move all test files to
tests/to match project convention
3. Duplicate Dependency (Low)
requirements.txtnow has bothpyyaml>=6.0.0andPyYAML==6.0.3- Remove the duplicate line
4. Missing Documentation (Medium)
- Add
docs/NL_PARSER.mdwith:- Feature overview
- Usage examples
- API reference
5. Missing Newline
- Add final newline to
tests/test_nl_parser.py
What's Working Well ✅
- Typo correction (kubernets → kubernetes)
- Confidence scoring
- Clarification prompts for ambiguous requests
- 12+ test cases
- Clean intent detection logic
Once these structural fixes are done, this is ready to merge. Happy to help if you have questions!
Bounty: $75 upon merge (+ $75 bonus after funding)
5dabcd6 to
397ca8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (6)
cortex/nl_parser.py (1)
184-195: Intent selection logic may suppress valid semantic matches.When
rule_int != "unknown"(lines 187-189), the semantic intent is ignored even ifsem_inthas a higher confidence score. This could lead to suboptimal intent selection when rules produce lower-confidence matches than semantic matching.Consider incorporating semantic confidence into the decision:
- elif rule_int != "unknown": - chosen_intent = rule_int - c_classifier = 0.0 + elif rule_int != "unknown": + # Prefer semantic if it has significantly higher confidence + if c_sem > c_rule + 0.2 and sem_int != "unknown": + chosen_intent = sem_int + c_classifier = 0.0 + else: + chosen_intent = rule_int + c_classifier = 0.0tests/test_clarifier.py (1)
4-12: Expand test coverage forClarifier.The test only verifies that a clarification question is returned for one input. Consider adding:
- A test where clarification is NOT needed (high-confidence, unambiguous intent)
- A test validating the question content is relevant to GPU/ML context
def test_clarifier_no_clarification_needed(): """Test that specific requests don't trigger clarification.""" d = IntentDetector() c = Clarifier() text = "install pytorch with CUDA 12.3" intents = d.detect(text) question = c.needs_clarification(intents, text) # Verify behavior for specific requests # (assertion depends on expected Clarifier logic)tests/test_planner.py (2)
13-16: Assertions are tightly coupled to exact output strings.The assertions depend on exact wording like
"Install CUDA 12.3 + drivers". If the planner's output format changes slightly, these tests will fail even if the functionality is correct.Consider asserting on key characteristics rather than exact strings:
# More resilient assertions assert any("CUDA" in step for step in plan) assert any("PyTorch" in step for step in plan) assert any("GPU" in step and "environment" in step.lower() for step in plan) assert "Verify" in plan[-1]
4-16: Add test cases for non-GPU and edge scenarios.The test only covers the GPU/CUDA pipeline. Consider adding tests for:
- Non-GPU intents (e.g., web server installation)
- Empty intent list
- Mixed intents (some GPU, some not)
tests/test_llm_agent.py (1)
3-9: Mock response content is not verified in assertions.The mock returns
"install: tensorflow\ninstall: jupyter"but the test only verifies thatplanexists and is non-empty. Consider asserting that the mocked intents actually influence the plan:# Verify mock intents are reflected in the plan assert any("tensorflow" in step.lower() or "TensorFlow" in step for step in result["plan"])tests/test_context.py (1)
4-13: Consider splitting into focused test cases and adding coverage for untested methods.The single test combines GPU, intents, and installed tracking. Per test best practices, consider:
- Splitting into
test_context_gpu,test_context_intents,test_context_installed- Adding tests for
reset(),add_clarification(), andget_clarifications()methodsdef test_context_reset(): ctx = SessionContext() ctx.set_gpu("NVIDIA RTX 4090") ctx.add_installed("cuda") ctx.reset() assert ctx.get_gpu() is None assert ctx.is_installed("cuda") is False
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
cortex/nl_parser.py(1 hunks)docs/NL_PARSER.md(1 hunks)tests/test_clarifier.py(1 hunks)tests/test_context.py(1 hunks)tests/test_intent_detection.py(1 hunks)tests/test_llm_agent.py(1 hunks)tests/test_nl_parser.py(1 hunks)tests/test_planner.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/NL_PARSER.md
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_nl_parser.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
tests/test_context.pytests/test_llm_agent.pytests/test_planner.pytests/test_intent_detection.pycortex/nl_parser.pytests/test_clarifier.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Maintain >80% test coverage for pull requests
Files:
tests/test_context.pytests/test_llm_agent.pytests/test_planner.pytests/test_intent_detection.pytests/test_clarifier.py
🧠 Learnings (2)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*.py : Docstrings required for all public APIs
Applied to files:
cortex/nl_parser.py
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*.py : Type hints required in Python code
Applied to files:
cortex/nl_parser.py
🧬 Code graph analysis (5)
tests/test_context.py (1)
src/intent/detector.py (1)
Intent(7-10)
tests/test_llm_agent.py (1)
src/intent/llm_agent.py (1)
process(46-89)
tests/test_planner.py (1)
src/intent/detector.py (1)
Intent(7-10)
tests/test_intent_detection.py (1)
src/intent/detector.py (3)
IntentDetector(12-49)Intent(7-10)detect(26-49)
tests/test_clarifier.py (1)
src/intent/detector.py (1)
detect(26-49)
🪛 Ruff (0.14.8)
tests/test_llm_agent.py
4-4: Unused method argument: kwargs
(ARG002)
8-8: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
22-22: Unused lambda argument: a
(ARG005)
22-22: Unused lambda argument: k
(ARG005)
🔇 Additional comments (5)
cortex/nl_parser.py (2)
173-220: LGTM:parse_requestorchestrates the parsing pipeline correctly.The function properly sequences normalization, tokenization, spell correction, dual-path intent detection, slot extraction, confidence aggregation, and clarification generation. The explanation output fulfills the PR requirement to show AI reasoning.
147-157: The original review identifies a mathematical scenario that does not actually occur in the code. When examining the parse_request function (lines 184-195), c_classifier is set to 0.95 only when both rule_int and sem_int agree. In this case, both c_rule and c_sem would typically be high (not 0.95 and 1.0 as claimed), resulting in maximum confidence approaching 0.99, not 0.78.The scenario described in the review—"0.4 * 0.95 + 0.4 * 1.0 + 0.2 * 0.0"—incorrectly mixes the c_classifier=0.95 case (agreement) with the c_classifier=0.0 weighting pattern, which is inconsistent with how the code executes.
The actual design is sound: clarification thresholds (< 0.6 at line 161, < 0.75 at line 168) appropriately trigger when confidence is lower, and test expectations (>= 0.5) are consistent with this behavior.
Likely an incorrect or invalid review comment.
tests/test_llm_agent.py (1)
15-28: LGTM: Mock-based LLM testing approach is appropriate.The test correctly isolates LLM behavior by injecting a mock, preventing real API calls during testing. The lambda to disable clarification is a valid pattern for controlling test behavior.
tests/test_context.py (1)
1-13: LGTM: CoreSessionContextfunctionality is verified.The test validates the essential state management: GPU context, intent history, and installation tracking work as expected.
tests/test_intent_detection.py (1)
1-1: LGTM!The import statement correctly imports the necessary classes from the intent detector module.
…ications, and tests
36ad2fc to
3630728
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/test_nl_parser.py (1)
14-14: This issue was already flagged in a previous review.The expected intent value should be
"unknown"instead of"ambiguous"to match the actual return value fromparse_request.
🧹 Nitpick comments (2)
.github/workflows/codeql.yml (1)
85-94: Manual build placeholder will fail if enabled.The manual build step exits with code 1 as a placeholder. Since the current
build-modeis "none", this step won't execute. However, if the build mode is changed to "manual" in the future, the workflow will fail at this step.This is acceptable as it forces explicit implementation, but consider documenting this behavior in comments or removing the step entirely since Python typically doesn't require manual builds.
tests/test_nl_parser.py (1)
34-37: Consider tightening the platform assertion.The test accepts both "mac" and "macos" for the input "on mac", but the regex in
nl_parser.py(line 125) will capture "mac" exactly as written. The assertion could be more precise:- assert r["slots"].get("platform") in ("mac", "macos") + assert r["slots"].get("platform") == "mac"However, the current approach provides flexibility for future implementation changes, so this is optional.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/codeql.yml(1 hunks)nl_parser.py(1 hunks)tests/test_nl_parser.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
tests/test_nl_parser.pynl_parser.py
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Maintain >80% test coverage for pull requests
Files:
tests/test_nl_parser.py
🧠 Learnings (2)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*.py : Docstrings required for all public APIs
Applied to files:
nl_parser.py
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*.py : Type hints required in Python code
Applied to files:
nl_parser.py
🧬 Code graph analysis (1)
tests/test_nl_parser.py (1)
nl_parser.py (1)
parse_request(173-220)
🔇 Additional comments (9)
.github/workflows/codeql.yml (1)
14-56: LGTM!The CodeQL workflow configuration is appropriate for the Python project. Triggers are correctly set for push/PR to main with scheduled scans, permissions are properly scoped, and the matrix configuration with
build-mode: noneis correct for Python (no compilation needed).tests/test_nl_parser.py (2)
29-32: LGTM!The test correctly verifies spell correction behavior by checking that the misspelled token appears in the corrections list and that the final intent is correctly determined after correction.
1-37: No issues found. Test coverage is 95%, which exceeds the >80% requirement specified in the coding guidelines.nl_parser.py (6)
60-67: LGTM!The spell correction logic correctly uses
difflib.get_close_matcheswith a reasonable cutoff (0.75) and properly tracks whether a correction was made. The approach of returning a tuple(corrected_token, was_corrected)provides clear semantics.
85-97: LGTM!The semantic intent matching logic correctly compares the input text against all canonical examples using fuzzy matching and selects the intent with the highest similarity score. The approach is straightforward and appropriate for the lightweight NLP requirements.
100-121: Rule-based intent detection provides good coverage.The keyword-based rules provide a solid fallback with reasonable priority ordering (Docker+K8s > Docker > K8s > web > Python > ML). The confidence scores (0.9-0.95) are appropriately high for exact keyword matches.
Note: The kubernetes-only case (lines 109-110) was flagged in a previous review for clarification.
124-144: Slot extraction logic is well-structured.The regex patterns correctly use word boundaries and capture groups for version numbers, platforms, and packages. The deduplication technique using
dict.fromkeys()(line 142) efficiently preserves order while removing duplicates.
147-157: Confidence aggregation formula is sound.The weighted average (40% rule, 40% semantic, 20% classifier) with spell-correction penalty (10% per correction) provides reasonable confidence scoring. The penalty clamping ensures graceful degradation with multiple corrections, and the final clamping to [0.0, 1.0] prevents edge cases.
173-220: Main parse_request workflow is well-orchestrated.The function correctly sequences normalization, spell correction, dual intent detection (rule + semantic), intent selection logic, slot extraction, confidence aggregation, and clarifications. The intent selection logic (lines 184-195) appropriately prioritizes agreement between methods and provides sensible fallbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (12)
cortex/nl_parser.py (12)
1-4: Redundant import already flagged in previous review.
7-13: Missing type annotation already flagged in previous review.
16-45: Missing type annotation and logic issues already flagged in previous reviews.
48-53: Missing docstring already flagged in previous review.
56-57: Missing docstring already flagged in previous review.
70-78: Missing docstring already flagged in previous review.
81-82: Missing docstring already flagged in previous review.
100-121: Kubernetes-only intent handling already flagged; minor: unnecessary variable alias.Line 102 (
t = text) creates an alias that doesn't add clarity. Consider usingtextdirectly.def rule_intent(text: str) -> Tuple[str, float]: """Simple keyword/rule-based detection.""" - t = text - - if "docker" in t: - if "kubernetes" in t or "k8s" in t or "kubectl" in t: + if "docker" in text: + if "kubernetes" in text or "k8s" in text or "kubectl" in text: return "install_docker_k8s", 0.95 return "install_docker", 0.9 - - if "kubernetes" in t or "k8s" in t or "kubectl" in t: + if "kubernetes" in text or "k8s" in text or "kubectl" in text: return "install_docker_k8s", 0.9 - - if "nginx" in t or "apache" in t or "httpd" in t or "web server" in t: + if "nginx" in text or "apache" in text or "httpd" in text or "web server" in text: return "install_web_server", 0.9 - - if "python" in t or "venv" in t or "conda" in t or "anaconda" in t: + if "python" in text or "venv" in text or "conda" in text or "anaconda" in text: return "setup_python_env", 0.9 - - if any(word in t for word in ("tensorflow", "pytorch", "torch", "machine learning", "ml")): + if any(word in text for word in ("tensorflow", "pytorch", "torch", "machine learning", "ml")): return "install_ml", 0.9 - return "unknown", 0.0
124-126: Missing type annotations already flagged in previous review.
129-144: Missing docstring already flagged in previous review.
147-157: Missing type hints and docstring already flagged in previous review.
160-170: Missing type hints and docstring already flagged in previous review.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
cortex/nl_parser.py(1 hunks)docs/NL_PARSER.md(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docs/NL_PARSER.md
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Follow PEP 8 style guide
Type hints required in Python code
Docstrings required for all public APIs
Files:
cortex/nl_parser.py
🧠 Learnings (2)
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*.py : Docstrings required for all public APIs
Applied to files:
cortex/nl_parser.py
📚 Learning: 2025-12-11T12:03:24.071Z
Learnt from: CR
Repo: cortexlinux/cortex PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-11T12:03:24.071Z
Learning: Applies to **/*.py : Type hints required in Python code
Applied to files:
cortex/nl_parser.py
🔇 Additional comments (5)
cortex/nl_parser.py (5)
60-67: LGTM!The function has type hints and a brief docstring. The spell correction logic using
get_close_matcheswith a 0.75 cutoff is reasonable for typo tolerance.
85-97: LGTM!The semantic matching logic is straightforward and correct. The function properly returns the best-matching intent with its score, allowing callers to apply their own thresholds.
184-195: LGTM with note on design choice.The intent resolution prioritizes rule-based detection over semantic matching, which is a reasonable design for reliability. When both methods agree, the confidence bonus (c_classifier=0.95) effectively rewards consensus.
205-211: LGTM!The explanation string provides good transparency into the parser's reasoning, showing spell corrections and both rule-based and semantic scores. This meets the PR objective of displaying reasoning to the user.
213-220: LGTM!The return structure is well-organized with all required fields for the cortex install flow: intent, confidence, explanation, slots, corrections, and clarifications.
|
Hi @mikejmorgan-ai , This branch is ready for review. Please let me know if everything looks good so we can proceed with the merge. Thanks! |



PR: Add NL Parser With Typo Tolerance, Confidence Scoring, Clarifications & Unit Tests
This PR adds a standalone natural language parser for the cortex install flow, improving demo reliability and addressing all acceptance criteria in the bounty request.
✔ Features Implemented
Typo tolerance (e.g., kubernets → kubernetes, pyhton → python)
Rule-based + fuzzy semantic intent detection
Confidence scoring (0.0–1.0)
Ambiguity handling with clarification prompts
Short explanation showing AI reasoning
Slot extraction:
Python version
Platform
Packages mentioned
12+ unit tests validating NL behavior
✔ Files Added
nl_parser.py
tests/test_nl_parser.py
✔ How to Test
Run:
python -m pytest tests/test_nl_parser.py -q
✔ Bounty Requirement Status
All acceptance criteria met:
10+ NL test cases — ✔
Ambiguous handling — ✔
Typo tolerance — ✔
Explanation output — ✔
Confidence scoring — ✔
Unit tests — ✔
Closes: #248
Summary by CodeRabbit
New Features
Documentation
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.