Skip to content

Conversation

@rockleona
Copy link

I started to have a research on computer vision, as the first step, this PR introduce Ultralytics YOLO as the object detection tool, and import yolov11 model as the base model.

Usage is really easy, just load an image, enable the detection, then you will see the result on the Console Widget. You can check the screenshot as below:
截圖 2025-10-29 23 16 42

logger.setLevel(logging.DEBUG)

if 'model' not in globals():
model = YOLO('./modmesh/pilot/yolo11n.pt') # Please check the model path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you get the model? Maybe have a runtime download logic?

Copy link
Author

@rockleona rockleona Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will check the path if the model is exist during runtime, nor it will download it directly to the specified path.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found there is a directory called thirdparty, maybe I should specify the path overthere instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thirdparty is for the 3rd libraries. In this case, I think you can put the model file at the same directory of pilot runtime. Btw, it seems that the download logic is not implemented yet, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ultralytics library had already done the download logic, no need to do it agin, perhaps they will find the model name is in their file server or not, then download it when trigger class YOLO initialization.

@@ -0,0 +1,99 @@
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to write tests to validate the implementation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you recommend to place the tests, put them in the tests/test_pilot.py perhaps?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can put it at tests/test_vision.py?

@yungyuc yungyuc added the pilot GUI and visualization label Oct 30, 2025
@yungyuc yungyuc marked this pull request as draft October 30, 2025 14:42
Copy link
Member

@yungyuc yungyuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure CI passes before requesting for review.

  • Correct copyright headers.
  • Remove unnecessary code like WrapRVisionDockWidget, which looks like a placeholder.
  • The pimpl class RVisionDockWidget::Impl is not necessary. Do not use pimpl.
  • Do not create a symbol named modmesh::BoundingBox.
  • Always add an end marker to classes and namespaces.

I see you are using pybind11 to call back into Python to use YOLO. Why don't you just write PySide6 to do it?

const uchar *data = rgbImg.bits();
py::array_t<uint8_t> np_img({height, width, channels}, data);
py::object vision_mod = py::module_::import("modmesh.pilot._vision");
py::object yolo_func = vision_mod.attr("yolo_detect");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you call YOLO from Python, why not use PySide?

@rockleona
Copy link
Author

Please make sure CI passes before requesting for review.

  • Correct copyright headers.
  • Remove unnecessary code like WrapRVisionDockWidget, which looks like a placeholder.
  • The pimpl class RVisionDockWidget::Impl is not necessary. Do not use pimpl.
  • Do not create a symbol named modmesh::BoundingBox.
  • Always add an end marker to classes and namespaces.

I see you are using pybind11 to call back into Python to use YOLO. Why don't you just write PySide6 to do it?

I thought it was a must to write all the GUI component with qt, I will change it to PySide6 since these functions were executed only from Python

@yungyuc
Copy link
Member

yungyuc commented Nov 17, 2025

@rockleona The code base has changed a lot. Please rebase to refresh the CI status.

@rockleona rockleona marked this pull request as ready for review November 24, 2025 14:01
@rockleona
Copy link
Author

I've made a lots of changes, please find the items below:

  • Change all GUI components with PySide6
  • Write unit test cases for _yolo_detector

The latest GUI will be look like this, I didn't change the layout but slightly different on the detail like model status and logging message in pycon widget

截圖 2025-11-24 22 04 07

@rockleona
Copy link
Author

Please make sure CI passes before requesting for review.

I cannot find a button to run the CI process, maybe it should be executed by a repo maintainer?

@yungyuc
Copy link
Member

yungyuc commented Nov 29, 2025

Please make sure CI passes before requesting for review.

I cannot find a button to run the CI process, maybe it should be executed by a repo maintainer?

You should use your own fork to test for CI.

The latest GUI will be look like this, I didn't change the layout but slightly different on the detail like model status and logging message in pycon widget

截圖 2025-11-24 22 04 07

Can you move the image preview away from the widget window (lower left) to the central sub-window, like other windows for 2D and 3D plots?

Copy link
Member

@yungyuc yungyuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be some discussions before we can merge any code about YOLO.

  • Include upsplash license text.
  • Clean up image source link.
  • Evaluate to use Qt instead of PIL.
  • Discuss why including ultralytics.

you may not use this file except in compliance with the License.
You may obtain a copy of the License at

https://unsplash.com/license
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a copy of the license text here in this file.

Test Files
==========

- cat.jpg (original source: https://unsplash.com/photos/orange-and-white-cat-on-yellow-surface-sR0cTmQHPug?utm_source=unsplash&utm_medium=referral&utm_content=creditShareLink)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean up link.


import numpy as np
import requests
from PIL import Image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, I do not want to have PIL. modmesh is already using Qt which should include all image handling features that PIL provides. Please evaluate if you can simply use Qt/PySide for processing images.

import numpy as np
import requests
from PIL import Image
from ultralytics import YOLO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel including a huge thirdparty like ultralytics defeats the principle of "doing it ourselves" in modmesh. @rockleona please elaborate why you include ultralytics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pilot GUI and visualization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants