Skip to content

Conversation

@raedle
Copy link
Contributor

@raedle raedle commented Jan 6, 2025

Summary:

  • Added support for multiple positive/negative clicks
  • Visualize model download time, image encoder time, and decoder time
  • URL input rather than image upload
  • Temp disabled analytics

Test Plan:

Tested locally

@vercel
Copy link

vercel bot commented Jan 6, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sam2 ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 26, 2025 4:30pm

@raedle
Copy link
Contributor Author

raedle commented Jan 6, 2025

Great project, @geronimi73!

No need to act on the PR at the moment. I am experimenting with runtime and model quality.

I made some changes to support positive/negative clicks (quick hack to see if it works). This breaks multi-object segmentation, so the UI will need support for adding multiple objects

@geronimi73
Copy link
Owner

great idea for a PR! thanks
and, nice, I see you're cleaning my ugly coding style 😆 embarrassing

Summary:

* Added support for multiple positive/negative clicks
* Visualize model download time, image encoder time, and decoder time
* URL input rather than image upload
* Temp disabled analytics

Test Plan:

Tested locally
@geronimi73
Copy link
Owner

checked the preview, beautiful. thank you! i'll take a closer look later today

@geronimi73
Copy link
Owner

lgtm, @raedle are you done?
I would then restore the OPFS storage and remove the debug logs and add back the download/crop button

@raedle
Copy link
Contributor Author

raedle commented Jan 8, 2025

@geronimi73, it's not quite ready yet. There is a potential way to improve predictions for refinement clicks (i.e., inputs other than the first click). To achieve this, a small code change is needed. Specifically, the output mask scores from the previous step should be fed as mask_input to the next step, and has_mask_input should be set to 1.

Additionally, another detail worth trying out is using multimask_output=False for refinement clicks, which could help obtain a single consistent mask prediction

@geronimi73
Copy link
Owner

@geronimi73, it's not quite ready yet. There is a potential way to improve predictions for refinement clicks (i.e., inputs other than the first click). To achieve this, a small code change is needed. Specifically, the output mask scores from the previous step should be fed as mask_input to the next step, and has_mask_input should be set to 1.

Additionally, another detail worth trying out is using multimask_output=False for refinement clicks, which could help obtain a single consistent mask prediction

That sounds good. I'm pretty busy till the weekend but I'll try to put these things in

@geronimi73
Copy link
Owner

geronimi73 commented Jan 26, 2025

added refinement clicks. The previous mask tensor is stored and sent to the decoder for the subsequent clicks.

A test case would be nice. My feeling is the segmentation is a bit better now but i'm not 100% sure everything's correct. Do you have any images where passing a mask will make a huge difference?

ps: sorry for the delay, first got sick and then overwhelmed by work

@geronimi73
Copy link
Owner

I guess you're busy @raedle, anyway, thank you for this contribution!

@geronimi73 geronimi73 marked this pull request as ready for review February 10, 2025 11:01
@geronimi73 geronimi73 merged commit 52f70cd into geronimi73:main Feb 10, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants