Skip to content

Conversation

neolefty
Copy link
Contributor

@neolefty neolefty commented Aug 7, 2025

before after
image image

See this API call for example (illustrated in TaxonPages screenshots above):
https://sfg.taxonworks.org/api/v1/otus/autocomplete?project_token=ekMTicbZWijqmdpHKqs_TA&having_taxon_name_only=true&include_common_names=true&term=ashton

And try it out here: https://ag.purdue.edu/department/entm/perc/search-collection.html


Fix duplicate OTU entries in autocomplete when scientific and common names share substrings
Previously, when searching for a substring that appeared in both an OTU's scientific name
and common name, the autocomplete would return duplicate entries for the same OTU. This
created a confusing user experience where the same OTU would appear multiple times in
the dropdown with identical visual labels.

The issue occurred in the api_autocomplete_extended method where duplicate detection was
based on the label_target's ID and class name, which didn't account for different query
matches (scientific vs common name) that rendered the same visual text.

This fix improves the deduplication logic by:

  • Generating the actual rendered label text for each result
  • Using OTU ID + visual label text as the deduplication key
  • Ensuring each unique OTU+label combination appears only once

Added comprehensive test coverage to verify that visual duplicates are properly filtered
while maintaining the ability to search by both scientific and common names.

🤖 Generated with Claude Code

Co-Authored-By: Claude [email protected]

neolefty and others added 3 commits July 28, 2025 11:13
…names share substrings

Previously, when searching for a substring that appeared in both an OTU's scientific name
and common name, the autocomplete would return duplicate entries for the same OTU. This
created a confusing user experience where the same OTU would appear multiple times in
the dropdown with identical visual labels.

The issue occurred in the api_autocomplete_extended method where duplicate detection was
based on the label_target's ID and class name, which didn't account for different query
matches (scientific vs common name) that rendered the same visual text.

This fix improves the deduplication logic by:
- Generating the actual rendered label text for each result
- Using OTU ID + visual label text as the deduplication key
- Ensuring each unique OTU+label combination appears only once

Added comprehensive test coverage to verify that visual duplicates are properly filtered
while maintaining the ability to search by both scientific and common names.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add test case that directly reproduces the original bug where common
  names and scientific names sharing substrings caused duplicates
- Extract helper method to reduce code duplication in label extraction
- Reorganize existing tests with clearer context descriptions
- Add comments explaining what each test scenario covers
- Improve test maintainability with better structure and naming

The new tests specifically verify that searching for "ashton" returns
only one result when it appears in both the scientific name (ashtoni)
and common name (ashton cuckoo bumble bee), while still maintaining
searchability by either name type independently.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
otu_label_pairs = results.map { |r| [r[:otu].id, extract_label(r)] }

# Verify no duplicates exist
expect(otu_label_pairs.uniq).to eq(otu_label_pairs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case, where your synonym is a homonym, we do get duplicate labels in the UI, since the otus for the two related taxa have all of the same data used to build the autocomplete label: (otu name, taxon name, taxon name cached) and have different otu ids.
So there will still be duplicated results possible for otu autocomplete, but with your fix here we should no longer get duplicate results for the same otu.
The duplicates caused by homonyms is a separate issue I think with a different sort of solution - for now I think you can just remove this 'with synonyms relationship' context.

@@ -221,8 +221,18 @@ def api_autocomplete_extended
compact = []

r.each do |h|
g = h[:label_target].id.to_s + h[:label_target].class.name
m = [ h[:otu].id, g ]
# Generate the actual rendered label to detect visual duplicates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the issue, but let's move the solution to where we can actually create the current label, in app/views/otus/api/v1/autocomplete.json.jbuilder - that will keep us from getting out of sync with the display code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants