Skip to content

Conversation

@nicobao
Copy link
Member

@nicobao nicobao commented Aug 20, 2025

Fixed:

  • representative opinion used to be formatted wrongly (using repful_for=disagree data instead of agree for example), especially when it comes to the "best-agree"
  • the way to select representative opinions and then select them for formatting was different, so it was leading to errors.
  • we're now sorting representative opinions by repness-test before using pick-max so we're sure we have the best ones

TODO:

  • best-agree support was temporarily removed for now, as the implementation was flawed
  • instead of the previous implementation we should select the best "agree" of the existing selected representative opinions after pick_max filter, if any, and then update the column to add "best-agree: true". There might be not best-agree sometimes when tehre is no "agree" representative opinions, and it's expected according to my experience I think sometimes there is no best-agree at all (it also makes sense in general I think, but I may be wrong)
  • add unit tests!

Test data

Votes.json:
votes.json

Before:

{
  0: [
    {'tid': 2460, 'n-success': 0, 'n-trials': 11, 'p-success': 0.07692307692307693, 'p-test': -2.8867513459481287, 'repness': 1.9615384615384617, 'repness-test': 0.6282130217953903, 'repful-for': 'disagree', 'n-agree': 0, 'best-agree': True
    },
    {'tid': 2451, 'n-success': 17, 'n-trials': 21, 'p-success': 0.782608695652174, 'p-test': 2.9848100289785466, 'repness': 3.8086956521739133, 'repness-test': 5.244952713874628, 'repful-for': 'disagree'
    },
    {'tid': 2443, 'n-success': 18, 'n-trials': 22, 'p-success': 0.7916666666666666, 'p-test': 3.1277162108561214, 'repness': 2.7708333333333335, 'repness-test': 4.506763538720049, 'repful-for': 'disagree'
    },
    {'tid': 2453, 'n-success': 15, 'n-trials': 22, 'p-success': 0.6666666666666666, 'p-test': 1.8766297265136724, 'repness': 3.4761904761904763, 'repness-test': 4.501864540706584, 'repful-for': 'disagree'
    },
    {'tid': 2464, 'n-success': 9, 'n-trials': 11, 'p-success': 0.7692307692307693, 'p-test': 2.3094010767585034, 'repness': 2.9585798816568047, 'repness-test': 3.638873891389283, 'repful-for': 'disagree'
    }
  ],
  1: [
    {'tid': 2439, 'n-success': 22, 'n-trials': 26, 'p-success': 0.8214285714285714, 'p-test': 3.65655170486763, 'repness': 1.5945378151260503, 'repness-test': 2.9577381047491125, 'repful-for': 'agree', 'n-agree': 22, 'best-agree': True
    },
    {'tid': 2446, 'n-success': 23, 'n-trials': 29, 'p-success': 0.7741935483870968, 'p-test': 3.2863353450309973, 'repness': 1.6774193548387095, 'repness-test': 3.0279141151771283, 'repful-for': 'agree'
    },
    {'tid': 2453, 'n-success': 22, 'n-trials': 29, 'p-success': 0.7419354838709677, 'p-test': 2.9211869733608866, 'repness': 1.632258064516129, 'repness-test': 2.7835490763429362, 'repful-for': 'agree'
    },
    {'tid': 2443, 'n-success': 18, 'n-trials': 27, 'p-success': 0.6551724137931034, 'p-test': 1.8898223650461365, 'repness': 2.129310344827586, 'repness-test': 3.2693281361911914, 'repful-for': 'disagree'
    },
    {'tid': 2444, 'n-success': 20, 'n-trials': 29, 'p-success': 0.6774193548387096, 'p-test': 2.190890230020664, 'repness': 1.518353726362625, 'repness-test': 2.2360438379992784, 'repful-for': 'disagree'
    }
  ],
  2: [
    {'tid': 2443, 'n-success': 18, 'n-trials': 21, 'p-success': 0.8260869565217391, 'p-test': 3.411211461689767, 'repness': 2.2558528428093645, 'repness-test': 4.028539006090745, 'repful-for': 'agree', 'n-agree': 18, 'best-agree': True
    },
    {'tid': 2436, 'n-success': 18, 'n-trials': 23, 'p-success': 0.76, 'p-test': 2.8577380332470406, 'repness': 1.4378378378378378, 'repness-test': 2.202200210889867, 'repful-for': 'agree'
    }
  ],
  3: [
    {'tid': 2444, 'n-success': 17, 'n-trials': 20, 'p-success': 0.8181818181818182, 'p-test': 3.273268353539885, 'repness': 3.784090909090909, 'repness-test': 5.361874200918158, 'repful-for': 'agree', 'n-agree': 17, 'best-agree': True
    },
    {'tid': 2443, 'n-success': 19, 'n-trials': 20, 'p-success': 0.9090909090909091, 'p-test': 4.146139914483855, 'repness': 2.618181818181818, 'repness-test': 4.8341813818288, 'repful-for': 'agree'
    },
    {'tid': 2446, 'n-success': 19, 'n-trials': 19, 'p-success': 0.9523809523809523, 'p-test': 4.47213595499958, 'repness': 2.100840336134454, 'repness-test': 4.338066253392037, 'repful-for': 'agree'
    },
    {'tid': 2451, 'n-success': 19, 'n-trials': 20, 'p-success': 0.9090909090909091, 'p-test': 4.146139914483855, 'repness': 2.1700879765395893, 'repness-test': 4.27781555325104, 'repful-for': 'agree'
    },
    {'tid': 2453, 'n-success': 18, 'n-trials': 19, 'p-success': 0.9047619047619048, 'p-test': 4.024922359499621, 'repness': 2.022408963585434, 'repness-test': 3.973835323186045, 'repful-for': 'agree'
    }
  ]
}

After:

{
  0: [
    {'tid': 2451, 'n-success': 17, 'n-trials': 21, 'p-success': 0.782608695652174, 'p-test': 2.9848100289785466, 'repness': 3.8086956521739133, 'repness-test': 5.244952713874628, 'repful-for': 'disagree'
    },
    {'tid': 2443, 'n-success': 18, 'n-trials': 22, 'p-success': 0.7916666666666666, 'p-test': 3.1277162108561214, 'repness': 2.7708333333333335, 'repness-test': 4.506763538720049, 'repful-for': 'disagree'
    },
    {'tid': 2453, 'n-success': 15, 'n-trials': 22, 'p-success': 0.6666666666666666, 'p-test': 1.8766297265136724, 'repness': 3.4761904761904763, 'repness-test': 4.501864540706584, 'repful-for': 'disagree'
    },
    {'tid': 2464, 'n-success': 9, 'n-trials': 11, 'p-success': 0.7692307692307693, 'p-test': 2.3094010767585034, 'repness': 2.9585798816568047, 'repness-test': 3.638873891389283, 'repful-for': 'disagree'
    },
    {'tid': 2444, 'n-success': 15, 'n-trials': 21, 'p-success': 0.6956521739130435, 'p-test': 2.1320071635561044, 'repness': 1.4936061381074168, 'repness-test': 2.098245801940926, 'repful-for': 'disagree'
    }
  ],
  1: [
    {'tid': 2446, 'n-success': 23, 'n-trials': 29, 'p-success': 0.7741935483870968, 'p-test': 3.2863353450309973, 'repness': 1.6774193548387095, 'repness-test': 3.0279141151771283, 'repful-for': 'agree'
    },
    {'tid': 2439, 'n-success': 22, 'n-trials': 26, 'p-success': 0.8214285714285714, 'p-test': 3.65655170486763, 'repness': 1.5945378151260503, 'repness-test': 2.9577381047491125, 'repful-for': 'agree'
    },
    {'tid': 2453, 'n-success': 22, 'n-trials': 29, 'p-success': 0.7419354838709677, 'p-test': 2.9211869733608866, 'repness': 1.632258064516129, 'repness-test': 2.7835490763429362, 'repful-for': 'agree'
    },
    {'tid': 2443, 'n-success': 18, 'n-trials': 27, 'p-success': 0.6551724137931034, 'p-test': 1.8898223650461365, 'repness': 2.129310344827586, 'repness-test': 3.2693281361911914, 'repful-for': 'disagree'
    },
    {'tid': 2444, 'n-success': 20, 'n-trials': 29, 'p-success': 0.6774193548387096, 'p-test': 2.190890230020664, 'repness': 1.518353726362625, 'repness-test': 2.2360438379992784, 'repful-for': 'disagree'
    }
  ],
  2: [
    {'tid': 2443, 'n-success': 18, 'n-trials': 21, 'p-success': 0.8260869565217391, 'p-test': 3.411211461689767, 'repness': 2.2558528428093645, 'repness-test': 4.028539006090745, 'repful-for': 'agree'
    },
    {'tid': 2436, 'n-success': 18, 'n-trials': 23, 'p-success': 0.76, 'p-test': 2.8577380332470406, 'repness': 1.4378378378378378, 'repness-test': 2.202200210889867, 'repful-for': 'agree'
    }
  ],
  3: [
    {'tid': 2444, 'n-success': 17, 'n-trials': 20, 'p-success': 0.8181818181818182, 'p-test': 3.273268353539885, 'repness': 3.784090909090909, 'repness-test': 5.361874200918158, 'repful-for': 'agree'
    },
    {'tid': 2443, 'n-success': 19, 'n-trials': 20, 'p-success': 0.9090909090909091, 'p-test': 4.146139914483855, 'repness': 2.618181818181818, 'repness-test': 4.8341813818288, 'repful-for': 'agree'
    },
    {'tid': 2446, 'n-success': 19, 'n-trials': 19, 'p-success': 0.9523809523809523, 'p-test': 4.47213595499958, 'repness': 2.100840336134454, 'repness-test': 4.338066253392037, 'repful-for': 'agree'
    },
    {'tid': 2451, 'n-success': 19, 'n-trials': 20, 'p-success': 0.9090909090909091, 'p-test': 4.146139914483855, 'repness': 2.1700879765395893, 'repness-test': 4.27781555325104, 'repful-for': 'agree'
    },
    {'tid': 2453, 'n-success': 18, 'n-trials': 19, 'p-success': 0.9047619047619048, 'p-test': 4.024922359499621, 'repness': 2.022408963585434, 'repness-test': 3.973835323186045, 'repful-for': 'agree'
    }
  ]
}

@nicobao
Copy link
Member Author

nicobao commented Aug 20, 2025

Test fails because best-agree was removed

@nicobao nicobao changed the title fix: representative opinions selection and stat formatting fix: representative opinions selection and comment stats formatting Aug 20, 2025
@nicobao
Copy link
Member Author

nicobao commented Aug 25, 2025

Hi @patcon, any feedback, so we can merge?

@nicobao
Copy link
Member Author

nicobao commented Sep 5, 2025

Hey @patcon, could you tell me how to go through sufficient_statements to add the following:

best_agree.update({"n-agree": best_agree["n-success"], "best-agree": True})

to the one statement among all the sufficient_statements which have maximum repness_test among those with repful_for=agree (if any)?

And if we use best_overall (sufficient_statements is empty) then do the same for best_overall.

(I am a noob when it comes to working with tf data objects)

(As I said earlier, it's possible we don't have any best-agree at all with that method if we only have disagree representative opinions, but that seems fine to me?)

@nicobao
Copy link
Member Author

nicobao commented Sep 23, 2025

I need your help @patcon to understand why there are so many test errors

- Remove unused 'stat' import
- Simplify significance checks by removing redundant vote count validations
- Streamline repful_for calculation by removing nested conditionals
- Lower minimum confidence threshold from 0.7 to 0.6 for statement selection
- Improve confidence selection to prefer exact pick_max matches over near-misses
- Remove best-agree flag assignment logic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant