fix: representative opinions selection and comment stats formatting #99

nicobao · 2025-08-20T15:38:57Z

Fixed:

representative opinion used to be formatted wrongly (using repful_for=disagree data instead of agree for example), especially when it comes to the "best-agree"
the way to select representative opinions and then select them for formatting was different, so it was leading to errors.
we're now sorting representative opinions by repness-test before using pick-max so we're sure we have the best ones

TODO:

best-agree support was temporarily removed for now, as the implementation was flawed
instead of the previous implementation we should select the best "agree" of the existing selected representative opinions after pick_max filter, if any, and then update the column to add "best-agree: true". There might be not best-agree sometimes when tehre is no "agree" representative opinions, and it's expected according to my experience I think sometimes there is no best-agree at all (it also makes sense in general I think, but I may be wrong)
add unit tests!

Test data

Before:

{
  0: [
    {'tid': 2460, 'n-success': 0, 'n-trials': 11, 'p-success': 0.07692307692307693, 'p-test': -2.8867513459481287, 'repness': 1.9615384615384617, 'repness-test': 0.6282130217953903, 'repful-for': 'disagree', 'n-agree': 0, 'best-agree': True
    },
    {'tid': 2451, 'n-success': 17, 'n-trials': 21, 'p-success': 0.782608695652174, 'p-test': 2.9848100289785466, 'repness': 3.8086956521739133, 'repness-test': 5.244952713874628, 'repful-for': 'disagree'
    },
    {'tid': 2443, 'n-success': 18, 'n-trials': 22, 'p-success': 0.7916666666666666, 'p-test': 3.1277162108561214, 'repness': 2.7708333333333335, 'repness-test': 4.506763538720049, 'repful-for': 'disagree'
    },
    {'tid': 2453, 'n-success': 15, 'n-trials': 22, 'p-success': 0.6666666666666666, 'p-test': 1.8766297265136724, 'repness': 3.4761904761904763, 'repness-test': 4.501864540706584, 'repful-for': 'disagree'
    },
    {'tid': 2464, 'n-success': 9, 'n-trials': 11, 'p-success': 0.7692307692307693, 'p-test': 2.3094010767585034, 'repness': 2.9585798816568047, 'repness-test': 3.638873891389283, 'repful-for': 'disagree'
    }
  ],
  1: [
    {'tid': 2439, 'n-success': 22, 'n-trials': 26, 'p-success': 0.8214285714285714, 'p-test': 3.65655170486763, 'repness': 1.5945378151260503, 'repness-test': 2.9577381047491125, 'repful-for': 'agree', 'n-agree': 22, 'best-agree': True
    },
    {'tid': 2446, 'n-success': 23, 'n-trials': 29, 'p-success': 0.7741935483870968, 'p-test': 3.2863353450309973, 'repness': 1.6774193548387095, 'repness-test': 3.0279141151771283, 'repful-for': 'agree'
    },
    {'tid': 2453, 'n-success': 22, 'n-trials': 29, 'p-success': 0.7419354838709677, 'p-test': 2.9211869733608866, 'repness': 1.632258064516129, 'repness-test': 2.7835490763429362, 'repful-for': 'agree'
    },
    {'tid': 2443, 'n-success': 18, 'n-trials': 27, 'p-success': 0.6551724137931034, 'p-test': 1.8898223650461365, 'repness': 2.129310344827586, 'repness-test': 3.2693281361911914, 'repful-for': 'disagree'
    },
    {'tid': 2444, 'n-success': 20, 'n-trials': 29, 'p-success': 0.6774193548387096, 'p-test': 2.190890230020664, 'repness': 1.518353726362625, 'repness-test': 2.2360438379992784, 'repful-for': 'disagree'
    }
  ],
  2: [
    {'tid': 2443, 'n-success': 18, 'n-trials': 21, 'p-success': 0.8260869565217391, 'p-test': 3.411211461689767, 'repness': 2.2558528428093645, 'repness-test': 4.028539006090745, 'repful-for': 'agree', 'n-agree': 18, 'best-agree': True
    },
    {'tid': 2436, 'n-success': 18, 'n-trials': 23, 'p-success': 0.76, 'p-test': 2.8577380332470406, 'repness': 1.4378378378378378, 'repness-test': 2.202200210889867, 'repful-for': 'agree'
    }
  ],
  3: [
    {'tid': 2444, 'n-success': 17, 'n-trials': 20, 'p-success': 0.8181818181818182, 'p-test': 3.273268353539885, 'repness': 3.784090909090909, 'repness-test': 5.361874200918158, 'repful-for': 'agree', 'n-agree': 17, 'best-agree': True
    },
    {'tid': 2443, 'n-success': 19, 'n-trials': 20, 'p-success': 0.9090909090909091, 'p-test': 4.146139914483855, 'repness': 2.618181818181818, 'repness-test': 4.8341813818288, 'repful-for': 'agree'
    },
    {'tid': 2446, 'n-success': 19, 'n-trials': 19, 'p-success': 0.9523809523809523, 'p-test': 4.47213595499958, 'repness': 2.100840336134454, 'repness-test': 4.338066253392037, 'repful-for': 'agree'
    },
    {'tid': 2451, 'n-success': 19, 'n-trials': 20, 'p-success': 0.9090909090909091, 'p-test': 4.146139914483855, 'repness': 2.1700879765395893, 'repness-test': 4.27781555325104, 'repful-for': 'agree'
    },
    {'tid': 2453, 'n-success': 18, 'n-trials': 19, 'p-success': 0.9047619047619048, 'p-test': 4.024922359499621, 'repness': 2.022408963585434, 'repness-test': 3.973835323186045, 'repful-for': 'agree'
    }
  ]
}

After:

{
  0: [
    {'tid': 2451, 'n-success': 17, 'n-trials': 21, 'p-success': 0.782608695652174, 'p-test': 2.9848100289785466, 'repness': 3.8086956521739133, 'repness-test': 5.244952713874628, 'repful-for': 'disagree'
    },
    {'tid': 2443, 'n-success': 18, 'n-trials': 22, 'p-success': 0.7916666666666666, 'p-test': 3.1277162108561214, 'repness': 2.7708333333333335, 'repness-test': 4.506763538720049, 'repful-for': 'disagree'
    },
    {'tid': 2453, 'n-success': 15, 'n-trials': 22, 'p-success': 0.6666666666666666, 'p-test': 1.8766297265136724, 'repness': 3.4761904761904763, 'repness-test': 4.501864540706584, 'repful-for': 'disagree'
    },
    {'tid': 2464, 'n-success': 9, 'n-trials': 11, 'p-success': 0.7692307692307693, 'p-test': 2.3094010767585034, 'repness': 2.9585798816568047, 'repness-test': 3.638873891389283, 'repful-for': 'disagree'
    },
    {'tid': 2444, 'n-success': 15, 'n-trials': 21, 'p-success': 0.6956521739130435, 'p-test': 2.1320071635561044, 'repness': 1.4936061381074168, 'repness-test': 2.098245801940926, 'repful-for': 'disagree'
    }
  ],
  1: [
    {'tid': 2446, 'n-success': 23, 'n-trials': 29, 'p-success': 0.7741935483870968, 'p-test': 3.2863353450309973, 'repness': 1.6774193548387095, 'repness-test': 3.0279141151771283, 'repful-for': 'agree'
    },
    {'tid': 2439, 'n-success': 22, 'n-trials': 26, 'p-success': 0.8214285714285714, 'p-test': 3.65655170486763, 'repness': 1.5945378151260503, 'repness-test': 2.9577381047491125, 'repful-for': 'agree'
    },
    {'tid': 2453, 'n-success': 22, 'n-trials': 29, 'p-success': 0.7419354838709677, 'p-test': 2.9211869733608866, 'repness': 1.632258064516129, 'repness-test': 2.7835490763429362, 'repful-for': 'agree'
    },
    {'tid': 2443, 'n-success': 18, 'n-trials': 27, 'p-success': 0.6551724137931034, 'p-test': 1.8898223650461365, 'repness': 2.129310344827586, 'repness-test': 3.2693281361911914, 'repful-for': 'disagree'
    },
    {'tid': 2444, 'n-success': 20, 'n-trials': 29, 'p-success': 0.6774193548387096, 'p-test': 2.190890230020664, 'repness': 1.518353726362625, 'repness-test': 2.2360438379992784, 'repful-for': 'disagree'
    }
  ],
  2: [
    {'tid': 2443, 'n-success': 18, 'n-trials': 21, 'p-success': 0.8260869565217391, 'p-test': 3.411211461689767, 'repness': 2.2558528428093645, 'repness-test': 4.028539006090745, 'repful-for': 'agree'
    },
    {'tid': 2436, 'n-success': 18, 'n-trials': 23, 'p-success': 0.76, 'p-test': 2.8577380332470406, 'repness': 1.4378378378378378, 'repness-test': 2.202200210889867, 'repful-for': 'agree'
    }
  ],
  3: [
    {'tid': 2444, 'n-success': 17, 'n-trials': 20, 'p-success': 0.8181818181818182, 'p-test': 3.273268353539885, 'repness': 3.784090909090909, 'repness-test': 5.361874200918158, 'repful-for': 'agree'
    },
    {'tid': 2443, 'n-success': 19, 'n-trials': 20, 'p-success': 0.9090909090909091, 'p-test': 4.146139914483855, 'repness': 2.618181818181818, 'repness-test': 4.8341813818288, 'repful-for': 'agree'
    },
    {'tid': 2446, 'n-success': 19, 'n-trials': 19, 'p-success': 0.9523809523809523, 'p-test': 4.47213595499958, 'repness': 2.100840336134454, 'repness-test': 4.338066253392037, 'repful-for': 'agree'
    },
    {'tid': 2451, 'n-success': 19, 'n-trials': 20, 'p-success': 0.9090909090909091, 'p-test': 4.146139914483855, 'repness': 2.1700879765395893, 'repness-test': 4.27781555325104, 'repful-for': 'agree'
    },
    {'tid': 2453, 'n-success': 18, 'n-trials': 19, 'p-success': 0.9047619047619048, 'p-test': 4.024922359499621, 'repness': 2.022408963585434, 'repness-test': 3.973835323186045, 'repful-for': 'agree'
    }
  ]
}

nicobao · 2025-08-20T15:42:49Z

Test fails because best-agree was removed

nicobao · 2025-08-25T23:20:36Z

Hi @patcon, any feedback, so we can merge?

nicobao · 2025-09-05T22:10:49Z

Hey @patcon, could you tell me how to go through sufficient_statements to add the following:

best_agree.update({"n-agree": best_agree["n-success"], "best-agree": True})

to the one statement among all the sufficient_statements which have maximum repness_test among those with repful_for=agree (if any)?

And if we use best_overall (sufficient_statements is empty) then do the same for best_overall.

(I am a noob when it comes to working with tf data objects)

(As I said earlier, it's possible we don't have any best-agree at all with that method if we only have disagree representative opinions, but that seems fine to me?)

nicobao · 2025-09-23T19:26:25Z

I need your help @patcon to understand why there are so many test errors

…erall

…idence

- Remove unused 'stat' import - Simplify significance checks by removing redundant vote count validations - Streamline repful_for calculation by removing nested conditionals - Lower minimum confidence threshold from 0.7 to 0.6 for statement selection - Improve confidence selection to prefer exact pick_max matches over near-misses - Remove best-agree flag assignment logic

nicobao changed the title ~~fix: representative opinions selection and stat formatting~~ fix: representative opinions selection and comment stats formatting Aug 20, 2025

nicobao mentioned this pull request Sep 8, 2025

feat(stats): allow users to rank all opinions by representativeness for each group #105

Open

nicobao added 18 commits September 23, 2025 21:42

fix: repful_for was wrongly attributed and the wrong fields were filled

1301ce8

fix: use absolute for comparing rat/rdt as it can be negative

1255087

fix: try to compare just probabilities

ad81762

fix: try comparing prob without abs

651dc54

fix: testing always agree

e4b002d

fix: removing format_comment_stats unused argument

d794f7c

fix: attempt to understand values

7cae672

fix: unify repful_for calc and get rid of bugged best-agree for now

8c57996

fix: use pat and pdt correctly

070ded2

fix: always pick representative opinons up to pick_max, best first

5c065a4

fix: typo in accessing repness-test

0ffc4e2

fix(repness): lower confidence instead of directly relying on best_ov…

8c2c6f9

…erall

feat: try to add representative opinions till pick_max & max 0.6 conf…

185cc4a

…idence

fix: remove unecessary sort

25b3cea

fix(repness): actually choose all repness for a given confidence

93e1434

feat(repness): improve algorithm to find the best repness

93ab314

fix: add back best-agree

15f3ccf

feat: what if both are significant

c67cb6e

nicobao force-pushed the fix-repful-for branch from a06b6d8 to c67cb6e Compare September 23, 2025 19:43

nicobao added 4 commits October 1, 2025 16:37

feat: improve alg

c9af699

fix: don't take "rep" statemnts with no agree or disagree

75f8d51

fix: syntax error

e6fd28d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: representative opinions selection and comment stats formatting #99

fix: representative opinions selection and comment stats formatting #99

Uh oh!

nicobao commented Aug 20, 2025 •

edited

Loading

Uh oh!

nicobao commented Aug 20, 2025

Uh oh!

nicobao commented Aug 25, 2025

Uh oh!

nicobao commented Sep 5, 2025 •

edited

Loading

Uh oh!

nicobao commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: representative opinions selection and comment stats formatting #99

Are you sure you want to change the base?

fix: representative opinions selection and comment stats formatting #99

Uh oh!

Conversation

nicobao commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test data

Uh oh!

nicobao commented Aug 20, 2025

Uh oh!

nicobao commented Aug 25, 2025

Uh oh!

nicobao commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nicobao commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nicobao commented Aug 20, 2025 •

edited

Loading

nicobao commented Sep 5, 2025 •

edited

Loading