-
-
Notifications
You must be signed in to change notification settings - Fork 14
Goodness of fit for density visualizations #383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #383 +/- ##
==========================================
+ Coverage 85.04% 85.35% +0.30%
==========================================
Files 56 58 +2
Lines 6661 6827 +166
==========================================
+ Hits 5665 5827 +162
- Misses 996 1000 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I'm not sure I can add many additional comments about the implementation itself, as I'm not very familiar with the details. As for the layout of the plot, I agree the two-column plot would be useful in most cases. |
|
Thanks for the comment and discussion. The easy thing to do is to have some utility function |
Sounds good to me! |
933fb14 to
cccd241
Compare
547e942 to
1d58857
Compare
|
@OriolAbril do you have suggestions for the name of these functions? |
This shows the basic computations described in https://arxiv.org/abs/2503.01509 for assessing the quality of a density representation (histogram, KDE, or quantile dot plot). The method used the PIT evaluated using the computed density.
EDIT
This implements the functions
plot_dgofandplot_dgof_dist. The first one only includes the diagnostics and the second also includes the density in the left column.Not happy with the
dgofname.depends on arviz-devs/arviz-stats#273
For the visualization part, I guess we would like to have a two-column plot with the dist on the left and the diagnostic on the right. Something likeplot_trace_distandplot_rank_dist. To avoid duplicating the computation, we should compute the density once and use that information to both plot it on the left and calculate the visualization on the right.For the computation part, I am not sure if we want the PIT to be included with the output of histogram/kde/qds, or do we want a separate function that uses that output.While working on this, I realized we currently don't support
bins="auto"(for more than one variable), and hence this diagnostic is usually very bad for the default histogram we compute, arviz-devs/arviz-stats#262