Skip to content

Conversation

@jycor
Copy link
Contributor

@jycor jycor commented Aug 29, 2025

Changes:

  • use the commitMetadata and commitHeight (if available) in iterator
  • reduce string and regex operations when resolving a known hash commit spec
  • directly access parent rather than allocating slice

The result of these changes is roughly 1.7x speed up in select * from dolt_log() queries.
Addresses: #9743

@coffeegoddd
Copy link
Contributor

@jycor DOLT

comparing_percentages
100.000000 to 100.000000
version result total
27607c6 ok 5937471
version total_tests
27607c6 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
93e9dca ok 5937471
version total_tests
93e9dca 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@jycor DOLT

comparing_percentages
100.000000 to 100.000000
version result total
14f5ca3 ok 5937471
version total_tests
14f5ca3 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
51cf667 ok 5937471
version total_tests
51cf667 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@jycor DOLT

comparing_percentages
100.000000 to 100.000000
version result total
d28de79 ok 5937471
version total_tests
d28de79 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
fa30ff7 ok 5937471
version total_tests
fa30ff7 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@jycor DOLT

comparing_percentages
100.000000 to 100.000000
version result total
36f31a9 ok 5937471
version total_tests
36f31a9 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@coffeegoddd DOLT

comparing_percentages
100.000000 to 100.000000
version result total
11b8b65 ok 5937471
version total_tests
11b8b65 5937471
correctness_percentage
100.0

@coffeegoddd
Copy link
Contributor

@jycor DOLT

comparing_percentages
100.000000 to 100.000000
version result total
e8e2f5c ok 5937471
version total_tests
e8e2f5c 5937471
correctness_percentage
100.0

Copy link
Contributor

@fulghum fulghum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good. A few small things to clean up.

// Next returns the hash of the next commit, and a pointer to that commit. It handles making sure the list of commits
// returned are unique. When complete Next will return hash.Hash{}, nil, io.EOF
func (cmItr *commitItr[C]) Next(ctx C) (hash.Hash, *OptionalCommit, error) {
func (cmItr *commitItr[C]) Next(ctx C) (hash.Hash, *OptionalCommit, *datas.CommitMeta, uint64, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this Next() implementation will never actually return a *CommitMeta instance, right? That makes this interface a little bit tricky/frustrating to use when the caller doesn't know which implementations return which fields. At minimum, it's worth documenting this better, perhaps in the main interface definition (to document that some implementations may not return CommitMeta and callers should always test for nil before using that param) and perhaps also in the implementations that don't return CommitMeta data since they are not really fully implementing the interface's contract.

@coffeegoddd
Copy link
Contributor

@jycor DOLT

comparing_percentages
100.000000 to 100.000000
version result total
ed65e73 ok 5937471
version total_tests
ed65e73 5937471
correctness_percentage
100.0

@jycor jycor merged commit 5fa69b5 into main Aug 30, 2025
23 checks passed
@jycor jycor deleted the james/commit branch August 30, 2025 00:14
@coffeegoddd
Copy link
Contributor

@jycor DOLT

comparing_percentages
100.000000 to 100.000000
version result total
fbe43d4 ok 5937471
version total_tests
fbe43d4 5937471
correctness_percentage
100.0

@github-actions
Copy link

@coffeegoddd DOLT

test_name detail row_cnt sorted mysql_time sql_mult cli_mult
batching LOAD DATA 10000 1 0.07 1.29
batching batch sql 10000 1 0.09 1.33
batching by line sql 10000 1 0.08 1.5
blob 1 blob 200000 1 0.9 4.02 4.68
blob 2 blobs 200000 1 0.86 4.6 5.01
blob no blob 200000 1 0.96 2.5 2.86
col type datetime 200000 1 0.84 2.54 2.9
col type varchar 200000 1 0.79 3.22 3.57
config width 2 cols 200000 1 0.89 2.42 2.78
config width 32 cols 200000 1 1.94 2.55 2.81
config width 8 cols 200000 1 0.99 2.71 3.1
pk type float 200000 1 0.87 2.48 2.87
pk type int 200000 1 2.27 0.93 1.08
pk type varchar 200000 1 1.55 1.83 1.88
row count 1.6mm 1600000 1 5.75 3.12 3.27
row count 400k 400000 1 1.52 2.84 3.13
row count 800k 800000 1 2.88 3.07 3.25
secondary index four index 200000 1 3.5 1.55 1.49
secondary index no secondary 200000 1 0.93 2.6 2.92
secondary index one index 200000 1 1.12 2.69 2.94
secondary index two index 200000 1 1.95 1.93 2.03
sorting shuffled 1mm 1000000 0 5.46 2.83 2.76
sorting sorted 1mm 1000000 1 5.24 2.93 2.89

@github-actions
Copy link

@coffeegoddd DOLT

name detail mean_mult
dolt_blame_basic system table 1.18
dolt_blame_commit_filter system table 2.85
dolt_commit_ancestors_commit_filter system table 0.61
dolt_commits_commit_filter system table 1
dolt_diff_log_join_from_commit system table 2.71
dolt_diff_log_join_to_commit system table 2.66
dolt_diff_table_from_commit_filter system table 1.19
dolt_diff_table_to_commit_filter system table 1.19
dolt_diffs_commit_filter system table 1.03
dolt_history_commit_filter system table 1.5
dolt_log_commit_filter system table 1.1

@github-actions
Copy link

@coffeegoddd DOLT

name add_cnt delete_cnt update_cnt latency
adds_only 60000 0 0 1.12
adds_updates_deletes 60000 60000 60000 4.51
deletes_only 0 60000 0 2.36
updates_only 0 0 60000 3.01

@github-actions
Copy link

@coffeegoddd DOLT

test_name detail row_cnt sorted mysql_time sql_mult cli_mult
batching LOAD DATA 10000 1 0.05 1.8
batching batch sql 10000 1 0.09 1.33
batching by line sql 10000 1 0.09 1.44
blob 1 blob 200000 1 0.92 3.87 4.55
blob 2 blobs 200000 1 0.9 4.44 4.83
blob no blob 200000 1 0.93 2.6 2.96
col type datetime 200000 1 0.8 2.71 3.1
col type varchar 200000 1 0.79 3.16 3.48
config width 2 cols 200000 1 0.84 2.55 2.96
config width 32 cols 200000 1 1.89 2.57 2.83
config width 8 cols 200000 1 0.95 2.78 3.24
pk type float 200000 1 0.88 2.47 3.01
pk type int 200000 1 0.93 2.27 2.69
pk type varchar 200000 1 2.76 0.97 1.18
row count 1.6mm 1600000 1 5.71 3.13 3.32
row count 400k 400000 1 1.42 3.06 3.35
row count 800k 800000 1 2.91 3.03 3.26
secondary index four index 200000 1 3.58 1.53 1.49
secondary index no secondary 200000 1 0.94 2.54 2.89
secondary index one index 200000 1 1.08 2.83 3.07
secondary index two index 200000 1 1.98 1.94 1.96
sorting shuffled 1mm 1000000 0 5.18 2.97 2.92
sorting sorted 1mm 1000000 1 4.99 3.05 3.08

@github-actions
Copy link

@coffeegoddd DOLT

name detail mean_mult
dolt_blame_basic system table 1.18
dolt_blame_commit_filter system table 2.82
dolt_commit_ancestors_commit_filter system table 0.61
dolt_commits_commit_filter system table 1
dolt_diff_log_join_from_commit system table 2.75
dolt_diff_log_join_to_commit system table 2.74
dolt_diff_table_from_commit_filter system table 1.24
dolt_diff_table_to_commit_filter system table 1.21
dolt_diffs_commit_filter system table 0.97
dolt_history_commit_filter system table 1.48
dolt_log_commit_filter system table 1.05

@github-actions
Copy link

@coffeegoddd DOLT

name add_cnt delete_cnt update_cnt latency
adds_only 60000 0 0 1.11
adds_updates_deletes 60000 60000 60000 4.53
deletes_only 0 60000 0 2.35
updates_only 0 0 60000 3.03

@github-actions
Copy link

github-actions bot commented Sep 1, 2025

@coffeegoddd DOLT

test_name detail row_cnt sorted mysql_time sql_mult cli_mult
batching LOAD DATA 10000 1 0.06 1.5
batching batch sql 10000 1 0.09 1.56
batching by line sql 10000 1 0.1 1.2
blob 1 blob 200000 1 0.86 4.33 5.07
blob 2 blobs 200000 1 0.88 4.63 5.19
blob no blob 200000 1 0.94 2.59 2.95
col type datetime 200000 1 0.79 2.71 3.13
col type varchar 200000 1 0.7 3.66 3.86
config width 2 cols 200000 1 0.78 2.74 3.19
config width 32 cols 200000 1 1.86 2.61 2.89
config width 8 cols 200000 1 0.98 2.69 3.1
pk type float 200000 1 0.86 2.49 2.94
pk type int 200000 1 0.82 2.73 3.01
pk type varchar 200000 1 1.49 2.07 2.17
row count 1.6mm 1600000 1 5.58 3.2 3.39
row count 400k 400000 1 1.54 2.87 3.08
row count 800k 800000 1 2.85 3.12 3.32
secondary index four index 200000 1 3.67 1.48 1.46
secondary index no secondary 200000 1 0.92 2.66 2.96
secondary index one index 200000 1 1.15 2.64 2.86
secondary index two index 200000 1 2.03 1.91 1.93
sorting shuffled 1mm 1000000 0 5.5 2.89 2.9
sorting sorted 1mm 1000000 1 5.36 2.96 3.04

@github-actions
Copy link

github-actions bot commented Sep 1, 2025

@coffeegoddd DOLT

name detail mean_mult
dolt_blame_basic system table 1.17
dolt_blame_commit_filter system table 2.84
dolt_commit_ancestors_commit_filter system table 0.61
dolt_commits_commit_filter system table 1.05
dolt_diff_log_join_from_commit system table 2.71
dolt_diff_log_join_to_commit system table 2.75
dolt_diff_table_from_commit_filter system table 1.21
dolt_diff_table_to_commit_filter system table 1.25
dolt_diffs_commit_filter system table 1
dolt_history_commit_filter system table 1.48
dolt_log_commit_filter system table 1.05

@github-actions
Copy link

github-actions bot commented Sep 1, 2025

@coffeegoddd DOLT

name add_cnt delete_cnt update_cnt latency
adds_only 60000 0 0 1.1
adds_updates_deletes 60000 60000 60000 4.56
deletes_only 0 60000 0 2.34
updates_only 0 0 60000 3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants