Add op_sha256tree #632

matt-o-how · 2025-09-25T16:33:10Z

This PR adds a costed and cached shatree which accounts for cost as if it isn't caching so that further improvements may be made in the future without breaking consensus.

Attached are the benchmarked performance graphs for cost of call, cost per pair and cost for 32byte chunk.

coveralls-official · 2025-09-25T16:42:32Z

Pull Request Test Coverage Report for Build 20073713856

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

86 of 335 (25.67%) changed or added relevant lines in 6 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-3.0%) to 87.278%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/chia_dialect.rs	0	1	0.0%
src/test_ops.rs	0	1	0.0%
src/treehash.rs	81	109	74.31%
tools/src/bin/sha256tree-benching.rs	0	219	0.0%

Totals
Change from base Build 18973962912:	-3.0%
Covered Lines:	6380
Relevant Lines:	7310

💛 - Coveralls

op-tests/test-sha256tree.txt

src/sha_tree_op.rs

op-tests/test-sha256tree.txt

src/chia_dialect.rs

src/sha_tree_op.rs

src/treehash.rs

arvidn

there is a tool that can be extended that establishes a reasonable cost for new operators as well. We need some kind of benchmark to set the cost.

src/treehash.rs

op-tests/test-sha256tree.txt

src/treehash.rs

arvidn

do you feel confident that the cost benchmarks are good? specifically the cost per byte, cost per pair and cost per atom?

arvidn · 2025-10-31T14:14:05Z

tools/generate-sha256tree-tests.py

+
+seed(1337)
+
+SHA256TREE_BASE_COST = 0


If the correct value for this is 0, I think we should just remove it

tools/src/bin/benchmark-clvm-cost.rs

arvidn · 2025-11-20T10:14:13Z

new raspberry PI results:

opcode: sha256tree (65)
   time: per-32byte: 132.36ns
   cost: per-32byte: 851
   time: base: 69.57ns
   cost: base: 165
   time: per-node: 586.04ns
   cost: per-node: 3766
   intercept: 293355.28

MacBookPro M1:

opcode: sha256tree (65)
   time: per-32byte: 92.18ns
   cost: per-32byte: 592
   time: base: 31.28ns
   cost: base: 100
   time: per-node: 383.56ns
   cost: per-node: 2465
   intercept: 183875.97

arvidn · 2025-11-26T12:14:02Z

src/treehash.rs

+#[derive(Default)]
+pub struct TreeCache {
+    hashes: Vec<[u8; 32]>,
+    // parallel vector holding the cost used to compute the corresponding hash
+    costs: Vec<Cost>,
+    // each entry is an index into hashes and costs, or one of 3 special values:
+    // u16::MAX if the pair has not been visited
+    // u16::MAX - 1 if the pair has been seen once
+    // u16::MAX - 2 if the pair has been seen at least twice (this makes it a
+    // candidate for memoization)
+    pairs: Vec<u16>,
+}
+
+const NOT_VISITED: u16 = u16::MAX;
+const SEEN_ONCE: u16 = u16::MAX - 1;
+const SEEN_MULTIPLE: u16 = u16::MAX - 2;
+
+impl TreeCache {
+    /// Get cached hash and its associated cost (if present).
+    pub fn get(&self, n: NodePtr) -> Option<(&[u8; 32], Cost)> {
+        // We only cache pairs (for now)
+        if !matches!(n.object_type(), ObjectType::Pair) {
+            return None;
+        }
+
+        let idx = n.index() as usize;
+        let slot = *self.pairs.get(idx)?;
+        if slot >= SEEN_MULTIPLE {
+            return None;
+        }
+        Some((&self.hashes[slot as usize], self.costs[slot as usize]))
+    }
+
+    /// Insert a cached hash with its associated cost. If the cache is full we
+    /// ignore the insertion.
+    pub fn insert(&mut self, n: NodePtr, hash: &[u8; 32], cost: Cost) {
+        // If we've reached the max size, just ignore new cache items
+        if self.hashes.len() == SEEN_MULTIPLE as usize {
+            return;
+        }
+
+        if !matches!(n.object_type(), ObjectType::Pair) {
+            return;
+        }
+
+        let idx = n.index() as usize;
+        if idx >= self.pairs.len() {
+            self.pairs.resize(idx + 1, NOT_VISITED);
+        }
+
+        let slot = self.hashes.len();
+        self.hashes.push(*hash);
+        self.costs.push(cost);
+        self.pairs[idx] = slot as u16;
+    }
+
+    /// mark the node as being visited. Returns true if we need to
+    /// traverse visitation down this node.
+    fn visit(&mut self, n: NodePtr) -> bool {
+        if !matches!(n.object_type(), ObjectType::Pair) {
+            return false;
+        }
+        let idx = n.index() as usize;
+        if idx >= self.pairs.len() {
+            self.pairs.resize(idx + 1, NOT_VISITED);
+        }
+        if self.pairs[idx] > SEEN_MULTIPLE {
+            self.pairs[idx] -= 1;
+        }
+        self.pairs[idx] == SEEN_ONCE
+    }
+
+    pub fn should_memoize(&mut self, n: NodePtr) -> bool {
+        if !matches!(n.object_type(), ObjectType::Pair) {
+            return false;
+        }
+        let idx = n.index() as usize;
+        if idx >= self.pairs.len() {
+            false
+        } else {
+            self.pairs[idx] <= SEEN_MULTIPLE
+        }
+    }
+
+    pub fn visit_tree(&mut self, a: &Allocator, node: NodePtr) {
+        if !self.visit(node) {
+            return;
+        }
+        let mut nodes = vec![node];
+        while let Some(n) = nodes.pop() {
+            let SExp::Pair(left, right) = a.sexp(n) else {
+                continue;
+            };
+            if self.visit(left) {
+                nodes.push(left);
+            }
+            if self.visit(right) {
+                nodes.push(right);
+            }
+        }
+    }
+}


I think all code related to the cached version should be removed until we have a good solution. I suspect it will need to look very different from this

arvidn · 2025-11-26T12:14:10Z

src/treehash.rs

+enum TreeOp {
+    SExp(NodePtr),
+    Cons,
+    ConsAddCacheCost(NodePtr, Cost),


including this one

arvidn · 2025-11-26T12:17:01Z

src/treehash.rs

+        match op {
+            TreeOp::SExp(node) => {
+                // charge a call cost for processing this op
+                cost += SHA256TREE_COST_PER_NODE;


here, "node" means both pair and atom, right? When you measure these costs in the benchmark, you just have base-cost, per-32-bytes-cost and per-pair cost, right? It's not clear to me how you establish the cost-per-node.

arvidn · 2025-11-26T12:17:57Z

src/treehash.rs

+                        }
+                    }
+                    NodeVisitor::Pair(left, right) => {
+                        increment_bytes(65, &mut cost);


this also doesn't seem to match how you measure these costs in the benchmark.

arvidn · 2025-11-26T12:18:47Z

src/treehash.rs

+const SHA256TREE_BASE_COST: Cost = 30;
+const SHA256TREE_COST_PER_NODE: Cost = 3000;
+const SHA256TREE_COST_PER_32_BYTES: Cost = 700;


It's not obvious to me how you establish these costs. They seem to represent different metrics than what you actually measure in the benchmark.

I think it would be good to add comments to these constants to describe what they represent.

arvidn · 2025-11-26T12:20:20Z

tools/src/bin/benchmark-clvm-cost.rs

 }

+// this adds 32 bytes at a time compared to per_byte which adds 5 at a time
+fn time_per_byte_for_atom(a: &mut Allocator, op: &Operator, output: &mut dyn Write) -> (f64, f64) {


it makes sense that just looking at the slope here, you isolate the cost per 32 bytes. The constant factor is unknown as it includes base cost and cost per node.

arvidn · 2025-11-26T12:25:45Z

tools/src/bin/benchmark-clvm-cost.rs

+        samples.push(sample);
+    }
+
+    linear_regression_of(&samples).expect("linreg failed")


IIUC, the slope here represents: 2 * cost-per-node + 3 cost-per-32-bytes. Since you charge for hashing 65 bytes in the pair case of the tree hash function.

So, in order to isolate the cost-per-node, you need to subtract 3 cost-per-32-bytes and then divide by two. Am I missing something?

Since this is a bit more complex than the other benchmarks, that isolate a single cost per test, it would be good to add some comments. It might also be good to reconsider how to apply cost. If you don't charge the cost for hashing 65 bytes for a pair, this slope becomes cost-per-pair + cost-per-atom. But then you'd need to isolate the cost per atom as well, separate from bytes.

arvidn · 2025-11-26T12:31:43Z

tools/src/bin/sha256tree-benching.rs

+    (f64, f64),
+    (f64, f64), // time slopes
+    (f64, f64),
+    (f64, f64), // cost slopes


as far as I can tell, only the second value in these pairs are slopes, the first is the constant, where the line intersects x=0.

arvidn · 2025-11-26T12:33:02Z

tools/src/bin/sha256tree-benching.rs

+    println!("CLVM   cost slope      : {:.4}", atom_clvm_c.0);
+
+    println!("list results: ");
+    println!("Native time slope  (ns): {:.4}", cons_nat_t.0);


I don't think this is a slope. But labelling these as "slope" isn't very helpful. they are just cost-per-pair, or cost-per-32-bytes or things like that, right?

And the constants don't mean anything at all, as far as I can tell.

arvidn · 2025-12-01T16:54:33Z

tools/src/bin/sha256tree-benching.rs

+    bytes32_native_cost: f64,
+    bytes32_clvm_cost: f64,


I think the cost should be specified as u64

arvidn · 2025-12-01T16:56:28Z

tools/src/bin/sha256tree-benching.rs

+    f64,
+    f64, // cost slopes


I think cost should be u64. Also, referring to these as "slope" isn't very helpful, I think. It kind of tells you that it's cost per something, but I think it would be much better to refer to these by what they actually represent. It's time per node and cost per node, right? But what are the two other values?

arvidn · 2025-12-01T16:58:45Z

tools/src/bin/sha256tree-benching.rs

        let result_1 = node_to_bytes(a, red.1).expect("should work");
        let duration = start.elapsed().as_nanos() as f64;
+        let duration = (duration - (3.0 * bytes32_native_time)) / 2.0;
+        let cost = (cost as f64 - (3.0 * bytes32_native_cost)) / 2.0;


is 3.0/2.0 a constant to convert from time to cost? I think it should be a constant with a name and a comment explaining how you arrived at it.

I don't understand why you alter cost here. Shouldn't you report the actual cost reported from the CLVM interpreter?

arvidn · 2025-12-01T17:34:11Z

tools/src/bin/sha256tree-benching.rs

+        linear_regression_of(&samples_cost_native).unwrap().0,
+        linear_regression_of(&samples_cost_clvm).unwrap().0,


is there a good reason to collect both time and cost in parallel, and run regression analysis on both, independently?

I would expect all cost samples to be exactly proportional to the time samples, since they're just multiplied. You could just apply the multiplication at the end instead, and only run analysis on the timings.

arvidn · 2025-12-01T17:38:05Z

tools/src/bin/sha256tree-benching.rs

The primary objective of this program is to compare the cost of the native shatree operator against the cost of the CLVM implementation of shatree. I think it would be sufficient to run a few example trees (maybe simple ones) and print the costs for the two implementations.

I don't think you need to time anything, or run any regression analysis here. All that is already done in benchmark-clvm-cost.rs, right?

arvidn · 2025-12-01T17:42:14Z

tools/src/bin/sha256tree-benching.rs

-
        let duration = start.elapsed().as_nanos() as f64;
+        let duration = (duration - (3.0 * bytes32_clvm_time)) / 2.0;
+        let cost = (cost as f64 - (3.0 * bytes32_clvm_cost)) / 2.0;


same here, I would expect this program to compare the cost of the native operator to the CLVM implementation. But here you alter the cost.

arvidn · 2025-12-01T17:45:14Z

src/treehash.rs

 const SHA256TREE_BASE_COST: Cost = 30;
-const SHA256TREE_COST_PER_NODE: Cost = 3000;
+// this is the cost per node, whether it is a cons box or an atom
+const SHA256TREE_COST_PER_NODE: Cost = 2000;


this comment suggests that this cost is the base cost for both an atom and a pair. I believe the benchmark isn't measuring that cost, it's measuring the cost for a pair, only.

It was applying this cost for both pairs and atoms.

…perations

arvidn · 2025-12-02T13:39:52Z

src/treehash.rs


 // the base cost is the cost of calling it to begin with
-const SHA256TREE_BASE_COST: Cost = 30;
+const SHA256TREE_BASE_COST: Cost = 0;


I don't think this can be 0. (sha256tree ()) would have very low cost otherwise.

It is now set to the same base cost as sha256

arvidn · 2025-12-02T13:55:41Z

tools/src/bin/benchmark-clvm-cost.rs

        run_program(a, &dialect, call, a.nil(), 11000000000).unwrap();
        let duration = start.elapsed();
-        let sample = (i as f64, duration.as_nanos() as f64);
+        let duration_f64 = (duration.as_nanos() as f64 - (4.0 * time_per_byte32)) / 2.0;


I think this / 2.0 warrants a comment

arvidn · 2025-12-03T12:07:48Z

tools/src/bin/benchmark-clvm-cost.rs

+// this function is used for calculating a theoretical cost per node
+// we pass in the time it takes for a byte32 chunk and subtract 4*chunk_time
+// we then divide by two to account for the fact that we are adding a nil atom each time the list grows too
+// this is because atoms are nodes too


What's the point of this function? It doesn't look like it's used for anything. If you were to make it measure the cost of 1 pair and 1 NIL-atom, you could use the result to validate the model of only charging for the blocks of hashed bytes. But as it is right now, what do you do with the result?

Do you check to see that it's close to 0?

I suppose for the pure purpose of this file that it makes sense to remove this legacy costing function as we no longer have a cost for the thing it is trying to measure. I have now removed it.

but now you've re-introduced a cost per node

arvidn · 2025-12-03T12:10:35Z

tools/src/bin/sha256tree-benching.rs

+/*
+This file is for comparing the native sha256tree with the clvm implementation which previously existed.
+The costs for the native implementation should be lower as it is not required to make allocations.
+*/


It's not clear to me how the output of this program should be interpreted. We want it to tell whether the current cost for the native shatree operator is reasonable compared to the CLVM implementation.

How can we tell?

Why does the actual timing matter? It's just the cost that matters, isn't it?

I believe we want to output the timing as well so we can check that the Cost is also close to the timing * cost_factor measurement too. I have added a comment to this end.

…clvm-cost

arvidn · 2025-12-08T23:03:28Z

tools/src/bin/sha256tree-benching.rs

+        let duration = (duration - (500.0 + i as f64) * (4.0 * bytes32_clvm_time)) / 2.0;
+        let cost = (cost as f64 - (500.0 + i as f64) * (4.0 * bytes32_clvm_cost)) / 2.0;


thinking some more about this; It doesn't seem safe to assume that the remaining time (after subtracting the 4 sha256 blocks) is evenly distributed between the node and the pair.

But I still don't understand why you're measuring this here. I expect this cost value to match whatever constant you've picked. So why "measure" it?

tools/src/bin/benchmark-clvm-cost.rs

arvidn · 2025-12-08T23:11:44Z

tools/src/bin/benchmark-clvm-cost.rs

it makes me nervous to not have any measurements of trees of the sha256tree operator. The only think you measure now is the time for hashing a single atom, and then assume that the cost of a pair is 3 blocks. It would be good to have a measurement confirming this assumption.

arvidn · 2025-12-08T23:16:28Z

tools/src/bin/sha256tree-benching.rs

+    );
+
+    // taken from benchmark-clvm-cost.rs
+    let cost_scale = ((101094.0 / 39000.0) + (1343980.0 / 131000.0)) / 2.0;


This looked suspicious to me. Partly because this benchmark is already measuring the cost and the timing, so you already know what the scale factor is, why do you need this constant?

When looking into it, I found the comment above it in benchmark-clvm-cost.rs. It says:

// this "magic" scaling depends on the computer you run the tests on. // It's calibrated against the timing of point_add, which has a cost

I'm pretty sure that means you have to update it to match your computer in order for the final cost to make sense. It would be nice to automatically establish this scale, but that would perhaps be a bit of a scope creep.

arvidn · 2025-12-08T23:21:12Z

src/treehash.rs

+const SHA256TREE_BASE_COST: Cost = 87;
+// this cost is applied for every node we traverse to
+const SHA256TREE_NODE_COST: Cost = 500;
+// this is the cost for every 32 bytes in a sha256 call
+// it is set to the same as sha256
+const SHA256TREE_COST_PER_32_BYTES: Cost = 64;


I think we have to be very careful and deliberate when picking these costs. We want to be absolutely certain we won't regret it. Right now I don't really understand how you arrive at these numbers. I'm hoping the PR description can explain how they are picked along with the measurements used to decide this.

I especially think SHA256TREE_NODE_COST is on shaky grounds right now, as there isn't a measurement for a list (or a tree) parameter.

Ideally we can demonstrate that the model we pick works for both lists and a trees.

i.e.

(a . (b . (c . (d . (...)))))

as well as:

(((a . a) . (a . a)) . ((a . a) . (a . a)))

etc.

Co-authored-by: Arvid Norberg <[email protected]>

matt-o-how force-pushed the op_sha256tree branch from 909adfe to 837026a Compare October 24, 2025 15:14

matt-o-how marked this pull request as ready for review October 27, 2025 15:49

arvidn reviewed Oct 27, 2025

View reviewed changes

arvidn reviewed Oct 29, 2025

View reviewed changes

src/treehash.rs Outdated Show resolved Hide resolved

op-tests/test-sha256tree.txt Show resolved Hide resolved

src/treehash.rs Outdated Show resolved Hide resolved

matt-o-how added 23 commits October 31, 2025 13:26

initial commit

c4b88fe

bring the cache over

d1e93df

more functionality

4ff94fc

separate treehash into own file and further correctness

850e5d5

test test

790cc9b

fix testing and add new tests

8144780

add more tests

b02106f

adjust costs

feb2f0f

add 0x

0c023d9

cache cost

f340214

no TreeHash type

0925c41

remove duplicate precomputed_hashes

ed42080

make treeop private

056abac

add malloc_cost_per_byte

3142cd6

comment fixes

1cfa204

generate random tests and fix subtle costing bug

94aa383

add treecache tests

8b569fb

clippy fixes

27c57c0

fmt again

0976bc3

add negative tests and more atom tests

f777a13

test fix

7efd8d4

pass ref into pair

c676a34

add sha256tree to bechmarker

52fb0ba

matt-o-how force-pushed the op_sha256tree branch from 2dcc4df to 52fb0ba Compare October 31, 2025 13:27

arvidn reviewed Oct 31, 2025

View reviewed changes

matt-o-how added 4 commits November 18, 2025 17:48

rename to list

dc568c3

print cost

4810478

assert equal outputs

52df508

use non-cached

3744787

matt-o-how added 2 commits November 20, 2025 11:50

update values

745927e

update costing and tests for shatree

a4e2e36

arvidn reviewed Nov 26, 2025

View reviewed changes

matt-o-how added 6 commits November 26, 2025 15:43

remove cached treehash

c3ffba1

cleanups and better benching

bdbf557

ignore many arguments

ad5daaf

gitignore DS_Store

fe62f3a

update comments and text output for clarity

55bfbed

update tests for new costs

0eed06b

arvidn reviewed Dec 1, 2025

View reviewed changes

no longer add high cost per node, only calculate cost based on hash o…

b27f794

…perations

arvidn reviewed Dec 2, 2025

View reviewed changes

add comments

a28ce54

arvidn reviewed Dec 3, 2025

View reviewed changes

matt-o-how added 4 commits December 4, 2025 13:06

add more comments

7fa2819

remove cost_per_node and add base cost equal to sha256

a9fcdcd

add further comments and remove per_node calculations from benchmark-…

73deeee

…clvm-cost

update costing to more closely reflect the sha256 operator

46478d2

arvidn reviewed Dec 8, 2025

View reviewed changes

matt-o-how and others added 6 commits December 9, 2025 12:17

Update tools/src/bin/benchmark-clvm-cost.rs

0f62718

Co-authored-by: Arvid Norberg <[email protected]>

Update tools/src/bin/benchmark-clvm-cost.rs

2d0bc2d

Co-authored-by: Arvid Norberg <[email protected]>

Update tools/src/bin/benchmark-clvm-cost.rs

2823b16

Co-authored-by: Arvid Norberg <[email protected]>

fmt

3ac31f7

add new cost_factor calculator into benchmark-clvm-cost.rs

dd4c4a9

bench per type

0519dc5

		linear_regression_of(&samples_cost_native).unwrap().0,
		linear_regression_of(&samples_cost_clvm).unwrap().0,

		let duration = (duration - (500.0 + i as f64) * (4.0 * bytes32_clvm_time)) / 2.0;
		let cost = (cost as f64 - (500.0 + i as f64) * (4.0 * bytes32_clvm_cost)) / 2.0;


		seed(1337)

		SHA256TREE_BASE_COST = 0

Add op_sha256tree #632

Are you sure you want to change the base?

Add op_sha256tree #632

Uh oh!

Conversation

matt-o-how commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls-official bot commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 20073713856

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arvidn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arvidn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

arvidn commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

matt-o-how commented Sep 25, 2025 •

edited

Loading

coveralls-official bot commented Sep 25, 2025 •

edited

Loading

arvidn commented Nov 20, 2025 •

edited

Loading