Skip to content

Commit dab8065

Browse files
makenowjustkou
andauthored
Improve Node#each_recursive performance (#139)
Fix #134 ## Summary This PR does: - Add `benchmark/each_recursive.yaml` - Rewrite `Node#each_recursive` implementation for performance - Add a test for `Node#each_recursive` The performance of `Node#each_recursive` is improved 60~80x faster. ## Details `each_recursive` is too much slow as I described in #134. I improved this performance by rewriting its implementation in this PR. Also, I added a benchmark in `benchmark/each_recursive.yaml` and the following is a result on my laptop: ``` RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/makenowjust/Projects/github.com/makenowjust/simple-dotfiles/.asdf/installs/ruby/3.3.2/bin/ruby -v -S benchmark-driver /Users/makenowjust/Projects/github.com/ruby/rexml/benchmark/each_recursive.yaml ruby 3.3.2 (2024-05-30 revision e5a195edf6) [arm64-darwin23] Calculating ------------------------------------- rexml 3.2.6 master 3.2.6(YJIT) master(YJIT) each_recursive 11.279 686.502 17.926 1.470k i/s - 100.000 times in 8.866303s 0.145666s 5.578360s 0.068018s Comparison: each_recursive master(YJIT): 1470.2 i/s master: 686.5 i/s - 2.14x slower 3.2.6(YJIT): 17.9 i/s - 82.01x slower rexml 3.2.6: 11.3 i/s - 130.35x slower ``` We can see that the performance is improved 60~80x faster. Additionally, I added a new test for `Node#each_recursive`. It was missing, but we need it to confirm not to break the previous behavior. Thank you. --------- Co-authored-by: Sutou Kouhei <[email protected]>
1 parent da67561 commit dab8065

File tree

3 files changed

+84
-4
lines changed

3 files changed

+84
-4
lines changed

benchmark/each_recursive.yaml

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
loop_count: 100
2+
contexts:
3+
- gems:
4+
rexml: 3.2.6
5+
require: false
6+
prelude: require 'rexml'
7+
- name: master
8+
prelude: |
9+
$LOAD_PATH.unshift(File.expand_path("lib"))
10+
require 'rexml'
11+
- name: 3.2.6(YJIT)
12+
gems:
13+
rexml: 3.2.6
14+
require: false
15+
prelude: |
16+
require 'rexml'
17+
RubyVM::YJIT.enable
18+
- name: master(YJIT)
19+
prelude: |
20+
$LOAD_PATH.unshift(File.expand_path("lib"))
21+
require 'rexml'
22+
RubyVM::YJIT.enable
23+
24+
prelude: |
25+
require 'rexml/document'
26+
27+
xml_source = +"<root>"
28+
100.times do
29+
x_node_source = ""
30+
100.times do
31+
x_node_source = "<x>#{x_node_source}</x>"
32+
end
33+
xml_source << x_node_source
34+
end
35+
xml_source << "</root>"
36+
37+
document = REXML::Document.new(xml_source)
38+
39+
benchmark:
40+
each_recursive: document.each_recursive { |_| }

lib/rexml/node.rb

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -52,10 +52,14 @@ def parent?
5252

5353
# Visit all subnodes of +self+ recursively
5454
def each_recursive(&block) # :yields: node
55-
self.elements.each {|node|
56-
block.call(node)
57-
node.each_recursive(&block)
58-
}
55+
stack = []
56+
each { |child| stack.unshift child if child.node_type == :element }
57+
until stack.empty?
58+
child = stack.pop
59+
yield child
60+
n = stack.size
61+
child.each { |grandchild| stack.insert n, grandchild if grandchild.node_type == :element }
62+
end
5963
end
6064

6165
# Find (and return) first subnode (recursively) for which the block

test/test_document.rb

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,42 @@ def test_gt_linear_performance_attribute_value
209209
end
210210
end
211211

212+
def test_each_recursive
213+
xml_source = <<~XML
214+
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
215+
<root name="root">
216+
<x name="1_1">
217+
<x name="1_2">
218+
<x name="1_3" />
219+
</x>
220+
</x>
221+
<x name="2_1">
222+
<x name="2_2">
223+
<x name="2_3" />
224+
</x>
225+
</x>
226+
<!-- comment -->
227+
<![CDATA[ cdata ]]>
228+
</root>
229+
XML
230+
231+
expected_names = %w[
232+
root
233+
1_1 1_2 1_3
234+
2_1 2_2 2_3
235+
]
236+
237+
document = REXML::Document.new(xml_source)
238+
239+
# Node#each_recursive iterates elements only.
240+
# This does not iterate XML declerations, comments, attributes, CDATA sections, etc.
241+
actual_names = []
242+
document.each_recursive do |element|
243+
actual_names << element.attributes["name"]
244+
end
245+
assert_equal(expected_names, actual_names)
246+
end
247+
212248
class WriteTest < Test::Unit::TestCase
213249
def setup
214250
@document = REXML::Document.new(<<-EOX)

0 commit comments

Comments
 (0)