Skip to content

Conversation

@notthelewis
Copy link

Optimize internal memory layouts and add validation tests

Why this matters

This improves memory and cache efficiency for large OCR outputs (many thousands of boxes), reducing heap usage and improving iteration speed — particularly in high-throughput or streaming OCR workloads.

Summary of changes

  • Fixed micro-leak in setVariablesToInitializedAPI function, as per commit: d625fe4
  • Reordered fields in Client to reduce internal padding and improve memory density.
  • Replaced the old BoundingBox (image.Rectangle + mixed types) with a slim, cache-friendly layout:
type BoundingBox struct {
    Word       string
    X0, Y0     int32
    X1, Y1     int32
    Confidence float32
    BlockNum   int32
    ParNum     int32
    LineNum    int32
    WordNum    int32
}
  • Added Rect() and FromRectangle() helpers for compatibility.
  • Updated GetBoundingBoxes and GetBoundingBoxesVerbose to map directly to the new layout.
  • Introduced a pure-Go mapping test (bbox_mapping_test.go) to verify that memory-to-struct conversions match the original C struct layout.
  • Added a comprehensive benchmark suite (bench_layout_test.go) comparing:
    • Original vs. optimized Client
    • Original vs. slim BoundingBox
    • Memory footprint, iteration throughput, and copy bandwidth

Safety and compatability

  • All CGO boundaries remain unchanged; only Go-side representations were optimized.
  • Added mapping validation test to confirm correct field alignment, asper commit: 0f474e2
  • BoundingBox.Rect() preserves compatibility for code expecting image.Rectangle.
  • Optional JSON marshaling is added to maintain backward-compatible wire format, as per commit: 76d3c95
╰─>$ go test -run TestPureGoBoundingBoxesMapping -v
=== RUN   TestPureGoBoundingBoxesMapping
--- PASS: TestPureGoBoundingBoxesMapping (0.00s)
PASS
ok      github.com/otiai10/gosseract/v2 0.011s

Performance results

=== RUN   TestHeapFootprint
    bench_layout_test.go:158: Alloc delta for []BoundingBoxOrig(2000000): ~176005120 bytes (88.00 bytes/elem)
    bench_layout_test.go:166: Alloc delta for []BoundingBoxSlim(2000000): ~111998328 bytes (56.00 bytes/elem)
    bench_layout_test.go:174: Alloc delta for []ClientOrig(2000000):      ~192001368 bytes (96.00 bytes/elem)
    bench_layout_test.go:182: Alloc delta for []ClientOpt(2000000):       ~176002456 bytes (88.00 bytes/elem)
--- PASS: TestHeapFootprint (0.11s)
PASS
ok      github.com/otiai10/gosseract/v2 0.126s

=== RUN   TestSizes
    bench_layout_test.go:142: sizeof(ClientOrig)  = 96
    bench_layout_test.go:143: sizeof(ClientOpt)   = 88
    bench_layout_test.go:144: sizeof(BBoxOrig)    = 88
    bench_layout_test.go:145: sizeof(BBoxSlim)    = 56
--- PASS: TestSizes (0.00s)
PASS
ok      github.com/otiai10/gosseract/v2 0.005s
go test -run=^$ -bench . -benchmem -count=1 bench_layout_test.go
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz
BenchmarkIterate_BBoxOrig-16                 198           5999137 ns/op        14668.78 MB/s          0 B/op          0 allocs/op
BenchmarkIterate_BBoxSlim-16                 262           4534872 ns/op        12348.75 MB/s          0 B/op          0 allocs/op
BenchmarkCopy_BBoxOrig-16                   1234            987249 ns/op        17827.31 MB/s          0 B/op          0 allocs/op
BenchmarkCopy_BBoxSlim-16                   2146            518134 ns/op        21616.03 MB/s          0 B/op          0 allocs/op
BenchmarkIterate_ClientOrig-16               327           3759872 ns/op        12766.39 MB/s          0 B/op          0 allocs/op
BenchmarkIterate_ClientOpt-16                328           3676250 ns/op        11968.72 MB/s          0 B/op          0 allocs/op

Migration notes

For projects consuming goseract directly:

  • Field access changes
    • Old: bb.Box → New: bb.Rect()
    • The image.Rectangle field was replaced by coordinate fields (X0, Y0, X1, Y1)
  • Numeric type changes
    • Counters (BlockNum, ParNum, LineNum, WordNum) are now int32 instead of int
    • Confidence is now float32 (previously float64). If your code performs floating point maths with Confidence, cast as needed:
     conf64 := float64(bb.Confidence)
  • Helpers
    • New FromRectangle() constructor simplifies building boxes from image.Rectangle:
  bb := gosseract.FromRectangle("word", rect, conf, block, par, line, wordNum)

- Align structs to improve overall memory usage and cache-line
  efficiency
- Prefer 32bit integers where possible on `BoundingBox`, as confidence
  is 'good enough' and the overall memory footprint decreases quite
  drastically.
- Added bench_layout_test to prove the benefits, see below for results
  on my machine

This _could_ break downstream, so may require a major release. Will
investigate in follow-up commits.

=== RUN   TestHeapFootprint
    bench_layout_test.go:158: Alloc delta for []BoundingBoxOrig(2000000): ~176005120 bytes (88.00 bytes/elem)
    bench_layout_test.go:166: Alloc delta for []BoundingBoxSlim(2000000): ~111998328 bytes (56.00 bytes/elem)
    bench_layout_test.go:174: Alloc delta for []ClientOrig(2000000):      ~192001368 bytes (96.00 bytes/elem)
    bench_layout_test.go:182: Alloc delta for []ClientOpt(2000000):       ~176002456 bytes (88.00 bytes/elem)
--- PASS: TestHeapFootprint (0.11s)
PASS
ok      github.com/otiai10/gosseract/v2 0.126s

=== RUN   TestSizes
    bench_layout_test.go:142: sizeof(ClientOrig)  = 96
    bench_layout_test.go:143: sizeof(ClientOpt)   = 88
    bench_layout_test.go:144: sizeof(BBoxOrig)    = 88
    bench_layout_test.go:145: sizeof(BBoxSlim)    = 56
--- PASS: TestSizes (0.00s)
PASS
ok      github.com/otiai10/gosseract/v2 0.005s

go test -run=^$ -bench . -benchmem -count=1 bench_layout_test.go
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz
BenchmarkIterate_BBoxOrig-16                 198           5999137 ns/op        14668.78 MB/s          0 B/op          0 allocs/op
BenchmarkIterate_BBoxSlim-16                 262           4534872 ns/op        12348.75 MB/s          0 B/op          0 allocs/op
BenchmarkCopy_BBoxOrig-16                   1234            987249 ns/op        17827.31 MB/s          0 B/op          0 allocs/op
BenchmarkCopy_BBoxSlim-16                   2146            518134 ns/op        21616.03 MB/s          0 B/op          0 allocs/op
BenchmarkIterate_ClientOrig-16               327           3759872 ns/op        12766.39 MB/s          0 B/op          0 allocs/op
BenchmarkIterate_ClientOpt-16                328           3676250 ns/op        11968.72 MB/s          0 B/op          0 allocs/op
Prefer immediate `free` instead of deferred, as large variable sets will
create a temporary spike.
 - Validate that the mapping from C structs to Go structs works as
   intended, using a pure Go approach - for ease of implementation

- Note, this is a bit limited, as it's not _actually_ using C. Though
  it's about as close as we can get using pure Go.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant