Make writing footer checksums less expensive or optional

Description

Problem:
JVM profiler indicates checksums as implemented are a significant bottleneck.

Cause:
impl.RawOutput wraps the provided OutputStream with a CheckedOutputStream. Every time a rawInt is written, CheckedOutputStream calls on its checksum to update itself.

Adler32's update method happens to be native, which may not be germane to the problem.
http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/zip/Adler32.java#91

The read side of data.fressian already exposes a knob for checksums to be ignored in RawInput. No such knob exists on the write side.

Checksums are used in the footer methods. They may be extremely useful for data at rest, but may be redundant with other out-of-band mechanisms.

Possible solutions
Buffering so that checksums don't recalculate frequently.
Exposing a knob to control whether write checksums are enabled. This would potentially involve changes with the footer.

Environment

None

Activity

Show:
Stuart Halloway
December 18, 2013, 2:33 PM

It is definitely possible that the checksum calculation dings perf. (And if so, another possible solution is just removing checksums entirely from Fressian.)

That said, I don't want to trust a profiler. To move this forward, would like to see a benchmark of a real-world use case without the profiler in play.

Assignee

Stuart Halloway

Reporter

Ghadi Shayban

Labels

None

Approval

Incomplete

Patch

None

Priority

Major
Configure