Skip to content

Language independent hashing mechanism for float and integers (Proposal) #52

@weigandf

Description

@weigandf

Depending on what programming language is used to implement ObjectHash there is quite a difference in behavior and the resulting hash. One big issue I see is the distinguishment between a float and an integer in the case of integer-valued floats.

An example (taken from the test cases) is:

(1) ["foo", {"bar":["baz", null, 1, 1.5, 0.0001, 1000, 2, -23.1234, 2]}]
-and-
(2) ["foo", {"bar":["baz", null, 1.0, 1.5, 0.0001, 1000.0, 2.0, -23.1234, 2.0]}]

In Python the results are:
(1) 726e7ae9e3fadf8a2228bf33e505a63df8db1638fa4f21429673d387dbd1c52a
-and-
(2) 783a423b094307bcb28d005bc2f026ff44204442ef3513585e7e73b66e3c2213

The Go implementation introduced a CommonJSON object using the Go marshalling function to address this issue:

json.Marshal(o)

I would like to suggest a different solution which is language independent by following the JSON Schema proposal in:

http://json-schema.org/draft-04/json-schema-core.html#rfc.section.5.5:

It is acknowledged by this specification that some programming languages, and their associated parsers, use different internal representations for floating point numbers and integers, while others do not.

As a consequence, for interoperability reasons, JSON values used in the context of JSON Schema, whether that JSON be a JSON Schema or an instance, SHOULD ensure that mathematical integers be represented as integers as defined by this specification.

In my opinion this can be simply achieved by adding a case differentiation:

case Type.Float:
{
  if ((float)val % 1.0 == 0.0)
  { 
    HashInt((int)val);
  } else
  {
    HashFloat((float)val);
  }
  break;
}       

It can be discussed if it is useful to exclude zero from that case distinction by adding:
(float)val % 1.0 == 0.0 && (float)val != 0.0

In my opinion it would be real great for the ObjectHash project to have a common understanding about this issue and for all implementations to follow the recommendation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions