Annoying basics that should work better

by Thomas Beale (modified: 2014 May 13)

There are some basic problems with serialisation of basic data types that need compensating classes and dependencies that make it hard to separate out and share library code.

I have in a utility class the following code, which compensates for missing / broken functionality right at the data types level of Eiffel: serialise_primitive_value (a_prim_val: ANY): STRING -- generate a correctly serialised string for any primitive value, making corrections for -- broken serialisations of DATE_TIME, DATE_TIME_DURATION and REAL do -- FIXME: duration.out does not exist in Eiffel, and in any case would not be ISO8601-compliant if attached {DATE_TIME_DURATION} a_prim_val as a_dur then Result := (create {ISO8601_DURATION}.make_date_time_duration(a_dur)).as_string elseif attached {DATE_TIME} a_prim_val as a_dt then Result := (create {ISO8601_DATE_TIME}.make_date_time(a_dt)).as_string else Result := a_prim_val.out -- FIXME: REAL.out is broken (still the case in Eiffel 6.6) if (attached {REAL_32} a_prim_val or attached {REAL_64} a_prim_val) and then Result.index_of ('.', 1) = 0 then Result.append(".0") end end end The first two branches compensate for the wrong behaviour of DATE_TIME_DURATION and DATE_TIME 'out' feature, which return respectively a non-ISO8601 duration string, and a (useless) US-style date/time string. In both cases, I think it's pretty much globally accepted these days that the output should be ISO8601-based. At the moment I have to suck in my ISO8601 library to do this.

The 3rd branch deals with the problem that when a REAL is serialised to string, if it happens to be integral in value, there is no '.0' included. I rely in the ODIN library on consistent syntax of basic types, both on input and output. So ODIN parses values like 3.14 and 3.0 as Reals, but it will parse 3 as an Integer. I can't remember what the Eiffel compiler does, but I think it's the same. In ODIN, any Real is output to string form with a decimal point and a digit to the right, which means the round-trip is symmetric. Eiffel's 'out' routine for Real doesn't, and that breaks ODIN.

These hacks are needed quite deep in my libraries, not just in ODIN. For example, I have a some Interval library classes, based on a parent class INTERVAL [G->PART_COMPARABLE] which are commonly used with Reals and date/times. INTERVAL has an out routine that would serialise instances of types like INTERVAL [REAL] and INTERVAL [DATE], if only the 'out' routines of those types were not broken.

Problems like this force me to have annoying extra classes and dependencies I don't want, thus gluing up libraries that could otherwise be separated out and shared freely. I can't be the only one in this situation.

Is there any prospect of things like this ever being fixed?

Comments
  • Colin Adams (10 years ago 11/5/2014)

    Motive for using ISO8601 library

    Serialization of date-times (as JSON) was the motivation I had for using (and making completely void-safe) the ISO8601 library.

  • Peter Gummer (10 years ago 13/5/2014)

    Why use out?

    It's not clear to me why you are using out for round-trip serialisation. That doesn't appear to be its stated purpose:

    out: STRING -- New string containing terse printable representation -- of current object

    I take this to mean that the string returned by out is for humans to read. Dropping ".0" from a REAL doesn't hurt the number's legibility (unless you want to communicate type information to the human reader), and it's definitely more "terse" this way.

    For DATE and DATE_TIME, it looks like formatted_out would be more appropriate.

    • Manu (10 years ago 13/5/2014)

      I agree with Peter, out is

      I agree with Peter, out is more like a debuggable representation and you cannot trust it to be normative. So if you need to output something that needs to follow a certain specification, you have to output the values accordingly. In addition to what Peter suggested, you can also use FORMAT_DOUBLE and FORMAT_INTEGER.

    • Thomas Beale (10 years ago 13/5/2014)

      Everyone uses out!

      What you say is true I guess for Real, but there is no reason in my mind for out to be specifically unreliable for Reals, when it's used so ubiquitously in Eiffel. It could perform its purpose _and_ be symmetric. I'm not trying to do any 'formatting' here. So using formatted_out is compensating for something that is needlessly broken in my view, not solving any true 'formatting' problem, like needing ',' characters every 3 digits.

      But the main problem here is that the original code doesn't know it has a Real, it just has a primitive type object that could be any Eiffel primitive type, plus a number of other primitive business types like STRING, DATE etc. So I need a standard routine that is reliable for serialising. Other languages have this. Serialising an instance graph properly relies on a method like this. If in Eiffel it isn't 'out', then we need another reliable serialiser function in ANY.

      • Peter Gummer (10 years ago 14/5/2014)

        Examples from other languages

        I can't think of any examples from other languages that I'm familiar with that would have a reliable serialiser function on their ANY class-equivalent.

        There's no such thing in .NET, for example. The closest thing would be System.Object.ToString(), but as you can see at http://msdn.microsoft.com/en-us/library/system.object.tostring.aspx, it resembles ANY.out. Ditto for Java: http://www.tutorialspoint.com/java/lang/object_tostring.htm.

        What are you thinking of in other languages that you wish for in Eiffel?

    • Thomas Beale (10 years ago 1/6/2014)

      I just got back to this. I think the reason I have never used DATEo or DATE_TIME.formatted_out is that there is no documentation for the format string that I can find, at least not in the class where the routine is defined. I know it looks something like this example:

      default_format_string: STRING = "yyyy-[0]mm-[0]dd [0]hh:[0]mi:[0]ss.ff3"
      		-- ISO 8601 standard

      but that's not documentation.

      Also there's no formatted output routine at all for DATE_TIME_DURATION that I can find.