My three Eiffelwishes, part 1: enums

by David Le Bansais (modified: 2010 Jun 13)

For people that routinely switch between programming languages like me, it's temping to try to get the best of all and mix language features. The most common example (and likely the most researched) being to add assertions to C++ or C# classes, for instance.

While working on Eiffel code, I repeatedly found that my code could be simpler, and better, if I had access to some features of C-like languages. Today I'd like to talk about one of them: enumerations.

Previous work

There is an article on enums and how to implement them with the current design of Eiffel at this page. While the code could use some improvement with the recent addition of void safety and CAPs, it's a start. In particular, it introduces the necessary features that type-safety and invariants require. It doesn't, however, provide a good solution if the goal is to have simpler code. The ENUM class, for instance, expects a generic parameter, and some code is duplicated. Moreover, the author doesn't investigate improvements that would require a change in the language.

What's an enum?

An enumeration can be defined as a set of named entities, sometimes associated to a known numerical value. Typical examples are club, diamond, heart and spade, each a playing card suit, or jack, queen, king and ace, associated to values 1, 2, 3 and 4 respectively (some readers might have recognized common values used in contract bridge).

An enumeration is a type. And, therefore, eventually a class. Following examples above, we could consider classes SUIT and HONOUR, and it would allow us to write code such as: class PLAYING_CARD feature suit: SUIT honour: HONOUR end

The definition and behavior of SUIT and HONOUR is the purpose of this post.

Expectations

How should we use enums and how do they make code simpler? It should be possible to use them in place of numerical constants.

In conditional statements: if suit = {SUIT}.club then ... end In inspect statements: inspect suit when {SUIT}.club then ... end And, of course, assignments: suit := {SUIT}.club

Note that, in the code above, no actual numerical value is expected. For all we know, the internal representation of {SUIT}.club is a reference and it varies from one execution of the program to the other. Also, while assigning an enum value with an associated constant to a numerical type is possible, the other way around is prohibited. i: INTEGER i := honour honour := i -- Prohibited

It is debatable whether an enum type should have a default initialization value, like other base types do. For some enums there might be an obvious candidate, but for our SUIT example none of the 4 values stands above others.

There is however an important property of enums that can be used or ignored without interfering with the program: values can be ordered, typically in the order they appear in the definition of the enum class. In practice, this would mean that SUIT and HONOUR inherit from COMPARABLE, and that the following statements are valid: if honour < {HONOUR}.ace then ... end inspect honour when {HONOUR}.jack..{HONOUR}.king then ... end

It should be possible to enumerate through values as well. For the purpose of displaying their names, or associated constants for instance. The code below shows how to achieve that: across suit.items as value loop print(value.item.out) -- Print the value name end

Last but not least, an attribute of an enum type can only be assigned one of the values in the enum declaration, and at any time it cannot contain anything else than these values. This will be the default class invariant.

The question remains whether enums should have implicit numeric constants associated to each values. Such constants are needed when:

  • The program is connected to external components, like C code where enum values are fixed, known constants
  • out, print etc. feature are called for enum attributes, to obtain printable representations
  • Values are associated to specific constants for a purpose, in HONOUR for instance where constants represent a number of points to use during a bid.
  • Attributes of enum type are serialized

Perhaps the best approach is to leave the specification of the constants open, and just require that the internal representation of enum values be consistent with the cases listed above. At worst, it means that when interfacing with other languages, the compiler must require all values to have explicit associated constants. It would however be preferable to have a smarter compiler, capable of assigning the proper constants to enum values depending on the language it interfaces with.

In any case, if the value is associated to a numerical constant, this constant should be available to code through conversion.

Inheritance

Like any class in Eiffel, an enum can have descendants, and these should be able to redefine features, but also add, remove or rename values. The standard syntax can be used to rename or remove values, however the case of addition must be carefully considered, since it can break the order defined by the parent.

Descendants can also add attributes. What is the semantic, then, of an assignment like suit := {SUIT}.club, or the honour < {HONOUR}.ace expression? The simple answer is that {SUIT}.club and {HONOUR}.ace are objects of type SUIT and HONOUR, and these classes must have either default values, or redefine default_create such that {SUIT}.club and {HONOUR}.ace can be created by the compiler.

In the case of the honour < {HONOUR}.ace expression, if is_less is not redefined, then HONOUR is partially ordered around its enum values, and other attributes are ignored during expression evaluation.

The new syntax

It is now time to describe my new suggested syntax, and see how it fits into Eiffel.

First of all, all user-defined enums can inherit from the system ENUM class, inheriting itself from COMPARABLE and equipped with built-in features to ensure that values are kept ordered, to enforce the invariant, and provide representations for names and associated constants.

The ENUM class would have the following contract:expanded class ENUM inherit COMPARABLE redefine is_less end HASHABLE undefine is_equal redefine hash_code end feature -- Implementation of deferred is_less alias "<" (other: like Current): BOOLEAN external "built_in" end hash_code: INTEGER external "built_in" end feature -- Set items: LIST[like Current] external "built_in" end invariant -- Consistency is enforced by the compiler end

The ENUM class is simply an enum with no values, and each user-defined enum simply add new values thanks to an additional clause in the inheritance part:class SUIT inherit ENUM unique club, diamond, heart, spade end end

And voilĂ !

Values order is the order of declaration in the unique clause. Everything else is provided by the ENUM class and fairly simple new rules for handling enum values in various part of the language: expressions, the inspect statement, conversions, assignment and so on.

Values can be given an associated numerical constant:class HONOUR inherit ENUM unique jack = 1, queen = 2, king = 3, ace = 4 end end

Descendants can add values to an existing set:class POWER_CARD inherit ENUM unique joker = 0 end HONOUR end

In this example, the value order in the POWER_CARD enum class is: joker, then all values from HONOUR. To make the new joker card the last value, change the order of inherited classes declaration:class POWER_CARD inherit HONOUR ENUM unique joker = 5 end end

Since HONOUR inherits from ENUM, the above declaration could be made even simpler:inherit HONOUR unique joker = 5 end

It is interesting to note that the compiler should make sure that the declaration order is compatible with the natural order of associated constants, and that numerical constants are preserved by inheritance. For instance, the following declaration should be prohibited:inherit HONOUR unique joker = 0 end

Other languages may allow enums to have unordered constants, but my personal feeling is that it's a recipe for disaster, and the symptom of a design error.

The unique clause can be used in conjunction with other clauses in the inheritance part of the class declaration:class JOKE_CARD inherit HONOUR rename jack as jack_the_mischevious, queen as queen_the_sad, king as king_the_deceived, ace as ace_for_nothing, unique joker_smiling_face redefine is_less end feature is_less alias "<" (other: like Current): BOOLEAN -- Reverse the order and make the joker the less powerful card do Result := not Precursor(other) end end

Creation

With this syntax, no creation procedure can be written that can initialize the value. Therefore, either enumeration types are initialized to a standardized default value, like the first value, or they are not initialized at all until client classes assign them a value. The later behavior is similar to references in a void-safe environment.

What if the enum class has other attributes? A default_create redefinition can handle this case, as the example below demonstrates:class SUIT inherit ENUM unique club, diamond, heart, spade redefine default_create end create default_create feature -- Initialization default_create do manufacturer := "none" end feature -- Status manufacturer: STRING endHere, the assignment suit := {SUIT}.club still makes sense and will reset all additional attributes of 'suit' to their default value, i.e. manufacturer to "none".

Conclusion

I think enumeration types are a much needed feature in Eiffel. It allows simpler, easier to read code. I suggested a syntax that makes both the declaration and use of enums a simple task. While I haven't tested an actual implementation of the new unique clause, I hope something similar will make it in the language, eventually.

Comments
  • Manu (13 years ago 15/6/2010)

    Thanks for the suggestion. This is clearly something we have given some thoughts too. Currently we have a proposal that looks like the following (based on the notion of ephemeral classes discussed elsewhere):

    once Mars, Earth, Neptune, .... class PLANETS feature mars: PLANET once ... end ... end class PLANET create make feature mass: REAL_64 ... Other features and invariants as usual ... end

    and if you just want the ephemeral class, you could simply write:

    once class EPHEMERAL_CLASS end

    Of course this is just syntax at the moment, but the idea is that you can take the queries mars', neptune', ... and use them in inspect statement or anywhere you need them using {PLANETS}.mars, .... The good thing about it is that unlike other languages, you do not need to enumerate over simple entities, but more complex like planets, securities, ....

    Now the thing that is missing is bit flag enumeration as you can find in other languages (this is sometimes what enums are mostly used for).

  • Colin LeMahieu (13 years ago 15/6/2010)

    Does the existing language syntax already allow this?

    It seems like implementing the type-inspection multi-branch would solve all the issues stated above.

    It seems like having: class SUIT inherit ANY end

    and then having

    class CLUB inherit SUIT end class DIAMOND inherit SUIT end

    would allow: inspect obj.suit when {CLUB} then when {DIAMOND} then end

    This syntax and semantics would allow what's being requested. Now if we're looking at the optimization level will this compile to an assembly-level jump table it becomes a little more complex. The class hierarchy under {SUIT} would have to flatten in to a one dimensional line similar to how single inheritance languages to their polymorphic lookup table.

    It seems like the jump-table case could be an optimization discovered by the Eiffel Compiler, specifically SUIT and its descendants don't add any attributes to the class definition and each descendant of SUIT is a leaf in the inheritance hierarchy.

    • David Le Bansais (13 years ago 15/6/2010)

      It doesn't provide the concept of ordered enums, though. And you don't get the list of values, but that's a minor issue.

      Conditional aren't easy to write, either. I guess the way to do it would be if attached {CLUB} obj.suit as club_card then ... end

      I've never tried if attached {CLUB} obj.suit then ... end

      Now that would work.

      • Manu (13 years ago 16/6/2010)

        An object test can be written without an object test local, so your second form is valid.

  • Colin LeMahieu (13 years ago 19/9/2010)

    In light of some recent work on output parameters and ephemeral objects I've found many uses for expanded objects. What do you think of simply using expanded classes for this task?

    expanded class SUIT feature value: INTEGER_32 feature -- Possible types clubs: INTEGER_32 = 1 spades: INTEGER_32 = 2 hearts: INTEGER_32 = 3 diamonds: INTEGER_32 = 4 invariant value = clubs or value = spades or value = hearts or value = diamonds end

    One could add conversions from/to INTEGER_32 and necessary inheritance. Would this satisfy the requirements?