A nice bug

by Bernd Schoeller (modified: 2007 Nov 30)

This is a very nice bug that I stumbled into while running the great auto_test tool on some code I had written. It is a beautiful example of a bug, because it is a very small bug that is still nearly undetectable for the average developer. Also, I do not know about any formal verification tool that could reveal the bug. Finally, it is indeed a specification problem that cannot be expressed in Design by Contract.

I had following deferred class SET with a couple of features defined (I skipped other features, comments and contracts, because they are not relevant for the example):

deferred class SET feature -- Relations is_superset_of (x: SET):BOOLEAN is deferred end is_subset_of (x: SET):BOOLEAN is deferred end end

For the deferred, I had to provide two effective implementations, called SET_A and SET_B. Being lazy, I implemented one of the features and described the other feature in terms of the first.

The first implementation looked like this:

class SET_A feature -- Relations is_superset_of (x: SET):BOOLEAN is do -- Some implementation end is_subset_of (x: SET):BOOLEAN is do Result := x.is_superset_of (Current) end end

For the second implementation, is_subset_of was actually easier to implement, so I wrote the following code:

class SET_B feature -- Relations is_superset_of (x: SET):BOOLEAN is do Result := x.is_subset_of (Current) end is_subset_of (x: SET):BOOLEAN is do -- Some implementation end end

Now, the code is broken. If you do not yet see how, think about what happens when I execute the following code:

local sa: SET_A sb: SET_B r: BOOLEAN do create sa create sb r := sa.is_superset_of (sb) end

My question: Who is to blame for the bug? Where exactly does the bug manifest in the code?

Comments
  • Colin Adams (16 years ago 30/11/2007)

    Bernd is to blame!

    The easiest question answer is who is to blame - you are! :-)

    Second - which bug? there is the bug you are showing, and then there is this one:

    local sa: SET_A sb: SET_B do create sa r := sa.is_superset_of (sb) end

    OK. That's just me being myself. I know you are using ECMA notation with attached status as the default.

    For a more serious answer to your question, I don't believe this is impossible to solve by DbC.

    class SET_A feature -- Relations is_superset_of (x: !SET):BOOLEAN is do -- Some implementation ensure then inverse: Result implies {Current} x.is_subset_of (Current) end is_subset_of (x: !SET):BOOLEAN is do Result := x.is_superset_of (Current) ensure then inverse: Result implies {Current} x.is_superset_of (Current) end end

    I don't suppose this is legal ECMA-compliant notation, and perhaps there isn't such a notation available at the moment, but I don't think that that can be laid at the door of DbC - rather we just need some extensions.

    In case it isn't clear, the intention of my notation here is to say that the implementation of the class specified must be used when evaluating the postcondition. This will imply a conversion mechanism from the type of x to the type of Current for the purposes of the evaluation.

    Colin Adams

    • Bernd Schoeller (16 years ago 1/12/2007)

      I do not understand

      Hi Colin, I do not understand how your proposal solves the problem of a stack overflow created by infinite recursion that I have described. The added contracts do not help.

      • Colin Adams (16 years ago 1/12/2007)

        CATcall detection

        Because the postcondition suffers from a catcall for your implementation. So it will be detected at compile time by the catcall detection mechanism (such as GEC's dynamic type set algorithm). Colin Adams

  • Simon Hudon (16 years ago 1/12/2007)

    Eiffel is not yet a formal specification language so I think we could "blame" this for the fact that such bugs can appear.

    In this case, I think a simple extension of DbC might do the trick. It's a generalization of the variant to recursive (or mutually recursive) routines. We might want to add a variant after the precondition so that, if the control reenters a routine, the quantity must have decreased compared to the next passage.

    It could look like this:

    gcd (i, j: INTEGER): INTEGER is require i > 0 and j >= 0 variant i do if j = 0 then Result := i else Result := gcd (j, i \\ j) end end

    I would put the default routine variant to 0 so that routines may not be reentrant unless specified otherwise.

    This notation would not even have to appear in the contract view since it is related to the recursive nature of the implementation.

    Wouldn't that be an interesting feature?

    Simon

    • Bernd Schoeller (16 years ago 1/12/2007)

      Measure

      Looks like a good approach, very similar to the use of measures for recursive methods in Spec#.

      Still, I disagree that this does not need to become part of the interface. Look again at the code: The static type of the target for the calls is SET, and I have to rely on the interface specification of SET to check total correctness. If SET does not make any commitment on the measure, then I am missing important information for verification.

      • Simon Hudon (16 years ago 2/12/2007)

        Total correctness

        Even if the variant was part of the interface, there would be a something missing: if you don't know the transitive closure of the routine called (in terms of the call graph), you may not be aware that calling it would make it reentrant. If we make the variant part of the interface, we may also want to show part of its transitive closure. I say part of the transitive closure because nobody may be interested in knowing that {INTEGER_32_REF}.infix "+" is called ... except for those who inherit INTEGER_32_REF .

        This is why it seems good for the documentation tools to show only the intersection of the transitive closure of the documented feature with the routine set of some class of interest, one that is being developed.

        But the import point is that documenting the variant of a routine would still be insufficient.

        What do you think?

        Simon

        • Bernd Schoeller (16 years ago 2/12/2007)

          It seems to be a fundamental specification problem to prove total correctness for object-oriented programs, not only for Eiffel. I do not have a solution.

          I think there is something missing here, and I do not know what it is. Specifying something like a feature variant might be the right approach, but for me the implementation of a feature should be totally opaque, and the client has to only rely on the specification of the feature (which - if I understand correctly - you would want to violate by intersecting the closure of all possible feature calls in the body of a routine). Otherwise we get into deep problems with polymorphism and substitutivity.

          • Simon Hudon (16 years ago 2/12/2007)

            It does violate the opacity of the implementation of a feature but, otherwise, knowing that a feature is not reentrant is not sufficient to implement another feature and know that it will not cause recursive calls, as in your example.

            But I do agree that we should avoid or at least minimize exposing part of the implementation. To minimize it, I think we could come up with a set of rules (like the one I posted) that would filter some features. Some other rules could based, for example, on the target of the call, we may provably not be the same as the caller.

            The problem may also be approached in another way: the compiler may issue an error or a warning to tell that some non-reentrant features may be in a cycle in the call graph. Whether or not it is an error depends on the certainty that the compiler can have about a recursive call. The little problem about that solution is that one cannot foresee the error when he develops a feature unless he could have the ability to see (through documentation tools) if his feature may eventually be called (in a suspicious way, one may add) by his supplier.

            I'm not sure if I understand your point correctly: why do you say that compromising the opacity of a feature's implementation would result into deep problems with polymorphism and substitutivity?

            Simon

            • Bernd Schoeller (16 years ago 2/12/2007)

              Polymorphism

              To answer you last question: Non-opaque implementations are a problem, because you actually never know which code is executed when a feature is called, as the implementation may be replaced by a new one in a subtype. If you do not know which code is actually executed, there is no point in revealing details of the implementation.

              • Simon Hudon (16 years ago 3/12/2007)

                Ah yes, of course! Maybe I didn't use the right word but I was thinking of a generalized transitive closure. What I mean by that is that, if, at some point in the closure the feature foo is called, we include all the features it calls plus the feature all its descendant versions call.

                It may seem at little futile because a simple change to the hierarchy of the supplier invalidates the assumption on which the client feature has been written. But if the change in the hierarchy is in the same system, which seems very probable, writing a descendant version of foo will require the same precautions. So, when writing a feature that would create a cycle could always it would always be possible realize the problem created.

                I think that there should be some supplementary tool that would reveal the possible cycle unasked because it is all too frequent to write a routine without checking all the specifications of its client routines.

                Simon

                • Bernd Schoeller (16 years ago 3/12/2007)

                  Non-modularity - never !

                  A verification on the basis of "all descendant versions" is inherently non-modular. It can break the verification of a correct full system just by adding a class.

                  You are not alone with your suggestion to solve certain problems with Eiffel in a non-modular way. Bertrand himself has suggested non-modular reasoning for a number of problems, most prominently avoiding CAT calls.

                  I disagree that non-modular reasoning is a valid option. We would sacrifice reuse and the ability to modify the code (Open/Closed principle).

                  • Simon Hudon (16 years ago 7/12/2007)

                    Not giving up!

                    Well I have tried several answers to your comment but I have discarded them all. I think we can agree that this is not an easy problem. I'm still thinking of several solutions but I will have more time to write some descent reply after my term has come to an end.

                    I think that the next step of the debate is agreeing on some properties of the acceptable solutions.

                    Simon

                  • Simon Hudon (16 years ago 29/12/2007)

                    Modularity vs Correctness

                    I'm not sure that modularity necessarily has to be sacrificed to avoid cyclic call graphs but if it had, I'm not sure that the solution "leaving modularity alone" would be the evident winner.

                    For now, I'd be interested in sketching up a descent solution. I think that we both agree that the routine variant idea is a good one and may play an import role in the future solution. The point where we our opinions diverge, however, is when asking if the variant is sufficient. I'm convinced that it is not. The critics you opposed my solution may apply to a solution only constituted of routine variants. In truth, if, by default, every routine is not reentrant, adding some classes (possibly a library) to a system that was correct can also break it. In this case, however, it will only be found through testing. Thanks to Eiffel assertion monitoring, the error may be found very quickly but the conclusion may be something like: "Well! It seems that library ABC from supplier X and library DEF from Y cannot be used together without severe constraints".

                    For that, adding a static constraint may be necessary. One such constraint may be to disallow qualified call to reentrant routines. That way, any cycle allowed would be contained in a single class and its ancestry. All other cycles, those involving several classes, would be illegal. The cycle detection, however, would still have the ability to break a correct system for which a correct class has been added but the rejection would prevent real bugs. To get around the inconvenient rejection when introducing a new library in a project, it may be necessary to add overrides to the project.

                    Of course, it does not help if the project is a library. For that case having something like B's refinement could be useful. Everything referring to the refined class would use the refinement, instead. Contrarily to the overrides, already supported by EiffelStudio, the refinement could inherit the features of the refined class and redefine what features it wants to.

                    One of the important points I'm trying to make here is that allowing arbitrary calls is decreasing modularity because it may create a tight coupling between unrelated pieces of software. I think this is something to be avoided but the best way to restrict the least possible amount of calls is not obvious.

                    I'm eager to see what you think of this.

                    Simon Hudon