Towards a CQS checker

by Colin Adams (modified: 2013 Feb 16)

The single hardest problem with observing CQS is knowing which "functions" [1] you intend to call are themselves observing CQS. Ideally, the compiler would flag an error when encountering them, so you don't have to. Since this isn't happening yet, I propose to write a (conservative) CQS checker, to warn the author of a function.

A sufficient but unnecessary condition for a query to observe CQS is [2]:

  1. Don't issue any commands
  2. Don't call any impure "functions"
  3. Don't assign to any attributes
  4. Don't call any externals
  5. Don't issue any create instructions

So to check a function for purity, we check the body and rescue clause [3] honour these restrictions, recursively checking any queries that we call as we go. Debug statements can be skipped [4].

The resulting checker [5] will be very conservative. That is, it will flag queries as impure when it need not do so. So my plan is to iteratively improve the checker by making it less conservative. I'm not aware of any published literature on this problem for Eiffel. please alert me to anything you know of.

The prohibitions above are listed in approximate order of how strictly you must observe them. Calling commands or impure "functions" are absolute prohibitions [6]. in some cases we can relax the restriction on assigning to attributes. If an attribute is exported to NONE in a frozen class, then we can be sure that it's value is not used in routines of any class other than the one we are checking. In which case, if we can check the rest of the class text to see if the attribute is only referenced in commands, we can let the assignment pass. This complicates the logic of the checker, so it will certainly not attempt this check in it's initial incarnation. I think the introduction of once ("OBJECT") syntax eliminates most valid usages anyway.

The prohibition on calls to externals is necessary because C does not pretend to honour CQS, despite the principle having been established before C was born [7]. However many C functions are indeed of the purest hue. Without wishing to extend the checker to check C code, the only approach available is for the Eiffel author to certify purity. A note clause named "pure" whose value is the reason the Eiffel author believes the routine to be a pure function, sounds best. The checker should print out the location and text of each reason, so the person running the checker can assess the trustworthiness of the certification.

The prohibition on create instructions is the one that is most conservative. And I think this is the area which will need the most effort in gradually making the checker less conservative. Clearly any creation instruction that accepts Current as an argument is in a position to change the state of Current. I doubt if we will ever be sophisticated enough to check such a call for harmlessness. At the other extreme, creating a local with ANY.default_create will be quite harmless (in the absence of attribute initializers). The first iteration of the checker will ban all creation instructions. As such, it is likely to be fairly useless.

[1] Eiffel queries, which are not attributes. Attributes with initializers should also be considered, but as a first approximation, I shall assume all attributes observe CQS.

[2] Please tell me if I've missed anything

[3] We should check the preconditions and postconditions too

[4] Optionally, we can check them too.

[5] I guess I will use the Gobo gelint library to implement the checker, as I'm at least slightly familiar with using it. An alternative would be to fork Eiffel Studio, but I fear that as they use svn rather than git, I would never get the results merged back in, and so would gradually fall behind.

[6] Calling a command or an impure function for e.g. logging or tracing purposes can be arguably legitimate, but it would be hard to prove safe, I think. My own view is that these should be in keyed debug statements. Then the default plan for the CQS checker - ignore debug statements, suffices.

[7] Strachey, Christopher (2000). "Fundamental Concepts in Programming Languages". Higher-Order and Symbolic Computation 13: 11–49. doi:10.1023/A:1010000313106. Available as http://www.itu.dk/courses/BPRD/E2009/fundamental-1967.pdf