A Tool to Obtain Eiffel Code Metrics Across Many Libraries

by Finnian Reilly (modified: 2023 Sep 17)

Introduction

In the mainstream we normally hear about lines of code to quantify the extent of a code base. But this makes little sense in Eiffel since Eiffel lines can be very long or very short depending on the personal style of the coder.

I propose measuring the extent of an Eiffel codebase by counting the number of occurrences of identifiers and Eiffel keywords that occur within the body of a routine, i.e. between the do (or once) keyword and the corresponding end (or ensure) at the end of the routine. But ignoring any identifiers and keywords within code blocks defined by the check or debug keyword, since this code is most likely not operational in the finalized executable. Naturally comments, and any quoted text or characters, are also be ignored. I did not consider that numeric constants add much to the overall picture, so these are also not counted.

This method takes into account that contract code does not contribute to the functionality of the codebase, and neither do local declarations or attribute declarations, or inheritance qualifiers. So all of this is not counted.

And so I created a tool to measure this. with a few other metrics thrown in for good measure (pun intended). Some examples outputs are shown below.

Motivation

Part of the motivation for this project was to test out an idea for a high-speed parsing technique that makes use of "jail-breaked" immutable strings of type IMMUTABLE_STRING_8. Obtaining Eiffel metrics seemed a good application of my idea and could be useful and of interest. Besides I am not happy with the current metrics I am using for the Eiffel-Loop website. I am pleased with the results as the parsing speeds exceeded my expectations.

Core Classes

Links to the source for the core classes for this codebase analyzer are shown in order of inheritance:

  1. EL_EIFFEL_IMMUTABLE_KEYWORDS
  2. EIFFEL_SOURCE_READER
  3. EIFFEL_SOURCE_ANALYZER
  4. CODEBASE_METRICS

The class EIFFEL_SOURCE_READER could also be easily applied to making a tool that could perform a restrictive search for text within the following categories.

  1. Only within indexing notes
  2. Only within comments
  3. Only within quoted strings that are not indexing notes.

Some Example Readings

The following are some examples of the console output of this tool including performance benchmarking information. I edited out the progress bar indicator. One interesting comparative benchmark to pay attention to is the ratio of keywords to identifiers expressed as percentiles, and also the average number of keywords + identifiers per routine.

Eiffel Software Libraries (Ver. 16.05)

Reading manifest: SRC-ISE-libraries /opt/ES/Eiffel_16.05/library Class count: 4766 Total mega bytes: 26.23 Keyword occurrences: 171912 Identifier occurrences: 473361 Keyword + identifier count: 645273 Routine count: 45414 External routine count: 11531 Average keywords + identifiers per routine: 14 Percentiles keywords:identifiers: 27%:73% Execution time: 1 secs 405 ms Previous runs: 2 Average Execution time: 1 secs 839 ms

GOBO Libraries distributed with ES 16.05

Reading manifest: SRC-GOBO-libraries /opt/ES/Eiffel_16.05/contrib/library/gobo/library Class count: 2555 Total mega bytes: 25.18 Keyword occurrences: 279599 Identifier occurrences: 536832 Keyword + identifier count: 816431 Routine count: 29733 External routine count: 800 Average keywords + identifiers per routine: 27 Percentiles keywords:identifiers: 34%:66% Execution time: 1 secs 126 ms Previous runs: 3 Average Execution time: 1 secs 661 ms

Eiffel-Loop Libraries

Reading manifest: SRC-Eiffel-Loop-library /home/finnian/dev/Eiffel/library/Eiffel-Loop/library/base /home/finnian/dev/Eiffel/library/Eiffel-Loop/library/graphic /home/finnian/dev/Eiffel/library/Eiffel-Loop/library/language_interface /home/finnian/dev/Eiffel/library/Eiffel-Loop/library/multimedia /home/finnian/dev/Eiffel/library/Eiffel-Loop/library/network /home/finnian/dev/Eiffel/library/Eiffel-Loop/library/persistency /home/finnian/dev/Eiffel/library/Eiffel-Loop/library/runtime /home/finnian/dev/Eiffel/library/Eiffel-Loop/library/testing /home/finnian/dev/Eiffel/library/Eiffel-Loop/library/text /home/finnian/dev/Eiffel/library/Eiffel-Loop/library/utility Class count: 2841 Total mega bytes: 5.91 Keyword occurrences: 52690 Identifier occurrences: 119240 Keyword + identifier count: 171930 Routine count: 14883 External routine count: 1042 Average keywords + identifiers per routine: 12 Percentiles keywords:identifiers: 31%:69% Execution time: 2 secs 78 ms Previous runs: 1 Average Execution time: 2 secs 56 ms

FORMAT EXT4 VS NTFS

I was wondering why Eiffel-Loop libraries are taking longer to parse despite being only about 6 megabytes. The only explanation I can think of is that they are located on an NTFS partition accessed by Linux, whilst the other code is on an ext4 partition, but both on the same SSD drive. The reason for having an NTFS partition is, it makes cross-platform development easier.