Why aren't literal strings in Eiffel read-only? Indeed, why isn't class STRING read-only? I think (as do others) that Java gets in right in this respect, with it's separation of String and StringBuffer.
All sorts of hard-to-debug problems can arise because there is no read-only variant of STRING. We had one of these occur in the Gobo XML library - the XML parser was corrupted because one of the event filters was editing one of the strings emitted by the parser.
Our work-around for this problem was the following class:
description: "STRINGs with copy-on-write semantics"
library: "Gobo Eiffel String Library"
copyright: "Copyright (c) 2005, Colin Adams and others"
license: "MIT License"
date: "$Date: 2007-01-26 18:55:25 +0000 (Fri, 26 Jan 2007) $"
revision: "$Revision: 5877 $"
feature -- Access
-- Version of item that is safe for editing
same_as_item: Result /= Void and then Result = item
feature -- Element change
append_character (c: CHARACTER_8)
-- Append `c' at end.
new_count: item.count = old item.count + 1
appended: item.item (item.count) = c
append_string (s: STRING_8)
-- Append a copy of `s' at end.
s_not_void: s /= Void
fill_with (c: CHARACTER_8)
-- Replace every character with `c'.
same_count: old item.count = item.count
filled: item.occurrences (c) = item.count
insert_character (c: CHARACTER_8; i: INTEGER_32)
-- Insert `c' at index `i', shifting characters between
-- ranks `i' and `count' rightwards.
valid_insertion_index: 1 <= i and i <= item.count + 1
one_more_character: item.count = old item.count + 1
inserted: item.item (i) = c
put (c: CHARACTER_8; i: INTEGER_32)
-- Replace character at index `i' by `c'
valid_index: item.valid_index (i)
stable_count: item.count = old item.count
replaced: item.item (i) = c
item_not_void: item /= Void
end -- class ST_COPY_ON_WRITE_STRING
Hm. Now I look at it, it seems that the contract for append_string could be strengthened in the postconditions.
The _8 suffixes are only present because I used the EiffelStudio interface view.
This class can be used in a lot of situations to avoid problems. The basic idea is to avoid duplicating strings needlessly.
But it would be much better if we could follow the Java line here.
I think it ought to be possible to do this without breaking (much) existing code, by making use of the convert keyword. Rather than having STRING_GENERAL inherit from READ_ONLY_STRING_GENERAL, we have parallel hierarchies, and say that the read-only versions can convert to the existing versions (the creation procedures involved would of course copy the characters of the string).
Then we could change string literals to be of type READ_ONLY_STRING (or rather, one of its aliases), and everything should work just fine.
Mutable strings are evil
I, too, often encounter hard-to-track-down bugs in Eiffel code caused by the fact that STRING is mutable. Every other OO language that I've worked with (Delphi, C#, Java) treats strings as immutable. This is a bit weird, because it means that strings are reference types with some expanded semantics; so the Eiffel approach is more consistent with the rest of the type system. But having a few years of Eiffel development under my belt now, I can safely say that it's not just a prejudice based on what I'm used to: mutable strings are evil!
Performance of string comparisons
The disadvantage of using convert is that the cost of string comparisons, already an expensive operation, is increased by the need to create a temporary object. Colin Adams
Immutable STRING variants are something that I've brought up a number of times. I fully agree that Eiffel needs immutable strings. Keep up the comments.
It might be worth running a poll on this. Options such as:
Inherit and convert
Can't we have both 2) and 3)? Mutable strings would conform to constant strings and constant strings would convert to mutable strings. Comparison would use constant strings as argument and hence conformance would be involved (no performance implications). The problem with STRING_GENERAL inheriting from CONSTANT_STRING_GENERAL is that even though it's a good way for a feature to state that it won't alter strings passed as arguments if declared as constant, it does not mean that this string will not be modified by another feature (or the same feature after assignment attempt) if its dynamic type is in fact one of a mutable string. So we think that the string passed as CONSTANT_GENERAL_STRING will not change, but it can. In fact it's not surprising. Even if the inheritance is appealing, a mutable string is not a constant string. It's like having RECTANGLE inherit from SQUARE. Hmmm, so I guess that in order to be 100% safe we need conversion in both ways. For comparison and performance, we probably need a common ancestor to STRING_GENERAL and CONSTANT_STRING_GENERAL. READONLY_STRING_GENERAL? We can only read the content of the string through this interface, but it's not necessarily a constant string. Polymorphically it can be attached to a mutable string whose content can be modified.
READABLE_STRING is better than READONLY_STRING_GENERAL (because the dynamic type may also be writable, and I think we will not need _8 and _32 descendants for this class).
For comparison, there is
same_string and is_equal. Same_string can accept a READABLE_STRING, and so avoid a conversion, but what about is_equal? It takes a like Current argument, and so a conversion will be involved.
Taking the discussion elsewhere.
I'm in the process of creating a Wiki page with rationale and implementation suggestions. I'll post a link when it in at least a legible draft state.
Wiki document on Immutable Strings.
I think the signature of is_equal is subject to change anyway.