I've been with Ruby for about a year now and have a language question: are symbols necessary because Ruby strings are mutable and not interned?
No.
Symbol and String are simply two different data types. String is for text, Symbol is for labels.
In, say, Java, strings are immutable and interned.
No, they are not. They are immutable and sometimes interned, sometimes not. If Strings were interned, then why is there a method java.lang.String.intern() which interns a String? Strings in Java are only interned if
- you call
java.lang.String.intern() or
- the
String is the result of a String literal expression or
- the
String is the result of String-typed constant value expression
Otherwise, they are not.
So "foo" is always equal to "foo" in value and reference and its value cannot change.
Again, this is not true:
class Test {
public static void main(String... args) {
System.out.println("foo".equals(args[0]));
System.out.println("foo" == args[0]);
}
}
Call it with
java Test foo
# true
# false
In Ruby, strings are mutable and not interned, so "a".object_id == "a".object_id will be false.
In modern Ruby, that is not necessarily true either:
#frozen_string_literal: true
"a".object_id == "a".object_id
#=> true
If Ruby had implemented strings like Java, symbols wouldn't be necessary, right?
No. Like I said, they are different types for different use cases.
Take a look at Scala, for example, which implements "strings like Java" (in fact, on the JVM implementation of Scala, there is no String, Scala String simply is java.lang.String). Yet, it also has a Symbol class.
Likewise, Clojure has not one but two datatypes like Ruby's Symbol: keywords are exactly equivalent to Ruby's Symbols, they evaluate to themselves and stand only for themselves. Symbols OTOH may stand for something else.
Erlang has immutable strings and atoms, which are like Clojure/Lisp symbols.
ECMAScript has immutable strings and recently added a Symbol datatype. They are not 100% equivalent to Ruby Symbols, though, since they have an additional guarantee: not only do they evaluate only to themselves and stand only for themselves, but they are also unforgeable (meaning it is impossible to create a Symbol which is equal to another Symbol).
Note that Ruby is moving away from mutable strings:
- Ruby 2.1 optimizes the pattern
'literal string'.freeze to return a frozen string from a global string pool.
- Ruby 2.3 introduces the
# frozen_string_literal: true pragma and --enable=frozen-string-literal feature toggle switch to make all string literals frozen (and pooled) by default on a per-script (pragma) or per-process (feature toggle) basis.
- Ruby 3 will switch the default for both of those to
true, so that you have to explicitly say # frozen_string_literal: false or --disable=frozen-string-literal in order to get the current behavior.
- Some later version will remove support for mutable strings altogether.