The following code:
val sentence = "1 2 3 4".split(" ")
gives me:
Array(1, 2, "", 3, "", "", 4)
but I'd rather want to have only the words:
Array(1, 2, 3, 4)
How can I split the sentence when the words are separated by multiple spaces?
The following code:
val sentence = "1 2 3 4".split(" ")
gives me:
Array(1, 2, "", 3, "", "", 4)
but I'd rather want to have only the words:
Array(1, 2, 3, 4)
How can I split the sentence when the words are separated by multiple spaces?
Use a regular expression:
scala> "1 2 3".split(" +")
res1: Array[String] = Array(1, 2, 3)
The "+" means "one or more of the previous" (previous being a space).
Better yet, if you want to split on all whitespace:
scala> "1 2 3".split("\\s+")
res2: Array[String] = Array(1, 2, 3)
(Where "\\s" is a Pattern which matches any whitespace. Look here for more examples.)
You can filter out the "" from the split Array.
scala> val sentence = "1 2 3 4".split(" ").filterNot(_ == "")
sentence: Array[java.lang.String] = Array(1, 2, 3, 4)
This regular expression \\W+ delivers (alphaunmerical) words, thus
val sentence = "1 2 3 4".split("\\W+")
sentence: Array[String] = Array(1, 2, 3, 4)
For ease of use, in Scala 2.10.* and 2.11.* consider
implicit class RichString(val s: String) extends AnyVal {
def words = s.split("\\W+")
}
Thus,
sentence.words
res: Array[String] = Array(1, 2, 3, 4)