Jay Taylor's notes

back to listing index

Regex and Pattern Matching in Scala - Stack Overflow

[web search]
Original source (stackoverflow.com)
Tags: scala regular-expressions case-matching syntax stackoverflow.com
Clipped on: 2012-09-01

I am not strong in regex, and pretty new to Scala. I would like to be able to find a match between the first letter of a word, and one of the letters in a group such as "ABC". In pseudocode, this might look something like:

case Process(word) =>
   word
.firstLetter match {
     
case([a-c][A-C]) =>
     
case _ =>
   
}
}

but I don't know how to grab the first letter in Scala instead of Java, how to express the regular expression properly, nor if it's possible to do this within a case class. Any suggestions? Thanks in advance.

Bruce

Image (Asset 1/6) alt= 22.2k1371120
asked Jan 8 '11 at 22:50
Image (Asset 2/6) alt= 23439

88% accept rate
1 upvote
 flag
Be warned: In Scala (and *ML languages), pattern matching has another, very different from regexes, meaning. – delnan Jan 8 '11 at 22:51
  upvote
 flag
So does that mean I should be trying to keep them separate? – Bruce Ferguson Jan 8 '11 at 22:56
  upvote
 flag
You probably want [a-cA-C] for that regular expression. – pst Jan 8 '11 at 23:27
1 upvote
 flag
in scala 2.8, strings are converted to Traversable (like List and Array), if you want the first 3 chars, try "my string".take(3), for the first "foo".head – shellholic Jan 9 '11 at 1:15
add comment

4 Answers

up vote 34 down vote accepted

You can do this because regular expressions define extractors but you need to define the regex pattern first. I don't have access to a Scala REPL to test this but something like this should work.

val Pattern = "([a-cA-C])".r
word.firstLetter match {
   case Pattern(c) => c bound to capture group here
   case _ =>
}
answered Jan 8 '11 at 23:03
Image (Asset 3/6) alt= 1,797819
1 upvote
 flag
No more votes today :( Nice answer though. – pst Jan 8 '11 at 23:27
add comment

As delnan pointed out, the match keyword in Scala has nothing to do with regexes. To find out whether a string matches a regex, you can use the String.matches method. To find out whether a string starts with an a, b or c in lower or upper case, the regex would look like this:

word.matches("[a-cA-C].*")

You can read this regex as "one of the characters a, b, c, A, B or C followed by anything" (. means "any character" and * means "zero or more times", so ".*" is any string).

answered Jan 8 '11 at 22:57
Image (Asset 4/6) alt= 96.9k6199290
  upvote
 flag
Ok, cool. Thanks – Bruce Ferguson Jan 8 '11 at 23:00
add comment

String.matches is the way to do pattern matching in the regex sense.

But as a handy aside, word.firstLetter in real Scala code looks like:

word(0)

Scala treats Strings as a sequence of Char's, so if for some reason you wanted to explicitly get the first character of the String and match it, you could use something like this:

"Cat"(0).toString.matches("[a-cA-C]")
res10
: Boolean = true

I'm not proposing this as the general way to do regex pattern matching, but it's in line with your proposed approach to first find the first character of a String and then match it against a regex.

EDIT: To be clear, the way I would do this is, as others have said:

"Cat".matches("^[a-cA-C].*")
res14
: Boolean = true

Just wanted to show an example as close as possible to your initial pseudocode. Cheers!

answered Jan 8 '11 at 23:46
Image (Asset 5/6) alt= 1,3771316
  upvote
 flag
"Cat"(0).toString could be more clearly written as "Cat" take 1, imho. – David Winslow Jan 9 '11 at 17:08
add comment

To expand a little on Andrew's answer: The fact that regular expressions define extractors can be used to decompose the substrings matched by the regex very nicely using Scala's pattern matching, e.g.:

val Process = """([a-cA-C])([^\s]+)""".r // define first, rest is non-space
for (p <- Process findAllIn "aha bah Cah dah") p match {
 
case Process("b", _) => println("first: 'a', some rest")
 
case Process(_, rest) => println("some first, rest: " + rest)
 
// etc.
}
answered Jan 9 '11 at 0:34
Image (Asset 6/6) alt= 15.1k12760
add comment

Your Answer

 
community wiki

Not the answer you're looking for? Browse other questions tagged or ask your own question.