what is problem with my regular expression and how can I fix it?
You are using a character class
[http.?://(www.)?]
This means:
- either an h
- or a t
- or a t
- or a .
- or a ?
- or a :
- or a /
- or a /
- or a (
- or a w
- or a w
- or a w
- or a .
- or a )
- or a ?
It does not include an s, so it will not match https://.
It is not clear to me why you are using a character class here, nor why you are using duplicate characters in the class.
Ideally, you shouldn't try to parse URIs yourself; someone else has already done the hard work. You could, for example, use the java.net.URI class:
import java.net.URI
val u1 = new URI("test.net")
u1.getHost
// res: String = null
val u2 = new URI("https://www.test.net")
u2.getHost
// res: String = www.test.net
val u3 = new URI("https://test.net")
u3.getHost
// res: String = test.net
val u4 = new URI("http://www.test.net")
u4.getHost
// res: String = www.test.net
val u5 = new URI("http://test.net")
u5.getHost
// res: String = test.net
Unfortunately, as you can see, what you want to achieve does not actually comply with the official URI syntax.
If you can fix that, then you can use java.net.URI. Otherwise, you will need to go back to your old solution and parse the URI yourself:
val re = "(?>https?://)?(?>www.)?([^/?#]*)".r
val re(domain1) = "test.net"
//=> domain1: String = test.net
val re(domain2) = "https://www.test.net"
//=> domain2: String = test.net
val re(domain3) = "https://test.net"
//=> domain3: String = test.net
val re(domain4) = "http://www.test.net"
//=> domain4: String = test.net
val re(domain5) = "http://test.net"
//=> domain5: String = test.net