I'm trying to get the domain of a given URL. For example http://www.facebook.com/someuser/ will return facebook.com. The given URL can be on these formats:
https://www.facebook.com/someuser(www. is optional, but should be ignored)www.facebook.com/someuser(http:// is not required)facebook.com/someuserhttp://someuser.tumblr.com-> this has to returntumblr.comonly
I wrote this regex:
/(?: \.|\/{2})(?: www\.)?([^\/]*)/i
But it does not work as I expect.
I can do this in parts:
- Remove
http://andhttps://, if present on string, withstring.delete "/https?:\/\//i". - Remove
www.withstring.delete "/www\./i". - Get the domain with match and
/(\w+\.\w+)+/i
But this won't work with subdomains. String for testing:
https://www.facebook.com/username
http://last.fm/user/username
www.google.com
facebook.com/username
http://sub.tumblr.com/
sub.tumblr.com
I need this to work with the minimum memory and processing coast as possible.
Any ideas?