The app I am writing deals with utility service addresses, and right now I am forcing the user to know enough to separate the parts of the address and put them in the appropriate fields before adding to the database. It has to be done this way for sorting purposes because a straight alphabetical sort isn't always right when there is a pre-direction in the address. For example, right now if the user wanted to put in the service address 123 N Main St, they would enter it as:
- Street Number = 123
- Pre-direction = N
- Street Name = Main
- Street Type = St
I've tried to separate this address into its parts by using the Split function and iterating through each part. What I have so far is below:
Public Shared Function ParseServiceAddress(ByVal Address As String) As String()
'this assumes a valid address - 101 N Main St South
Dim strResult(5) As String '0=st_num, 1=predir, 2=st_name, 3=st_type, 4=postdir
Dim strParts() As String
Dim strSep() As Char = {Char.Parse(" ")}
Dim i As Integer
Dim j As Integer = 0
Address = Address.Trim()
strParts = Address.Split(strSep) 'split using spaces
For i = 0 To strParts.GetUpperBound(0)
If Integer.TryParse(strParts(i), j) Then
'this is a number, is it the house number?
If i = 0 Then
'we know this is the house number
strResult(0) = strParts(i)
Else
'part of the street name
strResult(2) = strResult(2) & " " & strParts(i)
End If
Else
Select Case strParts(i).ToUpper()
Case "TH", "ND"
'know this is part of the street name
strResult(2) = strResult(2) & strParts(i)
Case "NORTH", "SOUTH", "EAST", "WEST", "N", "S", "E", "W"
'is this a predirection?
If i = 1 Then
strResult(1) = strParts(i)
ElseIf i = strParts.GetUpperBound(0) Then
'this is the post direction
strResult(4) = strParts(i)
Else
'part of the name
strResult(2) = strResult(2) & strParts(i)
End If
Case Else
If i = strParts.GetUpperBound(0) Then
'street type
strResult(3) = strParts(i)
Else
'part of the street name
strResult(2) = strResult(2) & " " & strResult(i)
End If
End Select
End If
Next i
Return strResult
End Function
I've found this method to be cumbersome, slow, and even totally wrong when given a wonky address. I'm wondering if what I'm trying to do here would be a good application for a regular expression? Admittedly I've never used regex in anything before and am a total newbie in that regard.
Thank you in advance for any help. :)
Edit - Seems more and more like I'm going to need a parser and not just regex. Does anyone know of any good address parser libraries in .NET? Writing our own is just not in the cards right now, and would be sent to the back burner if it came to that.