I made a simple regular expression engine, which supports concatenation, alternation, closure, and char a .. z.
The way I represent nfa and dfa is to use record:
type state       = int with sexp, compare
type alphabet    = char with sexp, compare
type transaction = state * alphabet option * state with sexp, compare
type d_transaction = state * alphabet * state with sexp, compare
type state_set = State_set.t
type states_set = States_set.t
type nfa = {
  states       : State_set.t ;
  alphabets    : Alphabet_set.t ;
  transactions : Transaction_set.t; 
  start_state  : state;
  final_states : State_set.t;
}
type dfa = {
  d_states       : State_set.t ;
  d_alphabets    : Alphabet_set.t;
  d_transactions : D_Transaction_set.t ;
  d_start_state  : state ;
  d_final_states : State_set.t;
}
For example, a string "a*" will be parsed into Closure (Char 'a'), and then is transformed 
to nfa:
states: 0 1 2 3
 alphabets: a
 transactions: 0->e->1, 1->a>2, 2->e->3, 2->e->1, 0->e->3
 start_state: 0
 final_states: 3
then dfa:
states: 0 1
 alphabets: a
 transactions: 0->a->1, 1->a->1
 start_state: 0
 final_states: 0 1
However, I use a lot of recursion in my code. The way my program generates state number for each node in nfa and dfa is realy unpredictable. I have no idea how to verify if the generated dfa is correct without using a pen and a piece of paper to test it myself
I am trying to figure out a better way to test my code so I can add more features into my program in the future.
Can anyone please give me some suggestions?
 
     
     
     
    