One way to go about this is to update a piece of state as you traverse the list, similar to what you'd do in an imperative language. This requires working with the State monad, which might take some studying and playing around to get it if it's your first time, but trust me, this is well worth learning. Let's start with imports:
import Control.Monad.State
import Data.Set (Set)
import qualified Data.Set as Set
The state we're going to keep is the Set of elements seen up to that point in the list. So first, let's write a pair of simple State actions to manage the set of seen elements:
-- Add an element to the context Set
remember :: Ord a => a -> State (Set a) ()
remember a = modify (Set.insert a)
-- Test if the context set contains an element
haveSeen :: Ord a => a -> State (Set a) Bool
haveSeen a = do seen <- get
return (a `Set.member` seen)
Now we're going to combine these two into an action that checks for duplication:
isDuplicate :: Ord a => a -> State (Set a) Bool
isDuplicate a = do seen <- haveSeen a
remember a
return seen
You've mentioned the takeWhile function. We're going to build our solution along similar lines. This is takeWhile's definition:
-- different name to avoid collision
takeWhile' :: (a -> Bool) -> [a] -> [a]
takeWhile' _ [] = []
takeWhile' p (a:as)
| p a = a : takeWhile p as
| otherwise = []
We can modify this function to work with a predicate that has the Bool wrapped inside a monad:
takeWhileM :: Monad m => (a -> m Bool) -> [a] -> m [a]
takeWhileM _ [] = return []
takeWhileM p (a:as) =
do test <- p a
if test
then do rest <- takeWhileM p as
return (a:rest)
else return []
But the key difference here is that because the test in takeWhileM is monadic, we can use our stateful isDuplicate from above. So each time we test an element of the list with isDuplicate, we will also record that element in the Set that's being threaded through the computation. So now we can write takeUntilDuplicate like this:
takeUntilDuplicate :: Ord a => [a] -> [a]
takeUntilDuplicate as = evalState (takeUntilDuplicate' as) Set.empty
where takeUntilDuplicate' :: Ord a => [a] -> State (Set a) [a]
takeUntilDuplicate' = takeWhileM (fmap not . isDuplicate)
Example use (with an infinite list argument):
>>> takeUntilDuplicate (cycle [1..5])
[1,2,3,4,5]
And what's neat is that several of these pieces of code could be reused for similar problems.