Seems I'm about 14 years late to the party. But here goes.
TLDR TLDR
Friend classes are there so that you can extend encapsulation to the group of classes which comprise your data structure.
TLDR
Your data structure in general consists of multiple classes. Similarly to a traditional class (supported by your programming language), your data structure is a generalized class which also has data and invariants on that data which spans across objects of multiple classes. Encapsulation protects those invariants against accidental modification of the data from the outside, so that the data-structure's operations ("member functions") work correctly. Friend classes extend encapsulation from classes to your generalized class.
The too long
A class is a datatype together with invariants which specify a subset of the values of the datatype, called the valid states. An object is a valid state of a class. A member function of a class moves a given object from a valid state to another.
It is essential that object data is not modified from outside of the class member functions, because this could break the class invariants (i.e. move the object to an invalid state). Encapsulation prohibits access to object data from outside of the class. This is an important safety feature of programming languages, because it makes it hard to inadvertedly break class invariants.
A class is often a natural choice for implementing a data structure, because the properties (e.g. performance) of a data structure is dependent on invariants on its data (e.g. red-black tree invariants). However, sometimes a single class is not enough to describe a data structure.
A data structure is any set of data, invariants, and functions which move that data from a valid state to another. This is a generalization of a class. The subtle difference is that the data may be scattered over datatypes rather than be concentrated on a single datatype.
Data structure example
A prototypical example of a data structure is a graph which is stored using separate objects for vertices (class Vertex), edges (class Edge), and the graph (class Graph). These classes do not make sense independently. The Graph class creates Vertexs and Edges by its member functions (e.g. graph.addVertex() and graph.addEdge(aVertex, bVertex)) and returns pointers (or similar) to them. Vertexs and Edges are similarly destroyed by their owning Graph (e.g. graph.removeVertex(vertex) and graph.removeEdge(edge)). The collection of Vertex objects, Edge objects and the Graph object together encode a mathematical graph. In this example the intention is that Vertex/Edge objects are not shared between Graph objects (other design choices are also possible).
A Graph object could store a list of all its vertices and edges, while each Vertex could store a pointer to its owning Graph. Hence, the Graph object represents the whole mathematical graph, and you would pass that around whenever the mathematical graph is needed.
Invariant example
An invariant for the graph data structure then would be that a Vertex is listed in its owner Graph's list. This invariant spans both the Vertex object and the Graph object. Multiple objects of multiple types can take part in a given invariant.
Encapsulation example
Similarly to a class, a data structure benefits from encapsulation which protects against accidental modification of its data. This is because the data structure needs to preserve invariants to be able to function in promised manner, exactly like a class.
In the graph data structure example, you would state that Vertex is a friend of Graph, and also make the constructors and data-members of Vertex private so that a Vertex can only be created and modified by Graph. In particular, Vertex would have a private constructor which accepts a pointer to its owning graph. This constructor is called in graph.addVertex(), which is possible because Vertex is a friend of Graph. (But note that Graph is not a friend of Vertex: there is no need for Vertex to be able to access Graph's vertex-list, say.)
Terminology
The definition of a data structure acts itself like a class. I propose that we start using the term 'generalized class' for any set of data, invariants, and functions which move that data from a valid state to another. A C++ class is then a specific kind of a generalized class. It is then self-evident that friend classes are the precise mechanism for extending encapsulation from C++ classes to generalized classes.
(In fact, I'd like the term 'class' to be replaced with the concept of 'generalized class', and use 'native class' for the special case of a class supported by the programming language. Then when teaching classes you would learn of both native classes and these generalized classes. But perhaps that would be confusing.)