I am working on a project where the specification requires a parent - child relationship within the Solr data collection ... i.e. a user and the collection of languages they speak (each of which is made up of multiple data fields). My production system is a 4.10 Solr implementation but I have a 5.5 implementation as my disposal as well. Thus far, I am not getting this to work on either one and I have yet to find a complete documentation source on how to implement this.
The goal is to get a resulting document from Solr that looks like this:
{
    "id": 123,
    "firstName": "John",
    "lastName": "Doe",
    "languagesSpoken": [
        {
            "id": 243,
            "abbreviation": "en",
            "name": "English"
        },
        {
            "id": 442,
            "abbreviation": "fr",
            "name": "French"
        }
    ]
}
In my schema.xml, I have flatted out all of the fields as follows:
<field name="id" type="int" indexed="true" stored="true" required="true" multiValued="false" />
<field name="firstName" type="text_general" indexed="true" stored="true" />
<field name="lastName" type="text_general" indexed="true" stored="true" />
<field name="languagesSpoken" type="string" indexed="true" stored="true" multiValued="true"/>
<field name="languagesSpoken_id" type="int" indexed="true" stored="true" />
<field name="languagesSpoken_abbreviation " type="text_general" indexed="true" stored="true" />
<field name="languagesSpoken_name" type="text_general" indexed="true" stored="true" />
The latest rendition of my db-data-config.xml looks like this:
<dataConfig>
    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:...." />
        <document name="clients">
            <entity name="client" query="SELECT * FROM clients" deltaImportQuery="SELECT * FROM clients WHERE id = ${dih.delta.id}" deltaQuery="SELECT id FROM clients WHERE updateDate > '${dih.last_index_time}'">
                <field column="id" name="id" />
                <field column="firstName" name="firstName" />
                <field column="lastName" name="lastName" />
                <entity name="languagesSpoken" child="true" query="SELECT id, abbreviation, name FROM languages WHERE clientId = ${client.id}">
                    <field name="languagesSpoken_id" column="id" />
                    <field name="languagesSpoken_abbreviation" column="abbreviation" />
                    <field name="languagesSpoken_name" column="name" />
                </entity>
            </entity>
        </document>
        ...
On the 4.10 server, when the data comes out of Solr, I get one flat document record with the fields for one language inline with the firstName and lastname like this:
{
    "id": 123,
    "firstName": "John",
    "lastName": "Doe",
    "languagesSpoken_id": 243,
    "languagesSpoken_abbreviation ": "en",
    "languagesSpoken_name": "English"
}
On the 5.5 server, when the data comes out, I get separate documents for the root client document and the child language documents with no relationship between them like this:
{
    "id": 123,
    "firstName": "John",
    "lastName": "Doe"
},
{
    "languagesSpoken_id": 243,
    "languagesSpoken_abbreviation": "en",
    "languagesSpoken_name": "English"
},
{
    "languagesSpoken_id": 442,
    "languagesSpoken_abbreviation": "fr",
    "languagesSpoken_name": "French"
}
I have spent several days now trying to figure out what is going on here to no avail. Can anybody provide me with a pointer as to what I am missing here?
Thanks, -- Jeff
 
    