1

I use Boost to serialize a NeuralNetwork object with code like this

template <class Archive>
void NeuralNetwork::serialize(Archive& ar, unsigned version)
{
    boost::serialization::void_cast_register<NeuralNetwork, StatisticAnalysis>();
    ar & boost::serialization::base_object<StatisticAnalysis>(*this);
    ar.template register_type<FullyConnected>(); // derived from Layer object
    ar.template register_type<Recurrence>();
    ar.template register_type<Convolution>();
    ar.template register_type<MaxPooling>();
    ar & layers; // vector<unique_ptr<Layer>>
}

My problem is that I have objects already serialized and when I add a new class inherited from Layer, I have the following error: unknown file: error: C++ exception with description "unregistered class" thrown in the test body.

How can I add a new register_type<T> without breaking compatibility with already serialized and saved objects?

sehe
  • 374,641
  • 47
  • 450
  • 633
Matthieu H
  • 525
  • 8
  • 19

1 Answers1

1

when I add a new class inherited from Layer, I have the following error: unknown file: error: C++ exception with description "unregistered class" thrown in the test body.

I'd argue that this is due to other things.

Reference Point: "Automatic" type registration

The typical pattern is NOT to use register_type. Instead you'd use the automatic registration mechanism: https://www.boost.org/doc/libs/1_32_0/libs/serialization/doc/special.html#registration

#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/serialization/base_object.hpp>
#include <boost/serialization/export.hpp>
#include <boost/serialization/unique_ptr.hpp>
#include <boost/serialization/access.hpp>
#include <boost/serialization/vector.hpp>
#include <sstream>
#include <iostream>
#include <iomanip>
#include <boost/core/demangle.hpp>
using boost::serialization::base_object;
using boost::core::demangle;

struct StatisticAnalysis {
    virtual ~StatisticAnalysis() = default;
    virtual void report(std::ostream&) const = 0;
    std::vector<int> base_data {1,2,3};
    void serialize(auto& ar, unsigned) { ar & base_data; }

    friend std::ostream& operator<<(std::ostream& os, StatisticAnalysis const& sa) {
        sa.report(os);
        return os;
    }
};

BOOST_SERIALIZATION_ASSUME_ABSTRACT(StatisticAnalysis)
BOOST_CLASS_EXPORT(StatisticAnalysis)

struct Layer {
    virtual ~Layer() = default;
    void serialize(auto&, unsigned) { }
};

BOOST_SERIALIZATION_ASSUME_ABSTRACT(Layer)
BOOST_CLASS_EXPORT(Layer)

struct FullyConnected : Layer { void serialize(auto &ar, unsigned) { ar &base_object<Layer>(*this); } };
struct Recurrence     : Layer { void serialize(auto &ar, unsigned) { ar &base_object<Layer>(*this); } };
struct Convolution    : Layer { void serialize(auto &ar, unsigned) { ar &base_object<Layer>(*this); } };
struct MaxPooling     : Layer { void serialize(auto &ar, unsigned) { ar &base_object<Layer>(*this); } };

BOOST_CLASS_EXPORT(FullyConnected)
BOOST_CLASS_EXPORT(Recurrence)
BOOST_CLASS_EXPORT(Convolution)
BOOST_CLASS_EXPORT(MaxPooling)

#if defined(VERSION2)
struct NewLayer : Layer {
    void serialize(auto &ar, unsigned) { ar &base_object<Layer>(*this); }
};
BOOST_CLASS_EXPORT(NewLayer)
#endif

struct NeuralNetwork : StatisticAnalysis {
    virtual void report(std::ostream& os) const override {
        os << layers.size() << " layers: {";
        for (auto& layer : layers) {
            os << " " << demangle(typeid(*layer).name());
        }
        os << " }\n";
    }

    std::vector<std::unique_ptr<Layer> > layers;

    void serialize(auto& ar, unsigned) {
        ar &base_object<StatisticAnalysis>(*this);
        ar &layers;
    }
};

BOOST_CLASS_EXPORT(NeuralNetwork)

int main()
{
    std::unique_ptr<StatisticAnalysis> analysis;
    std::stringstream ss;
    {
        boost::archive::text_oarchive oa(ss);
        analysis = [] {
            auto nn = std::make_unique<NeuralNetwork>();
            nn->layers.emplace_back(std::make_unique<FullyConnected>());
            nn->layers.emplace_back( std::make_unique<Recurrence>());
            nn->layers.emplace_back(std::make_unique<Convolution>());
            nn->layers.emplace_back(std::make_unique<FullyConnected>());
            nn->layers.emplace_back(std::make_unique<FullyConnected>());
            nn->layers.emplace_back(std::make_unique<MaxPooling>());
            return nn;
        }();
        oa << analysis;
    }

    std::cout << "Data: " << std::quoted(ss.str()) << "\n";

    {
        boost::archive::text_iarchive ia(ss);

        analysis.reset();
        ia >> analysis;
        
        std::cerr << *analysis << "\n";
    }

}

Both versions have identical archives:

Data: "22 serialization::archive 17 0 0 1 13 NeuralNetwork 1 0
0 0 0 3 0 1 2 3 0 0 6 0 0 0 7 14 FullyConnected 1 0
1 1 0
2 8 10 Recurrence 1 0
3
4 9 11 Convolution 1 0
5
6 7
7
8 7
9
10 10 10 MaxPooling 1 0
11
12
"
6 layers: { FullyConnected Recurrence Convolution FullyConnected FullyConnected MaxPooling }

Compare With register_type

Just making sure that register_type doesn't actually create a compatibility problem - as the docs could be implying indeed:

Note that if the serialization function is split between save and load, both functions must include the registration. This is required to keep the save and corresponding load in syncronization.

Note: After seeing that the output was identical as expected for text archives, I also modified to write to a binary archive, just in case there are some implementation differences at play there.

Live Demo (both versions v1 and v2 at once):

//#define VERSION2
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
#include <boost/serialization/base_object.hpp>
#include <boost/serialization/export.hpp>
#include <boost/serialization/unique_ptr.hpp>
#include <boost/serialization/access.hpp>
#include <boost/serialization/vector.hpp>
#include <fstream>
#include <iostream>
#include <iomanip>
#include <boost/core/demangle.hpp>
using boost::serialization::base_object;
using boost::core::demangle;

struct StatisticAnalysis {
    virtual ~StatisticAnalysis() = default;
    virtual void report(std::ostream&) const = 0;
    std::vector<int> base_data {1,2,3};
    void serialize(auto& ar, unsigned) { ar & base_data; }

    friend std::ostream& operator<<(std::ostream& os, StatisticAnalysis const& sa) {
        sa.report(os);
        return os;
    }
};

BOOST_SERIALIZATION_ASSUME_ABSTRACT(StatisticAnalysis)
BOOST_CLASS_EXPORT(StatisticAnalysis)

struct Layer {
    virtual ~Layer() = default;
    void serialize(auto&, unsigned) { }
};

BOOST_SERIALIZATION_ASSUME_ABSTRACT(Layer)
BOOST_CLASS_EXPORT(Layer)

struct FullyConnected : Layer { void serialize(auto &ar, unsigned) { ar &base_object<Layer>(*this); } };
struct Recurrence     : Layer { void serialize(auto &ar, unsigned) { ar &base_object<Layer>(*this); } };
struct Convolution    : Layer { void serialize(auto &ar, unsigned) { ar &base_object<Layer>(*this); } };
struct MaxPooling     : Layer { void serialize(auto &ar, unsigned) { ar &base_object<Layer>(*this); } };

//BOOST_CLASS_EXPORT(FullyConnected)
//BOOST_CLASS_EXPORT(Recurrence)
//BOOST_CLASS_EXPORT(Convolution)
//BOOST_CLASS_EXPORT(MaxPooling)

#if defined(VERSION2)
struct NewLayer : Layer {
    void serialize(auto &ar, unsigned) { ar &base_object<Layer>(*this); }
};
//BOOST_CLASS_EXPORT(NewLayer)
#endif

struct NeuralNetwork : StatisticAnalysis {
    virtual void report(std::ostream& os) const override {
        os << layers.size() << " layers: {";
        for (auto& layer : layers) {
            os << " " << demangle(typeid(*layer).name());
        }
        os << " }\n";
    }

    std::vector<std::unique_ptr<Layer> > layers;

    void serialize(auto& ar, unsigned) {
        ar &base_object<StatisticAnalysis>(*this);
        ar.template register_type<FullyConnected>(); // derived from Layer object
        ar.template register_type<Recurrence>();
        ar.template register_type<Convolution>();
        ar.template register_type<MaxPooling>();
#if defined(VERSION2)
        ar.template register_type<NewLayer>();
#endif

        ar &layers;
    }
};

BOOST_CLASS_EXPORT(NeuralNetwork)

int main(int, char **argv) {
    std::string program_name(*argv);

    std::unique_ptr<StatisticAnalysis> analysis;
    {
        std::ofstream ofs(program_name + ".bin", std::ios::binary);
        boost::archive::binary_oarchive oa(ofs);
        analysis = [] {
            auto nn = std::make_unique<NeuralNetwork>();
            nn->layers.emplace_back(std::make_unique<FullyConnected>());
            nn->layers.emplace_back( std::make_unique<Recurrence>());
            nn->layers.emplace_back(std::make_unique<Convolution>());
            nn->layers.emplace_back(std::make_unique<FullyConnected>());
            nn->layers.emplace_back(std::make_unique<FullyConnected>());
            nn->layers.emplace_back(std::make_unique<MaxPooling>());
            return nn;
        }();
        oa << analysis;
    }

    {
        std::ifstream ifs(program_name + ".bin", std::ios::binary);
        boost::archive::binary_iarchive ia(ifs);

        analysis.reset();
        ia >> analysis;
        
        std::cerr << *analysis << "\n";
    }
}

The test commands

g++ -std=c++20 -Os -DVERSION1 -lboost_serialization main.cpp -o v1
g++ -std=c++20 -Os -DVERSION2 -lboost_serialization main.cpp -o v2
./v1 && ./v2 && md5sum v1.bin v2.bin

Complete successfully, writing identical archives v1.bin and v2.bin, as evidenced by their md5sums:

5bba3ef7d8a25bd50d0768fed5dfed64  v1.bin
5bba3ef7d8a25bd50d0768fed5dfed64  v2.bin

Summary - Where To Go From Here

I think that adding subclasses in principle should not have to break archive compatibility. If it appears it did,

I'm here should you encounter some more information. If the question becomes different enough, consider opening a new question.

sehe
  • 374,641
  • 47
  • 450
  • 633
  • I felt like my code doesn't work without `register_type` but that was a while ago so I'll try to remove them and keep you posted. – Matthieu H Feb 16 '21 at 18:01
  • In the version with `register_type` if the line `ar.template register_type ();` is placed before the others the 2 checksums become different, my problem must come from there. I will see to upgrade to the first version even if it doesn't work at the moment. – Matthieu H Feb 16 '21 at 19:51
  • 1
    Yeah, mixing up the order is not ok - the doc quote confirms that part. So, basically I think we can conclude that you appending to the end of the list is no problem? – sehe Feb 16 '21 at 19:54
  • I still have the problem even putting the new register_type at the end (because I have 2 lists of register_type). And the version without the register_type doesn't work because my `BOOST_CLASS_EXPORT` is in the *.cpp – Matthieu H Feb 16 '21 at 20:41
  • @MatthieuH that's a common problem and solvable by separating keys/implementations: https://stackoverflow.com/a/58000691/85371 – sehe Feb 16 '21 at 21:46
  • Also, if you have two lists of register_type, try consolidating them so you won't run into that ordering issue again. It's kinda sad that you didn't have the auto-registration in place from the start, because now you might be building the same "global type registry" partly over that's already built into the library. Do you have a link to a repo to share so I can poke at the real code? – sehe Feb 16 '21 at 21:55
  • 1
    (I'm particularly curious whether we can make a switch-over from the old-style to auto-registration while retaining legacy compatibility. It may not be possible though, or only using class versioning) – sehe Feb 16 '21 at 21:59
  • The auto-registration doesn't work in my case maybe because I use template class or my code is too complexe, I don't know... You can find my project on [GitHub](https://github.com/MatthieuHernandez/StraightforwardNeuralNetwork) if you wanna try something or look at it in more details. – Matthieu H Feb 17 '21 at 17:50
  • Auto registration is just macros, so you can see what it does and replicate for a template class (I think I've done that before or it simply exists in a corner of the docs). I found your repo last night, I'll give it a look later. – sehe Feb 17 '21 at 17:53