Discussion on author's idea #1
The most-promising idea I had was to (1) find a way to enumerate all of the .o files upon which foo-use depenends, and then (2) iterate over each of those .o files, calling add_dependency for each one.
This shouldn't work according to the documentation for add_dependencies, which states:
Makes a top-level depend on other top-level targets to ensure that they build before does.
Ie. You can't use it to make a target depend on files- only on other targets.
Discussion on author's idea #2
I also considered using set_source_files_properties to set the OBJECT_DEPENDS property on each of my .cpp files used by foo-use, adding prof.data to that property's list.
The problem with this (AFAICT) is that each of my .cpp files is used to create two different .o files: one for foo-gen and one for foo-use. I want the .o files that get linked into foo-use to have this compile-time dependency on prof.data; but the .o files that get linked into foo-gen must not have a compile-time dependency on prof.data.
And AFAIK, set_source_files_properties doesn't let me set the OBJECT_DEPENDS property to have one of two values, contingent on whether foo-gen or foo-use is the current target of interest.
In the comment section, you mentioned that you could solve this if OBJECT_DEPENDS supported generator expressions, but it doesn't. As a side note, there is an issue ticket tracking this on the CMake gitlab repo. You can go give it a thumbs up and describe your use-case for their reference.
In the comments section you also mentioned a possible solution to this:
Potential other solution a) double project system where main user invoked project forwards settings to second pgo project compiling same settings again.
You can actually put this into the CMake project via ExternalProject so that it becomes part of the generated buildsystem: Make the top-level project include itself as an external project. The external project can be passed a cache variable to configure it to be the -gen version, and the top-level can be the -use version.
Speaking from experience, this is a whole other rabbit hole of long CMake-documentation-reading and finicking sessions if you have never manually invoked or done anything with ExternalProject before, so that answer might belong with a new question dedicated to it.
This can solve the problem of not having generator expressions in OBJECT_DEPENDS, but if you want to have multi-config for the top-level project and that some of the configs in the multi-config config not be for PGO, then you will be back to square one.
Proposed Solution
Here's what I've found works to make sources re-compile when profile data changes:
- To the custom command which runs the training executable and produces and re-formats the training data, add another
COMMAND which produces a c++ header file containing a timestamp in a comment.
- Include that header in all sources which you want to re-compile if the training has been re-run.
If you want to support non-PGO builds, wrap the timestamp header in a header which checks that it exists with __has_include and only includes it if it exists.
I'm pretty sure with this approach, CMake doesn't do the dependency checking of TUs on the profile data, and instead, it's the generated buildsystem's header-dependency tracking which does that work. The rationale for including a timestamp comment in the header file instead of just "touch"ing it to change the timestamp in the filesystem is that the generated buildsystem might detect changes by file contents instead of by the filesystem timestamp.
All the shortcomings of the proposed solution
The painfully obvious weakness of this approach is that you need to add a header include to all the .cpp files that you want to check for re-compilation. There are several problems that can spawn from this (from least to most egregious):
You might not like it from an aesthetics standpoint.
It certainly opens up a hole for human-error in forgetting to include the header for new .cpp files. I don't know how to solve that. Some compilers have a flag that you can use to include a file in every source file, such as GCC's -include flag and MSVC's /FI flag. You can then just add this flag to a CMake target using target_compile_options(<target> PRIVATE "SHELL:-include <path>")
You might not be able to change some of the sources that you need to re-compile, such as sources from third-party static libraries that your library depends on. There may be workarounds if you're using ExternalProject by doing something with the patch step, but I don't know.
For my personal project, #1 and #2 are acceptable, and #3 happens to not be an issue. You can take a look at how I'm doing things there if you're interested.
Toward a standard PGO CMake module
See https://gitlab.kitware.com/cmake/cmake/-/issues/19273