I'm writing a serializable class that takes several arguments, including a Function:
public class Cls implements Serializable {
private final Collection<String> _coll;
private final Function<String, ?> _func;
public Cls(Collection<String> coll, Function<String, ?> func) {
_coll = coll;
_func = func;
}
}
func is stored in a member variable, and so needs to be serializable. Java lambdas are serializable if the type they're being assigned to is serializable. What's the best way to ensure that the Function I get passed in my constructor is serializable, if it is created using a lambda?
Create a
SerializableFunctiontype and use that:public interface SerializableFunction<F, R> implements Function<F, R>, Serializable {} .... public Cls(Collection<String> coll, SerializableFunction<String, ?> func) {...}Issues:
- There's now a mismatch between the
collandfuncarguments, in thatfuncis declared as serializable in the signature, butcollis not, but both are required to be serializable for it to work. - It doesn't allow other implementations of
Functionthat are serializable.
- There's now a mismatch between the
Use a type parameter on the constructor:
public <F extends Function<String, ?> & Serializable> Cls(Collection<String> coll, F func) {...}Issues:
- More flexible than 1, but more confusing.
- There's still a mismatch between the two arguments - the
funcargument is required to implementSerializablein the compile-time type heirarchy, butcollis just required to be serializable somehow (although this requirement can be cast away if required).
EDIT This code doesn't actually compile when trying to call with a lambda or method reference.
Leave it up to the caller
This requires the caller to know (from the javadocs, or trial-and-error) that the argument needs to be serializable, and cast as appropriate:
Cls c = new Cls(strList, (Function<String, ?> & Serializable)s -> ...);or
Cls c = new Cls(strList, (Function<String, ?> & Serializable)Foo::processStr);This is ugly IMO, and the initial naive implementation of using a lambda is guaranteed to break, rather than likely to work as with
coll(as most collections are serializable somehow). This also pushes an implementation detail of the class onto the caller.
At the moment I'm leaning towards option 2, as the one that imposes the least burden on the caller, but I don't think there's an ideal solution here. Any other suggestions for how to do this properly?
EDIT: Perhaps some background is required. This is a class that runs inside storm, in a bolt, which is serialized to transfer to a remove cluster to execute. The function is performing an operation on the processed tuples when run on the cluster. So it is very much part of the class's purpose that it is serializable and that the function argument is serializable. If it is not, then the class is not usable at all.