After parsing many links regarding building Caffe layers in Python i still have difficulties in understanding few concepts. Can please someone clarify them?
- Blobs and weights python structure for network is explained here: Finding gradient of a Caffe conv-filter with regards to input.
- Network and Solver structure is explained here: Cheat sheet for caffe / pycaffe?.
- Example of defining python layer is here: pyloss.py on git.
- Layer tests here: test layer on git.
- Development of new layers for C++ is described here: git wiki.
What I am still missing is:
setup()method: what I should do here? Why in example I should compare the lenght of 'bottom' param with '2'? Why it should be 2? It seems not a batch size because its arbitrary? And bottom as I understand is blob, and then the first dimension is batch size?reshape()method: as I understand 'bottom' input param is blob of below layer, and 'top' param is blob of upper layer, and I need to reshape top layer according to output shape of my calculations with forward pass. But why do I need to do this every forward pass if these shapes do not change from pass to pass, only weights change?reshapeandforwardmethods have 0 indexes for 'top' input param used. Why would I need to usetop[0].data=...ortop[0].input=...instead oftop.data=...andtop.input=...? Whats this index about? If we do not use other part of this top list, why it is exposed in this way? I can suspect its or C++ backbone coincidence, but it would be good to know exactly.reshape()method, line with:if bottom[0].count != bottom[1].countwhat I do here? why its dimension is 2 again? And what I am counting here? Why both part of blobs (0 and 1) should be equal in amount of some members (
count)?forward()method, what I define by this line:self.diff[...] = bottom[0].data - bottom[1].dataWhen it is used after forward path if I define it? Can we just use
diff = bottom[0].data - bottom[1].datainstead to count loss later in this method, without assigning to
self, or its done with some purpose?backward()method: what's this about:for i in range(2):? Why again range is 2?backward()method,propagate_downparameter: why it is defined? I mean if its True, gradient should be assigned tobottom[X].diffas I see, but why would someone call method which would do nothing withpropagate_down = False, if it just do nothing and still cycling inside?
I'm sorry if those questions are too obvious, I just wasn't able to find a good guide to understand them and asking for help here.