When I have multiple outputs o1,o2,o3 is the loss on all of them added together and backpropagated from the last layer?total_loss=l1+l2+l3 (and total_loss is back propagted through the last layer)Or is the loss on each of them applied on the layer that produced the output and the preceding layers alone?eg. l1 is applied on o1 and preceding layers, l2 applied on o2 and preceding layers ( and thus effectively covering o1) as well and so on for l3Can someone explain this to me how does this work? Also provide a code exa,ple in tensorflow if possible.
↧