Quantcast
Channel: Recent Questions - Stack Overflow
Viewing all articles
Browse latest Browse all 12111

How to find which code directly access saved tensors after they have already been freed

$
0
0

I'm training a three-layer model, and I get an error when using loss.backward() to calculate the gradient during the training process.**RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed).**However I am not using backward() twice in the same round of training.My training process is as follows:

for _ in tqdm(range(inner_iters)):            idx_batch_eval = torch.randint(low=0, high=idx_range, size=(batch_size,),                                                                    dtype=torch.int32)            train_states = target_set[idx_batch_eval]            train_level = c            train_roa_labels = target_labels[idx_batch_eval]  # 确定他们的标签            class_weights, class_counts = balanced_class_weights(train_roa_labels.astype(bool),                                                                           scale_by_total=True)            def loss(value,nn_train_value):                class_labels = 2 * train_roa_labels - 1                decision_distance = train_level - value                class_labels_torch = torch.tensor(class_labels, dtype=torch.float64)                class_weights_torch = torch.tensor(class_weights, dtype=torch.float64)                # 创建一个300行1列的全0矩阵,数据类型为float64                zero_matrix = torch.zeros((300, 1), dtype=torch.float64)                classifier_loss = class_weights_torch * torch.maximum(- class_labels_torch *                                                            decision_distance,  zero_matrix)                tf_dv_nn_train = nn_train_value - value                stop_gra=value.detach()                train_roa_labels_torch = torch.tensor(train_roa_labels, dtype=torch.float64)                decrease_loss = train_roa_labels_torch * torch.maximum(tf_dv_nn_train,                                           torch.zeros([300,1])) / (stop_gra + OPTIONS.eps)                # decrease_loss = train_roa_labels * tf.maximum(tf_dv_nn_train, 0)                res = (classifier_loss + lagrange_multiplier * decrease_loss).mean()                return res            values = lyapunov_nn.lyapunov_function.forward(train_states)            nn_train_value =                    lyapunov_nn.lyapunov_function.forward(dynamics.build_evaluation(train_states))            objective = loss(values, nn_train_value)            optimizer.zero_grad()             objective.backward()             optimizer.step() 

The structure of a neural network is simple:

kernel_0 = nn.Linear(4, 256)kernel_1 = nn.Linear(256,256)kernel_2 = nn.Linear(256,256)

I tried to save the computed graph but normally this training doesn't need to save the computed graph, after using .backward(retain_graph=True) the error is reported as RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.DoubleTensor [256, 256]], which is output 0 of AsStridedBackward0, is at version 3; expected version 2 instead.

Please teach me how to find the tensor which makes the errors.


Viewing all articles
Browse latest Browse all 12111

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>