Skip to content

Possible shaping error on _add_loss_op() in model.py #34

@thefirebanks

Description

@thefirebanks

Hello!

I'm running into a reshaping error when using RL and intermediate rewards.

The output of intermediate_rewards() is a # list of max_dec_step * (batch_size, k)(line 241)

and then this is stacked and has shape (batch_size, k) - stored in self.sampling_discounted_rewards.

But then in _add_loss_op(), you iterate k times and append:

for _ in range(self._hps.k):
    self._sampled_rewards.append(self.sampling_discounted_rewards[:, :, _]) # shape (max_enc_steps, batch_size)

But the index [:, :, _] would run into a dimension error because the shape of self.sampling_discounted_rewards is (batch_size, k).

Am I missing something here? What should be the correct shape/reshaping? Thank you for uploading this code!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions