주요 콘텐츠

updatePolicyParameters

Update policy according to structure of policy parameters given as input argument

Since R2025a

    Description

    newPolicy = updatePolicyParameters(oldPolicy,params) updates oldPolicy according to the learnable and tunable parameters contained in the structure params, and returns as output the updated policy newPolicy.

    example

    Examples

    collapse all

    Assume that you have an existing trained reinforcement learning agent. For this example, load the trained agent from Compare DDPG Agent to LQR Controller.

    load("DoubleIntegDDPG.mat","agent") 

    Obtain the exploration policy from the agent.

    policy = getExplorationPolicy(agent);

    Obtain the parameters structure from the policy.

    params = policyParameters(policy)
    params = struct with fields:
               Model_fc_Weights: [-15.4663 -7.2746]
                  Model_fc_Bias: 0
        Policy_EnableNoiseDecay: 0
          Policy_UseNoisyAction: 1
                             ID: 0
    
    

    Modify the parameter values. For this example, turn off the additive action noise to make the policy greedy.

    params.Policy_UseNoisyAction = false;

    Update the policy with the modified structure.

    policy = updatePolicyParameters(policy,params);

    Display the updated value of the policy parameter.

    policy.UseNoisyAction
    ans = logical
       0
    
    

    Input Arguments

    collapse all

    Reinforcement learning policy, specified as one of these objects:

    Example: getExplorationPolicy(rlPPOAgent(rlNumericSpec([3 1]),rlNumericSpec([1 1])))

    Parameters of the policy, specified as a structure consistent with the bus object generated by a policy block. The data type and dimension of each field in params must be consistent with its corresponding policy parameter.

    To obtain a structure of parameters from a policy object use the policyParameters function.

    Output Arguments

    collapse all

    New reinforcement learning policy, returned as a policy object of the same type as oldPolicy. Apart from the learnable and tunable parameter values, newPolicy is the same as oldPolicy.

    Version History

    Introduced in R2025a