# Introduction to Robot Intelligence HW 5: Coding Questions

The is the coding potion of Homework 5. For submission instructions, please see the theory portion of Homework 5.

This portion of the homework consists of a single problem.

## Problem 1: Linear Quadratic Regulator for Mysterious Manipulator


Consider a myserious 2-DOF robot manipulator with dynamics governed by the equation below.

$$\begin{align} \ddot{x}_1 &= 0.1 \cdot \dot{x}_1 + u_{1}\\ 
\ddot{x}_2 &= 0.01 \cdot g \cdot x_2 + u_2\end{align}$$

We want to write out a numerical LQR solver to find the control torques $\mathbf{u} = [u_1 \ \ u_2]$ such that the state of the manipulator $\mathbf{x} = [x_1 \ \ x_2 \ \ \dot{x}_1 \ \ \dot{x}_2]$ is kept at equilibrium, $\mathbf{x} = \boldsymbol 0$. To that end, fill in the undefined functions and definitions below.


If you are stuck, you may find referring to [Tutorial 5](https://colab.research.google.com/drive/1zcJbQe7PUcqMU_hiNbGcmWF4YqH0FJ3H?usp=sharing) helpful.

In [None]:
import autograd.numpy as np
from autograd import grad, jacobian

## DO NOT MODIFY THESE GLOBAL DEFINITIONS
Q = np.eye(4)
R = 1*np.eye(2)
g = 9.8

In [None]:
def mysterious_manipulator_dynamics(vec_x, vec_u, delta_t=0.1):
	"""
	Inputs
		vec_x: (np.ndarray with shape (4,)) [x1, x2, dot_x1, dot_x2]
		vec_u: (np.ndarray with shape (2,)) [u1, u2]
		delta_t: constant

	Outputs
		A np.ndarray with shape (4,) representing vec_x at time (t + delta_t)
	"""
	raise NotImplementedError()

In [None]:
## Get the dynamics matrices A, B (separating dynamics across terms in state vs.
## terms in control) for LQR via your implementation 
## `mysterious_manipulator_dynamics`.
A = None
B = None

In [None]:
# Dynamic programming to compute K and V matrices backwards in time
def lqr_value_iteration(A, B, Q=Q, R=R, H=100):
  """
  Use the LQR steady-state update rule to compute the matrices K and V backwards
  in time. Return lists of V and K matrices (s.t. V[0], K[0] correspond to the
  first computed values of V, K respectively).

  Inputs
    A: (np.ndarray with shape (4,4))). Dynamics terms in state.
    B: (np.ndarray with shape (4,2)). Dynamics terms in control
    Q: (np.ndarray with shape (4,4)). Positive semi-definite.
    R: (np.ndarray with shape (2,2)). Positive definite.
    H: number of iteration steps to do.
  Returns
    V_list: (Python list with length H of (4,4) np.ndarrays). Consists of all
     computed V matrices.
    K_list: (Python list with length H of (1,2) np.ndarrays). Consists of all
     computed K matrices.
  """
  raise NotImplementedError()

In [None]:
"""
TEST CASES

Use these test cases to check your code. All errors should be zero given the 
rounding defined here.
"""
import numpy as np
from tabulate import tabulate

def _forward_simulate(x_0, K, H):
  x_traj, u_traj = [], []
  x = x_0
  for t in range(H):
    u = K @ x
    x_traj.append(x)
    u_traj.append(u)
    x = mysterious_manipulator_dynamics(x, u)
  return x_traj, u_traj

assert not A is None and not B is None

H = 1000
V_list, K_list = lqr_value_iteration(A, B, Q, R, H=H)

test_x0 = [[1, 0.01, 0.01, 1],
            [0.01, 0.01, 0.01, 0],
            [1, 3, 0.01, 2],
            [2, 0.01, 1, 0.01],
            [-1, 0.01, 1, -1],
            [-1, 6, 1, 1]]

expected_u0 = [[-0.9299836877065533, -1.6949506700118182], 
               [-0.026456190828435207, -0.01016999909840031], 
               [-0.9299836877065533, -6.420561071346928], 
               [-3.558273120094145, -0.02701780580753449], 
               [-0.8203110083422721, 1.6746106718150175], 
               [-0.8203110083422721, -7.786780129953603]]

expected_u1 = [[-0.7656675123060327, -1.5026399167094364], 
               [-0.022841212499060792, -0.008574954672458135], 
               [-0.7656675123060327, -5.560616325811397], 
               [-3.0344508456706976, -0.023515604292827924], 
               [-0.7834620583768432, 1.4854900073645203], 
               [-0.7834620583768432, -6.639037765511859]]

expected_u15 = [[0.164445609335082, -0.16411805690945114],
                [0.00036512682757059746, 0.0007651323047314232], 
                [0.164445609335082, -0.10022668700893789], 
                [0.20225054387575825, -0.0008836995874103998], 
                [-0.29496303948033714, 0.1656483215189139], 
                [-0.29496303948033714, 0.2941961936246722]]

results = []
for x0, exp_u0, exp_u1, exp_u15 in zip(test_x0, expected_u0, expected_u1, expected_u15):
    x_traj, u_traj = _forward_simulate(x0, K_list[-1], H)
    final_control_err = np.linalg.norm(x_traj[-1])
    u0_err = np.linalg.norm(u_traj[0] - exp_u0)
    u1_err = np.linalg.norm(u_traj[1] - exp_u1)
    u15_err = np.linalg.norm(u_traj[15] - exp_u15)
    
    results.append([x0, control_err.round(2), u0_err.round(2), u1_err.round(2), u15_err.round(2)])

print(tabulate(results, headers=['x0', 'final pos error', 'control at t=0 error', 'control at t=1 error', 'control at t=15 error']))

AssertionError: ignored