|  | 
| 69 | 69 |  "Given the availability of differential labels $Z,ドル <b> differential regression </b> minimizes a combination of value and derivatives errors:\n", | 
| 70 | 70 |  " \n", | 
| 71 | 71 |  "$$\n", | 
| 72 |  | - " \\min \\left\\{ MSE + \\sum_{j=1}^n \\alpha_j E\\left\\{\\left[Z_j-\\beta \\cdot \\phi_j\\left(X\\right)\\right]^2\\right\\} \\right\\}\n", | 
|  | 72 | + " \\min \\left\\{ MSE + \\sum_{j=1}^n \\alpha_j E\\left\\{\\left[Z_j-\\beta \\cdot \\delta \\phi_j\\left(X\\right)\\right]^2\\right\\} \\right\\}\n", | 
| 73 | 73 |  "$$\n", | 
| 74 | 74 |  "\n", | 
| 75 |  | - "where $\\phi_j\\left(X\\right) = \\left[\\frac{\\partial \\phi_1\\left(X\\right)}{\\partial X_j}, ..., \\frac{\\partial \\phi_K\\left(X\\right)}{\\partial X_j}\\right] \\in \\mathbb{R}^K$ is the vector of partial derivatives of the basis functions wrt the j-th input $X_j,ドル and $Z_j$ is the j-th differential label.\n", | 
|  | 75 | + "where $\\delta \\phi_j\\left(X\\right) = \\left[\\frac{\\partial \\phi_1\\left(X\\right)}{\\partial X_j}, ..., \\frac{\\partial \\phi_K\\left(X\\right)}{\\partial X_j}\\right] \\in \\mathbb{R}^K$ is the vector of partial derivatives of the basis functions wrt the j-th input $X_j,ドル and $Z_j$ is the j-th differential label.\n", | 
| 76 | 76 |  "\n", | 
| 77 | 77 |  "Zeroing the gradient of the differential objective wrt weights $\\beta,ドル we obtain the differential normal equation:\n", | 
| 78 | 78 |  "\n", | 
| 79 | 79 |  "$$\n", | 
| 80 | 80 |  " \\beta = \\left( C_{\\phi\\phi} + \\sum_{j=1}^n \\alpha_j C_{jj}^\\phi \\right)^{-1} \\left( C_{\\phi y} + \\sum_{j=1}^n \\alpha_j C_{j}^{\\phi z} \\right)\n", | 
| 81 | 81 |  "$$\n", | 
| 82 | 82 |  "\n", | 
| 83 |  | - "where $C_{jj}^\\phi = E\\left[\\phi_j\\left(X\\right) \\phi_j\\left(X\\right)^T\\right] \\in \\mathbb{R}^{K \\times K}$ and $C_{j}^{\\phi z} = E\\left[\\phi_j\\left(X\\right) z_j \\right] \\in \\mathbb{R}^K$.\n", | 
|  | 83 | + "where $C_{jj}^\\phi = E\\left[\\delta \\phi_j\\left(X\\right) \\delta \\phi_j\\left(X\\right)^T\\right] \\in \\mathbb{R}^{K \\times K}$ and $C_{j}^{\\phi z} = E\\left[\\delta \\phi_j\\left(X\\right) z_j \\right] \\in \\mathbb{R}^K$.\n", | 
| 84 | 84 |  "\n", | 
| 85 | 85 |  "Similarly to ridge regression, the hyperparameters $\\alpha_j$ control the relative importance of derivatives correctness in the minimization objective. Contrarily to ridge regression, however, differential regularization does not introduce bias. It follows that it has little risk of underfitting. A reasonable default is given by:\n", | 
| 86 | 86 |  "\n", | 
|  | 
0 commit comments