Loading [MathJax]/extensions/Safe.js
arxiv.org
scholar.google.com
Network-regularized Sparse Logistic Regression Models for Clinical Risk Prediction and Biomarker Discovery
Wenwen Min and Juan Liu and Shihua Zhang
arXiv e-Print archive - 2016 via Local arXiv
Keywords: q-bio.GN, cs.LG, stat.ML, J.3; H.2.8; G.1.6; I.5

more

[link]
Summary by Joseph Paul Cohen 7 years ago

In this paper they prior the representation a logistic regression model using known protein-protein interactions. They do so by regularizing the weights of the model using the Laplacian encoding of a graph.

Here is a regularization term of this form:

$$\lambda ||w||_1 + \eta w^T L w,$$

A small example:

Given a small graph of three nodes A, B, and C with one edge: {A-B} we have the following Laplacian:

$$ L = D - A = \left[\array{ 1 & 0 & 0 \\ 0 & 1 & 0\\ 0 & 0 & 0}\right] - \left[\array{ 0 & 1 & 0 \\ 1 & 0 & 0\\ 0 & 0 & 0}\right]$$

$$L = \left[\array{ 1 & -1 & 0 \\ -1 & 1 & 0\\ 0 & 0 & 0}\right] $$

If we have a small linear regression of the form:

$$y = x_Aw_A + x_Bw_B + x_Cw_C$$

Then we can look at how $w^TLw$ will impact the weights to gain insight:

$$w^TLw $$

$$= \left[\array{ w_A & w_B & w_C}\right] \left[\array{ 1 & -1 & 0 \\ -1 & 1 & 0\\ 0 & 0 & 0}\right] \left[\array{ w_A \\ w_B \\ w_C}\right] $$

$$= \left[\array{ w_A & w_B & w_C}\right] \left[\array{ w_A -w_B \\ -w_A + w_B \\ 0}\right] $$

$$ = (w_A^2 -w_Aw_B ) + (-w_Aw_B + w_B^2) $$

So because all terms are squared we can remove them from consideration to look at what is the real impact of regularization.

$$ = (-w_Aw_B ) + (-w_Aw_B) $$

$$ = -2w_Aw_B$$

The Laplacian regularization seems to increase the weight values of edges which are connected. Along with the squared terms and the $L1$ penalty that is also used the weights cannot grow without bound.

A few more experiments:

If we perform the same computation for a graph with two edges: {A-B, B-C} we have the following term which increases the weights of both pairwise interactions:

$$ = -2w_Aw_B -2w_Bw_C$$

If we perform the same computation for a graph with two edges: {A-B, A-C} we have no surprises:

$$ = -2w_Aw_B -2w_Aw_C$$

Another thing to think about is if there are no edges. If by default there are self-loops then the degree matrix will have 1 on the diagonal and it will be the identity which will be an $L2$ term. If no self loops are defined then the result is a 0 matrix yielding no regularization at all.

Contribution:

A contribution of this paper is to use the absolute value of the weights to make training easier.

$$|w|^T L |w|$$

TODO: Add more about how this impacts learning.

Overview

Here a high level figure shows the data and targets together with a graph prior. It looks nice so I wanted to include it.

more
Your comment:

Send Feedback
ShortScience.org allows researchers to publish paper summaries that are voted on and ranked!
About

Sponsored by: