Logistic regression is a statistical model for binary classification that models the probability of a binary outcome $y_i$ as a function of input features $x_i$:

$$ p(y\mid x) = \sigma(z)^{y}(1-\sigma(z))^{1-y} $$

and thus:

$$ p(y=1\mid x)=\sigma(z) $$

where:

$x_i\in\R^d$ is an input vector
$y\in\set{0,1}$ is a binary label
$w\in\R^d$ is a weight vector
$b\in\R$ is a bias
$z=w^\top x + b$
$\sigma$ is the sigmoid function

It is a generalized linear model where the link function is logistic/sigmoid function.

Training

Given likelihood:

$$ L(w,b) = \prod^n_{i=1} p(y_i\mid x_i) $$

training maximizes log-likelihood:

$$ \ell(w,b)=\sum^n_{i=1}\left[y_i \log\sigma(z_i) + (1-y_i)\log(1-\sigma(z_i))\right] $$

where: