Anatomy of a Neuron
The mathematical structure of an artificial neuron — inputs, weights, bias, the dot product, and the activation function — with implementation.
The Artificial Neuron
Inspired by biological neurons but simplified into a mathematical operation:
Biological neuron:
Receives signals from dendrites
Integrates them in the cell body
Fires if the sum exceeds a threshold
Sends signal through the axon
Artificial neuron (perceptron):
Receives inputs x₁, x₂, ..., xₙ
Computes weighted sum: z = Σ wᵢxᵢ + b
Applies activation function: output = activation(z)
Passes output to the next layerThe Formula
For a single neuron with n inputs:
z = w₁x₁ + w₂x₂ + ... + wₙxₙ + b
= w · x + b [dot product notation]
= xᵀw + b [matrix notation]
output = activation(z)
Where:
x = input vector [x₁, x₂, ..., xₙ]
w = weight vector [w₁, w₂, ..., wₙ]
b = bias (scalar)
z = pre-activation (also called "logit" or "net input")
activation = a non-linear function (ReLU, sigmoid, tanh, etc.)Visual Example
Clinical risk score neuron (predicting high INR risk):
Inputs (features):
x₁ = age (normalised) x₁ = 0.7 (age 70)
x₂ = warfarin dose (normalised) x₂ = 0.4 (5mg)
x₃ = comorbidity count x₃ = 0.6 (3 conditions)
Learned weights:
w₁ = 0.8 (age is highly predictive)
w₂ = 0.5 (dose matters)
w₃ = 0.4 (comorbidities matter)
b = -1.5 (bias — shifts the threshold)
z = 0.8(0.7) + 0.5(0.4) + 0.4(0.6) + (-1.5)
= 0.56 + 0.20 + 0.24 - 1.5
= -0.50
output = sigmoid(-0.50) = 1 / (1 + e^0.50) ≈ 0.38
→ 38% predicted probability of high INR riskPython Implementation
import numpy as np
class Neuron:
def __init__(self, n_inputs: int, activation: str = "sigmoid"):
# Xavier initialisation
std = np.sqrt(2.0 / (n_inputs + 1))
self.weights = np.random.normal(0, std, n_inputs)
self.bias = 0.0
self.activation_name = activation
def activation(self, z: float) -> float:
if self.activation_name == "sigmoid":
return 1 / (1 + np.exp(-z))
elif self.activation_name == "relu":
return max(0.0, z)
elif self.activation_name == "tanh":
return np.tanh(z)
elif self.activation_name == "linear":
return z
raise ValueError(f"Unknown activation: {self.activation_name}")
def forward(self, x: np.ndarray) -> float:
z = np.dot(self.weights, x) + self.bias
return self.activation(z)
def __repr__(self):
return f"Neuron(inputs={len(self.weights)}, activation={self.activation_name})"
# Example
neuron = Neuron(n_inputs=3, activation="sigmoid")
x = np.array([0.7, 0.4, 0.6]) # clinical features
output = neuron.forward(x)
print(f"Output: {output:.4f}")
# PyTorch equivalent: nn.Linear followed by activation
import torch
import torch.nn as nn
# A single neuron (1 output unit)
linear = nn.Linear(in_features=3, out_features=1, bias=True)
activation = nn.Sigmoid()
x_tensor = torch.tensor([0.7, 0.4, 0.6], dtype=torch.float32)
z = linear(x_tensor)
output_tensor = activation(z)
print(f"PyTorch output: {output_tensor.item():.4f}")The Role of the Bias
Without bias:
z = w · x
The decision boundary passes through the origin
Cannot represent "default output when all inputs are zero"
With bias:
z = w · x + b
b shifts the activation function left or right
Allows the neuron to fire even when inputs are zero (or not fire when they're high)
Example:
Intercept in logistic regression = bias
Without bias: if all features are 0, P(y=1) is always 0.5 (σ(0) = 0.5)
With bias b=-5: if all features are 0, P(y=1) is very low (σ(-5) ≈ 0.007)
→ Correct for rare diseases with low baseline probabilityWhat a Neuron Learns to Detect
In a trained neural network:
A neuron in layer 1: detects a simple feature (edge direction, word token)
A neuron in layer 2: detects a combination of layer 1 features
A neuron in the final layer: detects a high-level concept
The weights determine what feature the neuron responds to.
Training adjusts the weights so each neuron detects something useful
for the task.
Specialisation example (vision):
Some neurons fire for horizontal edges
Some neurons fire for faces
Some neurons fire for specific objects
Clinical example:
Some neurons in an ECG model fire for elevated ST segments
Some neurons fire for irregular RR intervals (AF pattern)
The final layer combines these to predict the diagnosisInterview Answer
"An artificial neuron computes a weighted sum of its inputs plus a bias — z = w·x + b — then passes z through a non-linear activation function. The weights determine which input features the neuron responds to (their relative importance), and the bias shifts the activation threshold. The activation function introduces non-linearity, which is essential for neural networks to approximate complex functions — without it, any stack of linear layers collapses to a single linear transformation. A single neuron is equivalent to logistic regression (with sigmoid activation). The power of neural networks comes from composing many such neurons in layers, where each neuron detects progressively more abstract features."
Found this helpful?
Leave a comment
Have a question, correction, or just found this helpful? Leave a note below.