Learnixo
Back to blog
AI Systemsbeginner

Anatomy of a Neuron

The mathematical structure of an artificial neuron — inputs, weights, bias, the dot product, and the activation function — with implementation.

Asma Hafeez KhanMay 21, 20264 min read
Deep LearningNeuronPerceptronActivationInterview
Share:𝕏

The Artificial Neuron

Inspired by biological neurons but simplified into a mathematical operation:

Biological neuron:
  Receives signals from dendrites
  Integrates them in the cell body
  Fires if the sum exceeds a threshold
  Sends signal through the axon

Artificial neuron (perceptron):
  Receives inputs x₁, x₂, ..., xₙ
  Computes weighted sum: z = Σ wᵢxᵢ + b
  Applies activation function: output = activation(z)
  Passes output to the next layer

The Formula

For a single neuron with n inputs:

z = w₁x₁ + w₂x₂ + ... + wₙxₙ + b
  = w · x + b           [dot product notation]
  = xᵀw + b             [matrix notation]

output = activation(z)

Where:
  x = input vector [x₁, x₂, ..., xₙ]
  w = weight vector [w₁, w₂, ..., wₙ]
  b = bias (scalar)
  z = pre-activation (also called "logit" or "net input")
  activation = a non-linear function (ReLU, sigmoid, tanh, etc.)

Visual Example

Clinical risk score neuron (predicting high INR risk):

Inputs (features):
  x₁ = age (normalised)           x₁ = 0.7  (age 70)
  x₂ = warfarin dose (normalised)  x₂ = 0.4  (5mg)
  x₃ = comorbidity count          x₃ = 0.6  (3 conditions)

Learned weights:
  w₁ = 0.8   (age is highly predictive)
  w₂ = 0.5   (dose matters)
  w₃ = 0.4   (comorbidities matter)
  b  = -1.5  (bias — shifts the threshold)

z = 0.8(0.7) + 0.5(0.4) + 0.4(0.6) + (-1.5)
  = 0.56 + 0.20 + 0.24 - 1.5
  = -0.50

output = sigmoid(-0.50) = 1 / (1 + e^0.50) ≈ 0.38

→ 38% predicted probability of high INR risk

Python Implementation

Python
import numpy as np

class Neuron:
    def __init__(self, n_inputs: int, activation: str = "sigmoid"):
        # Xavier initialisation
        std = np.sqrt(2.0 / (n_inputs + 1))
        self.weights = np.random.normal(0, std, n_inputs)
        self.bias = 0.0
        self.activation_name = activation
    
    def activation(self, z: float) -> float:
        if self.activation_name == "sigmoid":
            return 1 / (1 + np.exp(-z))
        elif self.activation_name == "relu":
            return max(0.0, z)
        elif self.activation_name == "tanh":
            return np.tanh(z)
        elif self.activation_name == "linear":
            return z
        raise ValueError(f"Unknown activation: {self.activation_name}")
    
    def forward(self, x: np.ndarray) -> float:
        z = np.dot(self.weights, x) + self.bias
        return self.activation(z)
    
    def __repr__(self):
        return f"Neuron(inputs={len(self.weights)}, activation={self.activation_name})"


# Example
neuron = Neuron(n_inputs=3, activation="sigmoid")
x = np.array([0.7, 0.4, 0.6])   # clinical features
output = neuron.forward(x)
print(f"Output: {output:.4f}")


# PyTorch equivalent: nn.Linear followed by activation
import torch
import torch.nn as nn

# A single neuron (1 output unit)
linear = nn.Linear(in_features=3, out_features=1, bias=True)
activation = nn.Sigmoid()

x_tensor = torch.tensor([0.7, 0.4, 0.6], dtype=torch.float32)
z = linear(x_tensor)
output_tensor = activation(z)
print(f"PyTorch output: {output_tensor.item():.4f}")

The Role of the Bias

Without bias:
  z = w · x
  The decision boundary passes through the origin
  Cannot represent "default output when all inputs are zero"

With bias:
  z = w · x + b
  b shifts the activation function left or right
  Allows the neuron to fire even when inputs are zero (or not fire when they're high)

Example:
  Intercept in logistic regression = bias
  Without bias: if all features are 0, P(y=1) is always 0.5 (σ(0) = 0.5)
  With bias b=-5: if all features are 0, P(y=1) is very low (σ(-5) ≈ 0.007)
  → Correct for rare diseases with low baseline probability

What a Neuron Learns to Detect

In a trained neural network:
  A neuron in layer 1: detects a simple feature (edge direction, word token)
  A neuron in layer 2: detects a combination of layer 1 features
  A neuron in the final layer: detects a high-level concept

The weights determine what feature the neuron responds to.
Training adjusts the weights so each neuron detects something useful
for the task.

Specialisation example (vision):
  Some neurons fire for horizontal edges
  Some neurons fire for faces
  Some neurons fire for specific objects

Clinical example:
  Some neurons in an ECG model fire for elevated ST segments
  Some neurons fire for irregular RR intervals (AF pattern)
  The final layer combines these to predict the diagnosis

Interview Answer

"An artificial neuron computes a weighted sum of its inputs plus a bias — z = w·x + b — then passes z through a non-linear activation function. The weights determine which input features the neuron responds to (their relative importance), and the bias shifts the activation threshold. The activation function introduces non-linearity, which is essential for neural networks to approximate complex functions — without it, any stack of linear layers collapses to a single linear transformation. A single neuron is equivalent to logistic regression (with sigmoid activation). The power of neural networks comes from composing many such neurons in layers, where each neuron detects progressively more abstract features."

Enjoyed this article?

Explore the AI Systems learning path for more.

Found this helpful?

Share:𝕏

Leave a comment

Have a question, correction, or just found this helpful? Leave a note below.