Back to Dashboard

🎓

AI & ML Fundamentals

Learn core concepts, theory, and intuitive visualizations.

🚀

Implementation Pipeline

Build and train real ML models with clinical markers.

⚖️

Ethical AI and Visual Understanding of ML

Interactive demos on clinical AI ethics, regression, and model transparency.

🎓 AI & ML Fundamentals

Core concepts to understand before exploring the implementation pipeline.

📥

Level 1: The Inputs

Features (Variables)

The specific data points used by a model to find patterns and make predictions.

Clinical Example Patient Age, BMI, Blood Pressure, and Heart Rate.

📤

Level 1: The Goal

Outcomes (Targets)

The specific value or category the model is trying to learn and predict.

Clinical Example Predicting if a patient has a disease (Yes/No).

🤖

Level 2: The Core

Artificial Intelligence (AI)

The broad science of making machines mimic human intelligence and reasoning.

Broad Context Chess-playing computers or automated hospital schedules.

📈

Level 2: Statistical

Machine Learning (ML)

A subset of AI where machines "learn" patterns directly from data.

Key Difference Learns rules from data instead of being told what to do.

🏷️

Level 3: Learning Types

Supervised Learning

Training with a teacher: The model sees data and the correct answer (Label).

Utility Used for diagnosis when historical "answers" exist.

🔦

Level 3: Discovery

Unsupervised Learning

Finding paths in the dark: The model groups data without being told the answer.

Utility Discovering new patient clusters or distinct disease subtypes.

✂️

Level 4: Evaluation

Train-Test Split

Hiding part of the data to test if the model actually "knows" its stuff.

Why Ensures the model works on patients it has NEVER seen before.

📊

Level 4: Honing

Train-Val-Test Split

A three-way split to fine-tune the model "settings" before the final exam.

Benefit Prevents "cheating" by adjusting settings using test data.

📉

Level 5: Model Error

Underfitting

When a model is too simple to capture the complexity of the data.

Metaphor Studying only Chapter 1 and failing the whole medical exam.

📈

Level 5: Model Error

Overfitting

When a model memorizes "noise" instead of learning the actual trend.

Metaphor Memorizing exact test answers but not understanding the topic.

🎯

Level 5: The Balance

Bias vs Variance

The trade-off between strict assumptions and sensitivity to data noise.

Goal Finding the "Goldilocks" model: not too loose, not too rigid.

🔄

Level 5: Robustness

Cross-Validation (K-Fold)

Rotating data so every piece is used for both training and testing.

Benefit Eliminates the risk of a "lucky" or "unlucky" data split.

🧩

Level 6: Modern Tech

Explainable AI (XAI)

Opening the "Black Box" to see exactly how and why AI made a decision.

Healthcare Role Understanding why a model flagged a patient for high risk.

🛡️

Level 6: Responsibility

Ethics in AI

Ensuring algorithms are fair, unbiased, and respectful of human dignity.

Goal Preventing bias against specific patient demographics.

👥

Level 7: Architecture

Ensemble Learning

Combining many models together to improve overall accuracy and stability.

Example A "Random Forest" is an ensemble of many Decision Trees.

🧠

Level 7: Neural Nets

Deep Learning (DL)

Complex networks inspired by the human brain for processing high-dimensional data.

Strength Identifying diseases in medical images (X-rays, MRIs).

🌐

Level 7: Privacy

Federated Learning

Training AI across institutions without ever moving raw patient data.

Healthcare Role Safe collaboration between hospitals while keeping records local.

⚡

Level 8: The Frontier

Transformers

The architecture that powers modern AI by focusing on relevant patterns.

Concept Uses "Self-Attention" to understand long-range context in data.

📚

Level 8: Scale

Large Language Models

Massive models trained on almost all human text to understand language.

Capability Can summarize medical journals or bridge language barriers.

🗨️

Level 8: Conversation

GPT Models

A specific type of Transformer designed to generate coherent, human-like text.

Status Powers ChatGPT, Claude, and specialized medical assistants.

🎨

Level 8: Creatvity

Generative AI (GenAI)

The ability of AI to create entirely new content (text, images, synthetic data).

Innovation Generating new drug candidates or synthetic training data.

📊 Intuitive Concept Visualizations

Visual aids to grasp complex statistical trade-offs in Machine Learning.

Overfitting vs Underfitting

Watch how model complexity captures noise.

Underfit

Model Complexity (Degree) 1

Simple (Bias) Complex (Variance)

Accuracy vs Precision

Bias (Accuracy) vs Variance (Precision).

Perfect (Ideal)

Target Accuracy (1/Bias) High

High Bias Accurate

Shot Precision (1/Variance) High

Unpredictable Precise

Accuracy (Inverses Bias)

➔ Better Fit

Precision (Inverses Var)

➔ Consistency

Ethical AI and Visual Understanding of ML

Hands-on demonstrations of where clinical AI fails, misleads, and discriminates.

Linear Regression

Predicting continuous health markers like SBP from BMI and Age.

Clinical Use Chronic Disease Progression Modeling.

Logistic Regression

Visualizing probability thresholds for binary risk (e.g., Diabetes).

Clinical Use Diagnostic Risk Stratification.

Interactive logic flows for cancer triage and diagnostic rules.

Clinical Use Emergency Room Triage Protocols.

The "Consensus Board" - aggregating multiple diverse tree models.

Clinical Use Multidisciplinary Tumour Boards.

Mastering the trade-off between Sensitivity and Specificity.

Clinical Use Screening Test Accuracy Benchmarking.

Precision & Recall

Understanding "Missing a Diagnosis" vs "False Alarms".

Clinical Use Rare Disease Screening Optimization.

Bias-Variance Trade-off

Visualizing the danger of Over-diagnosis vs Under-diagnosis.

Clinical Use Model Generalizability in Health Data.

Cross-Validation (K-Fold)

How rotating data ensures clinical reliability and robustness.

Clinical Use Ensuring Trial Results Apply to All Patients.

Neural Net Complexity

Why larger models sometimes perform better (Double Descent).

Clinical Use Precision Medicine & Genomic Transformers.

Train a multi-layer neural network epoch-by-epoch. Watch train vs validation loss curves and detect overfitting live.

Clinical Use Medical Imaging · Histopathology · ECG Analysis

Transformer Attention

Explore self-attention on a clinical note. Click any token to see which terms the model focuses on for diagnosis.

Clinical Use Clinical NLP · EHR Summarisation · Radiology Reports

Reinforcement Learning

Train a Q-learning agent to optimise sepsis vasopressor dosing. Watch cumulative reward grow as the agent learns.

Clinical Use ICU Treatment Protocols · Sepsis Management · Drug Dosing

K-Means Clustering

Discover hidden patient subgroups without labels. Watch centroids migrate and find the optimal K using the Elbow method.

Clinical Use Patient Phenotyping · Disease Subtyping · Risk Stratification

Hierarchical Clustering

Build a taxonomy of patient data. Drag the cut threshold to define clusters and explore different linkage strategies.

Clinical Use Gene Expression Analysis · Biomarker Discovery · Taxonomy

PCA Dimensionality

Simplify high-dimensional clinical data. Project many variables onto Principal Components while preserving variance.

Clinical Use Biomarker Compression · Visualising High-Dim Data · EHR Noise Reduction

Machine Learning Pipeline

Fundamentals for Researchers: From Raw Data to Predictive Models.

1

Select Prototype

2

Dataset

3

Split & Train

4

Validation

5

Prediction

🤖 Choose Your Prototype

Machine learning in medicine usually starts with choosing a model type based on the question: "Continuous Value" (Regression) or "Presence/Absence" (Classification).

Supervised Learning (Predicting Outomes)

📈

Linear Regression

Continuous outcomes (e.g. SBP).

🎯

Logistic Regression

Binary categories (e.g. Disease Risk).

🌲

Random Forest

Robust ensemble classification.

🧠

Neural Networks (ANN)

Complex non-linear patterns.

🌳

Decision Trees

Logical flows for binary splits.

Unsupervised Learning (Finding Patterns)

🪐

K-Means Clustering

Grouping unlabelled data by distance.

🧬

PCA Analysis

Reducing dimensions to find core variance.

🌿

Hierarchical

Nested grouping by distance hierarchy.

🌊

DBSCAN Clustering

Density-based grouping with noise handling.

📊 Step 2: Prepare Dataset

ML models need "Features" (Inputs) and "Targets" (Labels). We've loaded a clinical prototype dataset for you.

Data Source

Choose Data Source:

Target Variable (Y): Feature Variable (X):

Data Preview

⚖️ Step 3: Train-Test Split & Optimization

To evaluate performance, we split data into Training (to teach patterns) and Testing (to assess accuracy). Choose your preferred split ratio below.

Data Split Ratio: 80% Train / 20% Test

Training Set (80%)

0

➡️

Testing Set (20%)

0

✅ Step 4: Performance & Model Pkl

Model Coefficients

Calculating...

Diagnostic Metrics

Export Logic (.pkl)

A .pkl (Pickle) file is a serialized container for your model's Beta values and variables. It allows you to "deploy" your model without re-training.

Note: This is a simulation showing how real-world ML files are generated.

🔮 Step 5: Predictive Deployment

Now using the deployed model, we can input new, unseen feature values to predict outcomes.

Feature Input:

Predicted Outcome

--