Back to Dashboard

Understanding Basic Statistical Testing

Learn the foundations of biostatistics, upload your dataset, and perform automated hypothesis tests.

🚦 Statistics for Beginners: The Bare Minimum

1. The Baseline Assumption

(The "Null Hypothesis")

Assume nothing interesting is happening.

The drug does nothing.
The patient is perfectly healthy.
There is no difference between groups.

2. The P-Value

(The Surprise Score)

If the Baseline Assumption is true, how likely was this data to occur by chance only?

A P-value of 0.03 means we would only see this data by chance 3% of the time. We are very surprised!

3. The Verdict

(The 0.05 Rule)

If p ≤ 0.05:
We found something real! Reject the Baseline Assumption.
(Difference is Statistically Significant)

If p > 0.05:
It's probably just noise. Keep the Baseline Assumption.
(Difference is Not Statistically Significant)

Why do we NEVER say "We proved the Baseline Assumption is true or Null Hypothesis is Accepted"?

The Lesson

Imagine a patient arriving at the emergency room with severe abdominal pain. The doctor suspects appendicitis and orders a CT scan or ultrasound to look for evidence of inflammation.

Now, if the scan clearly shows an inflamed appendix, the diagnosis is confirmed. But if the scan does not show convincing signs of appendicitis, it does not prove that the patient is perfectly healthy; it simply means that, based on the current evidence, there is not enough proof to confirm the disease.

This is exactly how a p-value works in hypothesis testing. The baseline assumption (null hypothesis) assumes "no disease" or "no effect." A small p-value suggests that the observed findings would be very unlikely if there were truly no disease, so we reject the null hypothesis. A large p-value, however, does not prove that there is no disease or no effect; it only indicates that the evidence we observed is not strong enough to confidently rule out chance.

In essence, a p-value measures the strength of evidence against the null hypothesis—it does not prove that the null hypothesis is true.

Baseline (Nothing Happening) Most results land safely in the middle.

Play with the Surprise Score (P-Value): p = 0.03

Null Hypothesis Rejected (There is a statistically significant difference)

One-Tailed vs Two-Tailed Test

Imagine you are conducting a clinical trial on a new antihypertensive drug. The standard drug lowers systolic blood pressure by 10 mmHg on average.

Two-Tailed Test: If your question is: “Does the new drug change blood pressure compared to the standard?” — you are open to either possibility: it might lower BP more, or it might unexpectedly increase it. This is like a hospital safety committee that investigates both unusually high and unusually low oxygen levels in ICU patients. You are guarding against deviations in both directions.

One-Tailed Test: Now imagine your drug has shown evidence that it can only lower BP. Your question becomes: “Does the new drug reduce blood pressure more than the standard?” This is like a vaccination campaign where the only meaningful question is whether the vaccine reduces infection rates. Your statistical attention is focused in one direction.

*One-tailed tests should only be chosen when there is strong theoretical or clinical justification before data analysis.

What is "Degree of Freedom"?

The Anemia Patient Analogy: Suppose you are analyzing hemoglobin levels of 5 patients. Once you calculate the mean, it acts like a constraint. The first four patients can vary freely, but the fifth patient's level must balance the others to keep the average fixed. Although you have 5 data points, only 4 are truly "free to vary."

                                df = n - 1 (5 - 1 = 4)
                            

The Hospital Budget Analogy: Imagine distributing a fixed budget of ₹10 lakhs across five departments. You can allocate funds freely to the first four, but the fifth automatically receives whatever remains to keep the total at ₹10 lakhs. That last allocation has no independent freedom.

The Core Concept:
Degree of freedom represents the number of independent pieces of information available to estimate variability in a study.

Step 2: Upload Dataset (CSV)

💡 How to prepare your data:

Your CSV must have headers in the first row. Each column represents one variable, and each row represents one patient. Example format:
Group, Age, BloodPressure, Outcome Treatment, 45, 120.5, Recovered Control, 52, 135.2, Not Recovered Treatment, 38, 118.0, Recovered Control, 61, 142.8, Not Recovered Treatment, 41, 122.1, Not Recovered Control, 55, 138.5, Recovered Treatment, 33, 115.4, Recovered Control, 48, 129.0, Recovered

Choose a CSV or Excel File:

Or Paste CSV Data Here:

Waiting for raw data...

📊 Available Tests in EpiSense

Comparison of Means

Independent T-test (2 groups)
Paired T-test (Before/After)
One-Way ANOVA (3+ groups)
Mann-Whitney / Kruskal-Wallis (Non-parametric)

Relationships & Trends

Pearson's r (Linear Correlation)
Spearman's Rho (Rank Correlation)
Scatter Plot Visualizations

Categorical Data

Pearson Chi-Square (Groups/Outcomes)
Fisher's Exact Test (Small Samples)
2x2 Contingency Tables

Step 3: Choose Testing Method

Data Distribution & Normality Checker

Before choosing a statistical test (Parametric vs Non-Parametric), you must check if your continuous data follows a Normal "Bell Curve" distribution.

Select Variable to Test for Normality:

Optional: Plot Quality & Presentation Customization

Custom Title Prefix: Custom Subtitle: Visual Aspect Ratio:

Grid Style:

Color Scheme:

X-Axis / Category Label:

Show Legend Transparent BG

1. Histogram

2. Box Plot Approximation

3. Q-Q Plot

Interactive Statistical Test Selector

Not sure which test to run? Answer a few quick questions to decode the perfect statistical test for your data.

Available Tests in EpiSense:

Unpaired t-test

Parametric, 2 Independent Groups

Paired t-test

Parametric, 2 Paired Groups

One-Way ANOVA

Parametric, 3+ Groups

Mann-Whitney U

Non-Parametric, 2 Independent

Wilcoxon Signed-Rank

Non-Parametric, 2 Paired

Kruskal-Wallis

Non-Parametric, 3+ Groups

Chi-Square

Categorical vs Categorical

Pearson Correlation

Parametric, Continuous vs Continuous

Spearman Correlation

Non-Parametric Continuous/Ordinal

Test Selection Wizard

1. What is the primary goal of your analysis?

3. Statistical Hypothesis Testing

Select Statistical Test: Select Test Tail (Directionality):

Test Result Overview

Test Statistic

-

P-Value

Interpretation (α = 0.05)

Understanding Basic Statistical Testing

Why do we NEVER say "We proved the Baseline Assumption is true or Null Hypothesis is Accepted"?

Step 3: Choose Testing Method

Select Variable to Test for Normality:

1. Histogram

2. Box Plot Approximation

3. Q-Q Plot

Mathematical Normality Tests

Available Tests in EpiSense:

1. What is the primary goal of your analysis?

2. How many groups are you comparing?

3. Are the samples independent or paired?

Final Question: Does your continuous data map to a Normal Distribution?

🎯 Recommended Statistical Test

Mathematical Context

Test Results

Why do we NEVER say "We proved the Baseline Assumption is true or Null Hypothesis is Accepted"?

Step 3: Choose Testing Method

Select Variable to Test for Normality:

1. Histogram 📥

2. Box Plot Approximation 📥

3. Q-Q Plot 📥

Mathematical Normality Tests

Available Tests in EpiSense:

1. What is the primary goal of your analysis?

2. How many groups are you comparing?

3. Are the samples independent or paired?

Final Question: Does your continuous data map to a Normal Distribution?

🎯 Recommended Statistical Test

Mathematical Context

Test Results

1. Histogram

2. Box Plot Approximation

3. Q-Q Plot