SUPPORT2 Medical Survival Analysis

OVERVIEW

This project explores the SUPPORT2 medical dataset, which contains detailed clinical and demographic information from critically ill patients across five US hospitals. The aim was to uncover patterns related to survival outcomes using Initial Data Analysis (IDA) and Exploratory Data Analysis (EDA). My role centred on preparing and analysing the dataset through data cleaning, handling missing and inconsistent values, correcting data types, detecting outliers, and producing the early visualisations that informed the group’s deeper statistical findings.

YEAR

2024

ROLE

Data Analyst
EDA Specialisation

SERVICES

Preliminary / Exploratory Data Analyst
Data Cleaning

About the project

This assignment required performing a full IDA and EDA workflow on the SUPPORT2 clinical dataset, focusing on patient survival outcomes and underlying risk factors. My contribution began with inspecting the dataset structure, managing missing values, correcting inconsistent data types, rounding float values, and removing non-informative or corrupted entries.


I analysed descriptive statistics such as mean, median, mode, and standard deviation to gain an initial understanding of the variables. I then generated the first suite of visualisations, including multi-variable boxplots, which revealed substantial outliers across income, charges, lab measurements, and physiological variables. These findings guided key data cleaning decisions and prepared the dataset for deeper analysis.


CLICK HERE TO SEE PROJECT


Smooth Scroll
This will hide itself!

SUPPORT2 Medical Survival Analysis

OVERVIEW

This project explores the SUPPORT2 medical dataset, which contains detailed clinical and demographic information from critically ill patients across five US hospitals. The aim was to uncover patterns related to survival outcomes using Initial Data Analysis (IDA) and Exploratory Data Analysis (EDA). My role centred on preparing and analysing the dataset through data cleaning, handling missing and inconsistent values, correcting data types, detecting outliers, and producing the early visualisations that informed the group’s deeper statistical findings.

YEAR

2024

ROLE

Data Analyst
EDA Specialisation

SERVICES

Preliminary / Exploratory Data Analyst
Data Cleaning

About the project

This assignment required performing a full IDA and EDA workflow on the SUPPORT2 clinical dataset, focusing on patient survival outcomes and underlying risk factors. My contribution began with inspecting the dataset structure, managing missing values, correcting inconsistent data types, rounding float values, and removing non-informative or corrupted entries.


I analysed descriptive statistics such as mean, median, mode, and standard deviation to gain an initial understanding of the variables. I then generated the first suite of visualisations, including multi-variable boxplots, which revealed substantial outliers across income, charges, lab measurements, and physiological variables. These findings guided key data cleaning decisions and prepared the dataset for deeper analysis.


CLICK HERE TO SEE PROJECT


Smooth Scroll
This will hide itself!

SUPPORT2 Medical Survival Analysis

OVERVIEW

This project explores the SUPPORT2 medical dataset, which contains detailed clinical and demographic information from critically ill patients across five US hospitals. The aim was to uncover patterns related to survival outcomes using Initial Data Analysis (IDA) and Exploratory Data Analysis (EDA). My role centred on preparing and analysing the dataset through data cleaning, handling missing and inconsistent values, correcting data types, detecting outliers, and producing the early visualisations that informed the group’s deeper statistical findings.

YEAR

2024

ROLE

Data Analyst
EDA Specialisation

SERVICES

Preliminary / Exploratory Data Analyst
Data Cleaning

About the project

This assignment required performing a full IDA and EDA workflow on the SUPPORT2 clinical dataset, focusing on patient survival outcomes and underlying risk factors. My contribution began with inspecting the dataset structure, managing missing values, correcting inconsistent data types, rounding float values, and removing non-informative or corrupted entries.


I analysed descriptive statistics such as mean, median, mode, and standard deviation to gain an initial understanding of the variables. I then generated the first suite of visualisations, including multi-variable boxplots, which revealed substantial outliers across income, charges, lab measurements, and physiological variables. These findings guided key data cleaning decisions and prepared the dataset for deeper analysis.


CLICK HERE TO SEE PROJECT


Smooth Scroll
This will hide itself!