Skip to main content

Skill Guide

Feature engineering from HRV, SpO₂, accelerometry, and skin temperature signals

The process of extracting, selecting, and transforming raw physiological time-series data from wearable sensors into informative, non-redundant variables (features) for use in predictive modeling.

It is the critical bridge between raw sensor noise and actionable clinical or user insights, directly determining the accuracy and reliability of models for health monitoring, disease detection, and performance optimization. High-quality feature engineering reduces computational load and model complexity, accelerating time-to-market for digital health products.
1 Careers
1 Categories
9.0 Avg Demand
20% Avg AI Risk

How to Learn Feature engineering from HRV, SpO₂, accelerometry, and skin temperature signals

1. Master signal fundamentals: Understand sampling rates, noise profiles, and common artifacts for HRV, SpO₂, accelerometry, and skin temperature. 2. Learn core feature domains: Time-domain (e.g., mean, SDNN for HRV), frequency-domain (e.g., LF/HF ratio for HRV), and basic motion metrics (e.g., vector magnitude for accelerometry). 3. Implement a basic pipeline in Python using libraries like NeuroKit2 or HRVanalysis to compute a feature set from a public dataset (e.g., WESAD).
1. Focus on artifact handling: Implement robust techniques like interpolation, adaptive filtering, or using accelerometer data to identify and exclude motion-corrupted segments in HRV/SpO₂. 2. Explore cross-signal features: Engineer features that combine signals, such as activity-corrected HRV metrics or temperature-adjusted SpO₂ readings. 3. Apply feature selection methods (e.g., mutual information, recursive feature elimination) to reduce dimensionality and avoid overfitting.
1. Architect scalable feature pipelines: Design systems for real-time feature extraction on edge devices, optimizing for memory and computational constraints. 2. Develop domain-specific feature sets: Create specialized indices for specific use cases (e.g., a stress score combining HRV, EDA proxies from temperature, and activity). 3. Mentor teams on best practices for feature validation, ensuring features are physiologically plausible and reproducible across different sensor hardware.

Practice Projects

Beginner
Project

Basic HRV & Motion Feature Pipeline

Scenario

You have a 1-hour recording from a chest-worn device containing ECG-derived R-peaks and 3-axis accelerometry. The goal is to extract features that might indicate rest vs. light activity.

How to Execute
1. Load data and compute R-R intervals (NN intervals) from ECG. 2. Clean NN intervals by removing physiologically implausible values (e.g., >2000ms). 3. Calculate time-domain HRV features (mean HR, SDNN, RMSSD) and basic accelerometry features (mean and variance of vector magnitude). 4. Segment the data into 5-minute windows and compare feature distributions between visually identified rest and activity segments.
Intermediate
Project

Motion-Robust SpO₂ Desaturation Detection

Scenario

You are given wrist-worn PPG (for SpO₂) and accelerometry data collected during sleep. Motion artifacts severely corrupt the raw PPG signal. The task is to build a feature set that can reliably identify segments of true desaturation vs. motion-induced drops.

How to Execute
1. Compute the instantaneous SpO₂ estimate and the accelerometer vector magnitude in 30-second epochs. 2. Engineer a feature for 'motion intensity' (e.g., max accel variance in epoch). 3. Create a feature for 'signal quality' using the PPG amplitude or a dedicated SQI algorithm. 4. Train a simple classifier (e.g., logistic regression) using SpO₂, motion intensity, and signal quality as features to label epochs as 'valid' or 'artifact'. 5. Apply this classifier to flag and exclude unreliable desaturation events.
Advanced
Project

Multi-Modal Stress Index Development

Scenario

Design and validate a composite stress score for a wearable product, using HRV, skin temperature, and accelerometry data collected in a controlled lab study with psychological stress tasks (e.g., Trier Social Stress Test).

How to Execute
1. Extract a broad feature set from each modality for each stress epoch (e.g., HRV LF/HF, temperature slope, activity level). 2. Use a gold-standard stress measure (e.g., cortisol, self-report) as the target. 3. Employ an interpretable model (e.g., regularized regression) to identify the most predictive features and assign weights. 4. Validate the composite index on an out-of-sample cohort, assessing its ability to discriminate between stress, recovery, and baseline states. 5. Optimize the feature set and model for real-time inference on the target wearable platform.

Tools & Frameworks

Software & Libraries

Python (NumPy, SciPy, Pandas)NeuroKit2HRVanalysistsfreshPyWavelets

The core stack for signal processing, statistical feature computation, and automated feature extraction. NeuroKit2 is particularly strong for HRV and signal cleaning. tsfresh automates the extraction of hundreds of time-series features.

Methodologies & Frameworks

PhysioNet datasets (e.g., WESAD)Time-Frequency Analysis (STFT, Wavelets)Artifact Rejection ProtocolsFeature Selection Techniques (Filter, Wrapper, Embedded)

Use public datasets for benchmarking. Time-frequency methods are essential for HRV spectral analysis. Systematic artifact rejection and feature selection are non-negotiable for building robust models.

Hardware & Deployment

Edge TFLite/MicroOn-device signal processing (e.g., accelerometer-based activity recognition)Sensor fusion algorithms

For advanced roles, understanding the constraints of deploying feature extraction algorithms on low-power microcontrollers within wearables is critical. Sensor fusion improves the reliability of individual features.

Interview Questions

Answer Strategy

The interviewer is testing knowledge of signal physiology, artifact handling, and feature discrimination. Start by acknowledging the challenge of motion. Sample Answer: 'The core challenge is distinguishing true AFib's irregular rhythm from motion artifacts. I'd start with HRV features from PPG-derived pulse intervals, focusing on irregularity metrics like RMSSD, pNN50, and the Shannon entropy of the RR interval histogram. Crucially, I'd use the accelerometer to compute a motion intensity index. Segments with high motion would be flagged, and we'd either exclude them or use a model that explicitly accounts for motion as a covariate. We might also look at PPG waveform morphological features, but these are highly sensitive to motion, so signal quality indices would be mandatory.'

Answer Strategy

Testing systematic problem-solving and understanding of the lab-to-real-world gap. Focus on data distribution shift, feature robustness, and model generalization. Sample Answer: 'My approach would be threefold. First, analyze the feature distributions: compare the lab vs. real-world data for each feature to identify which ones have shifted significantly-likely those most sensitive to uncontrolled activity or sensor placement. Second, audit the pipeline for data leakage: were the lab windows perfectly clean, while real data has interruptions? Third, re-evaluate feature importance in the real-world context; a feature crucial in the lab may be meaningless in the wild. I'd then rebuild a simpler, more robust feature set focused on the most stable signals and retrain using a domain adaptation or transfer learning technique.'

Careers That Require Feature engineering from HRV, SpO₂, accelerometry, and skin temperature signals

1 career found