ObjectiveTo explore the methods of data management and statistical analysis for longitudinal big data collected from mobile health management applications (APP). MethodsThe data management process and statistical analysis method were proposed by summarizing the characteristics of the data from mobile health management APPs. The methods would be clarified by a practical case: an APP recording female menstruation. ResultsThe data from health management APPs belong to longitudinal big data and the original record of the APP should be reprocessed or computed before conducting statistical analysis. A two-step data cleaning procedure was suggested for data management of the original records and reprocessed data, and longitudinal models such as mixed models was recommended for statistical analysis. ConclusionsThe data from health management APPs could be used for medical research via specific data management and statistical analysis after removing suspicious data. Cloud computing could be a viable method to improve efficiency of the big data analysis of health management APPs.
Longitudinal data had intrinsic correlation problems at different time points, and traditional meta-analysis techniques cannot resolve this problem. Regression coefficients based on multi-level models can fully consider the correlations of longitudinal data at various time points. This paper uses SAS software to perform multi-level regression coefficient model meta-analysis and provides programming code which is simple and easy to operate.
ObjectiveTo explore the utilization of longitudinal data in constructing non-time-varying outcome prediction models and to compare the impact of different modeling approaches on prediction performance. MethodsClinical predictors were selected using univariate analysis and Lasso regression. Non-time-varying outcome prediction models were developed based on latent class trajectory analysis, the two-stage model, and logistic regression. Internal validation was performed using Bootstrapping resampling, and model performance was evaluated using ROC curves, PR curves, sensitivity, specificity and other relevant metrics. ResultsA total of 49 629 pregnant women were included in the study, with mean age of 31.42±4.13 years and pre-pregnancy BMI of 20.91±2.62kg/m². Fourteen predictors were incorporated into the final model. Prediction models utilizing longitudinal data demonstrated high accuracy, with AUROC values exceeding 0.90 and PR-AUC values greater than 0.47. The two-stage model based on late-pregnancy hemoglobin data showed the best performance, achieving AUROC of 0.93 (95%CI 0.92 to 0.94) and PR-AUC of 0.60 (95%CI 0.56 to 0.64). Internal validation confirmed robust model performance, and calibration curves indicated a good agreement between predicted and observed outcomes. ConclusionFor the longitudinal data, the two-stage model can well capture the dynamic change trajectory of the longitudinal data. For different clinical outcomes, the predictive value of repeated measurement data is different.