Novel statistical learning methods for multi-modality heterogeneous data fusion in health care applications