Principal Investigator: Jiayu Zhou [email: firstname.lastname@example.org]
Recent advances in big data infrastructures and algorithm foundations have unleashed a torrent of data being collected and stored in distributed data centers all over the world. The ever-increasing availability of these massive datasets leads to many machine learning tasks that are inherently related. Therefore transfer learning paradigms have been developed in the past decade to perform knowledge transfer among them to improve the learning efficiency and effectiveness. This project will develop a suite of large-scale lifelong learning methods to address significant challenges of knowledge transfer on big data. The algorithms and tools developed in this project will directly impact biomedical informatics and intelligent transportation systems, as they will be used to build personalized predictive models from electronic medical records and traffic state prediction models from big traffic data. The success of this project will be used to develop a new curriculum that incorporates research into the classroom and provides students from under-represented groups with opportunities to participate in machine learning research.
The properties of velocity, volume, variability, and variety characterizing the big data have imposed significant challenges in these traditional lifelong learning approaches. This project will advance lifelong learning by (1) developing a distributed life-long learning framework enable online knowledge transfer on large-scale distributed datasets; (2) designing effective methods to track temporal drifting in the task relationship, and leverage human knowledge via interactive transfer; and (3) investigating strategies that enable the distributed life-long learning to handle heterogeneities from both feature spaces and learning tasks. The results of this project will have an immediate and strong impact on Big Data theoretical and algorithmic foundations, by enabling a large-scale lifelong learning framework readily available for many Big Data analytics.