Student-Powered Machine Learning | MIT News
From their early days at MIT, and even before, Emma Liu ’22, MNG ’22, Yo-whan “John” Kim ’22, MNG ’22, and Clemente Ocejo ’21, MNG ’22 knew they wanted to do computational research and explore artificial intelligence and machine learning. “Since high school, I’ve been into deep learning and been involved in projects,” says Kim, who attended a Research Science Institute (RSI) summer program at MIT and the Harvard University and continued to work on action recognition in videos using Microsoft’s Kinect.
As students of the Department of Electrical and Computer Engineering who recently graduated from the Master of Engineering (MEng) thesis program, Liu, Kim, and Ocejo developed the skills to help guide application-oriented projects. In collaboration with the MIT-IBM Watson AI Lab, they improved text classification with limited labeled data and designed machine learning models for better long-term predictions for product purchases. For Kim, “it was a very smooth transition and…a great opportunity for me to continue working in deep learning and computer vision at the MIT-IBM Watson AI Lab.”
Working with researchers from academia and industry, Kim designed, trained, and tested a deep learning model for recognizing actions across domains — in this case, video. His team specifically targeted the use of synthetic data from videos generated for training and performed prediction and inference tasks on real data, composed of different action classes. They wanted to see how pre-training models on synthetic videos, specifically simulations or human or humanoid actions generated by a game engine, overlapped with real data: publicly available videos retrieved from the Internet.
The reason for this research, Kim says, is that real videos can have issues including representational bias, copyright and/or ethical or personal sensitivity, for example, videos of a car hitting people would be difficult to collect, or the use of faces, real addresses or license plates without consent. Kim is experimenting with 2D, 2.5D, and 3D video models, with the goal of creating a domain-specific synthetic video dataset or even a large general dataset that can be used for certain transfer domains, where the data is missing. For example, for construction industry applications, this could include performing its action recognition on a construction site. “I didn’t expect synthetically generated videos to perform on par with real videos,” he says. “I think it opens up a lot of different roles [for the work] in the future.”
Despite a rocky start to the project collecting and generating data and running many models, Kim says he wouldn’t have done it any other way. “It was amazing to see how the members of the lab encouraged me, ‘I’m fine. You will have all the experiences and the fun part ahead. Don’t stress too much. It was this structure that helped Kim take ownership of the job. “In the end, they gave me so much support and amazing ideas that helped me complete this project.”
Data scarcity was also a theme of Emma Liu’s work. “The overarching problem is that there’s all of this data out there, and for a lot of machine learning problems you need that data to be labeled,” Liu says, “but then you have all of that unlabeled data available. that you don’t really benefit.
Liu, under the direction of his MIT and IBM group, worked to put this data to good use, training semi-supervised text classification models (and combining aspects of them) to add pseudo-labels to unlabeled data, based on predictions and probabilities about which categories each previously unlabeled piece of data fits into. “Then the problem is that there’s been previous work that has shown that you can’t always trust probability; specifically, neural networks have often proven to be overconfident,” Liu points out.
Liu and his team solved this problem by evaluating the accuracy and uncertainty of the models and recalibrating them to improve his self-training framework. The self-training and calibration stage allowed him to have better confidence in the predictions. This pseudo-tagged data, she says, could then be added to the pool of real data, thereby expanding the data set; this process could be repeated in a series of iterations.
For Liu, what she deserved most was not the product, but the process. “I learned a lot about being an independent researcher,” she says. As an undergraduate, Liu worked with IBM to develop machine learning methods to reuse drugs already on the market and honed her decision-making ability. After collaborating with academic and industrial researchers to learn skills in asking probing questions, seeking out experts, digesting and presenting scientific papers for relevant content, and testing ideas, Liu and his cohort of MEng students working with MIT -IBM Watson AI Lab felt they were confident in their knowledge, freedom and flexibility to dictate the direction of their own research. In taking on this key role, Liu says, “I feel like I have ownership of my project. »
After his time at MIT and the MIT-IBM Watson AI Lab, Clemente Ocejo also came away with a sense of mastery, having built a strong foundation in AI techniques and time series methods starting with his MIT Undergraduate Research Opportunities Program (UROP), where he met his advisor MEng. “You really have to be proactive in decision-making,” says Ocejo, “vocalizing it [your choices] as a researcher and letting people know that’s what you do.
Ocejo used its experience in traditional time-series methods for collaboration with the lab, applying deep learning to better predict demand forecasting for medical products. Here he designed, wrote and trained a transformer, a specific machine learning model, which is typically used in natural language processing and has the ability to learn very long term dependencies. Ocejo and his team compared target forecast demand between months, learning dynamic connections and attention weights between product sales within a product family. They looked at the characteristics of the identifier, concerning the price and the amount, as well as the characteristics of the account indicating who buys the articles or the services.
“One product doesn’t necessarily impact the prediction made for another product at the time of the prediction. It just impacts the parameters during training that lead to that prediction,” says Ocejo. “Instead of that, we wanted to give it a little more direct impact, so we added this layer that makes that connection and learns attention between all the products in our dataset.”
In the long term, over a one-year forecast, the MIT-IBM Watson AI Lab group was able to outperform the current model; even more impressively, it did so in the short term (nearly a fiscal quarter). Ocejo attributes this to the dynamics of its interdisciplinary team. “A lot of people in my group weren’t necessarily very experienced in the deep learning aspect of things, but they had a lot of experience in supply chain management, operations research and optimization, which that I don’t have that much experience,” Ocejo says. “They were giving a lot of good, high-level feedback on what to address next and…and knew what the industry field wanted to see or was looking to improve, so it was very helpful in streamlining my focus.”
For this work, it was not a deluge of data that made the difference for Ocejo and his team, but rather its structure and its presentation. Often, large deep learning models require millions and millions of data points in order to make meaningful inferences; however, the MIT-IBM Watson AI Lab group demonstrated that the results and technical improvements can be application specific. “It just shows that these models can learn something useful, in the right framework, with the right architecture, without needing an excessive amount of data,” says Ocejo. “And then with an excessive amount of data, it’s only going to get better.”