Embedding Explainability into the Components of Machine Learning Models | MIT News

Explanatory methods that help users understand and trust machine learning models often describe how certain features used in the model contribute to its prediction. For example, if a model predicts a patient’s risk of developing heart disease, a physician may want to know how the patient’s heart rate data influences that prediction.

But if these features are so complex or convoluted that the user cannot understand them, is the explain method useful?

MIT researchers are working to improve the interpretability of features so that decision makers are more comfortable with the results of machine learning models. Based on years of fieldwork, they’ve developed a taxonomy to help developers build features that will be easier for their target audience to understand.

“We found that in the real world, even though we used state-of-the-art methods to explain machine learning models, there is still a lot of confusion resulting from the features, not the model itself,” says Alexandra Zytek, PhD student in electrical engineering and computer science and lead author of an article presenting the taxonomy.

To build the taxonomy, the researchers defined properties that make the features interpretable for five types of users, from artificial intelligence experts to people affected by the prediction of a machine learning model. They also offer instructions on how template makers can turn features into formats that will be easier for a layman to understand.

They hope their work will inspire model builders to consider using interpretable features early in the development process, rather than trying to backtrack and focus on explainability after the fact.

MIT co-authors include Dongyu Liu, a postdoc; visiting professor Laure Berti-Équille, research director at the IRD; and lead author Kalyan Veeramachaneni, senior researcher at the Laboratory for Information and Decision Systems (LIDS) and head of the Data to AI group. They are joined by Ignacio Arnaldo, principal data scientist at Corelight. The research is published in the June edition of the Association for Computing Machinery’s Special Interest Group on Knowledge Discovery and Data Mining. Exploration Bulletin.

Real world lessons

Features are input variables that feed machine learning models; they are usually taken from the columns of a dataset. Data scientists typically select and build features for the model, and they focus primarily on developing features to improve model accuracy, not on a decision maker’s ability to understand them, Veeramachaneni says.

For several years, he and his team worked with decision makers to identify machine learning usability challenges. These domain experts, most of whom lack knowledge of machine learning, often don’t trust models because they don’t understand the features that influence predictions.

For one project, they partnered with clinicians in a hospital intensive care unit who used machine learning to predict a patient’s risk of facing complications after heart surgery. Some characteristics were presented as aggregated values, such as a patient’s heart rate trend over time. While features coded in this way were “model ready” (the model could process the data), clinicians did not understand how they were calculated. They would prefer to see how these aggregated characteristics relate to the original values, so they can identify abnormalities in a patient’s heart rate, Liu says.

In contrast, a group of learning scientists preferred aggregated functionality. Instead of having a feature like “number of posts a student has made on discussion boards,” they would rather have related features grouped together and labeled with terms they understood, like “participation.”

“With interpretability, one size does not fit all. When you go from one region to another, the needs are different. And interpretability itself has many levels,” says Veeramachaneni.

The idea that one size does not fit all is key to the scholars taxonomy. They define the properties that can make functionality more or less interpretable for different decision makers and describe the properties that are likely most important to specific users.

For example, machine learning developers may focus on model-compatible and predictive features, which means they are expected to improve model performance.

On the other hand, decision makers without machine learning experience might be better served by human-formulated features, meaning they are described in a way that is natural to users and understandable, meaning that they refer to real-world metric users. can reason.

“The taxonomy says, if you are creating interpretable functionality, at what level are they interpretable? You may not need all levels, depending on the type of domain experts you are working with,” Zytek explains. .

Priority to interpretability

The researchers also describe feature engineering techniques that a developer can use to make features more interpretable for a specific audience.

Feature engineering is a process in which data scientists transform data into a format that machine learning models can process, using techniques such as data aggregation or value normalization. Most models also cannot process categorical data unless it is converted to a numeric code. These transformations are often nearly impossible for laymen to unbox.

Creating interpretable functionality may involve undoing some of this encoding, Zytek explains. For example, a common feature engineering technique organizes ranges of data so that they all contain the same number of years. To make these characteristics more interpretable, age groups could be grouped using human terms, such as infant, toddler, child, and adolescent. Or rather than using a transformed feature like the average pulse, an interpretable feature could simply be the actual pulse data, Liu adds.

“In many areas, the trade-off between interpretable features and model accuracy is actually very small. When working with child protection screening officers, for example, we retrained the model using only those features that met our interpretability definitions, and the performance degradation was almost negligible,” says Zytek. .

Building on this work, the researchers are developing a system that allows a model developer to handle complex feature transformations more efficiently, to create human-centered explanations for machine learning models. This new system will also convert algorithms designed to explain model-ready datasets into formats that decision makers can understand.

Sherry J. Basler