Comprehensive coverage

Deciphering the black box of artificial intelligence - scientists reveal unexpected results

Researchers at the University of Bonn in Germany are investigating how the internal engines of machine learning applications in drug research work. Among other things, they discovered that the artificial intelligence systems have become quite lazy

The "black box" of artificial intelligence. Illustration:
The "black box" of artificial intelligence. Illustration:

Artificial intelligence technology has advanced rapidly, but its inner workings often remain unknown, characterized by the nature of a "black box" where the process for reaching conclusions is not visible. However, a significant breakthrough was achieved by Prof. Dr. Jürgen Bayurath and his team, chemo-informatics experts at the University of Bonn. They developed a technique that reveals the operating mechanisms of certain artificial intelligence systems used in pharmaceutical research.

Surprisingly, their findings indicate that these AI models rely primarily on retrieving existing data rather than learning specific chemical interactions to predict drug efficacy. Their results were recently published in Nature Machine Intelligence.

Which drug molecule is the most effective? Researchers are passionately searching for effective active ingredients to fight disease. These compounds often bind to proteins, which are usually enzymes or receptors that activate a specific chain of physiological actions.

In some cases, certain molecules are even designed to block unwanted reactions in the body - such as an excessive inflammatory reaction. Given the abundance of chemical compounds available, this search is at first glance similar to searching for a needle in a haystack. Drug discovery therefore tries to use scientific models to predict which molecules will best fit the target protein and bind to it strongly. These potential drug candidates are then tested in more detail in experimental studies.

Since the advancement of artificial intelligence, drug discovery research is also increasingly using machine learning applications. One of these applications, "Graphic Neural Networks" (GNN), provides one of several opportunities for such applications. They are adapted to predict, for example, how strongly a particular molecule binds to a target protein. For this purpose, GNN models are trained using graphs that represent reactions (complexes) formed between proteins and chemical compounds (ligands).

Graphs usually consist of nodes that represent objects, and arcs that represent relationships between nodes. In graph representations of protein-ligand complexes, arcs connect only protein or ligand nodes, representing their structures, respectively, or protein and ligand nodes, representing specific protein-ligand interactions.

"The GNN engines arrive at their predictions is like a black box that we don't have a look into," says Prof. Bayurat. The chemoinformatics researcher from the LIMES Institute at the University of Bonn, the Bonn-Aachen International Center for Information Technologies (B-IT), and the Lamar Institute for Computational Learning and Artificial Intelligence in Bonn, together with colleagues from the Sapienza University of Rome, analyzed in detail whether graph neural networks really learn protein-ligand interactions to Predict how well an active substance binds to a target protein.

The researchers analyzed a total of six different GNN architectures using the method known as EdgeSHAPer that they developed and a different conceptual methodology for the purpose of comparing them. These computer programs "scan" whether GNN learns the most important interactions between a compound and a protein and thus predicts the strength of the ligand, as the researchers expect and anticipate - or whether the artificial intelligence reaches the predictions in other ways.

"The GNNs are very dependent on the data they are trained with," says the first author of the study, Dr. Andrea Mastropietro from the Sapienza University of Rome, who conducted part of his doctoral thesis in the group of Prof. Bayurat Babon.

The scientists trained the six GNNs with graphs generated from structures of protein-ligand complexes, for which the mode of action and binding strength of the compounds to their target proteins were already known from experiments. The trained GNNs were then tested on other complexes. The EdgeSHAPer analysis that follows allows us to understand how GNNs produce seemingly positive predictions.

"If the GNNs do what is expected of them, they need to study the interactions between the active substance and the target protein and the predictions need to be determined by prioritizing specific interactions," explains Prof. Bayurat. According to the team's analyses, however, the six GNNs largely failed to perform. Most GNNs have only studied single protein-drug interactions and focused mainly on ligands. According to Bayurat: "In order to predict the binding strength of a molecule to a target protein, the models relied mainly on 'remembering' chemically similar molecules encountered during training and their binding data, regardless of the target protein. The chemical similarity that was studied then largely dictated the predictions."

According to the scientists, this is largely similar to the "smart rapist effect". This effect refers to a horse that supposedly could count. How many times the executioner knocked his hoof was supposed to indicate the result of the calculation. As it turned out later, however, the horse could not calculate at all, but deduced varied results from the facial expressions and movements of its companion.

What do these findings mean for drug discovery research? "Usually it is unlikely that GNNs study chemical interactions between active substances and proteins," says the researcher. Their predictions are overrated because predictions of similar quality can be produced using chemical knowledge and simpler methods. However, the research also offers opportunities for artificial intelligence. Two of the models tested showed a clear tendency to learn more interactions as the binding strength of the test compounds increased. "It's worth looking at this in more detail," says Bayurat. Perhaps these GNNs can be improved in the desired direction using modified representations and training techniques. However, the assumption that physical quantities can be learned on the basis of molecular graphs should be treated with caution. "Artificial intelligence is not black magic," says Bayurat.

He sees the previous open access publication of EdgeSHAPer and other specially developed tools as promising approaches to illuminate the black box of AI models. His team's current approach focuses on GNNs and new "chemical language models".

"Development of methods for explaining predictions of complex models is an important field in artificial intelligence research. There are also approaches to other network architectures such as language models that help to better understand how machine learning reaches its results," says Bayurat. He expects that fascinating things will happen soon also in the field of "explained artificial intelligence" at the Lamar Institute, where he is principal researcher and chairman of the field of artificial intelligence in the life sciences.

Source: “Learning characteristics of graph neural networks predicting protein–ligand affinities” by Andrea Mastropietro, Giuseppe Pasculli and Jürgen Bajorath, 13 Nov 2023, Nature Machine Intelligence.

DOI: 10.1038/s42256-023-00756-9

This press release presents important research on understanding how deep learning models work in the field of drug discovery. The researchers from the University of Bonn showed that graph neural networks, a common method for predicting protein-drug binding, rely mostly on memory of previous data and less on learning specific biochemical interactions. The findings highlight the need for additional tools for understanding the "black box" of artificial intelligence and improving the models. This research may lead to the development of more improved models for drug discovery in the future.

More of the topic in Hayadan:

2 תגובות

  1. It is amazing that it is still not clear to humanity even at this stage, after years of working with networks
    Neurons how exactly the business works. Computer science research has reached a good understanding of advanced algorithms but not yet on this topic.
    Perhaps you need the help of artificial intelligence to understand how such artificial intelligence works.
    Another thing, the field of artificial intelligence is exciting but also scary.
    Will the computer soon replace many professionals?
    I think so.
    Professions that I think are in imminent danger:
    teachers, accountants, secretaries, managers, project managers,
    Drivers, consultants in all fields, companions of older people, doctors, and more.
    Professions that I think are in danger in the more distant period:
    Renovators, builders.
    But they are also in danger, for example you can see the progress with Tesla's robot.
    A great danger is in the combination of artificial intelligence in wars. It will no longer help to hide,
    The weapon with artificial intelligence will find you and know how to attack you at your weak point.
    On the positive side, artificial intelligence will do our work better and more efficiently
    For example also to find better medicines.
    Combining artificial intelligence with quantum computing is even more exciting and scary.
    This combination will create artificial intelligence that improves dramatically every fraction of a second.
    By the way, I already see how artificial intelligence helps me personally to program
    In computer languages ​​like Java, c++, python, react, assembly
    It's nice and helps to work faster, but on the other hand, there are already people today with basic knowledge
    Only in programming can ask artificial intelligence to write them code according to their requirements.
    I think you can call it fifth generation programming (for those who know the previous generations).

    Eli Isaac

    Private computer science teacher, lecturer up to a master's degree and senior software engineer

Leave a Reply

Email will not be published. Required fields are marked *

This site uses Akismat to prevent spam messages. Click here to learn how your response data is processed.