“Generative AI has not surprised us.”

by Bernd Beckert /

At Fraunhofer ISI, I’m currently running a project, which deals with the technical and conceptual demarcation of research fields on Artificial Intelligence. Why? AI research consists of very many subareas, which are partially incompatible, for example Computer Vision, Natural Language Processing or Decision Support Systems. Combining individual subareas can lead to an innovation boost. However, which subareas are best suited? In order to be able to support research planning effectively, it helps to have specific knowledge about the currently most important scientific issues in AI subareas.

Here we want to provide support with our expertise: In our current study, we evaluate bibliometric analyses and interview experts in order to characterize the AI landscape and determine current trends. The study will be published as a working paper in April 2024.

Yet research planning also depends on knowledge about the future. Here, there are various foresight methods, which are becoming better all the time, as they are based on large quantities of data and use AI tools. Foresight based on AI is anything but trivial: Dr. Philine Warnke is a foresight expert at Fraunhofer ISI and has shared the secret of technical foresight with me, which allows her to recognize trends, before others recognize them. However, results at the push of a button do not exist.

Many underestimated generative Artificial Intelligence – until ChatGPT arrived. Even experts didn’t see the dynamic development, which happened in recent years in this AI research field: Even in 2021, that is to say one year before the launch of ChatGPT, Liu, Shapria and Yue scrapped the topics “generative AI” and “Chatbots” from the list of relevant AI research fields in a large-scale bibliometric study because they didn’t find enough scientific publications on this topic. You, on the other hand, recognized as early as 2019 in your study “100 radical innovation breakthroughs for the future” for the European Commission, that voice recognition, Chatbots and generative AI will become important AI topics. What did you do differently, how did you go about it?

Yes, the boom of generative AI didn’t surprise us. Obviously, nobody could foresee the enormous economic dynamics and the societal debate about AI, caused by OpenAI and the publication of ChatGPT. We are a little bit proud, that we saw the trend as soon as 2019. AI has many facets and breakthroughs can come from any AI subarea. However, the current boom is very clearly caused by generative AI.

What we’ve done is basically not a secret: It is a variant of the method “Weak Signals Scanning”, which we repeatedly employ in different foresight projects. However, for the EU project RIBRI we made a decisive specific adjustment.

Basically, it is like this. When we use “Weak Signals Scanning”, we don’t look at what is currently published or has been published, quoted or patented in the past. Instead, we ask: What is new in the discourse? That can also be small things, topics, which only a few look at, scientific niches. Then we put forward so called “Seed of Change” hypotheses for areas, in which something new is being developed. Here we work qualitatively and interview lead users and other experts who we do not otherwise focus on. In other words, we deliberately go to the margins and interview people who operate outside the scientific mainstream.

The crucial adjustment of the method, which we made for RIBRI: we automated the identification of weak signals with the aid of an AI tool. This way we were able to analyze a much larger data pool.

The tool that our research partners from Institutul de Prospectiva in Bucharest developed is a machine learning tool which they trained using the supervised learning procedure to automatically recognize weak signals in science news blogs in the Internet. The data basis for the training were more than 200 international science and technology blogs and RSS feeds. In such sources with unstructured data new topics are discussed much earlier than in scientific publications which are analyzed for classical bibliometric studies.

Following the training, the project team fed the artificial neural net with hundreds of thousands new entries from these sources. The tool then classified and discarded ninety percent of the articles as mainstream, i.e. widely known trends. We didn’t look at the discarded trends any further. We were, however, very interested in the remaining ten percent. These, we analyzed in more detail and merged them into theme clusters. When compiling the clusters, another AI-tool helped us, as we were still dealing with a large number of articles.

 

That sounds exciting. So in the end, you only had to press a button to arrive at the hundred future innovations?

The AI-tools were a great help, but we could not automate everything. In the end, we had to process every single produced thematic “cloud” by hand, as the AI-tool didn’t really understand the topics. It just delivered a selected article-set for each topic.

A description and a classification were important for the report, which we obviously did ourselves. For this purpose, we had real humans classify and verify the AI results in conversations with experts and in a Delphi survey. That is important, as automated procedures and human judgement both have their strengths and weaknesses and complement each other.

 

That sounds very time consuming. Would it – in principle – be possible, to use this method to identify topics, which emerge today, in the year 2024, and which will become more important in the next three to five years?

Yes, that would work. In the meantime, we have our own AI-tools and use algorithms, which can for example extract certain themes from news bulletins. This method is called “natural language programming-based topic modelling”, it works both with structured data from databanks and with unstructured data, for example from blogs and digital newspaper articles. We would probably contact our former partner in Bucharest, whose neural network has in the meantime surely become even better in discovering the new in such sources.

However, we mustn’t forget that we researched, trained and analyzed in this project for the EU for almost two years. Despite all AI-tools, in the future it still won’t work at the push of a button.