This research was supported by SES S.A., Iberdrola and ScottishPower Renewables, and Hyundai Motor Co.
“When it comes to complex tasks like anomaly detection in time series, LLMs really are a contender,” says Alnegheimish. “Maybe other complex tasks can be addressed with LLMs as well.”
Her co-authors include Linh Nguyen, an EECS graduate student; Laure Berti-Equille, a research director at the French National Research Institute for Sustainable Development; and senior author Kalyan Veeramachaneni, a principal research scientist in the Laboratory for Information and Decision Systems. The research will be presented at the IEEE Conference on Data Science and Advanced Analytics.
An off-the-shelf solution
Large language models are autoregressive, which means they can understand that the newest values in sequential data depend on previous values. For instance, models like GPT-4 can predict the next word in a sentence using the words that precede it.
In a new study, MIT researchers found that large language models (LLMs) have potential to be more efficient anomaly detectors for time-series data. Importantly, these pretrained models can be deployed right out of the box.
In the future, an LLM may also be able to provide plain language explanations with its predictions so an operator could be better able to understand why an LLM identified a certain data point as anomalous.
Identifying one faulty turbine in a wind farm, which can involve looking at hundreds of signals and millions of data points, is akin to finding a needle in a haystack.
“If you don’t handle these steps very carefully, you might end up chopping off some part of your data that does matter, losing that information,” Alnegheimish says.
However, state-of-the-art deep learning models outperformed LLMs by a wide margin, showing that there is still work to do before an LLM could be used for anomaly detection.
But before they could deploy it, they had to convert time-series data into text-based inputs the language model could handle. They accomplished this through a sequence of transformations that capture the most important parts of the time series while representing data with the fewest number of tokens. Tokens are the basic inputs for an LLM, and more tokens require more computation.
For the second approach, called “Detector,” they use the LLM as a forecaster to predict the next value from a time series. The researchers compare the predicted value to the actual value. A large discrepancy suggests that the real value is likely an anomaly.
For the first, which they call “Prompter,” they feed the prepared data into the model and prompt it to locate anomalous values.
While LLMs couldn’t beat state-of-the-art deep learning models at anomaly detection, they did perform as well as some other AI approaches. If researchers can improve the performance of LLMs, this framework could help technicians flag potential problems in equipment like heavy machinery or satellites before they occur, with no need to train an expensive deep-learning model.
Published Aug. 14, 2024, in MIT News.
منبع: https://www.qualitydigest.com/inside/innovation-article/using-large-language-models-flag-problems-complex-systems-090424.html
Their LLM approaches also take between 30 minutes and two hours to produce results, so increasing the speed is a key area of future work. The researchers also want to probe LLMs to understand how they perform anomaly detection, in hopes of finding a way to boost their performance.
But with hundreds of wind turbines recording dozens of signals each hour, training a deep-learning model to analyze time-series data is costly and cumbersome. This is compounded by the fact that the model may need to be retrained after deployment, and wind farm operators might lack the necessary machine-learning expertise.
“We had to iterate a number of times to figure out the right prompts for one specific time series. It’s not easy to understand how these LLMs ingest and process the data,” Alnegheimish says.
“Since this is just the first iteration, we didn’t expect to get there from the first go, but these results show that there’s an opportunity here to leverage LLMs for complex anomaly detection tasks,” says Sarah Alnegheimish, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on SigLLM.
With Detector, the LLM would be part of an anomaly detection pipeline, while Prompter would complete the task on its own. In practice, Detector performed better than Prompter, which generated many false positives.
Once they had figured out how to transform time-series data, the researchers developed two anomaly detection approaches.
Approaches for anomaly detection
Because time-series data are sequential, the researchers thought the autoregressive nature of LLMs might make them well-suited for detecting anomalies in this type of data.
Engineers often streamline this complex problem using deep-learning models that can detect anomalies in measurements taken repeatedly over time by each turbine, known as time-series data.
Paper: “Large language models can be zero-shot anomaly detectors for time series?”
The researchers developed a framework called “SigLLM” that includes a component that converts time-series data into text-based inputs an LLM can process. A user can feed these prepared data to the model and ask it to start identifying anomalies. The LLM can also be used to forecast future time-series data points as part of an anomaly-detection pipeline.
When they compared both approaches to current techniques, Detector outperformed transformer-based AI models on seven of the 11 datasets they evaluated, even though the LLM required no training or fine-tuning.
“I think, with the Prompter approach, we were asking the LLM to jump through too many hoops. We were giving it a harder problem to solve,” says Veeramachaneni.
“What will it take to get to the point where it is doing as well as these state-of-the-art models? That is the million-dollar question staring at us right now,” Veeramachaneni says. “An LLM-based anomaly detector needs to be a game-changer for us to justify this sort of effort.”
Moving forward, the researchers want to see if fine-tuning can improve performance, though that would require additional time, cost, and expertise for training.
However, they wanted to develop a technique that avoids fine-tuning, a process in which engineers retrain a general-purpose LLM on a small amount of task-specific data to make it an expert at one task. Instead, the researchers deploy an LLM off the shelf with no additional training steps.