Protecting People’s Lives Through “Smart Safeguard” AI Anti-Fraud System

Release Date:2024-05-16 By Huang Xiaobing, Wang Wei

According to public information released by China’s National Anti-Fraud Center, telecom fraud has become the crime with the largest number of cases, the fastest growth rate, and the widest coverage in recent years. By the end of 2022, China’s public security departments had cracked 1.156 million telecom fraud cases, arrested 1.553 million suspects, and intercepted over 916.5 billion yuan of funds related to fraud cases. The increasing prevalence of telecom fraud pose a significant threat to personal and property safety.

Difficulties and Challenges in SMS Fraud Monitoring

SMS fraud stands out as one of the most common types of telecom fraud. Fraudsters constantly alter and adapt SMS contents to bypass the SMS monitoring system of telecom operators. Some common tactics employed by fraudsters include:

  • Circumventing keyword rules by utilizing combining mutations, escape characters, homophones, and similar shapes.
  • Using a combination of Chinese characters, symbols, and digits to express standard URLs and numbers, evading regular expression monitoring policies implemented in the existing network.
  • Evading traffic and keyword thresholds through utilizing a vast pool of numbers.
  • Making breakthroughs through methods such as dialing tests to enable the sending of massive messages.  

 

The traditional fraud management solution with a long upgrade period faces huge challenges. Overly lax policies may lead to low interception efficiency, while overly strict policies affect normal communication.  

AI Model Enables Technological Revolution  

On November 30, 2022, OpenAI launched ChatGPT, which obtained 100 million users within two months after its launch. Built on the transformer neural network architecture, ChatGPT, a large language model (LLM), has made major breakthroughs across multiple deep learning fields, including large-scale natural language processing, sequence data analysis, and target detection. Trained on extensive corpora, LLMs can acquire generalized knowledge and a deep understanding of languages and dialogues. Moreover, targeted training allows LLMs to solve problems in specific fields and rapidly adapt to new tasks and scenarios.

Accurately identifying fraudulent SMS messages requires a deep understanding of natural languages. Furthermore, it is necessary to classify sensitive information and identify the real intentions conveyed in the content. Lastly, given the evolving nature of fraudulent SMS messages, it is necessary to learn from samples and dynamically upgrade knowledge and models. These are the technologies where transformer-based LLMs excel. It is worthwhile to develop new SMS anti-fraud technologies and products utilizing AI models through prototype testing and exploration.

Rapid Technical Breakthrough Helps Tackle Difficulties

In the early stage of the project, we faced several challenges in selecting LLMs:

  • Uncertainty in model selection: It is challenging to determine the most appropriate model while ensuring legal compliance.
  • Uncertainty in corpus and training solutions: The quality, quantity, format, and prompts of the corpus were unknown. We started from scratch with training and inference solutions.
  • High GPU and server costs: Inference performance was low in the medium term, and the number and costs of GPUs required for handling large service traffic were too high.

 

To achieve rapid technical breakthrough, we dared to try different approaches, make mistakes and adjust solutions promptly.

In terms of model selection, during the initial exploration phase, we tried models with less than 100 million parameters to 340 million, 7 billion and 13 billion parameters. This process encompassed four parameter scales and included six different models from both domestic and international sources, including self-developed ones. We evaluated a total of more than 20 combinations.  

In terms of corpus and fine-tuning, we obtained a first-hand, high-quality corpus compliant with regulations, tried various fine-tuning solutions, and finally devised the most effective approach: "special prompt words+sample fine-tuning", greatly improving recognition accuracy and recall rate.

To address the challenges of high GPU quantity and high costs, we designed a multi-layer architecture with cache acceleration at the front and utilized a combination of small models and large models. Additionally, we implemented inference acceleration to achieve optimal performance.

After evaluating the effects and cost indicators of the models, we selected the most optimal solution and passed legal compliance review.

Perfect Combination of Communications and AI

Through continuous innovation, ZTE has successfully released the industry's first anti-fraud big model system called “Smart Safeguard” (Fig. 1). With its out-of-the-box functionality, the system automatically identifies illegal SMS messages without policy configuration. This greatly reduces the complexity and workload of on-site policy O&M, while enhancing the accuracy of illegal SMS message identification and recall rate. It enables integrated management of identifying, preventing, and controlling junk and fraudulent SMS messages.

Currently, the system has implemented the industry’s first LLM-based SMS anti-fraud management pilot in pilot offices of operators A and B in China and quickly transitioned into commercial use.

  • Operator A’s achievements: Since the system was deployed in the provincial company, there has been a significant increase in the interception rate of fraudulent messages. The daily average number of overseas junk SMS messages dropped from 500,000 to 600,000 to 20,000 to 30,000. The forecast success rate and interception accuracy rate can reach up to 99%. In addition, the number of fraud-related cases has significantly decreased. In August 2023, the month-on-month ratio of overseas fraud-related cases decreased by 64%. Following the office’s provisioning, the system received high acclaim from the operator and the anti-fraud center of the province.
  • Operator B’s achievements: The total number of mobile-originated (MO) messages initiated by domestic terminals is 4 million per day, all of which are monitored by the Smart Safeguard system. The average daily interception success rate for junk and fraudulent messages has increased from 57.25% to 93.60%. The false interception ratio has reduced from 42.75% to 6.4%.

 

In addition, ZTE’s AI anti-fraud technologies have been chosen by China’s Ministry of Industry and Information Technology as an innovative technology application for preventing and controlling telecom fraud, and they have been promoted nationwide.

Future Evolution and Prospect

The introduction of anti-fraud model marks the beginning of AI model application in the communication sector. The Smart Safeguard series models will be developed, evolved, and applied across multiple domains, including service scope, media capabilities, and industrial applications.

  • Field expansion and capability openness: ZTE will achieve capability replication and openness, further exploring the application of anti-fraud governance in 5G communication, IT, and content release.
  • Computer vision model: Besides SMS text content anti-fraud, multimedia content is a fast-growing form of telecom fraud. To ensure that media content is trustworthy, secure, and reliable in the new 5G communications era, the Smart Safeguard model must efficiently identify and combat  fraudulent multimedia content in the future.
  • Industrial model: 5G industrial customers have diverse requirements including intelligent dialogues, industrial knowledge services, and enterprise applications. By supporting L0/L1/L2 AI models and integrating them into platforms on the new communications network side, such as the 5G messaging platform, ZTE “Smart Safeguard” model can rapidly meet the AI capability requirements of 5G industrial communication, effectively serving government and enterprise customers.