AI Model Empowers Intelligent Operation and Maintenance for Efficiency Enhancement

Release Date:2024-05-16 By He Wei

With the acceleration of digital transformation in the industry, intelligent operation and maintenance (O&M) requirements within the telecom field are becoming increasingly complex. Consequently, intelligent O&M has emerged as one of the key factors for maintaining competitiveness in the digital era. However, due to the rapid development of services and continuous technological updates, traditional O&M methods are inadequate to meet the evolving O&M requirements of communication devices. The advent of AI model has brought breakthroughs in the field of intelligent O&M. It offers a more user-friendly man-machine interaction mode, processes massive structured data, delivers high-precision analysis and prediction, and empowers advanced O&M capabilities.

Applications of AI Model in Intelligent O&M

The AI model technology has been widely used in intelligent O&M in the telecom field, which includes O&M knowledge Q&A, fault and exception detection, root cause location, and fault prediction and prevention.

  • O&M knowledge Q&A: When the AI model is utilized in the telecom field for knowledge questions and answers, its capabilities in storage, memory, understanding, and application become especially crucial. By analyzing a large amount of communication data and technical documents, the AI model can deeply understand various communication devices, protocols, and network topologies. This comprehensive understanding allows the AI model to efficiently integrate contextual information when addressing complex communication issues, and quickly and accurately generate answers to questions. Additionally, the AI model also has the capability to continuously update and refine its knowledge base based on real-time communication data and the latest industry trends to keep pace with the latest knowledge and provide timely and reliable support and guidance for operation and maintenance personnel.
  • Fault and exception detection: Utilizing AI-model intelligent algorithms and models, the system processes and analyzes collected data to identify abnormal data or behaviors that are inconsistent with the normal status. This typically involves feature extraction, data modeling and classification, and formulation of abnormal judgment standards. In practice, to enhance detection accuracy and robustness, considerations for data temporal characteristics, spatial correlations, and trends over time are often necessary. Moreover, tailored algorithms and models may be required to accommodate diverse data characteristics and business needs in various domains and application scenarios. Furthermore, continuous updates and optimizations of algorithms and models are essential to ensure system reliability and stability, enabling adaptation to new fault types and changes in scenarios.
  • Root cause location: Based on fault detection, ZTE conducts further analysis of abnormal data to infer the cause and location of the fault, thereby identifying the specific type and location. This requires various diagnosis technologies and methods, such as fault tree analysis and expert systems. Through root cause analysis, the system can pinpoint the origins of issues more accurately and take effective measures to resolve faults, thereby enhancing system reliability and stability.
  • Fault prediction and prevention: The AI model can learn from vast amounts of historical O&M data to identify patterns and trends in fault occurrence, establishing a fault occurrence model. Utilizing this model, the AI model monitors and analyzes real-time data to predict potential fault risks and send early warnings, providing O&M personnel with sufficient time to take preventive measures and reduce the fault rate. This predictive maintenance approach not only reduces the impact of sudden failures but also maximizes system stability and availability, enhancing operational efficiency and resource utilization.

 

Compared with traditional AI for IT operations (AIOps), the AI model provides enhanced intelligent O&M capabilities, such as simpler interaction, broader knowledge coverage, fault self-learning, and a more flexible model architecture. It provides O&M personnel with lower maintenance threshold and continuously generalizes O&M capabilities.

Architecture and Key Technologies of ZTE CCN AI O&M Model

ZTE CCN AI O&M model is based on the Nebula model, developed by ZTE, in the telecom field. It uses high-quality corpus for fine-tuning the base models and generating an AI O&M model oriented to the core network and network cloud (Fig. 1). There are three types of AI O&M model applications: intelligent interaction (CoPilot-I), intelligent analysis (CoPilot-A), and intelligent generation (CoPilot-G).

 

  • CoPilot-I: It provides the functions such as professional knowledge Q&A, network health querying, and key indicator information querying.
  • CoPilot-A: It includes fault analysis assistance, network optimization assistance, and inspection report.
  • CoPilot-G: It generates inspection reports, operation solutions, and network reports.

 

To meet the above-mentioned AI O&M model capabilities, all AI O&M model products targeted for ZTE core network and network cloud incorporate the currently popular key technologies such as retrieval-augmented generation (RAG) and multi-agent collaboration.

  • Retrieval-Augmented Generation

To accomplish more complex and knowledge-intensive tasks, it is necessary to build a more accurate and reliable system and alleviate the hallucination issue of AI models. RAG stands out as a key technology for AI models. It helps large language models generate answers by retrieving information from data sources. The RAG technology can greatly enhance the accuracy and relevance of content, effectively addressing hallucination issues, accelerating knowledge updates, and improving the traceability of content generation. RAG has become the most popular system architecture for AI models to obtain new external knowledge.

  • Multi-Agent Collaboration Architecture

Multi-agent collaboration refers to the process in which multiple agents communicate and collaborate with each other in a shared environment to achieve common goals. Each agent possesses a level of autonomy and intelligence, enabling perception, decision-making, and execution based on environment information. Through mutual interaction and cooperation, the system can benefit from the advantages and strengths of each agent to achieve more efficient and intelligent decision-making and action. Leveraging the multi-agent collaboration architecture, independent agents, such as knowledge experts, fault experts, on-duty experts, and solution experts, can be collaboratively created to build an intelligent network O&M system architecture.

Challenges and Future Development of AI Models in Intelligent O&M

Although AI models have promising application prospects and advantages in the intelligent O&M field, challenges still exist, including improving the model’s adaptation capability, reducing complexity, and addressing data privacy and security issues.

In the future, with the continuous development of technologies and application scenarios, AI models will be widely and deeply applied in the intelligent O&M field. For instance, with the proliferation and advancement of edge computing, AI models will gradually migrate to the edge to achieve more efficient and real-time intelligent O&M. At the same time, AI models will be closely integrated with machine learning, deep learning, and other technologies to further enhance the efficiency and precision of intelligent operation and maintenance. AI models will encounter both challenges and opportunities in handling more extensive data. Therefore, we need to continually explore, innovate, and apply practices tailored to specific scenarios and requirements. Additionally, further strengthening research and development in related technologies is necessary to promote the advancement and progress of intelligent O&M technologies.