Agricultural Bank of China: Building ABC ONE+financial network and exploring new end-to-end operation and maintenance models

Home    Solution    Agricultural Bank of China: Building ABC ONE+financial network and exploring new end-to-end operation and maintenance models

With the development of financial technology and the deepening of digital transformation, online businesses, mobile banking and other businesses are developing rapidly, driving the rapid growth of data center network scale and traffic. The complexity and difficulty of network operation and security protection are also increasing day by day. At the same time, the continuity of financial business requires uninterrupted 7 * 24 hours, high traceability requirements for financial accounting, transactions, and more intelligent and efficient data center operations.

Under the guidance of the new three-year plan of ABC ONE+network, Agricultural Bank of China (ABC) has fully launched research and exploration on integrated intelligent operation and maintenance of business networks. It has taken the lead in innovating and optimizing traffic backtracking and analysis systems, achieving breakthrough in end-to-end network traffic collection, business path restoration, and real-time intelligent analysis, safeguarding the development of financial technology and digital transformation.

The operational trends and challenges faced by financial digital transformation

The development of the Internet, cloud computing, and big data is driving tremendous changes in the infrastructure and management objects of data centers.

On the one hand, changes in infrastructure have driven the development of network virtualization through cloud and server resource pooling. Multi cloud, multi location, and multi data centers have become the choice of many large and joint-stock banks. The rapid growth of data center traffic has shifted from traditional "north-south traffic oriented" to "east-west traffic oriented", and the scale and complexity of data center operation and maintenance are increasing day by day.

On the other hand, the management objects have changed, and data centers have gradually shifted from traditional centralized small and large machines to distributed architectures. The objects of operation and management have also shifted from "traditional physical hardware such as hosts and devices" to "software resources and data such as applications and services". The scope and requirements of data center operation and management have gradually increased.

In this context, IT operation and maintenance tools are constantly emerging and diverse. From the "agricultural era" of traditional manual operation and maintenance, to the "industrial era" of automated operation and maintenance, and then to the "intelligent era" of intelligent operation and maintenance, operation and maintenance technology has achieved leapfrog development in recent years. However, in the actual management and operation of the financial industry, due to the lack of unified planning for operation and maintenance systems, data center operation and maintenance have gradually exposed some problems in the face of fluctuating business experience quality, complex application migration and launch strategies, and massive log alarms, such as:

The mapping relationship between business and network is unclear: Traditional network traffic collection is mostly achieved through physical device bypass traffic mirroring, which cannot open virtual network boundaries downwards, causing blind spots in network monitoring; However, network operation and maintenance tools are more concerned about the status of the network itself, and cannot see the overall performance of the business upwards. Even if the network perceives a fault, it cannot determine the scope of business impact. Therefore, after implementing the mapping of business mutual visits and the mapping of Overlay and Underlay networks, how to achieve the mapping of business and network states becomes the next challenge for business and network visualization.

Slow determination of business fault boundary and positioning: A data center may have more than ten different business and network management systems, each managed separately like the Chu River and Han River, and there are problems with duplicate traffic collection and ineffective information linkage. Only when receiving a fault alarm that requires joint positioning, can manual collaborative determination of the location and cause of the problem be carried out, which often takes several days, and the fault boundary positioning cycle is long and inefficient.

Difficulty in reproducing problems such as poor network quality: With the change of distributed architecture in data centers, there are numerous quality problems such as micro bursts and packet loss caused by distributed "multiple hits" in the current network. This type of problem can only be perceived as lagging or performance degradation at the business level. At the network level, due to the lack of systematic data analysis and evaluation, it is difficult to actively detect and reproduce, and there is no basis for post troubleshooting. Only manual inspection of table items/alarms and other information is necessary, which is time-consuming and requires high technical requirements. Therefore, the network department can only cooperate with the business department to repeatedly locate and analyze, which puts forward higher requirements for systematic investigation and early identification of network hazards.

Therefore, how to break the constraints of responsibility boundaries and management scope of different management systems without affecting the existing operation and maintenance systems of the current network is a common difficulty and challenge faced by the financial industry. Based on this, Agricultural Bank of China has firmly launched a new exploration of integrated intelligent operation and maintenance of business networks, and has clearly identified end-to-end intelligent operation and maintenance of the entire network as the roadmap and direction for the development of data center operation and maintenance.

Leap forward evolution, Agricultural Bank of China breaks the boundary between business and network operation and maintenance for the first time

In 2022, in order to break the boundary between business and network, Agricultural Bank of China launched an exploration of integrated intelligent operation and maintenance of business and network. On the one hand, actively sort out and identify the operational pain points and problems of the entire bank; On the other hand, actively communicate with manufacturers such as Huawei, explore the latest technologies and operation directions in the industry, and learn from the strengths of various companies. Finally, the Agricultural Bank of China's traffic backtracking analysis system consists of two parts: a business performance management system and a network intelligent operation and maintenance system, and the following innovative practices have been carried out based on this logical architecture.

Exploration 1: Service oriented network operation and maintenance capabilities, proactive status reporting

In order to quickly provide network data to the business performance management system, the network is service-oriented through 100+full APIs and fully open network data services. Through drag and drop integration, scenario based APIs can be quickly released and integrated with the upper level business performance management system, breaking the traditional hard coding development mode and greatly shortening the integration cycle between systems.

Exploration 2: Traffic Mirror Overlay, End to End Path Restoration

In order to provide comprehensive quality assurance for business and achieve comprehensive traffic mirroring, Agricultural Bank of China conducts boundary exit full flow mirroring at key nodes such as DC exits, Fabric exits, and VAS device interconnection ports, and conducts session and network performance analysis by the business performance management system; Mirror the ERSPAN flow within the Fabric based on TCP feature packets and send it to the network intelligent operation and maintenance system to restore the forwarding path within the Fabric. Finally, by overlaying two types of traffic mirroring, end-to-end mirroring and path restoration were achieved, and support for deduplication, decryption, and desensitization of mirrored traffic was provided, reducing the pressure on the analysis section.

Created on:2023年12月23日 10:17
PV:0
Collect