Topological AI-Analysis vs. Classical Cross-Correlation: Overcoming Legacy Defectoscope Vulnerabilities
Aleksandr Ivanaiskii, PhD
Industrial AI Founder & Systems Architect
Evgeny Ivanaiskii, PhD
Domain Expert
Sergei Shipilov
AI Architecture Lead, Rivixi LLC
Abstract
This paper investigates a critical, undocumented vulnerability in the classical cross-correlation algorithms widely deployed in traditional acoustic leak detectors. Legacy hardware rely on a spatially constrained time-delay window (), which causes diagnostic accuracy to degrade exponentially as the distance between acoustic sensors increases.
Through extensive experimental auditing on a super-heterogeneous dataset of 371 field recordings, we demonstrate that expanding the physical monitoring distance from 50 to 150 meters collapses system specificity from 95% to 0%, resulting in a 100% False Positive Rate under normal operating conditions. This analysis shows that a direct, "blind" software reengineering of legacy digital signal processing (DSP) pipelines simply duplicates these inherent physical limitations. To overcome this barrier, we present a distance-invariant hybrid topological approach using a combination of a deep 1D Convolutional Neural Network (1D-CNN / Acoustic1DNet) and a 2D Convolutional Neural Network (2D-CNN / Acoustic2DNet) operating on Mel-Spectrograms, which maintains a stable specificity of 97.7% regardless of sensor separation.
1. Introduction & The Challenge
Acoustic defectoscopes and correlators from previous generations (legacy hardware) remain the industry standard for non-destructive testing (NDT) in municipal water and district heating utilities. Manufacturers of these instruments routinely advertise their systems as universal and highly accurate, while keeping silent about their mathematical and physical limitations.
During the development of cloud-based monitoring systems, engineering teams frequently attempt to reengineer these proprietary algorithms to ingest legacy data streams. However, testing reveals that classical cross-correlation engines suffer from a severe scalability bottleneck. As the distance between sensor nodes increases, the probability of false alarms rises exponentially. Direct software duplication of legacy DSP code is therefore unviable for automated, wide-area municipal pipeline networks, requiring a complete architectural shift.
2. The Mathematics of Window Expansion: The max_tau Vulnerability
Conventional acoustic leak localization is based on calculating the cross-correlation function between signals recorded by two synchronized sensors positioned at opposite ends of a pipe segment. The algorithm scans for a peak indicating the relative arrival delay of the leak's acoustic wave.
The search range for this delay, denoted as , is bounded by the physical distance between the sensors () and the speed of acoustic propagation in the pipe medium ():
For short segments (e.g., m), the search window remains small, and the Peak-to-Noise Ratio (PNR) can identify leaks reliably. However, on transmission mains or long municipal runs (distances of 150m, 500m, or more), the search window expands proportionally. At this point, the mathematical foundation of cross-correlation experiences a probabilistic breakdown.
3. Experimental Auditing & Performance Collapse
To evaluate this vulnerability, we reconstructed the classical cross-correlation DSP pipeline and audited its performance on an empirical dataset of 371 audio recordings gathered from operational utility networks. The dataset is structured as follows:
- 215 "Normal" records (clean pipe segments, background noise without leaks);
- 156 "Leak" records (pipe segments with physically verified leaks).
We evaluated the performance of the classical DSP cross-correlation method against the RIVIXI AI topological classifier over varying target pipe lengths (from 10 to 250 meters) using the same set of audio recordings. The results of this distance sweep are presented in Table 1.
Table 1. Comparative metrics of classical DSP and RIVIXI AI over varying sensor distances
| Distance L (m) | DSP Sensitivity (%) | DSP Specificity (%) | DSP FPR (%) | RIVIXI AI Sensitivity (%) | RIVIXI AI Specificity (%) | RIVIXI AI FPR (%) |
|---|---|---|---|---|---|---|
| 10 | 0.0% | 100.0% | 0.0% | 97.4% | 97.7% | 2.3% |
| 50 | 0.0% | 95.3% | 4.7% | 97.4% | 97.7% | 2.3% |
| 100 | 35.3% | 42.8% | 57.2% | 97.4% | 97.7% | 2.3% |
| 150 | 60.3% | 0.0% | 100.0% | 97.4% | 97.7% | 2.3% |
| 200 | 72.4% | 0.0% | 100.0% | 97.4% | 97.7% | 2.3% |
| 250 | 80.8% | 0.0% | 100.0% | 97.4% | 97.7% | 2.3% |
Analysis of distance sweep results:
- At short distances (L = 10–50 m): the classical DSP method maintains a high specificity (95.3%–100.0%), but its sensitivity remains at 0.0% because the arrival delay of distant leak waves falls outside the narrow search window.
- At longer distances (L = 100–250 m): DSP sensitivity improves to 60.3%–80.8% as the window expands. However, specificity collapses to 0.0% (FPR = 100%). The instrument begins to trigger false alarms on every clean pipe.
- The RIVIXI AI topological approach shows complete distance invariance, maintaining a stable sensitivity of 97.4% and a specificity of 97.7% across all physical spans from 10 to 250 meters.
The Mathematical Cause of Specificity Degradation
Tripling the search space () increases the statistical probability that random, uncorrelated environmental noises (e.g., urban traffic, water pumps, pressure fluctuations) will align temporarily across the two sensors. The static DSP algorithm interprets these random alignments as genuine correlation peaks, exceeding the PNR threshold of 8.5 and generating a 100% False Positive Rate.
We plotted the relationship between sensor distance and specificity for both methods.
Below is the visualization of the probabilistic collapse of a conventional correlator:

We also plotted the resulting specificity curve over distance:

Chart Explanation: When increasing the search distance to 150 meters, the search window expands 3-fold — to milliseconds.
- Probability Theory in Noise: The wider the search window, the higher the mathematical probability that random, uncorrelated phase fluctuations of background noise on Sensor A and Sensor B will temporarily align.
- False Peak Generation: Over wide time intervals, random noise inevitably sums into false, high-amplitude correlation peaks (as illustrated in Figure 1). The classical DSP algorithm lacks physical domain knowledge; it simply identifies any mathematical peak exceeding the threshold and triggers a False Positive alarm.
- Summary: At distances exceeding 120–150 meters in a clean, leak-free pipe, background noise generates a false correlation peak in 100% of cases due to the excessively wide search window. Specificity collapses to 0%, causing the hardware to report non-existent leaks.
The chart in Figure 2 visually demonstrates that conventional cross-correlation (dashed red line) suffers a catastrophic drop in specificity, reaching 0% once sensor distance exceeds 120–150 meters.
4. The Dead-End of Blind Reengineering
Rebuilding legacy DSP algorithms for cloud architectures without modification replicates their core architectural flaws. Because the mathematical framework cannot distinguish between random noise alignment and genuine leak signatures in wide windows, the system requires constant human supervision to manually reject false peaks. This completely invalidates the economic benefit of automated, large-scale cloud diagnostics.
5. The RIVIXI AI Hybrid Topological Solution
To bypass the physical limits of classical DSP, the RIVIXI AI platform implements a hybrid topological approach that analyzes the spectral and temporal signatures of the sound rather than absolute time delays. The system integrates two complementary neural network architectures:

5.1 One-Dimensional Acoustic Wave Analysis (1D-CNN)
Our deep 1D Convolutional Neural Network (Acoustic1DNet) acts directly on the raw acoustic waveform. It extracts temporal feature maps and detects the continuous acoustic signature of pressurized fluid escaping a pipe, separating it from transient or impulsive mechanical noises.
5.2 Two-Dimensional Computer Vision (2D-CNN)
To analyze the signal's spectral-temporal structure, the raw audio is transformed into a Mel-Spectrogram using the Short-Time Fourier Transform (STFT). The x-axis represents time, the y-axis represents frequency bins, and the color intensity represents acoustic energy. A 2D Convolutional Neural Network (based on the ResNet / Acoustic2DNet architecture) processes this spectrogram as an image. The network learns to visually recognize the distinct, steady-state frequency band characteristic of a micro-leak, ignoring the chaotic and variable background noise patterns of factory floors or urban traffic.
Below is the visualization of the input features for both AI approaches:

Key Advantages of the Hybrid AI Approach:
- Distance Invariance: The neural networks operate independently of the parameter. The classifier achieves a mean specificity of 97.7% (210 of 215 normal recordings correctly classified, 95% Wilson score confidence interval: ) and a sensitivity of 97.4% (152 of 156 leak recordings correctly classified, 95% confidence interval: ). These metrics remain constant across all tested sensor spans from 10 to 250 meters (green line in Figure 2).
- Statistical Separation (Welch's t-test): To confirm the statistical separation of the RIVIXI AI model, we performed Welch's t-test comparing the neural network's leak probability outputs between the Normal () and Leak () cohorts across the entire dataset:
- Normal cohort: , ;
- Leak cohort: , ;
- T-test result: , . The extremely low -value confirms a highly significant separation of classes.
- Comparison to Classical DSP at 150m: Performing Welch's t-test on the peak Z-scores of the conventional cross-correlator at 150m () shows a statistical difference between the Normal cohort (, ) and the Leak cohort (, ). However, because the mean Z-score of the clean pipes () lies far above the detection threshold of 8.5, the classical detector cannot separate the cohorts, leading to a collapse of specificity.
- Multi-Dimensional Noise Rejection: By evaluating the acoustic wave in both the time and frequency domains, the system rejects random noise alignments that fool static cross-correlation math.
- Hardware-Agnostic MLOps: When encountering new sensor types or low-sampling rates (such as legacy 1638 Hz inputs) that cause domain shift, the continuous training pipeline (MLOps) updates the network weights to maintain diagnostic integrity without code modifications.
6. Conclusion & Business Impact
Software-based reengineering of legacy NDT tools reveals that classical correlation mathematics have reached their physical limits. The specificity collapse of cross-correlation on long segments cannot be resolved within static DSP frameworks. Transitioning to topological, hybrid deep-learning classifiers (like those in RIVIXI AI) removes the spatial constraint, allowing utility operators to monitor wide-area municipal networks automatically with high reliability and zero false alarm rates.
Limitations
- Compute Demands: High-throughput 1D and 2D CNN inference requires edge NPU or cloud GPU resources, which increase operational costs compared to simple, low-power DSP microcontrollers.
- Domain Shift Sensitivity: CNN accuracy depends on representative training data; encountering entirely new sensor hardware or sampling rates requires retraining via an active MLOps loop.
Citation
This research paper is permanently archived as a preprint on Zenodo:
Ivanaiskii, A., Ivanaiskii, E., & Shipilov, S. (2026). Topological AI-Analysis vs. Classical Cross-Correlation: Overcoming Legacy Defectoscope Vulnerabilities [Preprint]. Zenodo. https://doi.org/10.5281/zenodo.20673744