Utilizing Zero-Crossing Rate (ZCR) for Acoustic Leak Detection in Pipelines: From Empirical Models to a Physically Grounded DSP Pipeline
Aleksandr Ivanaiskii, PhD
Industrial AI Founder & Systems Architect
Evgeny Ivanaiskii, PhD
Domain Expert
Sergei Shipilov
AI Architecture Lead, Rivixi LLC
Abstract
This paper presents the physical and mathematical justification for incorporating the Zero-Crossing Rate (ZCR) metric to enhance the selectivity of acoustic defectoscopes in pipeline leak detection. We analyze the limitations of empirical classification methods—specifically, the misuse of NLP-inspired text tokenization on raw acoustic waveforms—and propose a rigorous, physically grounded digital signal processing (DSP) approach. We show that hydrodynamic noise from pressurized fluid leaks exhibits a characteristic high-frequency, stationary pattern with an elevated ZCR. Integrating ZCR calculation as a spectral verification regulator for cross-correlation Z-score allows the RIVIXI AI system to distinguish real leaks from impulsive mechanical noise (false alarms) with high statistical confidence.
1. Introduction and the Problem of False Positives
Traditional acoustic leak locators, which rely on the cross-correlation Z-score, are highly effective at finding the spatial location of a noise source along a pipeline. However, they are prone to false positives. A single mechanical impact on the pipe (e.g., dropping a tool, heavy vehicle traffic, or valve operations) produces a high-energy acoustic pulse. The cross-correlation algorithm will successfully align this pulse, yielding a high Z-score and triggering a false alarm.
Historically, various methods have been proposed to filter out these non-leak anomalies. Some researchers experimented with Natural Language Processing (NLP) techniques, converting zero-crossing intervals of the acoustic wave into space-separated text tokens to be analyzed by recurrent neural networks (LSTMs). However, tokenizing a continuous waveform in this manner destroys its topological and mathematical continuity, leading to out-of-vocabulary (OOV) errors and poor generalization. Building a trustworthy, industrial-grade AI system requires abandoning such heuristics in favor of physically grounded DSP parameters.
2. Physics of Hydrodynamic Noise
Distinguishing true leaks from transient mechanical noise is possible due to the fundamental differences in their underlying physics:
- Hydrodynamic Noise (Leaks): Fluid discharge through an orifice under pressure produces turbulent flow and local cavitation. This process generates continuous, broadband, high-frequency acoustic emissions (a characteristic "hiss"). The raw waveform of this signal oscillates rapidly, exhibiting high high-frequency energy.
- Mechanical Noise (Impacts/Vibration): Impulsive forces acting on the pipe excite its natural structural resonances. The pipe acts as an acoustic filter, attenuating high frequencies and leaving a low-frequency resonant ring (a "hum" or "thud").
The primary time-domain indicator of this high-frequency hiss—without the computational overhead of continuous Fast Fourier Transforms (FFT)—is the Zero-Crossing Rate (ZCR).

3. Mathematical Model of Zero-Crossing Rate
The Zero-Crossing Rate (ZCR) measures the rate of sign changes along a signal frame. For a discrete signal frame of length , the ZCR is defined as:
where:
- is the signal amplitude at discrete time step ;
- is the indicator function, returning if condition is true (sign change between adjacent samples) and otherwise.
For a low-frequency mechanical hum (resembling a low-frequency sinusoid), the signal crosses zero infrequently. Conversely, a leak's high-frequency turbulent hiss (broadband white/pink noise) is a high-frequency stochastic process that crosses zero hundreds or thousands of times per second.
4. Integrating ZCR into the RIVIXI AI Hybrid DSP Pipeline
In RIVIXI AI, the ZCR calculation has been integrated into the second container of the SaaS platform (the defectoscope engine) as a verification layer.
To improve robustness against both non-stationary interference (e.g., speech or music) and low-frequency resonant noise (e.g., impacts or soil hum), a two-stage spectral verification logic is implemented:
- Spatial Localization (Z-Score): The cross-correlation of signals from two sensors is computed. If the maximum peak exceeds a calibrated threshold (), a strong acoustic source is localized.
- Two-Stage Spectral Verification:
- Stage 1: Non-Stationarity Filter (ZCR CV): The Coefficient of Variation () of the Zero-Crossing Rate is calculated over the signal frame.
- Continuous leak noise is highly stationary ().
- Non-stationary human speech or music features rapid fluctuations and pauses, yielding a high variation coefficient ().
- If , the signal is classified as non-stationary acoustic interference and rejected.
- Stage 2: Frequency/Hum Filter (ZCR Mean): If the signal passes Stage 1, the mean ZCR () in the source band is evaluated.
- Low-frequency mechanical impacts and ground vibrations resonate at low frequencies, yielding a low zero-crossing rate ().
- Turbulent leak noise exhibits a high zero-crossing rate ().
- If , the source is identified as a mechanical vibration and rejected.
- If , the source is confirmed as a high-frequency turbulent leak hiss. Verdict: True Leak.
- Stage 1: Non-Stationarity Filter (ZCR CV): The Coefficient of Variation () of the Zero-Crossing Rate is calculated over the signal frame.
5. Experimental Evaluation and Comparative Analysis
To evaluate the performance of the proposed hybrid verification algorithm using ZCR, two sets of tests were conducted.
5.1. Reference Signal Analysis
Initially, the algorithm was tested on three reference acoustic signals: a continuous leak hiss, an impulsive hammer strike, and non-stationary speech/music. Incorporating ZCR filtering successfully rejected the impulsive and speech interferences while retaining the true leak detection.
Table 1. Zero-Crossing Rate (ZCR) and peak informativeness metrics for reference acoustic signals.
| Signal Type | ZCR Mean | ZCR CV (Coef. of Variation) | PNR (Cross-Corr.) | Variant 1 (With ZCR) | Variant 2 (Without ZCR) | Verdict |
|---|---|---|---|---|---|---|
| Continuous Leak Hiss | 0.2411 | 0.054 | 217.93 | LEAK DETECTED | LEAK DETECTED | Correct Detection |
| Hammer Impact (Impulsive) | 0.0215 | 0.500 | 28.40 | REJECTED (Low ZCR Hum) | LEAK DETECTED (False Alarm) | Correct Suppression |
| Speech / Music (Transient) | 0.0759 | 0.868 | 19.10 | REJECTED (Non-UZK) | LEAK DETECTED (False Alarm) | Correct Suppression |
A visual comparison of the defectoscope verdicts with and without ZCR is shown in Fig. 2.

5.2. Evaluation on a Real-World Dataset
For a larger-scale evaluation of the ZCR filter, we tested it on a database of 113 industrial acoustic recordings (30 leaks, 72 leak-free files, and 11 noisy files known to cause false alarms).
Two configurations of the RIVIXI AI defectoscope engine were compared:
- Without ZCR: Leaks are classified based solely on the cross-correlation peak ().
- With ZCR: Leaks are classified only if and the ZCR of the bandpass-filtered signal exceeds .
The ZCR threshold of 0.06 was determined via 5-fold cross-validation on an independent calibration set of 40 signals, optimizing for a trade-off that maintains a high leak recall (Sensitivity > 90%) while maximizing the suppression of low-frequency hums. This ensures the threshold generalizes well and is not overfitted to this test set.
Table 2. Comparative performance metrics of the pipeline leak detection engine under different configurations (N=113).
| Performance Metric | Without ZCR (Correlation Only) | With ZCR (Lightweight DSP) | Absolute Change |
|---|---|---|---|
| Accuracy | 33.63% | 41.59% | +7.96% |
| Balanced Accuracy | 54.82% | 58.11% | +3.29% |
| Precision | 28.57% | 30.43% | +1.86% |
| Recall (Sensitivity) | 100.00% | 93.33% | -6.67% |
| F1-Score | 0.4444 | 0.4590 | +0.0146 |
| False Positive Rate (FPR) | 90.36% | 77.11% | -13.25% |
| Confusion Matrix (TP / FP / TN / FN) | 30 / 75 / 8 / 0 | 28 / 64 / 19 / 2 | Corrected 11 false alarms |
The evolution of these key performance metrics across the processing stages is plotted in Fig. 3. While Table 2 captures the exact numerical indicators for the isolated DSP stage, Fig. 3 visually illustrates the overall trajectory of system evolution as subsequent analysis levels are integrated.

Notes on Fig. 3:
- For the False Positive Rate (FPR), lower values indicate better system selectivity (desired downward trend).
Discussion of Class Imbalance and Metrics
The raw test database is highly imbalanced, containing 30 leak records (positive class) and 83 non-leak/noisy records (negative class). Because the defectoscope engine is calibrated for maximum sensitivity to ensure no leak is missed, it operates at a very low threshold. This causes a massive number of false alarms on negative samples, dragging down standard Accuracy to 33.63%. By implementing the ZCR pre-filter, the Balanced Accuracy increases from 54.82% to 58.11%, and the F1-Score improves. In a real-world pipeline system, this pre-filter acts as a guard, reducing downstream computational overhead before the signal reaches the heavier 1D/2D CNN classification layers of RIVIXI AI.
Discussion of Suppressed False Alarms and FPR
Integrating the ZCR verification layer successfully eliminated 14.67% of false alarms (representing a 13.25% absolute reduction in the False Positive Rate, from 90.36% to 77.11%, by preventing 11 false triggers).
An FPR of 77.11% is still high for a standalone system. However, ZCR is a simple time-domain metric that is primarily effective at suppressing low-frequency resonant vibrations (e.g., soil noise, impacts) and highly non-stationary interferences. ZCR cannot distinguish a leak from high-frequency continuous noises (such as flow-control valves or pressure regulators), which exhibit similar stationarity and high ZCR. Within the hybrid RIVIXI AI architecture, ZCR is not used as a final classifier, but as a lightweight DSP pre-filtering step at the defectoscope level. The remaining complex false positives are resolved downstream by the deep 1D/2D CNN classification models.
As shown in Fig. 3, the combination of ZCR pre-filtering with downstream CNN architectures dramatically reduces the False Positive Rate to 12.05% while maintaining a high recall of 90.00%, achieving an overall system accuracy of 86.36%. Most of the suppressed false positives in this test were caused by low-frequency machinery, soil vibrations, and valves. While these sources produced high cross-correlation peaks, their ZCR in the target band was , allowing the system to classify them as ambient background noise.
An example of a suppressed false alarm is shown in Fig. 4, where a low-frequency ground vibration creates a prominent cross-correlation peak (Z-score = 30.89), which exceeds the defectoscope alarm threshold. However, because its ZCR (0.0458) remains below the leak threshold (0.06), the alarm is successfully prevented.

6. Conclusion
Shifting from heuristic, text-tokenized analyses of zero-crossings to a rigorous mathematical formulation of ZCR significantly improves acoustic leak detection reliability. Utilizing ZCR as a verification metric for cross-correlation Z-scores provides a physical defense against false alarms caused by mechanical impacts and industrial vibrations. This verification layer fits naturally into the RIVIXI AI hybrid ensemble, which combines lightweight, classical DSP methods with deep 1D and 2D convolutional networks to ensure hardware-agnostic performance and high robustness on noisy field data.
Acknowledgments
The authors used AI-assisted tools for language editing and translation. All scientific content, methodology, data analysis, and conclusions were developed and verified by the authors.
References
[1] Ivanaiskii, A., Ivanaiskii, E., & Shipilov, S. (2026). Topological AI-Analysis vs. Classical Cross-Correlation: Overcoming Legacy Defectoscope Vulnerabilities [Preprint]. Rivixi LLC. https://doi.org/10.5281/zenodo.20673744
[2] Ivanaiskii, A., Ivanaiskii, E., & Shipilov, S. (2026). Adapting an Ultrasonic Diagnostics AI Platform to Legacy Hardware: Dynamic DSP Pipeline Reengineering [Preprint]. Rivixi LLC. https://doi.org/10.5281/zenodo.20673979
[3] Liang, H., Gao, Y., Li, H., Huang, S., Chen, M., & Wang, B. (2023). Pipeline Leakage Detection Based on Secondary Phase Transform Cross-Correlation. Sensors, 23(3), 1572. https://doi.org/10.3390/s23031572
[4] Ahmad, S., Ahmad, Z., Kim, C.-H., & Kim, J.-M. (2022). A Method for Pipeline Leak Detection Based on Acoustic Imaging and Deep Learning. Sensors, 22(4), 1562. https://doi.org/10.3390/s22041562
[5] Ahmad, Z., Nguyen, T.K., Kim, J.M. (2023). Leak Detection and Size Identification in Fluid Pipelines Using a Novel Vulnerability Index and 1-D Convolutional Neural Network. Engineering Applications of Computational Fluid Mechanics, 17, 2165159. https://doi.org/10.1080/19942060.2023.2165159
[6] Kim, D., et al. (2023). A Reliable Pipeline Leak Detection Method Using Acoustic Emission with Time Difference of Arrival and Kolmogorov–Smirnov Test. Sensors, 23(23), 9296. https://doi.org/10.3390/s23239296
[7] Bachu, R.G., Kopparthi, S., Adapa, B., & Barkana, B.D. (2010). Voiced/Unvoiced Decision for Speech Signals Based on Zero-Crossing Rate and Energy. In Advanced Techniques in Computing Sciences and Software Engineering. Springer. https://doi.org/10.1007/978-90-481-3660-5_47
[8] McFee, B., et al. (2015). librosa: Audio and Music Signal Analysis in Python. Proceedings of the 14th Python in Science Conference. https://doi.org/10.25080/Majora-7b98e3ed-003
Citation
This research paper is permanently archived as a preprint on Zenodo:
Ivanaiskii, A., Ivanaiskii, E., & Shipilov, S. (2026). Utilizing Zero-Crossing Rate (ZCR) for Acoustic Leak Detection in Pipelines: From Empirical Models to a Physically Grounded DSP Pipeline [Preprint]. Zenodo. https://doi.org/10.5281/zenodo.20740891