Introduction
In recent years, the rapid advancement of big data technology has significantly transformed production and daily life, particularly revolutionizing key sectors, such as e-commerce, healthcare, public services, and finance. This technological progress owes much to the capabilities of open-source big data systems in processing massive datasets.
Common open source big data system software components include Hadoop, HDFS, Hive, Pig, Mahout, MapReduce, YARN, Zookeeper, and HBase. These components work together to perform complex distributed processing tasks for big data. The reliability of these components directly affects the performance and quality of big data processing. Therefore, their reliability has attracted research attention.
Some scholars have proposed using software reliability models (SRMs) to assess the reliability of open source big data system software. For instance, Tamura et al.1 studied the behavior of big data system failures in cloud computing environments, proposed a SRM based on jump diffusion, and conducted parameter estimation, model fitting, and performance evaluation experiments using a failure dataset from open source big data software.
Taking into account the features of cloud computing and big data, Tamura and Yamada2 proposed a three-dimensional Wiener process-based SRM and conducted corresponding experiments on a big data fault dataset. Tamura et al.3 proposed a clustering-based software reliability analysis method for three-dimensional fault data in big data and cloud computing environments.
By examining the correlation between actual big data processing and cloud computing, Tamura et al.4 proposed a three-dimensional stochastic differential equation-based SRM through analyzing communication traffic changes. Tamura and Yamada5 developed an SRM incorporating irregular fault detection rate variations through their cloud computing and big data research.
Given that big data processing requires a cloud computing environment, Tamura and Yamada6 proposed an SRM for fault data aggregation with dynamic risk rates. To account for variations in components, cloud computing, and user numbers, Tamura and Yamada7 proposed an SRM combining stochastic differential equations with jump diffusion processes. Tamura et al.8 introduced a neural network-based reliability evaluation method for high-capacity cloud environments, leveraging fault data clustering while considering the network-dependent relationship between cloud computing and big data systems.
The aforementioned SRMs were established in cloud computing contexts, incorporating the relationship between cloud storage and big data communication. However, given the complexity of reliability modeling for open source big data systems, it is more practical to develop an SRM that specifically addresses fault detection characteristics in these systems. This paper proposes a direct modeling approach for open source big data systems’ development and testing environments.
Our analysis of open source big data system fault datasets revealed a significant increase in detected faults during later development and testing phases across multiple software versions. Thus, we propose a Weibull–Weibull distribution-based SRM for open source big data systems, with experimental validation of its accuracy and effectiveness.
The contributions of this paper are as follows,
- 1.
We propose a novel Weibull–Weibull distribution-based SRM for open source big data systems.
- 2.
Our study empirically identifies and analyzes late-stage fault surge phenomena in open source big data systems, a phenomenon not thoroughly documented in prior work. We further investigate the underlying causes and implications of this trend, providing actionable insights for practitioners.
- 3.
We propose that fault detection in these systems follows a Weibull–Weibull distribution, providing a theoretical foundation for reliability assessment.
The remainder of this paper is organized into seven sections.
In "Reason for modeling" section, we explain why the Weibull-Weibull distribution is used to simulate the process of fault detection of open source big data system software. The third section describes the development process of the PM. The following section is the estimation method of SRM parameters. The fault datasets, comparison criteria for models, and comparison models used in the experiment are introduced in "Numerical examples" section. "Numerical examples" section also discusses and analyzes the experimental results. Sensitivity analysis is in "Sensitivity analysis" section. "Treats to validity" section analyzes the threats to validity. Conclusions are presented in the last section.
Reason for modeling
In this section, we will discuss two questions. Firstly, why is there a significant increase in detected faults in each version of open source big data system software during the later stages of software development and testing? Secondly, why do we need a Weibull–Weibull distribution to simulate the process of fault detection of open source big data system software?
Reason for increasing number of faults detected during the later stages of open source big data system software development and testing
During the open source big data system software testing phase, if more and more faults are detected in the later stage, there may be the following reasons:
- 1.
Expansion of testing scope: In the early stages of testing, the main focus may be on the core functions and critical paths of the system. However, as the testing progresses, the scope of testing will gradually expand to cover more functions, boundary conditions, and abnormal situations. This may lead to the discovery of more faults in the later stages of testing.
- 2.
Richness of test data: As testing progresses, the testing team will prepare richer and more complex test data to simulate real-world data situations. These more complex data may expose previously undiscovered faults.
- 3.
Concurrency and load testing: In the later stages of testing, concurrency and load testing is usually conducted to simulate multiple users accessing the system simultaneously. This testing method can expose performance bottlenecks and failures under high load and concurrency conditions.
- 4.
Integration testing with other systems: Big data systems often require integration with other systems, such as databases, message queues, caching, etc. In the later stages of testing, integration testing with other systems may expose compatibility and interaction issues that were not previously discovered.
- 5.
Improvement of testing strategies and methods: As testing progresses, the testing team may continuously improve testing strategies and methods, such as adding automated testing, introducing new testing tools, etc. These improvements may make testing more comprehensive and in-depth, thereby discovering more faults.
In addition, we analyzed the changes of detected faults in open source big data system components (i.e. version 0.23.5, 0.23.6, 0.23.7 of YARN and version 3.1.1, 3.1.2, 3.1.3 of HDFS) over time. In Fig.1, the horizontal axis represents time, and the vertical axis represents the cumulative number of detected faults. From Fig.1, we can clearly see that there is a significant increase in detected faults for open source big data system software during the later stages of project component testing.
The cumulative number of faults detected from open source big data system software over time.
Characteristics of Weibull–Weibull distribution
The Weibull–Weibull distribution was proposed by Bourguignon et al.9 in 2014year. The cumulative distribution function can be denoted as follows,
$$F\left( x \right) = 1 - \exp \left\{ { - b\left[ {\exp \left( {cx^{d} } \right) - 1} \right]^{e} } \right\},x,b,c,d,e > 0$$
(1)
where b, d and e represent shape parameter, and c denotes a scale parameter.
The probability density function of the Weibull–Weibull distribution can be written as follows,
$$f\left( x \right) = bcdex^{d - 1} \left[ {{\text{exp}}\left( {cx^{d} } \right) - {1}} \right]^{e - 1} {\text{exp}}\left\{ { - b\left[ {{\text{exp}}\left( {cx^{d} } \right) - {1}} \right]^{e} + cx^{d} } \right\}$$
(2)
From Fig.2, we can see that the Weibull–Weibull probability density function has various curve shapes through changing function parameters. For example, in Fig.2, it has decrease changes over time, or first increase, and then decrease changes at different time point. The flexible changes in these curves or shapes play an important role for modeling the complicated process of fault detection of open source big data system software.
Weibull–Weibull probability density function.
On the other hand, the hazard rate function can be given by
$$b\left( x \right) = \frac{f(x)}{{1 - F(x)}} = bcdex^{d - 1} \left[ {{\text{exp}}\left( {cx^{d} } \right) - {1}} \right]^{e - 1} {\text{exp}}\left( {cx^{d} } \right)$$
(3)
The hazard rate function of the Weibull–Weibull distribution has multiple curve or shape changes over time, such as decrease, increase or bathtub shape of the fault detection rate function. The various changes of the hazard rate function lay a solid foundation for simulating the process of fault detection of open source big data system software. This paper also uses multiple fault datasets to verify that the process of fault detection of big data system software follows the Weibull–Weibull distribution.
Model development
The detected faults behavior of open source big data system software is considered a counting process, represented by N(t). We assume that it follows a non-homogeneous Poisson process (NHPP), which can be expressed as follows,
$$\Pr \{ N(t) = n\} = \frac{{[m(t)]^{n} \exp ( - m(t))}}{n!}$$
(4)
where m(t) represents mean value function (MVF), that is, the expected cumulative number of detected faults by time t. n denotes the actual number of detected faults.
Software reliability10 is defined as follows,
$$R(\mu /t) = \exp [ - (m(t + \mu ) - m(t))]$$
(5)
In Eq.(5), \(R(\mu /t)\) denotes a software reliability function. It is a conditional probability which the software won’t fail within a specified time \((t,\;t + \mu ]\).
We assume that,
- (1)
The process of fault detection of open source big data system software follows NHPP.
- (2)
There is no introduction of new faults, and the identified faults are instantly removed.
- (3)
The remaining software faults are connected to the found faults in the open source big data system software.
- (4)
The fault detection rate of open source big data system software follows a Weibull–Weibull distribution.
Based on assumption (3), we can derive the following equation,
$$\frac{d(m(t))}{{dt}} = b(t)(a - m(t))$$
(6)
where a and b(t) represent the expected total number of faults detected initially and the fault detection rate function, respectively.
Based on assumption (4), and substituting Eq.(3) into Eq.(6), we can derive the following equation,
$$m(t) = a(1 - \exp ( - b(\exp (ct^{d} ) - 1)^{e} ))$$
(7)
Equation(7) is a mathematical expression of the PM.
Model parameter estimation method
In this paper, we use the least squares method to estimate the parameter values of the model. The following is an expression for the least squares estimate (LSE) method,
$$\eta = \sum\limits_{j = 0}^{n} {[m(t_{j} ) - O(t_{j} )]^{2} }$$
(8)
where \(O(t_{j} )\) represents the actual number of detected faults by time tj, j = 0,1,2,3,…,n.
Take partial derivatives on both sides of the Eq.(8),
$$\left\{ \begin{gathered} \frac{\partial \eta }{{\partial a}} = 2\sum\limits_{j = 0}^{n} {\frac{{\partial (m(t_{j} ))(m(t_{j} ) - O(t_{j} ))}}{\partial a}} = 0 \hfill \\ \frac{\partial \eta }{{\partial b}} = 2\sum\limits_{j = 0}^{n} {\frac{{\partial (m(t_{j} ))(m(t_{j} ) - O(t_{j} ))}}{\partial b}} = 0 \hfill \\ \frac{\partial \eta }{{\partial c}} = 2\sum\limits_{j = 0}^{n} {\frac{{\partial (m(t_{j} ))(m(t_{j} ) - O(t_{j} ))}}{\partial c}} = 0 \hfill \\ \frac{\partial \eta }{{\partial d}} = 2\sum\limits_{j = 0}^{n} {\frac{{\partial (m(t_{j} ))(m(t_{j} ) - O(t_{j} ))}}{\partial d}} = 0 \hfill \\ \frac{\partial \eta }{{\partial e}} = 2\sum\limits_{j = 0}^{n} {\frac{{\partial (m(t_{j} ))(m(t_{j} ) - O(t_{j} ))}}{\partial e}} = 0 \hfill \\ \end{gathered} \right.$$
The estimated model parameter values can be obtained by solving the system of equations above.
Numerical examples
In this section, we introduce collected fault datasets, models using for comparison and model comparison criteria. We also discuss the performance of models using two fault datasets from open source big data system components, and each fault datasets includes three sub-datasets collected from successive released open source big data system components.
Fault datasets
In this work, we leverage two fault datasets gathered from two distinct open source big data system components—YARN and HDFS—in issue tracking systems available at https://issues.apache.org/. There are three sub-datasets collected from YARN, such as YARN 0.23.5, YARN 0.23.6 and YARN 0.23.7. There are three sub-datasets collected from HDFS, such as HDFS 3.1.1, HDFS 3.1.2 and HDFS 3.1.3.
The first fault dataset (DS1-1) from YARN 0.23.5 includes 28 detected faults using 29weeks from May 3, 2012 to November 16, 2012. In the second fault dataset (DS1-2) from YARN 0.23.6, 15 faults were detected over 48weeks from February 22, 2012 to January 23, 2013. In the third fault dataset (DS1-3) from YARN 0.23.7, 25 faults were detected over 18months from October 7, 2011 to March 12, 2013. In the fourth fault dataset (DS2-1) from HDFS 3.1.1, 84 faults were detected over 47months from September 19, 2014 to July 25, 2018. In the fifth fault dataset (DS2-2) from HDFS 3.1.2, 65 faults were detected over 35months from March 31, 2016 to January 24, 2019. In the sixth fault dataset (DS2-3) from HDFS 3.1.3, 69 faults were detected over 54months from April 17, 2015 to October 2, 2019.
Note that the types of faults in the open source big data system components we collected include Bug, Task and Sub-task, but not Improvement, New Feature, Test and Wish.
The reasons are as follows,
- 1.
Improvement refers just improvements or enhancements to existing functions or tasks, rather than reporting system failures or issues.
- 2.
New Feature refers functions or features planned to be implemented in future versions of the software, without reporting existing system failures.
- 3.
Test: Typically used to track software testing related tasks, test cases, and test plans. This may include writing, executing, and managing test cases, rather than reporting system failures.
- 4.
Wish: Usually used to indicate a desire or suggestion for system improvements or new features, not to report faults or issues with existing systems.
Models for comparison
To conduct a thorough comparison of the models’ performance, we have selected multiple SRMs. We use seven SRMs to compare the fitting and predictive performance using six fault datasets of open source big data system software. From Table 1, we can see that seven SRMs include the Goel-Okumoto (G-O) model, the Delay S-shaped (DSS) model, the Inflection S-shaped (ISS) model, the Generalized G-O (GGO) model, the Zhang-Teng-Pham model, the Li model and the PM. In these SRMs, closed source SRMs include the G-O, DSS, ISS, GGO and Zhang-Teng-Pham model. Open source SRMs include the Li model and the PM. The perfect debugging SRMs have the G-O, DSS, ISS, GGO, Li model and PM. The Zhang-Teng-Pham model is an imperfect debugging SRM.
Model comparison criteria
We assess model performance using five model comparison criteria: such as, MSE, R2, TS, MEOP and Variance. They can denote as follows,
$$MSE = \frac{{\sum\nolimits_{j = 1}^{n} {(m(t_{j} ) - O_{{t_{j} }} )^{2} } }}{n - \theta }$$
(9)
where \(m(t_{j} )\) represents the expected cumulative number of detected faults by time tj. \(O_{{t_{j} }}\) denotes the actual number of observed faults by time tj. n and \(\theta\) represent a size of samples and the model parameter number, respectively.
$$R^{2} = 1 - \frac{{\sum\nolimits_{j = 1}^{n} {(O_{{t_{j} }} - m(t_{j} ))^{2} } }}{{\sum\nolimits_{j = 1}^{n} {(O_{{t_{j} }} - \sum\nolimits_{i = 1}^{n} {\frac{{O_{{t_{i} }} }}{n}} )^{2} } }}$$
(10)
where in Eq.(10), \(m(t_{j} )\),\(O_{{t_{j} }}\) and n are identical to those in Eq.(9).
$$TS = \sqrt {\frac{{\sum\nolimits_{j = 1}^{n} {(m(t_{j} ) - O_{{t_{j} }} )^{2} } }}{{\sum\nolimits_{j = 1}^{n} {O_{{t_{j} }}^{2} } }}} \times 100\%$$
(11)
where in Eq.(11), \(m(t_{j} )\),\(O_{{t_{j} }}\) and n are identical to those in Eq.(9).
$$MEOP = \frac{{\sum\nolimits_{j = 1}^{n} {{|}m(t_{j} ) - O_{{t_{j} }} {|}} }}{{n - \theta { + }1}}$$
(12)
where in Eq.(12), \(m(t_{j} )\),\(O_{{t_{j} }}\), n and \(\theta\) are the same as those in Eq.(9).
$$\begin{aligned} Variance & = \sqrt {\frac{{\sum\nolimits_{j = 1}^{n} {(O_{{t_{j} }} - m(t_{j} ) - Bias)^{2} } }}{n - 1}} ,\; \\ Bias & = \frac{{\sum\nolimits_{j = 1}^{n} {(m(t_{j} ) - O_{{t_{j} }} )} }}{n} \\ \end{aligned}$$
(13)
where in Eq.(13), \(m(t_{j} )\),\(O_{{t_{j} }}\) and n are the same as those in Eq.(9).
Note that the smaller MSE, TS, MEOP and Variance values, the higher the model’s performance. But the larger R2 value, the better the fitting performance of the model.
To properly evaluate and compare the performance of various attributes of SRMS, we use a rank (\(NCD_{k}\)) method from Literature17. \(NCD_{k}\) can be denoted as follows,
$$NCD_{k} = \sqrt {\left( {\sum\limits_{j = 1}^{n1} {\left( {\frac{{\chi_{kj} }}{{\sum\nolimits_{i = 1}^{n2} {\chi_{ij} } }}} \right)^{2} \psi_{j} } } \right)} ,\;\;\;\;\;k = 1,2,...,n2.$$
where n1 and n2 represent the number of model comparison criteria and the number of comparison SRMs, respectively. respectively. \(\chi_{kj}\) denotes the standard value of kth SRM and jth criteria.
\(\psi_{j}\) represents the weight when the index of the criterion is j. The smaller \(NCD_{k}\) value represents the better model performance.
Results and discussion of model performance comparisons
Comparison of model fitting performance
To verify the accuracy and effectiveness of the PM, we compared the fitting performance with six fault datasets and seven SRMs. Table 2 lists the parameter estimation values of the PM using the LSE method for 100% and 90% of DS1 and DS2, respectively. Table 3 indicates that the PM has excellent fitting performance compared to other models. For example, in Table 3, using 100% of DS2-1, MSE, TS, MEOP of the ISS model are about 5.25, 2.35 and 1.95 times bigger than those of the PM, respectively. Using 100% of DS2-2, MSE, TS and MEOP of the ISS model are nearly 5.43, 2.41 and 1.89 times bigger than those of the PM, respectively Table 4. From Table 5, we can clearly see the SRM ranking results. The first ranking is the PM, the second is the ISS model, and the worst is the G-O model. The ranking order of other SRMs is not fixed. These results indicate that the PM has better fitting performance compared to other SRMs. Those results can be clearly seen from Fig.3. Figure3 shows that the fitting performance comparisons of eight SRMs using six fault datasets. Figure3 shows that the fitting performance of the PM is superior to other SRMs Fig.4.
Fitting performance comparisons for SRMs using 100% of DS1 and DS2, respectively. (a–f) represents that fitting performance comparisons for SRMs using 100% of DS1-1, DS1-2, DS1-3, DS2-1, DS2-2 and DS2-3, respectively.
We conducted a 95% confidence interval for the PM using 100% of fault datasets. From Fig.5, we can see that estimated points are well within the range of 95% confidence intervals except for Fig.5d. This indicates that the parameter estimation values of the PM are effective and stable. In Fig.5d, although there is one point that falls outside the interval, overall, the estimated parameter values of the PM are acceptable. This indicates the complexity of establishing an SRM for open source big data system software, however, the PM is well suited to this situation.
Predictive performance comparisons for SRMs using 90% of DS1 and DS2, respectively. (a–f) represents that predictive performance comparisons for SRMs using 90% of DS1-1, DS1-2, DS1-3, DS2-1, DS2-2 and DS2-3, respectively.
Comparison of model predictive performance
To verify the predictive performance of the PM, we conducted corresponding experiments. Table 4 shows that the PM’s predictive performance outperforms other SRMs using 90% of fault datasets except for using 90% of DS2-2. For example, in Table 4, using 90% of DS1-3, MSE, TS, Variance and MEOP of the ISS model are about 3.58, 1.89, 1.9 and 1.4 times as big as those of the PM. using 90% of DS2-1, MSE, TS, Variance and MEOP of the GGO model are approximately 7.19, 2.68, 2.52 and 3.18 times as large as those of the PM. From Table 6, we can see that the PM, ISS model and G-O model rank first, second and last, respectively. In Table 4, using 90% of DS1-1, the Variance value (0.4818) of the ISS model is less than that (0.5248) of the PM. But from Table 6, it is evident that the PM’s predictive performance outperforms the ISS model’s through comprehensive comparisons. In addition, in Table 4, using 90% of DS2-2, the Variance and MEOP values of the ISS model are less than those of the PM. They are 2.1273 and 0.6264 for the ISS model, and 2.3336 and 0.6564 for the PM. Table 6 shows that the first ranking is the ISS model and the second is the PM using 90% of DS2-2. However, the ranking value of the PM is close to that of the ISS model.
The predictive performance of the PM is superior to other models. As can be seen from Table 6, except for the PM, the predictive performance of other models is not stable. For example, using 90% of DS2-1, the ISS model ranks the third, while it ranks the first using 90% of DS2-2, and other situation ranks the second. Other SRMs have the same changes as the ISS model. Note that from Tables 5 and 6, we can see that the G-O model are worst whether using 100% or 90% of fault datasets. This indicates that the G-O model is not suitable for reliability evaluation of open source big data system software. From Fig.4, we can see that the PM is the best among all SRMs on the predictive performance using 90% of fault datasets.
95% confidence interval for the proposed model using 100% of DS1 and DS2, respectively. (a–f) represents that 95% confidence interval for the PM using 100% of DS1-1, DS1-2, DS1-3, DS2-1, DS2-2 and DS2-3, respectively.
In summary, the PM has better fitting and predictive performance in the process of fault detection of open source big data system software. The PM predicts the most accurate number of remaining faults in the software and has the most stable performance. The experiment verified that software fault detection of open source big data system software follows a Weibull–Weibull distribution, and the effectiveness of the PM in fault prediction in the testing process of open source big data system software.
Sensitivity analysis
To illustrate which parameters of the PM are important, we also conducted relevant parameter sensitivity analysis experiments through changing one parameter under fixed other parameters of the PM. The experimental results indicate that all parameters of the PM are important parameters. From Fig.6, it can be seen that when the parameters a, b, c, d, and e change, the curve of the PM also undergoes significant changes. This indicates that every parameter of the PM is important. Below are the explanations:
- 1.
Parameter a of the PM denotes the total number of faults that were initially anticipated to be detected. Accurately estimating the total number of faults in open source big data system software is crucial in the process of fault detection. In the PM, parameter a is an important parameter. This indicates the effectiveness of the PM establishment.
- 2.
The parameters b, c, d, and e of the PM are important parameters. In the fault detection process of open source big data system software, the change of the fault detection rate determined by parameters b, c, d, and e is crucial, as it plays an important role in accurately and effectively estimating the number of faults detected in the software.
Parameter sensitivity analysis for the PM using 100% of DS2-3. (a) represents that parameter a changes under fixed other model parameters. (b) represents that parameter b changes under fixed other model parameters. (c) represents that parameter c changes under fixed other model parameters. (d) represents that parameter d changes under fixed other model parameters. (e) represents that parameter e changes under fixed other model parameters.
Treats to validity
The model that is put forward in this study faces two primary risks to its validity. One is from the internal aspects of the PM; the second aspect is to evaluate the performance of the PM using fault datasets.
External factors
This paper used two components of an open source big data system, consisting of six fault datasets, to conduct model fitting and predictive performance verification experiments. The experimental results present the effectiveness and accuracy of the PM in fault fitting and prediction of open source big data system software. If considering using more fault datasets, it may be better to validate the performance of the PM. In addition, our fault dataset comes from the same institution, so considering using fault datasets from different sources may be more convincing. However, fault datasets used in the experiment meet the basic requirements and are sufficient and reliable.
Internal factors
We mainly consider the issue of the number of PM parameters. In this article, there are a total of 5 parameters for the PM. Generally speaking, the more parameters a model has, the better its fitting performance. However, the number of parameters in the Zhang-Teng-Pham model is six, and its fitting performance is general. The ISS model ranking the second has 3 parameters, and its fitting performance are good. This indicates that the number of model parameters has a significant impact on the fitting performance of the model. However, it is more important to establish a reliability model that is suitable for software development and testing environments.
Related work
Although there are many applications and research on big data, there is very little research on the reliability of big data system software18. Cao and Gao18 proposed using fault trees to simulate and evaluate the reliability of open source big data systems. Meanwhile, they analyzed and summarized the fault occurrence of the big data system. Kumar et al.19 used traditional closed source software (TCSS) reliability models to evaluate the reliability of open source big data system software, and experimental verification shows that the software reliability model with the best fit may not necessarily have the best predictive performance. They did not propose a new reliability model based on big data system software. Therefore, it is impossible to compare the performance of reliability models for big data system software with TCSS reliability models. Govindasamy and Dillibabu20 proposed a hybrid method integrating TCSS reliability models, such as G-O, DSS and ISS models, to establish three software reliability models that consider software failure, hardware related failure, and user induced failure, respectively. They used fault datasets of open source big data to evaluate the hybrid models performance. Sharma et al.21 reviewed big data reliability technology. They focus on analyzing the establishment methods of reliability models and parameter estimation methods for big data system software. They also emphasized the importance of developing new reliability models for open source big data system software.
Conclusions
This paper proposes an open source big data system SRM based on the Weibull–Weibull distribution. We conducted fitting and predictive performance experiments using six fault datasets and eight SRMs. The experimental results indicate that the PM has better fitting and predictive performance than other SRMs. The PM can accurately and effectively predict the number of remaining failures in the testing process of open source big data system software. We also conducted a 95% confidence interval and parameter sensitivity analysis. The results present that each parameter of The PM is a crucial variable, and the PM has stable performance. The parameters of the PM play an important role in the fault detection process of open source big data system software. A limitation of this study is that it does not account for fault introduction phenomena, which may occur due to the complexity of debugging in open-source big data systems.
Considering the complexity of the development, testing and debugging processes of open source big data system software, our future work will focus on developing an imperfect-debugging SRM tailored for open source big data systems in complex environments.
Data availability
The data of our study are available on reasonable request. The data can be provided by the corresponding author.
References
Tamura, Y. & Yamada, S. Reliability analysis based on a jump diffusion model with two wiener processes for cloud computing with big data. Entropy 17(12), 4533–4546 (2015).
Tamura, Y., Yamada, S. 3D software tool for reliability assessment based on three dimensional wiener process model considering big data on cloud computing. In Proceedings of the 3rd International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions), 1–6 (IEEE, 2014)
Tamura, Y., Nobukawa, Y. & Yamada, S. A method of reliability assessment based on hazard rate by clustering approach for cloud computing with big data. In Proceedings of the IEEE International Conference on Industrial Engineering and Engineering Management (IEEM) 732–736 (2015).
Tamura, Y., Miyaoka, K. & Yamada, S. Reliability analysis based on three-dimensional stochastic differential equation for big data on cloud computing. In Proceedings of the IEEE International Conference on Industrial Engineering and Engineering Management 863–867 (2015).
Tamura, Y. & Yamada, S. Software reliability analysis considering the fault detection trends for big data on cloud computing. Ind. Eng. Manag. Sci. Appl. 349, 1021–1030 (2015).
Tamura, Y. & Yamada, S. Software reliability assessment tool based on fault data clustering and hazard rate model considering Cloud computing with Big data. In Proceedings of the 4th International Conference on Reliability, Infocom Technologies and Optimization (ICRITO)(Trends and Future Directions), Noida, 1–6 (2015).
Tamura, Y. & Yamada, S. Dependability analysis tool based on multi-dimensional stochastic noisy model for cloud computing with big data. Int. J. Math. Eng. Manag. Sci. 2(4), 273–287 (2017).
Tamura, Y., Nobukawa, Y. & Yamada, S. A method of reliability assessment based on Neural Network and fault data clustering for Cloud with Big data. In Proceedings of the 2nd International Conference on Information Science and Security (ICISS), Seoul, 1–4 (2015).
Bourguignon, M., Silva, R. B. & Cordeiro, G. M. The Weibull-G family of probability distributions. J. Data Sci. 12(1), 53–68 (2014).
Musa, J. D., Iannino, A. & Okumoto, K. Software Reliability: Measurement, Prediction, Application 32–241 (McGraw-Hill, 1989).
Goel, A. L. & Okumoto, K. Time dependent error detection rate model for software reliability and other performance measures. IEEE Trans. Reliab. 28(3), 206–211 (1979).
Yamada, S., Ohba, M. & Osaki, S. S-shaped reliability growth modeling for software error detection. IEEE Trans. Reliab. 32, 475–484 (1983).
Ohba, M. Stochastic Models in Reliability Theory 144–162 (Springer, 1984).
Goel, A. L. Software reliability models: assumptions, limitations and applicability. IEEE Trans. Softw. Eng. 11(12), 1411–1423 (1985).
Zhang, X., Teng, X. & Pham, H. Considering fault removal efficiency in software reliability assessment. J. IEEE Trans. Syst. Man Cybern. Part Syst. Hum. 33(1), 2241–2252 (2003).
Li, X., Li, Y. F., Xie, M. & Ng, S. H. Reliability analysis and optimal version-updating for open source software. Inf. Softw. Technol. 53(9), 929–936 (2011).
Pham, H. A new software reliability model with Vtub-shaped fault-detection rate and the uncertainty of operating environments. Optim. J. Math. Program. Op. Res. 63(10), 1481–1490 (2014).
Cao, R. & Gao, J. Research on reliability evaluation of big data system. In 2018 IEEE 3rd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), 261–265 (IEEE, 2018).
Kumar, R., Kumar, S. & Tiwari, S. K. A study of software reliability on big data open source software. Int. J. Syst. Assur. Eng. Manag. 10, 242–250 (2019).
Govindasamy, P. & Dillibabu, R. Development of software reliability models using a hybrid approach and validation of the proposed models using big data. J. Supercomput. 76(4), 2252–2265 (2020).
Sharma, S., Kumar, N. & Kaswan, K. S. Big data reliability: A critical review. J. Intell. Fuzzy Syst. 40(3), 5501–5516 (2021).
Acknowledgements
This work was supported by the Fundamental Research Program of Shanxi Province of China (No.202303021221061) and National Natural Science Foundation of China (No. 62472267).
Author information
Authors and Affiliations
School of Automation and Software Engineering, Shanxi University, Taiyuan, People’s Republic of China
Jinyong Wang,Haijun Geng&Pengda Li
Authors
- Jinyong Wang
View author publications
You can also search for this author inPubMedGoogle Scholar
- Haijun Geng
View author publications
You can also search for this author inPubMedGoogle Scholar
- Pengda Li
View author publications
You can also search for this author inPubMedGoogle Scholar
Contributions
J. W. wrote the main manuscript text, H. G. reviewed the manuscript and P. L. collected fault datasets.
Corresponding author
Correspondence to Jinyong Wang.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wang, J., Geng, H. & Li, P. A software reliability model for open source big data systems based on Weibull–Weibull distribution. Sci Rep 15, 14670 (2025). https://doi.org/10.1038/s41598-025-98942-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-98942-9
Keywords
- Software reliability model
- Open source big data system
- Weibull–Weibull distribution