24.08.2017  Ausgabe 11/2017 Open Access
Weibull Modulus Estimated by the Nonlinear Least Squares Method: A Solution to Deviation Occurring in Traditional Weibull Estimation
 Zeitschrift:
 Metallurgical and Materials Transactions A > Ausgabe 11/2017
Wichtige Hinweise
Manuscript submitted December 22, 2016.
1 Introduction
The Weibull distribution has been widely used to analyze the variability of the fracture properties of brittle materials for over 30 years. Fitting a Weibull distribution also later became a popular method in the prediction of the quality and reproducibility of castings.[
1–
3] The cumulative distribution function (CDF) of the Weibull distribution is given by[
4]
where
P is the probability of failure at a value of
x,
x
_{ u } is the minimum possible value of
x,
x
_{0} is the probability scale parameter characterizing the value of
x at which 62.8 pct of the population of specimens have failed, and m is the shape parameter describing the variability in the measured properties, which is also widely known as the Weibull moduli.
$$ P = 1  \exp \left[ {  \left( {\frac{{x  x_{u} }}{{x_{0} }}} \right)^{m} } \right], $$
(1)
In a practical application,
x could be substituted by the symbol
σ for the properties of materials (
e.g., Ultimate Tensile Strength (UTS)), and the lowest possible value of property could be assumed to be 0, making
x
_{ u } = 0, so that Eq. [
1] can be rewritten as a 2parameter Weibull function:
$$ P = 1  \exp \left[ {  \left( {\frac{\sigma }{{\sigma_{0} }}} \right)^{m} } \right]. $$
(2)
Anzeige
There are several approaches to the estimation of the Weibull modulus in Eq. [
2], with the most common methods being the Linear Least Squares method (LLS) and the Maximum Likelihood method (ML).
Many researches focused on the bias of the estimated Weibull modulus obtained by the estimation methods. Khalili and Kromp[
5] recommended the ML and the LLS methods after a comparison of the ML, LLS methods, and methods of momentum. Butikofer
et al.[
6] found that the LLS method was less biased than the ML method for a small sample size. Tiryakioglu and Hudak[
7] and Wu
et al.[
8] studied the best estimators for the LLS method.
However, there is still a shortcoming in the LLS method. In practice, some data points of the measured properties seriously deviate from the linear behavior in the traditional LLS method for Weibull estimation, resulting in a bad fit in the linear regression model. A good example was that of Griffiths and Lai’s[
2] measurement of UTS of a commercial purity “topfilled” Mg casting, as shown in Figure
1. It is clear that the data points were not randomly scattered along the fitted straight line in this linear regression, and the corresponding
R
^{2} value was only 79.1 pct, both of which suggested that it was a bad linear fit. These outliers would exert much influence on the regression line, making the Weibull modulus deviate from its true value. This type of behavior (
i.e., data deviation in the lower tail) in the plots of the linearized Weibull function (Figure
1) has occurred widely and resulted in estimation bias to various degrees, of which examples can be found in References
2 and
9 through
14. Keles
et al.[
14] made a summary of this deviation occurring in the measurement of brittle materials.
×
When this deviation occurs, a traditional solution is to firstly eliminate a few data points before the next step of the Weibull moduli analysis,[
1] because the data points in the lower tail were considered to be caused by gross pores. Nevertheless, the Weibull modulus obtained after such elimination would also neglect the effect of porosity on the quality of the castings, and could not reflect the reproducibility of the whole castings.
Anzeige
Currently, a popular explanation for this deviation, based on a plot of linearized Weibull CDF (Figure
1), is that the dataset may follow a 3p/mixed Weibull distribution.[
15–
17] The goodnessoffit of linear regression line (
i.e., R
^{2}) was accordingly used to determine the Weibull behavior of the datasets.[
18,
19] Tiryakioglu[
19] developed the following equation for the critical
R
^{2} value to determine the Weibull behavior of a dataset:
where
\( R_{0.05}^{2} \) is the critical
R
^{2} and
N is the sample size. If
R
^{2} of a linear regression was smaller than this critical value, the corresponding dataset was thought to follow a 3p/mixed Weibull distribution.
$$ R_{0.05}^{2} = 1.0637  \frac{0.4174}{{N^{0.3} }}, $$
(3)
This paper was aimed at investigating the reason for this widely reported deviation, and finding an appropriate method to estimate the Weibull modulus when such deviations occur. Preliminary work demonstrated that the widely reported deviation can be also caused by the linear transformation of the Weibull function, and a NonLS method may be more appropriate to evaluate the Weibull modulus. Comprehensive Monte Carlo simulations and a real casting experiment were subsequently carried out to explore the reliability of the parameter estimation by NonLS, LLS, and ML methods. It has been shown that the NonLS method, which avoids the linear transformation, outperforms all the other methods.
2 Background
2.1 Linear Least Squares (LLS) Methods
The Linear Least Squares method is also known as the linear regression method. Taking the natural logarithm of Eq. [
2] twice gives the linearized form of the 2p Weibull CDF:
$$ {\text{Ln}}\left[ {  {\text{Ln}}\left( {1  P} \right)} \right] = m{\text{Ln}}\left( \sigma \right)  m{\text{Ln}}\left( {\sigma_{0} } \right). $$
(4)
The Weibull modulus can then be determined according to the slope of a simple linear regression, (
i.e., ordinary least squares) of Ln [−Ln (1 −
P)] against Ln(
σ), where the
P value is assigned by a probability estimator. The probability estimators reported in the literature were generally written in the form of
where
i is the rank of the data sorted in an ascending order,
N is the total sample size,
a and
b are constants, whose values depend on the estimators used. The common estimators were summarized by Tiryakioglu and Hudak,[
7] and are shown in Table
I.
$$ P = \frac{i  a}{N + b}, $$
(5)
Table I
Probability Estimators Summarized by Ref. [
6]
a

b



0.5

0

Eq. [6]

0

1

Eq. [7]

0.3

0.4

Eq. [8]

0.375

0.250

Eq. [9]

0.44

0.12

Eq. [10]

0.25

0.50

Eq. [11]

0.4

0.2

Eq. [12]

0.333

0.333

Eq. [13]

0.50

0.25

Eq. [14]

0.31

0.38

Eq. [15]

2.2 Maximum Likelihood (ML) Method
In statistics, the likelihood is a function of the parameters of a given observed dataset and the underlying statistical model. “Likelihood” is related to, but is not equivalent to “probability”; the former is used after the outcome data are available to describe that something that is likely to have happened, while the latter describes possible future outcomes before the data are available.
The basic principles can be described as follows.[
20–
22] If there is a dataset of
N independent and identically distributed observations, namely
x
_{1},
x
_{2},…,
x
_{ N }, coming from a underlying probability density function
f(
θ). The true value of
θ is unknown and it is desirable to find an estimator
\( \hat{\theta } \) which would be as close to
θ
_{true} as possible. First the joint density function for all observations can be calculated as
$$ f\left( {x_{1} x_{2} \ldots x_{n} \theta } \right) = f\left( {x_{1} \theta } \right)f\left( {x_{2} \theta } \right) \cdots \cdot f\left( {x_{n} \theta } \right) = \prod\limits_{i = 1}^{n} {f\left( {x_{i} \theta } \right)}. $$
(16)
From a different perspective, Eq. [
16] can be considered to have the observed data
x
_{1},
x
_{2},…,
x
_{ N }, as the fixed parameters and
θ as the function’s variable. This will be called the likelihood function as follows
$$ L\left( {\theta x_{1} x_{2} \ldots x_{n} } \right) = f\left( {x_{1} x_{2} \ldots x_{n} \theta } \right) = \mathop \prod \limits_{i = 1}^{n} f\left( {x_{i} \theta } \right). $$
(17)
The maximum likelihood estimate (MLE) of
θ can be obtained by maximizing the likelihood function given the observed data as
$$ \hat{\theta }_{\text{MLE}} = \arg \mathop {\hbox{max} }\limits_{\theta } L\left( {\theta x_{1} x_{2} \ldots x_{n} } \right). $$
(18)
For a Weibull estimation of castings, the likelihood function of the observed dataset,
x
_{1},
x
_{2},…,
x
_{ N }, can be written as
$$ L\left( {m,\sigma x_{1} ,x_{2} , \ldots ,x_{N} } \right) = \mathop \prod \limits_{i = 1}^{N} f(x_{i} {\mid }m, \sigma ) = \mathop \prod \limits_{i = 1}^{N} \left( {\frac{m}{\sigma }\left( {\frac{x}{\sigma }} \right)^{m  1} \exp \left( {  \left( {\frac{{x_{i} }}{\sigma }} \right)^{m} } \right)} \right). $$
(19)
Here
f(
x
_{ i }∣
m,
σ) is the probability density function of Weibull distribution. MLE of a Weibull parameter can be then obtained by maximizing Eq. [
19], using NelderMead method.
The estimated Weibull modulus obtained by the Maximum Likelihood method was also biased from the value of
m
_{true}. Khalili[
5] reported that the bias level of the ML Method was higher than Eq. [6] of the linear least square method. This suggestion was also supported by the following study of References
6 and
8.
2.3 Nonlinear Least Squares (NonLS) Method
The NonLS method has many similarities to the LLS method. The observed data are also sorted in an ascending fashion, and subsequently paired with the failure probabilities, obtained by the estimators shown in Table
I. It differs from the LLS method as a nonlinear regression, using a GaussNewton algorithm, is directly carried out to achieve the best fitted curve of a Weibull function. This method was used to estimate Weibull parameters in some other fields,[
23,
24] but has not been applied in the Weibull estimation of castings and brittle materials.
3 Methods
3.1 Reanalysis of Griffiths and Lai’s Data
As shown in Figure
2, the Griffiths’ data shown in Figure
1 (
i.e., UTS of a commercial purity Mg casting produced using a topfilled running system) were reanalyzed using the NonLS method. To compare the fitting performance, the Weibull function with the parameters obtained by the LLS method (
i.e., the method originally used in Griffiths and Lai’s paper[
2]) is also plotted in Figure
2. Residual Sum of Squares (SSR) was used to evaluate the goodnessoffit instead of
R
^{2} in this nonlinear model (the adjusted
R
^{2} values were also given).
×
According to Figure
2(a), it can be seen that the data points showed a good fit to the Nonlinear regression curve (SSR = 0.0238), which is much better than the curve plotted according to the LLS estimation results (the Weibull parameters shown in Figure
1, SSR = 0.4096). There was a significant difference between the Weibull modulus estimated by the two methods (11.147 and 4.427). Therefore, although the Tiryakioglu’s equation (
i.e., Eq. [
3], (
R
_{0.05})
^{2} = 0.9047) rejected the Weibull behavior of this dataset, it is still not clear whether the data points follow a 2p Weibull distribution.
According to Figure
2(b), when the NonLS estimation result was plotted in the linearized Weibull plot (
i.e., solid line in Figure
2(b)), the data points showed a very bad fit to the line (
R
^{2} < 0), which was much worse than the LLS estimation results (
R
^{2} = 79.1 pct). The contradictory conclusions of Figures
2(a) and (b) suggest the following question: “Is it appropriate to determine the Weibull behavior of datasets according to the traditional linearized Weibull plot (Figure
2(b)), or the nonlinear Weibull plot (Figure
2(a))?”
3.2 A Shortcoming of the Linearized Form of the Weibull Function
It should be noted that according to the estimator defined as Eq. [
5], the cumulative probability in the Weibull estimation using the least square method is set to a specific value (denoted by
P
_{est,i } for the
ith datum point) with the same weight for each datum point. However, in a practical process, the true cumulative probability, referred to as
P
_{true,i }, is of course not necessarily equal to the estimated cumulative probability (
P
_{est,i }). Bergman[
25] also pointed out that it was erroneous to assume the same weight for each datum point in Eq. [
5]. Thus, there is usually a difference between
P
_{true,i } and
P
_{est,i }, making the estimated Weibull moduli biased.
Let DY
_{nonlinear,i } indicate the difference between the true and estimated values on the
Y axis for the
ith datum point in the plot of the original Weibull CDF, as shown in the following equation:
$$ {\text{DY}}_{{{\text{non{}linear}},i}} = \left {P_{{{\text{true}},i}}  P_{{{\text{est}}, i}} } \right. $$
(20)
Similarly, let DY
_{linear,i } indicate this difference on the
Y axis in the plot of the linearized Weibull function (Eq. [
3]), which can be calculated by the following equation:
$$ {\text{DY}}_{{{\text{linear}},i}} = \left {{\text{Ln}}[  {\text{Ln}}(1  P_{{{\text{true}},i}} )]  {\text{Ln}}[  {\text{Ln}}(1  P_{{ {\text{est}},i}} )]} \right. $$
(21)
As shown in Figure
3, no matter how much
P
_{true,i } and
P
_{est,i } are, linear transformation can always numerically enlarge the difference between the true and estimated values on the
Y axis. In other words, DY
_{linear,i } is always larger than DY
_{nonlinear,i }, especially when
P
_{est,i } significantly deviates from
P
_{true,i }. Such an increase, in the deviation from the estimated value to the true value on the
Y axis, causes a larger distance between the estimated and true positions of the data points in the linearized Weibull function plot, compared with that in the original Weibull CDF plot. Furthermore, it should be noted that the enlargement due to the linear transformation also exists in the linearized form of the 3p Weibull function.
×
This enlargement level can be further described by the following enlargement factor (EF):
$$ {\text{Enlargement factor:}} \;{\text{EF}} = \frac{{{\text{DY}}_{{{\text{non{}linear}}, i}} }}{{{\text{DY}}_{{{\text{linear}},i}} }}. $$
(22)
A 3D plot of this equation is shown in Figure
4. It can be seen that the EF value would be significantly small, even close to 0, when
P
_{true,i } approaches to 0 or 1, which means that the DY
_{nonlinear,i } would be dramatically enlarged at these positions. This nonuniform enlargement is the underlying reason why it was normal to report a deviation in the lower and upper tails of a dataset in a traditional linearized Weibull plot (Figure
1).
×
Since the regression algorithms of the least square method (no matter linear or nonlinear regression) produce the result according to the residuals (
i.e., the smallest Sum of Residual Squares), which is only related to the Ycoordinate, the nonuniform enlargement of DY
_{linear,i } accordingly may result in more bias of the regression results (such as the estimated Weibull moduli). Therefore, the bad fit of the NonLS estimation result shown in Figure
4(b) may be due to the enlargement of the difference between the true and estimated probabilities. The least square method has been accordingly used in this paper in the plot of the nonlinear Weibull CDF, rather than its linearized form. This approach is the nonlinear least square method (NonLS).
3.3 Examples of the Negative Effect of the Enlargement of DY_{nonlinear,i } on Weibull Estimation
For a further illustration of the negative effect of the enlargement of DY
_{nonlinear,i }, an example has been given in Table
II. This dataset was generated from a 2p Weibull distribution with shape = 11 and scale= 60, which was close to the NonLS estimation result of Griffiths (Figure
1). The raw data were sorted in ascending order, shown in the second column of Table
II. The true cumulative probabilities (
P
_{true,i }) were directly calculated from the Weibull function as listed in the third column. The estimated cumulative probability was computed according to
P
_{est,i } = (
i − 0.5)/
N (Eq. [6]), and has been shown in the 4th column. DY
_{linear,i } and DY
_{nonlinear,i } were listed in the 5th and 6th columns, respectively.
Table II
Data (Referred to as
x) Generated from a Weibull Function with Shape = 11, Scale = 60
i

x

P
_{true,i }

P
_{est,i } = (
i − 0.5)/
N

DY
_{linear,i }

DY
_{nonlinear,i }


1

34.0085

0.001938

0.02

0.018062

2.34314

2

34.6850

0.002406

0.06

0.057594

3.24576

3

37.1551

0.005122

0.10

0.094878

3.02130

4

42.4875

0.022199

0.14

0.117801

1.90484

5

51.1540

0.158850

0.18

0.021150

0.13734

6

51.8391

0.181473

0.22

0.038527

0.21573

7

54.0502

0.271693

0.26

0.011693

0.05155

8

54.5408

0.295427

0.30

0.004573

0.01843

9

55.1336

0.325901

0.34

0.014099

0.05221

10

55.2120

0.330076

0.38

0.049924

0.17674

11

55.3406

0.336997

0.42

0.083003

0.28175

12

55.7049

0.357082

0.46

0.102918

0.33283

13

56.6743

0.413777

0.50

0.086223

0.26074

14

57.0168

0.434843

0.54

0.105157

0.30805

15

57.8910

0.490647

0.58

0.089353

0.25148

16

58.0565

0.501496

0.62

0.118504

0.32925

17

58.2651

0.515268

0.66

0.144732

0.39860

18

58.6562

0.541344

0.70

0.158656

0.43479

19

59.7578

0.615759

0.74

0.124241

0.34242

20

60.5805

0.671011

0.78

0.108989

0.30892

21

60.8124

0.686338

0.82

0.133662

0.39136

22

61.0008

0.698679

0.86

0.161321

0.49409

23

62.7504

0.805486

0.90

0.094514

0.34101

24

63.0698

0.822946

0.94

0.117054

0.48552

25

64.0881

0.873164

0.98

0.106836

0.63899

Figures
5(a) and (b) show the corresponding Weibull estimation results using the NonLS and LLS methods. The solid square points indicate
P
_{true,i }, while the hollow triangle points denote
P
_{est,i }. It can be seen that the deviation from the estimated value to the true value on the
Y axis was obviously larger in the plot of the linearized Weibull function (DY
_{linear,i } in Figure
5(b)), than in the plot of the original nonlinear Weibull CDF (DY
_{nonlinear,i } in Figure
5(a)), especially when
P
_{true,i } is small. A deviation similar to that shown in Figure
1 (
i.e., Griffiths’ data) consequently occurred in the lower tail as shown in Figure
5(b).
×
In addition, the Weibull behavior of this dataset was also rejected by Tiryakioglu’s equation (Eq. [
3]). The line plotted based on the NonLS estimation (
i.e., the solid black line in Figure
5(b)) showed an extreme bad fit to the triangle points (
i.e., R
^{2} < 0), similar to that shown in Figure
2(b). However, this line was more close to the true function than the linear regression line.
Figure
5(c) shows the change in the enlargement factor (EF) along with
P
_{true,i }, revealing that the enlargement of DY
_{nonlinear,i } was more dramatic when
P
_{true,i } is close to 0 and 1, which is consistent with Figure
4. Accordingly, the performance of the Weibull moduli estimation is poorer in the LLS method than in the NonLS method, which can explain the different level of the goodnessoffit in different Weibull plots as shown in Figure
2.
4 Simulation Procedures
To further illustrate the discussion in Section
III, Monte Carlo simulations were performed in R Version 3.3.0 (
https://www.rproject.org). As shown in Figure
6, different procedures were used to investigate the bias of the estimated Weibull modulus.
×
For direct comparison of the different estimation methods (Figure
6(a)), random data points of sample size N were firstly generated from a 2p Weibull function (Eq. [
2]) with shape parameter =11 (referred to as
m
_{true}) and scale parameter = 60 (referred to as
σ
_{0,true}). The different approaches, listed in Table
III, were used to evaluate the Weibull modulus (written as
m
_{est}) of the generated data.
Table III
Approaches Using the Estimators Shown in Table
I Together with LLS, NonLS, and ML Methods
Estimators

Methods



LLS

NonLS

ML


Eq. [6]

Approach 1

Approach 11

Approach 21

Eq. [7]

Approach 2

Approach 12


Eq. [8]

Approach 3

Approach 13


Eq. [9]

Approach 4

Approach 14


Eq. [10]

Approach 5

Approach 15


Eq. [11]

Approach 6

Approach 16


Eq. [12]

Approach 7

Approach 17


Eq. [13]

Approach 8

Approach 18


Eq. [14]

Approach 9

Approach 19


Eq. [15]

Approach 10

Approach 20

The bias of the estimated Weibull modulus (
m
_{est}) was defined by the following equation,[
5,
7,
26]
$$ M = m_{\text{est}} /m_{\text{true}}. $$
(23)
M = 1 means the approach used was unbiased. In addition, since the estimated parameters are normalized by the true parameters, the setting of the scale and shape parameters are inconsequential.[
5,
19] This process was repeated for 20,000 times to obtain 20,000
M values. The bias level of the different approaches was evaluated by the mean of the 20,000
M values, written as
M
_{mean}.
To study the effect of the dramatic enlargement of DY
_{nonlinear,i } on Weibull moduli estimation (Figure
6(b)), the program checked whether the smallest datum of the randomly generated dataset was <30, thus making the data used for the simulation contain at least one datum point smaller than 30. This setting ensured a small value of the true probability of the first datum point (
P
_{ture,1}), and thus the corresponding difference between the true and estimated values on the
Y axis (DY
_{nonlinear,1}) would be dramatically enlarged in the linearized Weibull function plot, according to Figure
4 (when
P
_{true,1} is close to 0).
Based on the linearized Weibull plot, Tiryakioglu
et al.[
19] developed an equation for critical
R
^{2} (see Eq. [
3]), to determine the Weibull behavior of datasets. Similarly, based on the nonlinear Weibull plot, the critical SSR (referred to as SSRC) could also be calculated using a Monte Carlo simulation and the procedures shown in Figure
6(c). The SSRC obtained would be larger than the SSR value of 95 pct datasets (
i.e., 19,000 out of 20,000).
5 Results
5.1 Direct Comparison of the Estimation Approaches
Figure
7 illustrates the results of the simulations shown in Figure
6(a). In general, the estimated Weibull modulus became closer to
m
_{true} with increase in sample size
N. Figure
7(a) summarizes the
M
_{mean} obtained by the LLS and the ML methods (
i.e., Approaches 1 to 10 and 21 in Table
III). It can be seen that Approaches 1 and 9 were relatively less biased for
N ≥ 25, and Approach 5 was the least biased approach when the sample size
N was <25. This observation was consistent with the results of References
5 and
7. Figure
7(b) shows a summary of
M
_{mean} achieved via the NonLS method (
i.e., Approaches 11 to 20 in Table
III). It was obvious that Approach 12, which was the worst estimator for the LLS method (
i.e., Approach 2), was less biased than the other estimators using the NonLS method, especially when the sample size was smaller than 30.
×
For a further comparison, Approaches 1, 5, 9, and 12 were put together as shown in Figure
7(c). For 15 ≤
N < 35 and 90 ≤
N, it was clear that Approach 12 resulted in the least bias for all the sample sizes. For 35 ≤
N < 90, Approaches 1 and 12 were better than the other approaches. Figure
7(d) shows the Standard Error (SE) of
M, revealing a negligible difference between the SE values of different approaches.
5.2 Effect of a Dramatic Enlargement of DY_{nonlinear,i }
Figure
8 shows the
M
_{mean} of the datasets containing at least one datum point <30. As can be seen from Figure
8(a), the LLS method (
i.e., Approaches 1 to 10 in Table
III) was seriously biased when dealing with this type of data. For 15 ≤
N ≤ 40, which was the common sample size for obtaining the Weibull modulus of castings in previous publications,[
2,
27–
29] the
M
_{mean} values were no more than 0.7, presenting a significant bias of the estimated Weibull modulus. In addition, even with a large sample size, such as
N = 115, the
M
_{mean} values of Approaches 1 to 10 still did not exceed 0.85. Thus, it can be suggested that the LLS method is not suitable for estimating the Weibull modulus, when
P
_{est.i } dramatically deviates from
P
_{true,i } in the lower tail. This may explain why the data points shown in Figure
1 deviated from the linear fit.
×
By contrast, according to Figure
8(b), the NonLS method (
i.e., Approaches 11 to 20) was significantly less biased. Even the worst approach (Approach 12) of the NonLS method could cause a smaller bias (
M
_{mean} > 0.85, at
N =15) than any approaches using the LLS method (
M
_{mean} < 0.8, at
N = 115). In addition, Approach 11 obtained the least biased estimates among all the approaches using the NonLS methods (Approaches 11 to 20), especially when the sample size was smaller than 30.
Moreover, it should be noted that Approach 12, which was unbiased in Figure
7(b), became the most seriously biased estimator among the approaches of the NonLS method, indicating that the bias of the approaches could be different depending on the level of the enlargement of DY
_{nonlinear,i }.
M
_{mean} obtained by the ML method (Approach 21 in Table
III) was also shown in Figure
8(b), but it was clear that Approach 21 was more biased for all the sample sizes examined, in contrast to the NonLS method.
95 pct CI (confidence interval) of Approaches 1, 11, and 21 was computed under an assumption that the
M values follow a standard normal distribution, as shown in Figure
8(c).
Therefore, the NonLS method is relatively reliable, when the estimated probability (
P
_{est.i }) deviated dramatically from true probability (
P
_{ture,i }) in the lower tail, and Approach 11 was recommended to be the default to estimate the Weibull modulus for this type of data.
5.3 Critical Sum of Residual Squares (SSRC)
Table
IV shows the SSRC values for different sample sizes. The estimator used was
P = (
i − 0.5)/N (
i.e., Eq. [6] in Table
I). As previously mentioned, 95 pct Weibull datasets (19,000 out of 20,000) in the simulation had a smaller SSR than SSRC. Applying this criterion to the NonLS estimation result shown in Figure
2 (
N = 25), it can be determined that Griffiths and Lar’s data (Figure
1) follow a 2p Weibull distribution.
Table IV
SSRC Values; the Estimator Used is
P = (
i − 0.5)/
N
N

SSRC

SSRC
_{mean}


15

0.0756771

5.0451E−03

20

0.0784904

3.9245E−03

25

0.0797666

3.1907E−03

30

0.0802094

2.6736E−03

35

0.0818855

2.3396E−03

40

0.0816979

2.0424E−03

45

0.0832347

1.8497E−03

50

0.0826510

1.6530E−03

55

0.0837245

1.5223E−03

60

0.0829798

1.3830E−03

65

0.0834890

1.2844E−03

70

0.0831899

1.1884E−03

75

0.0844495

1.1260E−03

80

0.0846093

1.0576E−03

85

0.0842047

9.9064E−04

90

0.0849502

9.4389E−04

95

0.0840484

8.8472E−04

100

0.0846551

8.4655E−04

105

0.0833319

7.9364E−04

110

0.0842549

7.6595E−04

115

0.0845723

7.3541E−04

120

0.0838167

6.9847E−04

However, it should be noted that this suggestion was quite different using Tiryakioglu’s equation (
i.e., Eq. [
3]), which suggested that Griffiths and Lar’s data followed a 3p/mixture Weibull distribution. In conjunction with the discussion in Section
III–B, Tiryakioglu’s equation may falsely reject the Weibull behavior of Griffiths’ data, due to the shortcomings of the linearized Weibull function.
In addition, since the SSR value is affected by the sample size
N, SSRC values of the samples having different sizes could not be directly compared with each other. Therefore, the mean sum of residual squares (SSRC
_{mean}) can be further used to evaluate the goodnessoffit of the nonlinear regression results, as shown in the third column of Table
IV and the following equation:
$$ {\text{SSRC}}_{\text{mean}} = {\text{ SSRC}}/N. $$
(24)
The best fit curve of SSRC
_{mean} was computed as shown in Figure
9, which followed the formula below:
$$ {\text{SSRC}}_{\text{mean}} = 0.06463*N^{  0.93778}. $$
(25)
×
Therefore, the author recommends using this equation to determine the Weibull behavior of datasets.
5.4 Practical Data from MgAlloy Castings
Figure
10 shows an example of a commercial purity Mgalloy casting. Similar to the casting shown in Figure
1, this Mg casting was also produced using a resinbonded sand mold with a topfilled system, and the casting procedures were the same as Griffiths and Lai’s work.[
2] The material used was from the same batch as the Mg alloy used by Griffiths and Lai,[
2] so it can be readily compared. Thus, the reproducibility of this casting was expected to be close to the results from the casting shown in Figure
1. After solidification, the casting was machined into 40 test bars and tensile strength was tested. The UTS data were used for Weibull analysis.
×
Figure
10 shows the Weibull parameters evaluated using Approaches 1 and 11. It can be seen that the data points showed a good fit to the linear regression line (Figure
10(a)). In addition, in Figure
10(b), the data points showed a good fit to both the curves obtained using the NonLS method (Approach 11, SSR = 0.0222) and the LLS method (Approach 1, SSR = 0.0265). In conjunction with Figure
7, the enlargement of DY
_{nonlinear,i } in the dataset may not be dramatic. The estimated Weibull moduli (11.7 and 11.4) were close to the NonLS estimation results shown in Figure
1, rather than the LLS estimation results.
As previously mentioned, this casting process (
i.e., the casting shown in Figure
10) is the same as the Griffiths and Lai’s casting process (
i.e., the casting shown in Figure
2), and thus the estimated Weibull modulus shown in Figure
10 should be close to the true Weibull modulus of Griffiths and Lai’s casting. Therefore, the Weibull modulus shown in Figure
10 could be used as the reference value to determine which Weibull modulus (
i.e., the LLS and NonLS estimation results) in Figure
2 was closer to the true value. Based on the comparison between Figures
2 and
10, it can be suggested that the NonLS estimation result shown in Figure
2 (
i.e., m = 11.14) is closer to the true Weibull modulus than the LLS estimation result (
i.e., m = 4.4). The NonLS method is accordingly more appropriate to estimate the Weibull modulus of Griffiths and Lai’s data.
In addition, this comparison (Figures
10 and
2) further revealed that the SSRC method (Figure
9) may be more appropriate to interpret Griffiths and Lai’s data, while Tiryakioglu’s equation (Eq. [
3]) will falsely interpret this dataset to be a 3p Weibull distribution.
Figure
11 shows an example of results from an AZ91 casting, produced in the same way as the cast test bar results shown in Figure
10. As shown in Figure
11(a), the data points deviated from the linear regression line, and the corresponding
R
^{2} was smaller than the critical
R
^{2} suggested by Eq. [
3] [(
R
_{0.05})
^{2} = 0.9256), rejecting the Weibull behavior of this dataset. However, according to the nonlinear Weibull plot (Figure
11(b)], there was a clear difference between curves obtained by Approaches 1 (the LLS method) and 11 (the NonLS method). The curves obtained by Approach 11 have an SSR value smaller than the critical SSR (SSRC = 0.0816979, Table
IV), suggesting the data followed a 2p Weibull distribution. This different judgment of Weibull behavior is similar to that found in the estimation of Griffiths and Lai’s data (Figure
2), and the dataset may be falsely interpreted in the linearized Weibull plot.
×
6 Discussion
6.1 Determination of Weibull Behavior of Datasets
Figure
3 indicates that the difference between the estimated and true cumulative probabilities of data points (DY
_{nonlinear,i }) would be significantly enlarged due to the linear transformation of the Weibull function. Figure
4 further reveals that this enlargement level was not uniform: the enlargement could be more dramatic in the lower and upper tails (
i.e., when
P
_{true,i } is close to 0 or 1).
According to the Weibull analysis of example data (Figures
2 and
5), the nonuniform enlargement of DY
_{nonlinear,i } can affect the judgement of the Weibull behavior of datasets. The reanalysis of Griffiths’ data (Figure
2) and the corresponding SSRC value (Table
IV) indicated that it may not be necessarily correct to reject the Weibull behavior of datasets, according to the goodnessoffit of the linear regression line (Eq. [
3]). It should be noted that if a significant enlargement of DY
_{nonlinear,i } occurred in the lower tail (
i.e., the first few data points), even a dataset generated from a Weibull distribution would probably present a bad fit to the linear regression line, as shown in Figure
5(b).
The experimental result (Figure
10) further showed the dataset of a topfilled commercial purity Mg casting followed a 2p Weibull distribution, according to both
R
^{2} and SSR. In addition, both of the LLS and NonLS results in Figure
10 were close to the NonLS estimation result of Griffiths and Lai (Figure
2), which further supported the reliability of the NonLS estimation. Figure
11 shows a further example that the dataset may be falsely interpreted.
Therefore, the nonuniform enlargement of DY
_{nonlinear,i } is an underlying reason for the deviation of the data points widely found in previous publications.[
2,
9–
13] Previous researchers suggested that this deviation could be due to the nature of the physical flaws (
i.e., defects, such as porosity, low melting point intermetallic compounds, and segregation) in the material,[
14,
30] and the corresponding data points were interpreted to follow an underlying 3p or mixed Weibull distribution.[
15–
17] However, more analysis (Eq. [
25]) is still required to distinguish what is the actual reason of the deviation. The simulation results (Figure
5) and experimental results (Figures
2 and
10) indicated that a deviation caused by the nonuniform enlargement of DY
_{nonlinear,i } could be falsely interpreted to be due to physical flaws (
i.e., 3p/mixed Weibull distribution). This misunderstanding may exist in previous researches.
6.2 Effect of Weibull Modulus Estimation
The results of the Monte Carlo simulations demonstrated that the nonuniform enlargement of DY
_{nonlinear,i } resulted in a greater bias in the Weibull modulus estimation. When the difference between DY
_{linear,i } and DY
_{nonlinear,i } was not necessarily large (Figure
7), the NonLS method was slightly less biased than the LLS method. However, when high enlargement of DY
_{nonlinear,i } occurs in the lower tail (Figure
5), the NonLS method has a considerable merit over the LLS method.
It is therefore recommended that the plot of the original nonlinear Weibull CDF and the NonLS method, which avoids the linear transformation, should be used for the Weibull analysis of material properties.
7 Conclusion
1.
It has been demonstrated that the difference between the estimated and true cumulative probabilities of data points can be dramatically enlarged in the lower and upper tails, due to the linear transformation in the traditional Weibull modulus estimation using the LLS method.
2.
Such an enlargement is an underlying reason of the deviation from the linear regression line, which was previously widely reported and interpreted to be due to physical flaws contained in the brittle and metal materials.
3.
It is therefore not necessarily correct to reject the Weibull behavior of a dataset, according to the goodnessoffit of the linear regression line, such as R
^{2}.
4.
The NonLS method, which is demonstrated to be less biased compared with both the LLS and ML methods, is recommended for the Weibull modulus estimation.
Acknowledgments
The authors acknowledge funding from the EPSRC LiME Grant EP/H026177/1, and thank Professor Murat Tiryakioglu for his comments on an earlier draft of this paper.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (
http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.