Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES Cesar Acosta-Mejia June 2011
Monitoring High-yield processes EDUCATION –B.S. Catholic University of Peru –M.A. Monterrey Tech, Mexico –Ph.D. Texas A&M University RESEARCH –Quality Engineering - SPC, Process monitoring –Applied Probability and Statistics – Sequential analysis –Probability modeling – Change point detection, process surveillance
Monitoring High-yield processes MOTIVATION –High-yield processes –Monitor the fraction of nonconforming units p –Very small p(ppm) –To detect increases or decreases in p –A very sensitive procedure MONITORING HIGH-YIELD PROCESSES
Monitoring High-yield processes MONITORING HIGH-YIELD PROCESSES ASSUMPTIONS Process is observed continuously Process can be characterized by Bernoulli trials Fraction of nonconforming units p is constant, but may change at an unknown point of time
Monitoring High-yield processes Hypothesis Testing For (level ) two-sided tests the region R is made up of two subregions R1 and R2 with limits L and U such that P[X ≤ L] = / 2 P[X ≥ U] = / 2 L U
Monitoring High-yield processes Hypothesis Testing Consider testing the proportion p
Monitoring High-yield processes Hypothesis Testing The test may be based on different random variables Binomial (n, p) Geometric (p) Negative Binomial (r, p) Binomial – order k (n, p) Geometric – order k (p) Negative Binomial – order k (r, p)
Monitoring High-yield processes Binomial tests when p is very small
Monitoring High-yield processes Test 1 proportion p 0 = 0.025(25000 ppm) test H 0 : p = against H 1 : p X n. of nonconforming units in 500 items
Monitoring High-yield processes Test 1 Let X Binomial (500,p) To test the hypothesis H 0 : p = against H 1 : p the rejection region is R = {x ≤ 2} {x ≥ 25} since P[X ≤ 2]= < = /2 P[X ≥ 25]= < = /2
Monitoring High-yield processes Test 1 Plot of P[rejecting H 0 ] vs. p is
Monitoring High-yield processes Hypothesis Testing Now consider testing p 0 = (100 ppm)
Monitoring High-yield processes Test 1 Let X Binomial (n = 500,p) To test the hypothesis H 0 : p = against H 1 : p the rejection region is R = {X ≥ 2} since P [X ≥ 2]= For n=500 there is no two-sided test for p =
Monitoring High-yield processes Test 1 Binomial (n = 500, p = 0.025)Binomial (n = 500, p = )
Monitoring High-yield processes Test 1 For this test a plot of P[rejecting H 0 ] vs. p is
Monitoring High-yield processes Consider a geometric test for p when p is very small
Monitoring High-yield processes Test 2 Let X Geo(p) To test the hypothesis ( = ) H 0 : p = against H 1 : p the rejection region is R = {X ≤ 13} {X ≥ 66075} since P[X ≤ 13]= P[X ≥ 66075]= An observation in {X ≤ 13} leads to conclude that p >
Monitoring High-yield processes Test 2 For this test a plot of P[rejecting H 0 ] vs. p is
Monitoring High-yield processes Another performance measure of a sequential testing procedure
Monitoring High-yield processes Hypothesis Testing Let X 1, X 2, … Geo(p) iid Let T number of observations until H 0 is rejected Consider the random variables for j = 1,2,… A j = 1 if X j R P[A j = 0] = P R A j = 0 otherwise then the probability function of T is P[T= t] = P[A 1 = 0] P[A 2 = 0]… P[A t-1 = 0] P[A t = 1] = P R [1-P R ] t-1
Monitoring High-yield processes Hypothesis Testing therefore T Geo(P R ) Let us consider E[T] = 1/P R as a performance measure then E[T] = 1/P R mean number of tests until H 0 is rejected when p = p 0 E[T] = 1/
Monitoring High-yield processes Test 2 Let X Geo(p) q = 1 - p P [X ≤ x] = 1 – q x Let the rejection regionR = {X U} then P A = P [not rejecting H 0 ] = P [ L ≤ X ≤ U] = 1 – q U – (1 – q L-1 ) = q L-1 – q U P R = 1 – (1- p ) L-1 + (1 - p) U
Monitoring High-yield processes Test 2 Let X Geo(p) To test the hypothesis ( = ) H 0 : p = against H 0 : p the rejection region is R = {X 66074} then P[rejecting H 0 ] is P R = 1 – (1 – p) 13 + (1 – p) E[T] = 1/P R when p = p 0 E[T] = 1/ = 370.4
Monitoring High-yield processes Test 2 we want E[T]
Monitoring High-yield processes Test 2 How can we improve upon this test ? we want E[T]
Monitoring High-yield processes run sum procedure
Monitoring High-yield processes Geometric chart A sequence of tests of hypotheses
Monitoring High-yield processes THE RUN SUM Interval between limits is divided into regions A score is assigned to each region A sum is accumulated according to the region in which the statistics falls Sum is reset when last mean falls on the other side of the center line Reject H 0 when the cumulative score is equal or exceeds a limit value
Monitoring High-yield processes THE RUN SUM Interval between limits is divided into eight regions A score is assigned to each region (0,1,2,3) A sum is accumulated according to the region in which the statistics falls Sum is reset when last mean falls on the other side of the center line Reject H 0 when the cumulative score is equal or exceeds a limit value L = 5
Monitoring High-yield processes THE RUN SUM – for the mean
Monitoring High-yield processes THE GEOMETRIC RUN SUM
Monitoring High-yield processes THE GEOMETRIC RUN SUM - DEFINITION Let us denote the following cumulative sums SU t = SU t-1 + q t if X t falls above the center line = 0 otherwise SL t = SL t-1 - q t if X t falls below the center line = 0 otherwise where q t is the score assigned to the region in which X t falls
Monitoring High-yield processes THE GEOMETRIC RUN SUM - DEFINITION The run sum statistic is defined, for t = 1,2,…, by S t = max {SU t, -SL t } with SU 0 = 0, SL 0 = 0 and limit sum L
Monitoring High-yield processes THE GEOMETRIC RUN SUM - DESIGN Need to define region limits ( l 1, l 2, l 3 and l 5, l 6, l 7 ) region scores (q 1, q 2, q 3 and q 4 ) limit sum L
Monitoring High-yield processes THE GEOMETRIC RUN SUM - DESIGN Region limits above and below the center line are not symmetric around the center line. To define the region limits we use the cumulative probabilities of the distribution of X Geo (p 0 ) Such probabilities were chosen to be the same as those of a run sum for the mean with the same scores
Monitoring High-yield processes THE GEOMETRIC RUN SUM - DESIGN
Monitoring High-yield processes THE GEOMETRIC RUN SUM - EXAMPLE If X Geo (p 0 = ) the region limits are given by =P [X ≤ l 1 ] =P [X ≤ l 2 ] =P [X ≤ l 3 ] =P [X ≤ l 4 ] =P [X ≤ l 5 ] =P [X ≤ l 6 ] =P [X ≤ l 7 ]
Monitoring High-yield processes THE GEOMETRIC RUN SUM - EXAMPLE If X Geo (p 0 = ) the region limits are given by =P [X ≤ 13 ] =P [X ≤ 220 ] =P [X ≤ 1701 ] =P [X ≤ 6932 ] =P [X ≤ ] =P [X ≤ ] =P [X ≤ ]
Monitoring High-yield processes THE GEOMETRIC RUN SUM - EXAMPLE Conclude H 1 : p p 0 when S t L Let T number of samples until H 0 is rejected What is the distribution of T ? What is the mean and standard deviation?
Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Markov chain States defined by the values that S t can assume State space = {-4,-3,-2,-1,0,1,2,3,4,C} where C ={n N | n = …,-6,-5,5,6,…} is an absorbing state Transition probabilities
Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Letp 1 =P [ X ≤ l 1 ] p 2 =P [ l 1 ≤X ≤ l 2 ] p 3 =P [ l 2 ≤X ≤ l 3 ] p 4 =P [ l 3 ≤X ≤ l 4 ] p 5 =P [ l 4 ≤X ≤ l 5 ] p 6 =P [ l 5 ≤X ≤ l 6 ] p 7 =P [ l 6 ≤X ≤ l 7 ] p 8 =P [ X > l 8 ] where X Geo (p 0 )
Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Transitions from S t = 0
Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Transitions from S t = 1
Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Transitions from S t = 2
Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING
Monitoring High-yield processes RUN SUM (0,1,2,3) L = 5 - MODELING Let T be the first passage time to state C n. of observations until the run sum rejects H 0 Let Q be the sub matrix of transient states, then P [T ≤ t] = e ( I – Q t ) J G (s) = se ( I – s Q ) -1 ( I – Q) J E [T] = e ( I – Q ) -1 J e is a row vector defining the initial state {S 0 }
Monitoring High-yield processes Geometric Run sum For this chart a plot of E[T] vs. p is
Monitoring High-yield processes Geometric Run sum A comparison with Test 2
Monitoring High-yield processes RUN SUM – FURTHER IMPROVEMENT Consider a geometric run sum –No regions –Center line equal to l 4 –Scores are equal to X –Design – limit sum L
Monitoring High-yield processes NEW GEOMETRIC RUN SUM - DEFINITION Let us denote the following cumulative sums SU t = SU t-1 + X t if X t falls above the center line = 0 otherwise SL t = SL t-1 - X t if X t falls below the center line = 0 otherwise
Monitoring High-yield processes NEW GEOMETRIC RUN SUM - DEFINITION The run sum statistic is defined, for t = 1,2,…, by S t = max {SU t, -SL t } with SU 0 = 0, SL 0 = 0 and limit sum L
Monitoring High-yield processes NEW GEOMETRIC RUN SUM - MODELING Markov chain – not possible – huge number of states Need to derive the distribution of T Can show that
Monitoring High-yield processes NEW GEOMETRIC RUN SUM - MODELING
Monitoring High-yield processes CONCLUSIONS The run sum is an effective procedure for two-sided monitoring For monitoring very small p, it is more effective than a sequence of geometric tests If limited number of regions it can be modeled by a Markov chain
Monitoring High-yield processes TOPICS OF INTEREST Estimate (the time p changes – the change point) Bayesian tests Lack of independence (chain dependent BT) Run sum can be applied to other instances - monitoring - arrival process
Monitoring High-yield processes REFERENCES Acosta-Mejia, C. A., Pignatiello J. J., Jr. (2010). The run sum R chart with fast initial response. Communications in Statistics – Simulation and Computation, 39: Balakrishnan, N., Koutras, M. V. (2003). Runs and Scans with Applications, J. Wiley, New York, N. Y. Bourke, P. D. (1991). Detecting a shift in fraction nonconforming using run- length control charts with 100\% inspection. Journal of Quality Technology, 23(3), Calvin, T. W. (1983). Quality Control Techniques for Zero-Defects. IEEE Transactions Components, Hybrids and Manufacturing Technology, 6: Champ, C. W., Rigdon, S. E. (1997). An Analysis of the Run Sum Control Chart. Journal of Quality Technology, 29: Reynolds, J. H. (1971). The Run Sum Control Chart Procedure. Journal of Quality Technology, 3:23-27