mm-1061 == By the way, you can make a good living doing financial6 === Subject: Re: stat measures > I know that this is a very basic question, but neither one of you > actually answered it. This problem has come up before. Our answer is, in a sense, > that we think you're asking the wrong questions. Yes. Specifically, I > believe Reef Fish's answer to #2 is that it is all random (unless > I misinterpret his monkey throwing darts model). I thought the money and darts idea is so well known that it's been > mentioned in business news dozens of times for the non-statistically > trained. It basically encompasses the theories behind the Efficient > Market and Random Walk hypotheses, which had been throughly tested > by the Chicago group of researchers at the Center for Research in > Security Prices. http://gsbwww.uchicago.edu/research/crsp/ The market behavior can be summarized in two simple sentences: 1. The prices at the past cannot predict prices in the future. 2. The only present information that can predict future prices > are insider information that are illegal to use. the illegal use of such info that put the Martha Stewart > in jail. > Perhaps my previous post was a bit cryptic for those unfamiliar > with the two simple truths above. I didn't want to claim I knew exactly what you were talking about, although I thought I did. > That was why I said the idea > of trying to predict future prices of stock is like the re- > invention of wheels that had long been known to be broken ones. Perhaps this follow-up post makes it clearer. There is a VAST > amount of finance and economics literature that dealt with this > subject. I haven't learned anything new about it since I was rubbing > elbows with those researchers (many of whom have since become > Nobel winners) over three decades ago. Well, I did learn a > tiny little bit new on it since. :-) It's all deja vu. That's why I was surprised that folks are STILL trying to > re-invent the same useless wheels. -- Bob. People are still trying to come up with a perpetual motion machine, too. The lure of easy riches is strong, and there no doubt some people become rich in the stock market (by whatever methods). Enough people claim their methods work to short circuit the effort to thoroughly investigate the field before jumping in. Russell === Subject: Re: stat measures > I know that this is a very basic question, but neither one of you > actually answered it. This problem has come up before. Our answer is, in a sense, > that we think you're asking the wrong questions. Yes. Specifically, I > believe Reef Fish's answer to #2 is that it is all random (unless > I misinterpret his monkey throwing darts model). I thought the money and darts idea is so well known that it's been > mentioned in business news dozens of times for the non-statistically > trained. It basically encompasses the theories behind the Efficient > Market and Random Walk hypotheses, which had been throughly tested > by the Chicago group of researchers at the Center for Research in > Security Prices. http://gsbwww.uchicago.edu/research/crsp/ The market behavior can be summarized in two simple sentences: 1. The prices at the past cannot predict prices in the future. 2. The only present information that can predict future prices > are insider information that are illegal to use. the illegal use of such info that put the Martha Stewart > in jail. > Perhaps my previous post was a bit cryptic for those unfamiliar > with the two simple truths above. I didn't want to claim I knew exactly what you were talking > about, although I thought I did. That was why I said the idea > of trying to predict future prices of stock is like the re- > invention of wheels that had long been known to be broken ones. Perhaps this follow-up post makes it clearer. There is a VAST > amount of finance and economics literature that dealt with this > subject. I haven't learned anything new about it since I was rubbing > elbows with those researchers (many of whom have since become > Nobel winners) over three decades ago. Well, I did learn a > tiny little bit new on it since. :-) It's all deja vu. That's why I was surprised that folks are STILL trying to > re-invent the same useless wheels. -- Bob. People are still trying to come up with a perpetual motion > machine, too. And actually came VERY close to having a model that almost SEEMS perpetual, only to be defeated by the immutable laws of physics. :-) Those trying to re-invent the broken wheel in predicting stocks is not anywhere in the CLASS of perpetual-machine faciers. They are more in the class of planting trees to produce green Franklin bills, or hoarding geese to look for the golden egg. > The lure of easy riches is strong, and there > no doubt some people become rich in the stock market (by > whatever methods). Enough people claim their methods work > to short circuit the effort to thoroughly investigate the > field before jumping in. Russell A slightly harder myth to debunk is that fund managers in Wall Street knows more about the market than a tart throwing monkey, because SOME funds always outperform the AVERAGE, be it DOW or SP500 or whatever, and sometimes by a substantial margin. But the FACT is, collectively, they consistently UNDER-PERFORM compared to any market average, because of TRANSACTION COST, the little games fund managers play to dress up the window EVERY quarter. Here's a Programing exercise anyone can do. Simulate 1,000 dart throwing monkeys in stocks picked for a portfolio, and let the portfolio sit for ONE QUARTER without any trade, and look at the gain or loss at the end of one quarter. Here's MY prediction: compared to the DOW or SP500. the stock funds managed by Wall Street financial analysts. If anyone is programming this, let us know how well my predictions do. :-) -- Bob. 160 presents a path Diagram of Model C. When I run the Data I get the same results as those shown on pgs.159, and 160. so I feel the Path Diagram is properly specified. However when I estimated the output specified at the bottom of p.169 (All Implied Moments) and p. 170 (Factor Scores), p. 171 (Direct Effects) I did not get the results shown on pgs. 170 and 171. === Subject: Re: Number clustering (ranges) Better write a program that finds distance matrix for all the numbers against each other groups those are closer recalculates distance matrix iterates the above You can prune at right distance to get most appropriate number of exclusive clusters and print each cluster's range Hope this helps === Subject: Plackett-Burman questions I'm in the process of designing my first real PB experiment, and I have some questions that I hope someone can help me with: 1) In what sense does internal folded replication (Genstat terminology) represent replication? With N variables there are 2^N possible unique high/low combinations of the factors. An unreplicated design uses just N+1 of these, and an internally folded replicate design uses 2*(N+1). as far as I can see, none of these is literally a replicate of another - not that I'd expect them to be, I just wondered what expanded sense of the word was implied here. 2) Under what circumstances would it be acceptable to forgo such replication? It seems that if we do so, we can not make any estimate of the residual unexplained variance. Surely this is an unconditional bad thing? Indeed, how can we assess significance of main effects if we can't estimate the residual variance? 3) Suppose two of the factors are only significant via their product. To illustrate, iIf we knew nothing about wheels, and wanted to investigate how far a wheeled vehicle would travel in a given time. We could control the diameter of the wheel and its angular velocity, but only their product would be significant - literally, the second order interaction, without either main effect. Is there theoretical guidance (or educated guesswork :-) to say what the outcome of the experiment would be for two such variables? Would both, or neither, come out as significant main effects? Andy === Subject: Re: Plackett-Burman questions > I'm in the process of designing my first real PB experiment, and I > have some questions that I hope someone can help me with: 1) In what sense does internal folded replication (Genstat > terminology) represent replication? With N variables there are 2^N > possible unique high/low combinations of the factors. An unreplicated > design uses just N+1 of these, and an internally folded replicate > design uses 2*(N+1). as far as I can see, none of these is literally a > replicate of another - not that I'd expect them to be, I just wondered > what expanded sense of the word was implied here. > The N effects in a folded design are unconfounded with interactions since the dot product of any of the N columns and the pairwise product of any other two columns is zero. > 2) Under what circumstances would it be acceptable to forgo such > replication? It seems that if we do so, we can not make any estimate > of the residual unexplained variance. Surely this is an unconditional > bad thing? Indeed, how can we assess significance of main effects if > we can't estimate the residual variance? One way is to estimate the variance from a half normal plot. 3) Suppose two of the factors are only significant via their product. > To illustrate, iIf we knew nothing about wheels, and wanted to > investigate how far a wheeled vehicle would travel in a given time. We > could control the diameter of the wheel and its angular velocity, but > only their product would be significant - literally, the second order > interaction, without either main effect. Is there theoretical guidance > (or educated guesswork :-) to say what the outcome of the experiment > would be for two such variables? Would both, or neither, come out as > significant main effects? Andy If interactions are of interest then PB designs are not appropriate. If a PB design is used, and if interactions are present, then the coefficients from the analysis will be biased, being in reality estimates of some combination of first and second order effects. -- Bob Wheeler --- http://www.bobwheeler.com/ ECHIP, Inc. --- Randomness comes in bunches. === Subject: MCMC and confidence intervals I have several data sets (x,y) and I would like to get confidence interval for y so that I could estimate the accuracy of the y value when x value is known. The model between y and x is y = f(x;theta) which might be non-linear and with parameters theta = (theta1, theta2, ..., thetan). 1. Is MCMC suitable method to get confidence level for y? 2. If there are several models y = f(x;theta) to choose from, can I use rjMCMC (reversible jump MCMC) to evaluate which model would fit best the data? Thanx you all! Pekka === Subject: Re: MCMC and confidence intervals > I have several data sets (x,y) and I would like to get confidence > interval for y so that I could estimate the accuracy of the y value > when x value is known. The model between y and x is y = f(x;theta) > which might be non-linear and with parameters theta = (theta1, theta2, > ..., thetan). 1. Is MCMC suitable method to get confidence level for y? Yes, although it might not be the best way of doing this: non-linear models can be tricky. Well, non-linear models can be tricky however you're fitting them. > 2. If there are several models y = f(x;theta) to choose from, can I use > rjMCMC (reversible jump MCMC) to evaluate which model would fit best > the data? > Yes you can, but if you don't want need a formal measure of the probablity that each model is right, then it might be easier to use DIC (http://www.soe.ucsc.edu/~draper/DIC.pdf). Bob -- Bob O'Hara Dept. of Mathematics and Statistics P.O. Box 68 (Gustaf H.8allstr.9amin katu 2b) FIN-00014 University of Helsinki Finland Telephone: +358-9-191 51479 Mobile: +358 50 599 0540 Fax: +358-9-191 51400 WWW: http://www.RNI.Helsinki.FI/~boh/ Journal of Negative Results - EEB: http://www.jnr-eeb.org === Subject: Re: Median Polish Residuals Glen, be normal was that I thought that if the median polish were successful in producing a good model, then the residuals should reflect only random variation, and therefore have a normal distribution. I gather that I'm wrong about that? If so, what meaning do the residuals have? How do I tell whether the median polish results are reliable? (Or can they always be trusted?) What I'm trying to do is to determine whether the means for the different factor levels are significantly different. The software I have (Dataplot from NIST) produces F CDF percentages, but I'm not sure if I can just blindly run the analysis and trust the numbers, or if I should be examining things like residuals. --Paul === Subject: Sampling Questions Received on Email I don't have a background in stats, but have been given a stastical problem to solve, I'm looking for some help as I've made some progress but am now stuck. I have been given a population of emails of size appx 100,000. Each email has an 'intent' i.e. a specific question which can be allocated to a category (in the region of 200 categories). The task is to sample the emails to produce a summary of the types of questions asked the most within the population. What sample size is required for, sayfor arguments sake, 95% confidence this is being challenged and I am struggling to back up my case with any real reasoning. Can anyone help explain the maths behind deriving the sample size in this case? Any help much appreciated. === Subject: Re: Sampling Questions Received on Email I don't have a background in stats, but have been given a stastical > problem to solve, I'm looking for some help as I've made some progress > but am now stuck. I have been given a population of emails of size appx 100,000. Each email has an 'intent' i.e. a specific question which can be > allocated to a category (in the region of 200 categories). The task is to sample the emails to produce a summary of the types of > questions asked the most within the population. What sample size is required for, sayfor arguments sake, 95% confidence > this is being challenged and I am struggling to back up my case with > any real reasoning. I'm not sure what CI, +/- 4%, you are trying to place. With a sample of 500, and 200 categories, the *average* fraction is 2.5%. You could end up with a moderately popular fraction, in the sample, being 6% with a CI of (2%, 10%). That does not look very satisfactory. For one thing, you probably want CIs that are not symmetrical, when it comes to small fractions. Maybe you ought to try 500 and place your CIs, and see how happy you are -- and everyone else is -- with the results. Then you can cut the ranges in half by multiplying the sample size by 4. Can anyone help explain the maths behind deriving the sample size in > this case? 200 categories? I would try using the Poisson assumption for small proportions. Using that -- 1) Take the square root of an (expected) count. The root is close to normal, with SD= 0.5. 2) Add one/ subtract one, to get the CI (approximately 95%, since this is using t or z= 2.0) of the transformed values. 3) Square those, to get the CI of the original counts. 4) Transform those into percentages of the Total for the assumed sample. For the most popular category, a sample of 500*K produces 100 responses. Square root is 10; +/-1 is (9,11) for the CI. Squared, the CI for counts is (81, 121) [If there is more than 20% in a category, you should probably use a binomial instead of Poisson, to get tighter limits.] Now, represented as percentages, the count is 100/(500*K), or 20%/K. N=500. Where K=1, the percent is 20%, CI is (16.2, 24.2). This is approximately +/- 4%, as requested, in terms of the sample size, though the half-range is one-fifth of the observed count. N=2000. If the sample is larger, say, K=4, then the percent is 5%, the CI is (4.05, 6.05). That is +/- 1% in terms of the sample size, but it is still one-fifth of the count. - It seems to me that seeing 100 in a category is fairly adequate for that category. That is something for you to judge, though. WIth 200 categories, you will need a rather *large* sample if you intend to have very many categories that achieve a count of 100. I would recommend that doing 500 is a useful start for piloting your presentation of results. If 5 categories contain 50% of the responses, you might be happy with N of 500. But I will not be surprised if *someone* (if you are satisfying a committee) is interested in the tiny categories, so some party urges that you get 10,000 or more. -- Rich Ulrich, wpilib@pitt.edu http://www.pitt.edu/~wpilib/index.html === Subject: A problem for complete beginners III An urn contains balls of 5 different colors. Describe the sample space resulting of the extraction of two of them. Resolution (program 1000) The sample space is formed by N=15 * points *: __00002___00011___00020___00101 __00110___00200___01001___01010 __01100___02000___10001___10010 __10100___11000___20000_______________ Fig.I (note: the result 01001 is: no balls of color 1, one of color 2, no colors 3 and 4, one ball of colour 5). Extension (program not showed) * K* is the number of balls extracted, * N * the number of points (it grows immensely!) __K____N __2____15____3____50____4___120 __5___246____6___456____7___786 __8__1281____9__1996_________________Fig.II REM ñ1000î CLS FOR i=0 TO 2 : i$=STR$(i) FOR j=0 TO 2 : j$=STR$(j) FOR k=0 TO 2 : k$=STR$(k) FOR l=0 TO 2 : l$=STR$(l) FOR m=0 TO 2 : m$=STR$(m) IF i+j+k+l+m <>2 THEN GOTO 10 w$=i$+j$+k$+l$+m$ PRINT USING ñ [CapitalOAcute]; w$; 10 NEXT m : NEXT l : NEXT k : NEXT j : NEXT i END _______________licas === Subject: Re: Population median absolute deviation. I have computed the MAD_i (median absolute deviation) > of some i=1:N set > of samples (each of say 1000 data points). How do I > best calculate the > overall population equivalent out of it? Do I: 1. Take the mean of the MAD's, so 1/N sum(MAD_i) > 2. Take the median of the MAD's, so median(MAD_i) > 3. Pool all the samples, and then calculate the MAD > over all data? Doing each gives me slightly different estimates of > the population MAD. > 1. 0.045 > 2. 0.016 > 3. 0.035 You see, the median of the MAD is most strict (nr.2) > and the mean MAD > (nr.1) the weakest. Hope you can help. I can't decide > which one would > be best. Ivo > If you assume that each of the N populations has the same population median and the same population MAD, then you should go with (3). If you assume that each of the N populations has the same population median, but possibly different population MADs, then I would probably go with (1). Jack === Subject: Help: Hwo to get mean first passage time for BM with a drift? I have the following questions: Question1: If we only know the mean first passage time(FPT)for browian motion(BM) leaving a circle,whether do we get the mean FPT for BM with a drift leaving a circle? how to obtain the mean FPT? Question2: If we know the density function of FPT for BM leaving a circle, using girsanov theorem,we can obtain the mean the mean FPT for BM with a drift leaving a circle. Can you introduce a refernce book which gave the density function of FPT to me? Question3: How to obtain the mean FPT for BM with a drift leaving a pie slice(or fanlight) decided by a radius R and a angle S? Qinglin Zhao === Subject: Help: mean first passage time for BM with a drift I have the following questions: Question1: If we only know the mean first passage time(FPT)for browian motion(BM) leaving a circle,whether do we get the mean FPT for BM with a drift leaving a circle? how to obtain the mean FPT? Question2: If we know the density function of FPT for BM leaving a circle, using girsanov theorem,we can obtain the mean the mean FPT for BM with a drift leaving a circle. Can you introduce a refernce book which gave the density function of FPT to me? Question3: How to obtain the mean FPT for BM with a drift leaving a pie slice(or fanlight) decided by a radius R and a angle S? Qinglin Zhao === Subject: Re: A problem for complete beginners Ross-c said I have the equation E[h(X,a)] = b; > E stands for expected value > X is a random variable > a and b are constants > h(x,a) is a function of x and ``a'' only; it is cts > wrt x but > not differentiable; it is a complicated function > involving ``min'' and > ``max; only way to evaluate the expectation is > through simulation. I like > to solve > this equation for ``a'' when ``b'' is known. I have an approach, but it tends to be slow. Which > area of math/stat would > deal with this type of problems? I did > search for polynomials with random coefficients, but > that is totally > different. > This might be what you're doing already. Generate a large number (say N=10,000) of X's. Choose a_0 sufficiently small and a_1 sufficiently large. Using the bisection method on a, calculate the averages of the N h(X_i, a_0)'s and of the N h(X_i,a_1)'s. If they cover b, then choose the next a_2 as the midpoint of the previous a's. For each iteration, use the same set of X's and iterate half way to the right or left depending on how the current average h compares with b. Jack === Subject: Re: a fair die problem... HI! (too hot to work, 40- 42 C here, uff!). Sometimes the Negative Binomial is completely inadequate because the number of trials if fixed. This some die problem could be a simple example. Suppose that we can find the probability to have ne even and no odds The probability to have even is pe and to have odd is po. Therefore n (the total number of trials) must be n=ne+no because they are exhaustive (all events are called). Two different things must happen (one makes the other impossible to occur). ___1____the last trial leads to an even score (2, 4 , 6) ___p1= (n-1)choose(ne-1)* pe^(ne-1)* po^no * pe ___2____the last trial leads to an odd score (1, 3 , 5) ___p2= (n-1)choose(no-1)* po^(no-1) * pe^ne * po. (in this particular case, fair die, pe=po=1/2). Therefore the probability of ne even scores and no odd is p=p1+p2. _________________licas === Subject: Re: a fair die problem... Hi all! Referring to my August 5th post, I (apologizing) correct the figure: x+r______0__1___2__3__4___5__6__7___8___9_10__11_12_ _________|___|___|___|___|___|___|___|___|___|___|___|___|__ x_______________________________0__1___2___3__4___5_ SO sorry ____________licas === Subject: Re: a fair die problem... What a gang of complicated and confused minds!!!! The Negative Binomial Distribution deals with successive Bernoulli trials (independent, each one having the probability p of success), Let x (x=0,1,2,3,4,5,6,7,8,9,.83to infinity) the number of trials I have to perform, besides r, in order to have exactly r successes. Note that x=0 is the case that all the first r trials resulted in successes. (the die gave an even score: 2, 4 or 6, no matter they are). The mathematical equation solves the problem (without ANY doubt): ___p(x)=(x+r)Choose(r-1) * p ^ r * (1-p) ^ x Why? Simply because there are x+r trials leading to r successes and x failures (no matter the order) and the last trial is surely a success. Why to complicate? ____________licas === Subject: A problem for complete beginners II Probability depends upon the * information * we get LetÇs think about a simple problem. It is known that a sac contains 2 red balls and 2 black. Arthur draws one ball (it is black) and does not show it to Barnard.. He does come back to the sac. The 2nd draw gave red, and the 3rd black (both could see). They were not returned to the sac, too. ******** The probabilities of the remaining ballÇs colour are evaluated differently: They are: _____Arthur : p(red)=1 (he may be sure) , p(black)=0 _____Barnard: p(red)=1/2, p(black)=1/2. Comments, please ______________________licas === Subject: Re: A problem for complete beginners II Reasoning:(no replies then my justification.83) Barnard thinks that if the 1st draw is red (which has p=2/4) the 4th MUST be black, but if it is black (p=2/4 also) must be red. _____________licas === Subject: A problem for complete beginners IV Considering all letters (26) of the alphabet it is a pyramidal task. But you can understand the process with 3 of them only. Let be i,j,k the tines we have the letters A,B and C in the event when we draw at random these letters. Hence i+j+k=3: __p(i,j,k)= (3! / (i! j! k! )) * (1/3)^i * (1/3)^j * (1/3) ^k = _____= (3! / (i! j! k! ))* (1/3) ^3 Let be A,B,C. _occurs all of them: (ABC, ACB, BAC, BCA, CAB, CBA) _p= (3! / (1!*1!*1!))*(1/3) ^3 = 6/27 _occurs only two: (i.e. one of them twice) _p(AAB)=(3!/ (2!*1!*0!))*(1/3)^3 = 3/27 _p(ABB)=(3!/ (1!*2!*0!))*(1/3)^3 = 3/27 _p(BBC)=(3!/ (0!*2!*1!))*(1/3)^3 = 3/27 _p(BCC)=(3!/ (0!*1!*2!))*(1/3)^3 = 3/27 _p(CCA)=(3!/ (1!*0!*2!))*(1/3)^3 = 3/27 _p(AAC)=(3!/ (2!*0!*1!))*(1/3)^3 = 3/27 _only one letter _p(AAA)=(3!/ (3!*0!*0!))*(1/3)^3 = 1/27 _p(BBB)=(3!/ (0!*3!*0!))*(1/3)^3 = 1/27 _p(CCC)=(3!/ (0!*3!*0!))*(1/3)^3 = 1/27 Pay attention, please This law (multinomial) deals (exclusively) with the NUMBER of times that the letter occurs, the order being IMMATERIAL. Therefore the succession A+B+B (firstly A, then B, then B) is with the probability 3/27. **** When we use the full alphabet and 6 draws we have i+j+k+l+m+n=6 and ___p(i,j,k,l,m,n= ___(6!/(i!*j!*k!*l!*m!*n!))* (1/26)^6= ___(6!/(i!*j!*k!*l!*m!*n!))* 0.000000003237128297.83 ____________________licas === Subject: Coins , dice, cards and random numbers Starting from beginning in simulation: Let be x the outcome. Warning: the examples above are not exclusive. In fact there are much more ways to achieve the different outcomes, by using other x variables. ***** A________Flip (or throw) coins ______________x =1 - INT(rnd *2 ) __rnd ____________________x __0.45______0.90__________1___________head __0.55______1.10__________0___________tail (if rnd>0.5 then x=1, if rnd <=0.5 then x=0) ***** B_________Throw dice : 1, 2, 3, 4, 5, 6. _________________x = INT(rnd*6) + 1 ***** C_________Cards __suit (1, 2, 3, 4 = diamonds, spades, hearts, clubs) __value (1 to 13 = A, K, D, J, 10, 9, 8, 7, 6, 5, 4, 3, 2) ___suit = INT(rnd*4 )+ 1 ___value =INT (rnd * 13) + 1 ___________licas_@hotmail.com === Subject: Re: Coins , dice, cards and random numbers How can I simulate a k cards * hand * (randomly) from a pack ? The regular pack has 4 suits, 13 cards each. The * hand * is obtained drawing the cards one by one WITHOUT REPLACEMENT. The simpler way is to use two-dimensional vectors W(x,y) where x is the value and y the suit. __x= 1,2,3,4,5,6,7,8,9,10,11,12,13 Respectively if the card is __A, K, D, J, 10,9,8,7,6,5,4,3,2 __y=1,2,3,4 __for Diamonds, Spades, Hearths, Clubs We start with all W(x, y)= 1 (all cards ready to be chosen ). Every time a card is drawn we change the respective *cardÇs presence function* W(xÇ, yÇ)=0 (absent). After to obtain a new pair xÇÇ, yÇÇ we take care to find if W(xÇ, yÇ) = 0. If so we reject it. On the contrary if W(xÇ, yÇ) =1 we include this card in the hand and put W(xÇ, yÇ) = 0. ___________licas === Subject: addition to degrees of freedom post PS It is important to note, if I wasn't clear, that in my experiment I averaged the squares by dividing by n (sample size), not N or n-1. === Subject: wrong assumption about %RSD? I'm employed in the pharmaceutical bussiness, where the use of the %RSD (as you know: standarddeviation/mean * 100) is of common use. As I am an analyst, I also look at forums which covers my kind of laboratorium work. Now there is a topic which took my interest. It goes about %RSD. The question: is it possible to calculate the %RSD on 2 datapoints? And is it usefull. My point of view (which was also my answer): As standard deviation (s) gives an idea of the spread, you can also say that 68% of the measured values are between the mean±s. If the standard deviation is calculated on only 2 points, the above 'definition' is not of use. So I don't think its usefull to calculate %RSD on only 2 datapoints, but its possible to calculate. There are some other people who discuss this and found it usefull. What do you statistics/mathematics people think about this subject? Bart === Subject: Re: wrong assumption about %RSD? > I'm employed in the pharmaceutical bussiness, where > the use of the %RSD (as you know: standarddeviation/mean * 100) is of > common use. > As I am an analyst, I also look at forums which > covers my kind of > laboratorium work. Now there is a topic which took my > interest. It goes about %RSD. > The question: is it possible to calculate the %RSD on > 2 datapoints? And is it usefull. > My point of view (which was also my answer): > As standard deviation (s) gives an idea of the > spread, you can also say that 68% of the measured values are between the > mean±s. > If the standard deviation is calculated on only 2 > points, the above > 'definition' is not of use. So I don't think its > usefull to calculate > %RSD on only 2 datapoints, but its possible to > calculate. > There are some other people who discuss this and > found it usefull. > What do you statistics/mathematics people think about > this subject? > Bart > In other industries, %RSD is also known as the %CV (coefficient of variation). The CV is defined when N is at least 2. When you plug in the sample standard deviation and the sample mean, what you end up with is a sample CV which is a point estimate of the population CV. These estimates tends to have more sampling variability when the sample sizes are small. Let N be the sample size, xbar be the sample mean, CV_hat be the sample CV, and CVo be the population CV. Assuming normality, if the probability of xbar being negative is very, very small; i.e., CVo is small and N is sufficiently large, then CV_hat is approximately distributed as inversely proportional to a noncentral t. CV_hat ~ sqrt(N)/t'(N-1, delta) where the noncentrality parameter delta is delta = sqrt(N)/CVo. There are analytic approximations to the noncentral t, which will give you an idea of the sampling variability of a CV estimate for different sample sizes and population CVs. You can then multiply everything by 100 to get it in terms of %RSD. Jack === Subject: Re: Q: One-sided Fisher's exact (terminology) [snip, comment on OP] The sum is taken to one side of the *observed*, or to its > other side. Typically, it is taken to the more-extreme side, > because that is what is computationally convenient, and that > side is usually what was interesting. EAR If both the upper tail and the lower tail are of interest, a two-sided > test should be used, not a one-sided test. The above essentially > corresponds to running two tests and presenting only the lower P-value > of the two, which is not a valid approach. Absolutely. I should have mentioned that. The next point, about what should be presented to whom, in discussing one-tailed results, gets a little tricky. Different data, in different circumstances, call for different treatments. I get concerned with it seems as if something is being hidden that *any* of the audience would find interesting, whether the Investigators do or not. And I don't think that two-tailed testing is *always* the solution. For the FET, there is the special problem where the audience might have unreasonable expectations. RU You don't want to confuse your audience. It *is* a one-sided > test, but you need to describe or mention the *other* one-side > of the test, somewhere in the discussion, to be sure that the > audience stays on-board and knows what you are doing. EAR Since a one-sided test should only be used when the question posed is > one-sided only, this choice should normaly be made clear in the > methods section, and that this choice was made a priori. In principle, > the choice means that one has not even tested the other side. If the > other side needs to be discussed, rather than just discarded a priori, > it probably means that a one-sided test was not an appropriate choice. Discard, a-priori, no matter how large? I think of the opposite outcome to a one-tailed test as -- potentially -- being an important, exploratory result, despite whatever was set up as the initial hypothesis. Doing a one-tailed test does not give you permission to hide information. For questions like this, I have always like the *idea* of dividing an alpha-error into unequal sides. That is: Reject if (say) alpha less than 0.045 for A > B, and reject if alpha less than 0.005 for B > A. There's nothing wrong with it in theory. It seems to have problems in application, or acceptability, the first problem being that I've never found any clinician who liked it. But this seems to fit better with decision theory, or with a highly Bayesian approach (from the little I know). *If* it is going to matter to someone, you can't ignore the unexpected. It is not a result of your one-tailed main test of hypothesis, perhaps, but you report it as fact. In *practice*, clinical trials actually do get halted because of unexpected outcomes. And here is a counter-example from a couple of years ago -- Researchers with a human gene- modification treatment *failed* to mention to their grant- funding agency that 2 of their (small number of) patients had unexpectedly died. Well, dying was not a listed outcome from the treatment, and not one of the expected side-effects; and they didn't *know* the treatment was responsible.... IIRC, their university was threatened with total withdrawal of federal research funds, if better oversight was not enforced locally. RU I suppose that this potential for confusion is one reason that > some people do not like one-sided tests. EAR The main reason is that in most cases the question posed is not truely > one-sided. In fact, as far as I can recall, many medical journals are > very explicit about this: that one-sided tests should only be used > when only effects in one direction are of interest. - I've said approximately the same thing before. But today, maybe I would make that ... when only 'reasonably foreseen' effects in one direction are of interest. EAR When only effects in one direction are of interest, there is fairly > little room for confusion about which side is being tested. What do you recommend when your feelings are highly asymmetrical, as with my 0.045+ 0.005 example, above? Some journals *used* to be very explicit about not allowing any one-tailed tests at all, unless the was the most extreme of justifications. If you make it a black-and-white choice, it can come to that position, so I understand that position. I just don't think it is a particularly great position. A few days ago, I read a comment by a German statistician, which was posted to the Exact-stats mailing list. He echoed another poster to Exact-stats, about how the *interest* justifies the one-sided tests, among other advantages. And he said, One-sided tests with alpha greater or equal 0.025 seem to be generally acceptable to regulatory authorities in Europe and are thus becoming the de facto standard. - I had not heard of that, and I will be interested in hearing if that is the common perception. -- Rich Ulrich, wpilib@pitt.edu http://www.pitt.edu/~wpilib/index.html === Subject: finite difference method for space of large dimensions I would use the finite difference method for space of large dimension (says dgeq 3). I have some questions: 1/ is ADI is the only way to work with large dimension? 2/ which are the best reference for ADI? XS === Subject: finite difference in complex geometry hi all.. i'm studying evolution problem (some coupled non linear heat equations). Now i resolv them in simple geometry (a square). I was wondering how to extend a finite difference formulation to triangle based mesh in order to use more complex geometries. How can i build up the x and y derivatives in a triangle based mesh? thank you for any suggestion, ap -- Legge degli Affitti Cittadini: Chi non puo' permettersi di pagare l'affitto e' in affitto. Chi puo' permettersi di pagare l'affitto e' proprietario. === Subject: Re: finite difference in complex geometry > i'm studying evolution problem (some coupled non linear heat equations). > Now i resolv them in simple geometry (a square). I was wondering how to > extend a finite difference formulation to triangle based mesh in order to > use more complex geometries. How can i build up the x and y derivatives in > a triangle based mesh? Gilbert Strang and George Fix, An Analysis of the Finite Element Method. Author searches at amazon.com and www.barnesandnoble.com work just fine. -- pa at panix dot com === Subject: Re: finite difference in complex geometry > i'm studying evolution problem (some coupled non linear heat >> equations). >> Now i resolv them in simple geometry (a square). I was wondering how to >> extend a finite difference formulation to triangle based mesh in order to >> use more complex geometries. How can i build up the x and y derivatives >> in a triangle based mesh? Gilbert Strang and George Fix, An Analysis of the Finite Element Method. > Author searches at amazon.com and www.barnesandnoble.com work just fine. thank you! But the book has been published in 1973, there aren't some newer references for my problem? thank you -- Andrea: Triste la terra che non ha eroi. Galileo: No, triste la terra che ha bisogno di eroi. -- Bertold Brecht, Life of Galileo === Subject: Re: finite difference in complex geometry i'm studying evolution problem (some coupled non linear heat >> equations). >> Now i resolv them in simple geometry (a square). I was wondering how to >> extend a finite difference formulation to triangle based mesh in order to >> use more complex geometries. How can i build up the x and y derivatives >> in a triangle based mesh? Gilbert Strang and George Fix, An Analysis of the Finite Element Method. > Author searches at amazon.com and www.barnesandnoble.com work just fine. thank you! But the book has been published in 1973, there aren't some newer > references for my problem? Note that Pierre is suggesting finite ELEMENT methods - not finite DIFFERENCE methods. This is a different method for solving your model. This is good advice - for complex geometries the finite element method is much more flexible, and far easier to implement, but you will need to learn a bit of new maths to use it. Using finite DIFFERENCES on a trianglar mesh is a more advanced and tricky topic, harder to program and is far less common. Pierre has recommended one of the classic texts on finite element methods - it's a good starting place even if it's old and I used it to learn the method. There's a reson why it was reprinted recently - it's a good book! The discretisation you'll need is certainly old, although it is fair to say the methods for solving the resulting sparse algebraic equations have evolved a good deal since the 1970's so a more modern book may help here. However, you would be best off if you use a library code to solve the equations rather than developing code for this part of the problem yourself. Google search for sparse linear algebra and look at www.netlib.org www.netlib.org/utk/people/JackDongarra/la-sw.html for information on good free codes. Any undergraduate text test on finite element methods will provide the information you need. I suggest you pick up several books from a library and browse through them until you find one you like. Hope that helps, Andy === Subject: 2nd CFP Conf. Reliable Software Technologies, Ada-Europe 2006 Summary: Now is the time to prepare your submissions! Keywords: Conference,tutorials,reliable software,Ada,LNCS,Portugal Cc: dirk ----------------------------------------------------------------------- 2nd CALL FOR PAPERS 11th International Conference on Reliable Software Technologies - Ada-Europe 2006 5 - 9 June 2006, Porto, Portugal http://www.ada-europe.org/conference2006.html Organised, on behalf of Ada-Europe, by Instituto Superior de Engenharia do Porto, in cooperation with ACM SIGAda (approval pending) *** CfP in HTML/PDF on web site *** ----------------------------------------------------------------------- Ada-Europe organizes annual international conferences since the early 80's. This is the 11th event in the Reliable Software Technologies series, previous ones being held at Montreux, Switzerland ('96), London, UK ('97), Uppsala, Sweden ('98), Santander, Spain ('99), Potsdam, Germany ('00), Leuven, Belgium ('01), Vienna, Austria ('02), Toulouse, France ('03), Palma de Mallorca, Spain ('04), York, UK ('05). General Information ------------------- The 11th International Conference on Reliable Software Technologies (Ada-Europe 2006) will take place in Porto, Portugal. Following the usual style, the conference will span a full week, including a three-day technical program and vendor exhibitions from Tuesday to Thursday, along with parallel workshops and tutorials on Monday and Friday. Schedule -------- 30 October 2005: Submission of papers, workshop/tutorial proposals 20 January 2006: Notification to authors 20 February 2006: Camera-ready papers required 5-9 June 2006: Conference Topics ------ In the last decade the conference has established itself as an international forum for providers and practitioners of, and researchers into, reliable software technologies. The conference presentations will illustrate current work in the theory and practice of the design, development and maintenance of long-lived, high-quality software systems for a variety of application domains. The program will allow ample time for keynotes, Q&A sessions, panel discussions and social events. Participants will include practitioners and researchers from industry, academia and government organizations interested in furthering the development of reliable software technologies. To mark the completion of the technical work for the Ada language standard revision process, contributions that present and discuss the potential of the revised language are particularly sought after. For papers, tutorials, and workshop proposals, the topics of interest include, but are not limited to: - Methods and Techniques for Software Development and Maintenance: Requirements Engineering, Object-Oriented Technologies, Formal Methods, Re-engineering and Reverse Engineering, Reuse, Software Management Issues - Software Architectures: Patterns for Software Design and Composition, Frameworks, Architecture-Centered Development, Component and Class Libraries, Component-Based Design - Enabling Technology: CASE Tools, Software Development Environments and Project Browsers, Compilers, Debuggers and Run-time Systems - Software Quality: Quality Management and Assurance, Risk Analysis, Program Analysis, Verification, Validation, Testing of Software Systems - Critical Systems: Real-Time, Distribution, Fault Tolerance, Information Technology, Safety, Security - Mainstream and Emerging Applications: Multimedia and Communications, Manufacturing, Robotics, Avionics, Space, Health Care, Transportation - Ada Language and Technology: Programming Techniques, Object-Oriented Programming, Concurrent Programming, Distributed Programming, Bindings and Libraries, Evaluation & Comparative Assessments, Critical Review of Language Enhancements, Novel Support Technology, HW/SW platforms - Experience Reports: Experience Reports, Case Studies and Comparative Assessments, Management Approaches, Qualitative and Quantitative Metrics, Experience Reports on Education and Training Activities with bearing on any of the conference topics Submissions ----------- Authors are invited to submit original contributions. Paper submissions shall be in English, should be complete and should not exceed 20 double-spaced pages in length. Authors should submit their work via the Web submission system accessible from the conference Home page. The preferred format for submission is PDF. Postscript can also be accepted, as long as it was generated selecting the optimize for portability option in the used printer driver. Submissions by other means and formats will *not* be accepted. If you do not have easy access to the Internet, or you do not have an appropriate Web browser, please contact the Program Co-Chair Lu.92s Miguel Pinho, whose address details are on this call as well as on the conference Home page. Proceedings ----------- The authors of accepted papers shall prepare their camera-ready submissions in full conformance with the LNCS style, not exceeding 12 pages and strictly by *February 20, 2006*. Authors should refer to: http://www.springer.de/comp/lncs/authors.html for format and style guidelines. Failure to comply will prevent the paper from appearing in the conference proceedings. The conference proceedings including all accepted papers will be published in the Lecture Notes in Computer Science (LNCS) series by Springer Verlag, which will be available at the start of the conference. Awards ------ Ada-Europe will offer honorary awards for the best paper and the best presentation, which will be presented during the banquet and at the close of the conference respectively. Call for Tutorials ------------------ Tutorials should address subjects that fall within the thrust of the conference and may be proposed as either half- or a full-day events. Proposals should include a title, an abstract, a description of the topic, a detailed outline of the presentation, a description of the presenter's lecturing expertise in general and with the proposed topic in particular, the proposed duration (half day or full day), the intended level of the tutorial (introductory, intermediate, or advanced), the recommended audience experience and background, and a statement of the reasons for attending. Proposals should be submitted by e-mail to the Tutorial Chair Jorge Real. The providers of full-day tutorials will receive a complimentary conference registration as well as a fee for every paying participant in excess of 5; for half-day tutorials, these benefits will accordingly be halved. The Ada User Journal will offer space for the publication of summaries of the accepted tutorial in issues preceding and/or following the conference. Call for Workshops ------------------ Workshops on themes within the conference scope may be arranged to discuss matters of immediate technical interest as well as to foster action on longer-term technical objectives. Proposals may be submitted for half- or full-day workshops, to be scheduled on either ends of the main conference. Workshop proposals should be submitted by e-mail to the Conference Chair Lu.92s Miguel Pinho. The workshop organizer shall also commit to preparing proceedings for timely publication in the Ada User Journal. Exhibition ---------- Commercial exhibitions will span the three days of the main conference. Vendors and providers of software products and services should contact the Exhibition Chair Jos.8e Ruiz as soon as possible for further information and for allowing suitable planning of the exhibition space and time. Reduced Fees for Students ------------------------- A small number of grants are available for students who will (co-)author and present papers at the conference. A reduction of 25% will be made to the conference fee. Contact the Conference Chair Lu.92s Miguel Pinho for details. Organizing Committee -------------------- Conference Chair Lu.92s Miguel Pinho, Polytechnic Institute of Porto, Portugal lpinho@dei.isep.ipp.pt Program Co-Chairs Lu.92s Miguel Pinho, Polytechnic Institute of Porto, Portugal lpinho@dei.isep.ipp.pt Michael Gonz.87lez Harbour, Universidad de Cantabria, Spain mgh@unican.es Tutorial Chair Jorge Real, U. P. Valencia, Spain jorge@disca.upv.es Exhibition Chair Jos.8e Ruiz, AdaCore, France ruiz@adacore.com Publicity Chair Dirk Craeynest, Aubay Belgium & K.U.Leuven, Belgium Dirk.Craeynest@cs.kuleuven.be Local Chair Sandra Almeida, Polytechnic Institute of Porto, Portugal salmeida@dei.isep.ipp.pt Ada-Europe Conference Liaison Laurent Pautet, Telecom Paris, France pautet@enst.fr Program Committee (preliminary) ------------------------------- Alonso Alejandro, Universidad Polit.8ecnica de Madrid, Spain Asplund Lars, M.8alardalens H.9agskola, Sweden Barnes Janet, Praxis High Integrity Systems, UK Bernat Guillem, University of York, UK Blieberger Johann, Technische Universit.8at Wien, Austria Brosgol Ben, AdaCore, USA Burgstaller Bernd, University of Sydney, Australia Burns Alan, University of York, UK Craeynest Dirk, Aubay Belgium & K.U.Leuven, Belgium Crespo Alfons, Universidad Polit.8ecnica de Valencia, Spain Devillers Raymond, Universit.8e Libre de Bruxelles, Belgium Gonz.87lez Harbour Michael, Universidad de Cantabria, Spain Guti.8errez Jos.8e Javier, Universidad de Cantabria, Spain Hately Andrew, Eurocontrol CRDS, Hungary Hommel G.9fnter, Technischen Univesit.8at Berlin, Germany Keller Hubert, Institut f.9fr Angewandte Informatik, Germany Kermarrec Yvon, ENST Bretagne, France Kienzle J.9arg, McGill University, Canada Kordon Fabrice, Universit.8e Pierre & Marie Curie, France LLamosi Albert, Universitat de les Illes Balears, Spain Mazzanti Franco, ISTI-CNR Pisa, Italy McCormick John, University of Northern Iowa, USA Michell Stephen, Maurya Software, Canada Miranda Javier, Universidad Las Palmas de Gran Canaria, Spain Pautet Laurent, Telecom Paris, France Pinho Lu.92s Miguel, Polytechnic Institute of Porto, Portugal Pl.9adereder Erhard, Universit.8at Stuttgart, Germany de la Puente Juan A., Universidad Polit.8ecnica de Madrid, Spain Real Jorge, Universidad Polit.8ecnica de Valencia, Spain Romanovsky Alexander, University of Newcastle upon Tyne, UK Rosen Jean-Pierre, Adalog, France Ruiz Jos.8e, AdaCore, France Schonberg Edmond, New York University & AdaCore, USA Tokar Joyce, Pyrrhus Software, USA Vardanega Tullio, Universit.88 di Padova, Italy Wellings Andy, University of York, UK Winkler J.9frgen, Friedrich-Schiller-Universit.8at, Germany ----------------------------------------------------------------------- Our apologies if you receive multiple copies of this announcement. Please circulate widely. Dirk.Craeynest@cs.kuleuven.be, Ada-Europe'2006 Publicity Chair *** 11th Intl.Conf.on Reliable Software Technologies - Ada-Europe'2006 *** June 5-9, 2006 ** Porto, Portugal ** http://www.ada-europe.org *** (V2.7)