vpFREE2 Forums

D ia a D cost/Stats Class revisted

In a message dated 8/30/05 3:47:38 AM US Mountain Standard Time,
vpFREE@yahoogroups.com writes:

···

From: "cdfsrule" <groups.yahoo@verizon.net>
Subject: Re: Is Diamond worth it?

See comments below.

--- In vpFREE@yahoogroups.com, BOBBJ@a... wrote:
> There is a difference between mean loss and median loss in a game
that doesn't have a normal distributuon.

Right on: For a normal disribution, the mean (simple arithmetric
average) = median (50% probability point) = mode (most likely
outcome).

>The most likely loss is the median value of about 750$.

Not right: by definition, the most likely outcome is the MODE. For
VP, unless you play forever (really forever), the median, and the
mode are all less than the mean (or EV), even if the game is
positive.

That means we are, a priori, always more likely to win less (or lose
more) than the EV. Oh well.

I guess I should hurry up and compute the PDFs for JoB. I imagine
the pictures will tell a nice story.

Any preferences for the # of hands? (I am only going to do SL now)

*****

The original post about median did not quite sound right. It got me thinking
again about mean, median and mode. Thanks for comfirming what I thought,
mode is the answer.

You have to revisit set theory too, to grasp the definition of the three
"M"s. A set is just defined as all the possible outcomes of an event.

Mean- is the common math average.
Median- is the middle member of the set of possible values.
Mode- is the member that occurs the most...the highest column on a bar graph.

A Normal Distribution-all "M"s have the same value.

A Mean does not have to belong to the set of values, but Median and Mode do,
right?

So if you play a small finite number of trials (hands) the left side of the
curve will be fairly smooth and even alot of the right side too. But there
will be spikes on the far right side, where the royals hang out. The graph of
results is lumpy.

Playing 800 independant hands (bet units) on $5 Jacks 9/6 for $20k, gets a
real lumpy curve.

Playing 4,000 independant hands (bet units) on $1 Jacks 9/6 for $20k coin-in
has been a possible play, and starts the smoothing.

But, 80,000 independant hands (bet units) on nickel Jacks 9/6 for $20k
coin-in...really smoothing the curve, but starts running into other problems.

The set of possible outcomes increases with more hands played, which smoothes
out the curve, the three "M"s start to converge.

***
But Harrahs tosses in another constraint, 24 hrs. More importantly is the
player's contraint of time allotted/fatique.

This is why a multi-play game helps. The results of each hand are now
dependant upon the initial deal, but you get alot more different possibles
results...a trade-off. Not as good as independant ones, but helps get the three "M"s
closer together.

This has been a qualitative study, our math majors will now (or have) make it
quantitative.

BS

[Non-text portions of this message have been removed]

A litle bit more than my $0.02:
There's been a lot of talk about comparing games, variance, RoR,
kelly bankroll, cost of DiaD, and the like. Since we all have our
own risk-tolerances and each of us values money differently, there
isn't a one-size-fits-all answer to these "subjective" questions.
That said, there are indeed mathematically correct (rigorous,
accurate) methods and mathematically incorrect methods to go about
addressing the problems. Likewise, using approximations can be a
valid and helpful technique, so long as one understands the
assumptions made in using the approximations.

The answers to many of the questions I've read recently can be found
using the probability density function or PDF. So long as one
actually knows the PDF that is! Luckily, it is a relatively straight
forward process to compute the PDF's for single line VP (though it
can be a slow process, and to speed things up a tiny approximation or
two is made) and we need not speculate. [There are also some methods
to quickly compute an approximate PDF for multiline VP if the
covariances are known already]

My rant aside, there are plenty of times when we might might want to
compare two variances (or other metrics) and that is fine. Go ahead
and do so. The challenge with statistics is not computing them per
se (especially given today's computers), but rather understanding
what they (the statistics) mean. But comming up with meaningful
statistics or metrics is no mean feat (pun intended).

So, I'll leave to the group as a whole to come up with the important
metrics, and I'll try to help compute them when I get a chance.

Other comments below:

A Mean does not have to belong to the set of values, but Median and

Mode do,

right?

Useing "sets" needlessly complicates the issue. You are 2/3 right
(the mean doesn't have to belong to the set either!) but the problem
isn't really with the statistics, but with the PDF, which doesn't
lend itself to rigourous definition or "meaning" if we are dealing
with actual data (so called student populations). If we are dealing
with the so called parent population, things are easier, but still no
need to worry much about sets. Just use the "set" of all real
numbers if needed. If one must deal with student data, the CDF is a
much better choice, but most people don't learn about them in school.

So if you play a small finite number of trials (hands) the left

side of the curve will be fairly smooth and even alot of the right
side too. But there will be spikes on the far right side, where the
royals hang out. The graph of results is lumpy.

(1) I already submitted a plot of PDFs for Pickem. I will do the same
for JoB. Critically speaking, the PDFs are never smooth, since some
values (like 0.5 coin) are impossible. But that technicality aside,
you are correct. For 1 hand the PDF has non-zero values at
precisely 1 place for each unique entry in the pay table. Hence it
must be "spikey" in nature (or stick-like)

(2) The PDF's are always "lumpy" regardless of the number of hands
that are played. Really. To understand this, consider the first
hand. There is a tiny lump for the first royal. Now consider the
next hand. There is an even tinier lump for the possibility of
getting two RF's in a row. It could happen. Really. Hence, for
every hand, the PDF gets another lump, located at (for Job) 4000*(the
hand number). So no matter what, the PDF always has a very, very
long tail to the right (positive side). The part to the left is
always shorter; it stops at (-coins bet)*(number of hands). In other
words, VP PDFs have very long positvie tails. But don't take my word
for it, compute some for yourself. (BTW, I'd perfer another word
other than lumpy. Any ideas?)

Playing 800 independant hands (bet units) on $5 Jacks 9/6 for $20k,

gets a real lumpy curve.Playing 4,000 independant hands (bet units)
on $1 Jacks 9/6 for $20k coin-in has been a possible play, and starts
the smoothing.

You are right: smoothing is happening all the time, with every hand,
BUT at the same rate for every hand. Really. That said, smoothing
appears to be occuring more in the area we care about at a faster and
faster rate. This is our eyes playing tricks on us when we look at a
graph... the rate isn't changing.

But, 80,000 independant hands (bet units) on nickel Jacks 9/6 for

$20k coin-in...really smoothing the curve, but starts running into
other problems.

What problems, exactly?

The set of possible outcomes increases with more hands played,

which smoothes out the curve, the three "M"s start to converge.

Yes. Yes. Yes. But they never really do converge for VP. This is
due to the long tail issue. Hence the mean will always be to the
right (more positve) than the mode or median. Always. (But admitedly
less and less so as we approach infinity)

Some people think this behavior is an example of the central limit
theorem. Well, I'm not sure what the CLT is actually, but what is
happening isn't the CLT. Trust me. Nonetheless, I don't object to
calling it the CLT, since the CLT conjors up a mental image that many
people can understand. [The at the pickem PDF's. Sure they are
getting smoother and nicer looking, but there really isn't a central
limit developing]

But Harrahs tosses in another constraint, 24 hrs. More importantly

is the player's contraint of time allotted/fatique.

BTW, In message #480017 I gave some statistics for JoB for up to 4000
hands. I assumed that the player has enough backroll to play that
number of hands without EVER going bust.

This is why a multi-play game helps. The results of each hand are

now dependant upon the initial deal, but you get alot more different
possibles results...a trade-off. Not as good as independant ones,
but helps get the three "M"s closer together.

yes. But put another may, multiplay reduces the variance, though by
never as much the equivelent number of single-line hands would (for
the same total bet per play). There is a proven mathematical theory
behind this, don't worry. Now, just to confuse people, the variance
itself of our sessions (for single line play) would produce a normal-
like distribution given enough sessions. This is an example of the
CLT. [If my memory serves me correclty, the variance of a
uncorrelated random process follows a chi-square distribution of
order n, where n is the number of hands here. As n gets large, the
chi-squared distribution becomes "normal". Someone might want to
check me on this. Perhaps it is only true for normal processes?]

This has been a qualitative study, our math majors will now (or

have) make it quantitative.

Sorry, not a math major exactly, so I will stick to rigorous theory
and poor spelling at the moment and leave numbers for someone else.

snip

and do so. The challenge with statistics is not computing them

per

se (especially given today's computers), but rather understanding
what they (the statistics) mean. But comming up with meaningful
statistics or metrics is no mean feat (pun intended).

So, I'll leave to the group as a whole to come up with the

important

metrics, and I'll try to help compute them when I get a chance.

An accumulation of the PDF from neg. to positive (similar to a Z
table) for all of the popular games for various size sessions,
probably up into the 4 or 5 thousand range (See Jazbo's)

snip

If one must deal with student data, the CDF is a

much better choice, but most people don't learn about them in

school.

What is CDF please?

snip

  So no matter what, the PDF always has a very, very

long tail to the right (positive side). The part to the left is
always shorter; it stops at (-coins bet)*(number of hands). In

other

words, VP PDFs have very long positvie tails.

So does it ever even approximate a normal curve? I think not, but
am still learning. If so can you quantify when it approaches normal?

snip

> The set of possible outcomes increases with more hands played,
which smoothes out the curve, the three "M"s start to converge.

Yes. Yes. Yes. But they never really do converge for VP. This

is

due to the long tail issue. Hence the mean will always be to the
right (more positve) than the mode or median. Always. (But

admitedly

less and less so as we approach infinity)

See question above

Some people think this behavior is an example of the central limit
theorem. Well, I'm not sure what the CLT is actually, but what is
happening isn't the CLT. Trust me. Nonetheless, I don't object to
calling it the CLT, since the CLT conjors up a mental image that

many

people can understand. [The at the pickem PDF's. Sure they are
getting smoother and nicer looking, but there really isn't a

central

limit developing]

I agree, but cannot support my idea. I have asked this question
many times but many insist that the CLT is taking place. Extensive
on line research has produced some authors saying that a "special
kind of CLT is happening" but I am not convinced.

So my conclusion of your posting, your PDFs and those posted by
Jazbo is that a distribution of outcomes is never normal. The
larger the number of hands in each session then the smoother will be
the distribution and the less positive skew it will have, but it
will always be skewed.

Begetting a question: Possibly if enough hands per sessions are
played the skew will be so small that the normal distribution can be
used as an approximation?

DWK

···

--- In vpFREE@yahoogroups.com, "cdfsrule" <groups.yahoo@v...> wrote:

Yes. Yes. Yes. But they never really do converge
for VP. This is due to the long tail issue. Hence
the mean will always be to the right (more positve)
than the mode or median. Always. (But admitedly
less and less so as we approach infinity)

Some people think this behavior is an example of the
central limit theorem. Well, I'm not sure what the
CLT is actually, but what is happening isn't the

CLT.

Trust me. Nonetheless, I don't object to calling it
the CLT, since the CLT conjors up a mental image

that

many people can understand. [The at the pickem

PDF's.

Sure they are getting smoother and nicer looking,

but

there really isn't a central limit developing]

Here's an intuitive explanation of the Central Limit
Theorem.

Say you have a probability distribution of outcomes.
Call that the "population distribution." This is
analogous to the payoff likelihoods for a given game
of VP and a strategy for that game.

You can randomly select an outcome from the
probability distribution. For example, you can play a
hand of VP.

You could also select an arbitrary number of outcomes,
and then take the mean of that. We'll call that a
"sample." This is like playing N hands of video poker,
and adding up the results and dividing by N. Now for
each value of N you choose, there is a slightly
different distribution of outcomes. We can call this
distribution the "sampling" distribution.

This is what you are doing with your "PDF" technique.
You are choosing a number of hands _N_ and figuring
out the distribution of outcomes that occur from
playing N hands of VP.

What the Central Limit Theorem says is that as N goes
to infinity, the sampling distribution converges on
normal. Therefore, there IS some number of VP hands
such that if you plot the "PDF" for that number N,
you'll get a nice bell-shaped curve with mean = X and
variance = V.

Jerrod

···

--- cdfsrule <groups.yahoo@verizon.net> wrote:

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Parent Population PDFs for VP are never ever normal. Below is a proof
of sorts.

Normal distributions by definition are symetrical about their maximum
value. For the normal distribution, this maximum value is known as
the mean, mode, or median. Thus, the mean, mode, or median occurs
exactly in the center of the normal PDF.
  [To be really prescise, normal distributions also extend to +/-
infitity, but we don't need to care about that now]. Saying it
another way, if a PDF is not symetrical about its mean, or its mean
is not in the center, it can't be a normal distribution. BTW,
Negelecting the +/- infinity issue, only 2 parameters are needed to
describe a normal distribution, the mean and variance. One reason
only 2 parameters are needed is because the PDF is symetrical. If you
need more than two parameters (neglecting the +/- inifity issue
again), then the distribution is NOT normal.

For VP, the largest possible loss after n hands is -n*bet.
On the other hand, the largest possible win after n hands is the n*
(RF-bet), where RF= payout for the RF. So the PDF extends from -
n*bet all the way up to n*(RF-bet), including all integer values in
between.

Likewise the mean for VP is n*(EV-1)*bet, where EV is 1-hand
expectation value. For example, if EV=1, the mean = 0. If EV = 0,
the mean is -n*bet.

So, if VP had a normal distribution, it would be symetrical, and the
the midpoint of the pdf (the average of the worst loss and the best
possible win) would be equal to the mean. Let's check:

First let's compute the midpoint of the PDF after n hands:
1/2 *[-n * Bet + n*(RF-bet)] = n*RF/2 - n*bet {please check my math}

The mean is: n*(EV-1)*bet = n*EV*bet - n*bet

So unless EV*bet = RF/2 the PDF can NOT be normal. Moreover, this is
true for all n, even as n approaches infinity.

The central limit theorem (whatever it really is) just doesn't play
with n-hand theoretical PDF for VP.

What the Central Limit Theorem says is that as N goes
to infinity, the sampling distribution converges on
normal.

Well, sort of, but not exactly: The theory of random processes or
random variables, not the CLT, says that as n goes to infinity, the
sampling distribution converges to the parent distribution so long as
the random process is stationary and ergodic (or at least what is
called "wide-sense" ergodic). In application to VP, the combination
of "stationary" adn "ergodic" is saying that the game has the same
odds in theory regardless of machine, who playes it, or when it is
played, etc. So what this says, is, that if we take the results of
enough players (all who play exactly in same manner) and plot the
PDF, it will look more and more like the PDF's I
compute as the number of players increases, the number of machines
increases, the number of sessions increases, etc.

Therefore, there IS some number of VP hands
such that if you plot the "PDF" for that number N,
you'll get a nice bell-shaped curve with mean = X and
variance = V.

Well, I hope I've already convinced you that this can't be the case.
On the other hand, if we limit ourselves to the central portion of
the PDF, it does look quite normal. (Yes, finally!)

But the question of the variance still remains: what is the variance
that would best describe this central portion of the PDF? If we found
the normal distribution that best matched the actual VP distribution,
what variance would that normal distribution have? How much
different from the actual variance are these estimates? And most
importantly, how much of an error would we be making if we assumed
the PDF was normal when we statistical inferences and should we even
care?

Just some food for thought.

The argument about the symmetry of the distribution isn't a valid counterproof.

You realize at one point that your reasoning is somewhat flawed (the
issue of the infinite support of a normal distribution when all finite
PDFs have a finit support), but you fail to realize that the point
about symmetry is flawed for the same reasons. I can build an explicit
counter-example if you'd like (a series of functions that uniformly
converges toward a normal distribution even though each function has a
finite support toward negative infinity and an infinite support toward
positive infinity).

Yes, I'll give you the fact that we're looking at the convergence of
distributions here, which is somewhat more complex than the
convergence of functions, but since it's possible for each of the PDFs
to create a sister function which for all practical purposes has the
same properties for a VP player that is actually not a problem.

JBQ

···

On 8/31/05, cdfsrule <groups.yahoo@verizon.net> wrote:

Parent Population PDFs for VP are never ever normal. Below is a proof
of sorts.

Parent Population PDFs for VP are never ever normal.
Below is a proof of sorts.

They are not _exactly_ normal. They _converge_ on
normal. The function f(x) = 1/x converges on zero as x
goes to infinity, but it never takes the value zero.

Suppose we define an error value, which is the
aggregate difference between the directly calculated
PDF and the normal distribution (ie, the integral of
the absolute value of the difference between the two
functions from negative to positive infinity). The CLT
states that for any value E, there is a number of
trials N such that the error value is less than E.

<snip discussion of why the distribution can never be
exactly normal>

So unless EV*bet = RF/2 the PDF can NOT be normal.
Moreover, this is
true for all n, even as n approaches infinity.

The central limit theorem (whatever it really is)
just doesn't play with n-hand theoretical PDF for

VP.

The CLT is proven to apply to this case, just as it
does for any case of random variates with finite
variance and specified probability distributions.

> What the Central Limit Theorem says is that as N
goes
> to infinity, the sampling distribution converges
on
> normal.

Well, sort of, but not exactly: The theory of
random processes or
random variables, not the CLT, says that as n goes
to infinity, the
sampling distribution converges to the parent
distribution so long as
the random process is stationary and ergodic (or at
least what is
called "wide-sense" ergodic). In application to VP,
the combination
of "stationary" adn "ergodic" is saying that the
game has the same
odds in theory regardless of machine, who playes it,
or when it is
played, etc. So what this says, is, that if we take
the results of
enough players (all who play exactly in same manner)
and plot the
PDF, it will look more and more like the PDF's I
compute as the number of players increases, the
number of machines
increases, the number of sessions increases, etc.

Hopefully you realize that many players playing many
sessions and many machines is mathematically
equivalent to one player playing more hands.

>Therefore, there IS some number of VP hands
> such that if you plot the "PDF" for that number N,
> you'll get a nice bell-shaped curve with mean = X
and
> variance = V.

Well, I hope I've already convinced you that this
can't be the case.
On the other hand, if we limit ourselves to the
central portion of
the PDF, it does look quite normal. (Yes, finally!)

But the question of the variance still remains: what
is the variance
that would best describe this central portion of the
PDF? If we found
the normal distribution that best matched the actual
VP distribution,
what variance would that normal distribution have?

The variance of the game multiplied by the number of
trials, of course.

How much
different from the actual variance are these
estimates?

The variance of sessions of N hands is equal to N*(the
variance of the game).

And most
importantly, how much of an error would we be making
if we assumed
the PDF was normal when we statistical inferences
and should we even
care?

A very large one, in many cases. Just because the PDF
is normal when N is very, very large doesn't mean that
we can just pretend that it is normal when N is in the
normal range of outcomes. This is dramatically not the
case, and what you're saying about PDFs does have a
lot of value in assessing short-term risk and the
likely outcomes from sessions of fixed-length. In
fact, most players could benefit from looking at the
method you're using as a guide to their likely
outcomes.

The only reason I chime in here is because you are
making claims that are mathematically unsound in the
midst of providing something useful. Plus, you have a
pretty mathy vocabulary, and it's not reasonable to
expect non-mathematical readers to be able to discern
the difference between your claims which are true:

--Variance isn't a particularly good guide to session
risk.
--PDFs are a good method of analyzing risk for
sessions.

from the claims which are false:

--The Central Limit Theorem doesn't apply to video
poker.

Jerrod

···

--- cdfsrule <groups.yahoo@verizon.net> wrote:

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

From a practical point of view, If some one would compute and tabulate
cumulative PDF for the more popular games in the more likely session
sizes,(say 2K,5K,10K and 25K) It would be of use to many of us and
would eliminate the normal vs non-normal concern from an applied and
useful standpoint.

As JBQ pointed out to me in a private e-mail, expanding the tale
regions in the tabulation would be most useful as graphical
representations of same are not able to be read that accurately in the
tail regions.

This would allow one to look up the probability of exceeding or
staying below a win/loss of x amount after playing a session of
predetermined size much like one would determine the probability of
exceeding/or being below a given value in a Z table.

I challenge you, who are able to do this, to make this contribution to
vp resources for us that are more interested in application than in
theory.

If this makes sense, then this is what I have been striving for in my
posting of late where I was trying to clarify all of the information
floating around.

It has been reported that a program has been written to do this, but I
have yet to find it although I keep looking.

DWK

wrote:

It has been reported that a program has been written to do this, but I
have yet to find it although I keep looking.

http://www.lotspiech.com/GamblersRuin.html

···

--- In vpFREE@yahoogroups.com, "deuceswild1000" <deuceswild1000@y...>

http://mathworld.wolfram.com/CentralLimitTheorem.html

···

--- In vpFREE@yahoogroups.com, "cdfsrule" <groups.yahoo@v...> wrote:

The central limit theorem (whatever it really is)

Yes, this is a good defination of the CLT. And for those who don't
speak math-ease let me give you a pratical example. When you you are
done comtemplating the example... consider if it applies to VP PDF's
and how:

Imagine you have a bucket. Actually a lot of buckets). Each bucket
is filled with a whole lot of balls. Each ball has a number. The
balls in each bucket can be described by a mean value and a variance.

Now, you go ahead and take out some balls from each bucket. Assume
now you take out n balls from each bucket. You can have as many
buckets as you want. Now you take the n balls you pull out of one
bucket, add up their numbers, and divide by n: this produces a new
number equal to the average value of the balls you pulled out of a
bucket. For the hell of it, you put the balls back (you don't have
to, but doing so is called "sample with replacement").

Now, you do this procedure as many times as you want, cumputing the
average value of the n balls you draw. You can draw the balls from
any bucker or from the same bucket (it doesn't matter). Now you plot
the PDF of the average values you just computed. This PDF will
approach a normal distribution as n, the number of balls you pulled
out, increases. This is the CLT. It does not depend on the number
of time you pull the n balls, only on the number of balls n you pull
each time.

You can use microsoft excell to check this. Create random
distribution of number that is NOT at all normal. (How? use the
RANDfunction which returns an EVENLY distributed set of numbers).
Now draw n numbers from this set, and compute the average of the
number you draw. Repreat this many times. Then plot your results.
Now increase n, and repeat, and Plot your results. When n=1, the PDF
will look evenly distributed like the original data. It has to or you
have done something wrong. When n=2 (or 3 or so), you will get a
triangle like PDF. As n increases more, the PDF starts to get
somwhat normal like. This is a example of CLT in action.

--- In vpFREE@yahoogroups.com, "nightoftheiguana2000"
<nightoftheiguana2000@y...> wrote:

···

--- In vpFREE@yahoogroups.com, "cdfsrule" <groups.yahoo@v...> wrote:
> The central limit theorem (whatever it really is)

http://mathworld.wolfram.com/CentralLimitTheorem.html

JBQ,

I appreciate your comments. I think there is some confusion so let's
make sure we are discussing the same "things" rather than arguing.
IMHO, there are 2 differint "things" being discussed here, one in
which the PDF becomes normal, just as the CLT says it should, and one
in which the PDF isn't normal, and in which the CLT doesn't apply. I
will discuss both cases, using relevent examples for VP

Case 1: The PDF becomes normal (the CLT applies)

···

------------------------------------------------
So, let's assume that we have M different (identical) VP machines.
Each machine is played one time. So there are M results. I'll
assume that M is a very, very large number. In that case the PDF of
these M results looks exactly like the 1-hand PDF derviced from the
pay table and availble in WinPoker,etc.

Now instead of looking at the PDF of these M values, we compute new
values by averaging N of the M values. I'll call these new values X
(i,N). The formula is X(i,N)=1/N * sum(N values of M). This is
formula 2 given in the reference
http://mathworld.wolfram.com/CentralLimitTheorem.html.

Now, what does the PDF of the X(i,N) look like? Well, as N gets
large, the PDF becomes a normal ditribution, just like the CLT says.

So let's review: the average of N draws from a set of M values which
have the 1-hand VP PDF becomes normal as N increases. Excellent (we
all agree I hope)

We can also extend this a little, and consider the set of M values
which occurs after H hands of video poker are played. In this case
also, as N increases (not H), the PDF becomes normal.

But, what is the distribution of the M values after H hands? What
does it look like? Is it normal?

Well, it doesn't matter at all here (so ling as the mean & variance
are finite) what it looks like, the CLT still applies. That is the
power of the CLT. That sais, in the next section, I will explain that
the distribution of M after H hands is NOT NORMAL... since the CLT
doesn't apply. Those distributions can be computed and can be used
to verify what I am saying. In fact, I've already done that.

Case 2: The PDF doesn't become normal (the CLT doesn't apply)
-------------------------------------------------------------

So, let's assume again that we have M VP machines. Each machine is
played H times (H hands). So there are M results after H hands.
I'll assume that M is a very, very large number. So, what does the
PDF of these M results look like as a function of H? Does it become
normal as H increases. Don't jump to answer yet: Remember this is a
different situation than case 1. (Go re-read case 1 if the
differences aren't clear; Hint last time there was another variable,
N, which doesn't show up here at all)

First, for small H, we know what it looks like. I've computed it and
there are examples on the web for a reasonable number of hands. But
what happens when H increases? What happens when H= infinity. Does
the CLT apply?

Well, first, consider the mean of the distribution (I showed this in
an earlier email):
mean = H*(EV-1)*bet

The mean is linearly dependant on H. That means (pun intended) that
as H goes to infinity, unless EV=1, the mean becomes either + or -
infinity). So, unless EV=1, the mean goes to +/- infinty as H
increases. Hence, the PDF isn't normal. It can't be with a mean that
does not converge. So, for EV not 1, the CLT surely can't apply. Oh
well.

But for EV=1 there is still hope, becuase the mean is always 0. But
don't get to excited. We still need to consider the variance.

Computing the variance for the PDF after H hands is a bit trickier
than computing the mean. In fact, I don't know of an easy way to do
it (appropriate for this email). But as a proxy for the variance,
how about we consider the width of the central lobe in the PDF?
After all, if the PDF was actually normal, this width would be
related to the variance. Take a look at the PDF's I submitted a week
ago. Notice that the central portion is growing in width (full width
at half max, or FWHM) and NOT shrinking. In fact, the width
continues to increase with increasing H without limit. Hence, the
PDF does not converge to anything, normal or otherwise. If the PDF
was normal, the varaiance would be shrinking with H. It isn't. The
PDF isn't normal. The CLT couldn't have applied in this case. BTW,
If you don't like to use the FWHM, you can use the difference in
extrema as a proxy for the variance (like I did last time), and you
can show the midpoint between the extrema grows liniearly in H, from
which you can infer the variance is always increasing. Again, the
PDF is not normal. The CLT doesn't apply.

But why doesn't the CLT apply? Because we aren't aking the average
of N draws of the M values (for H hands). Without taking the avaerage
of the N values, the CLT doesn't apply. In fact, taking the average
of the N values is required for the CLT to apply. This isn't word
play or slight of hand: you have to take the N values and let N get
large or the CLT doesn't apply. The fact that H is increasing in
this example doesn't matter.

Aside: My arguement last time regading the symetry of the normal
distriution is sound, but a rigorous proof of it is beyond the scope
of this email (sorry). The issue of a truncated normal distribution
has no bearing either. In practice, one just lumps all the
missing "area" (from the finite extent of the PDF to it's theoretical
infinite limits) into a parameter than can be exactly computed using
the error function (or ERF). This # is then added or subtracted as
needed when computations involving moments of the experimental PDF
are undertaken or ignored. After all, all real "normal" PDFs have
finite extent, so there must be some trick for dealing with it.

Your comment about what the distribution converges to IS INSIGHTFUL.
It is complicated. Unfortunately, I don't think the PDF for H hands
of VP converges to anything. But, there are some re-normalization
tricks that might be applicable.... any ideas? BTW, the only thing
the casino cares about is case 1. They are banking on CLT for large
N, regardless of H, makeing the actual return converge to the
theoretical return.

--- In vpFREE@yahoogroups.com, Jean-Baptiste Queru <jbqueru@g...>
wrote:

The argument about the symmetry of the distribution isn't a valid

counterproof.

You realize at one point that your reasoning is somewhat flawed

I'm afraid that there's indeed quite some confusion.

I will quote you exactly here, and I think that we should stop the
discussion at this point:

Now, what does the PDF of the X(i,N) look like? Well, as N gets
large, the PDF becomes a normal ditribution, just like the CLT says.

Yes, this is exactly what the CLT says, and nothing else. Trying to go
deeper than that is not going to achieve anything.

If you really want to discuss more, I'll say that you seem to be
mixing the Monte Carlo theorem and the CLT. It's extremely dangerous,
because both of those theorems say that "something tends toward
something else when a certain number tends toward infinity", but the
number that tends toward infinity isn't the same. Before you can
properly reason on the CLT, you certainly need to see past the Monte
Carlo theorem and be able to reason on "pure" PDFs (as opposed to PDFs
seen as limits of finite samples).

On top of that, you're trying to reason on convergence of functions
(which are in a linear space of infinite dimension) as if they were
numbers (which are in a linear space of finite dimension). You're
trying to explicitly express the limit of the "reference" normal
distribution, and get visibly puzzled that it appears to have an
infinite mean and an infinite variance. Yes, the mean and the variance
of both the PDFs and the reference normal function all tend toward
infinity, but that's not actually a problem.

Your fundamental problem here is that to prove that two series of
functions converge toward one another, you try to prove that they both
converge toward the same thing. But that's not the case. Let me take a
simple example in dimension 1 to show you how two series can converge
toward one another without converging toward anything:

Consider the series s(n), defined for strictly positive integers,
where for all values of n s(n) is cos(n). Consider the series s1(n),
defined for strictly positive integers, where for all values of n
s1(n) is cos(n)+1/n. Neither of those series converges toward anything
(it's reasonably easy to prove), yet they converge toward one another.

Aside: My arguement last time regading the symetry of the normal
distriution is sound, but a rigorous proof of it is beyond the scope
of this email (sorry).

Well, I'have to assume that a reasonably rigorous counter-example is
within the scope of this email, so that I won't have to ask for your
rigorous proof (wink, wink).

Take the following functions fn, with n a strictly positive integer:

for -1/2 < x < 1/2, fn(x) is 1-1/n.
for n < x < n+1, fn(x) is 1/n.
for all other values of x, fn(x) is 0.

Take the following function f:

for -1/2 < x < 1/2, f(x) is 1.
for all other values of x, f(x) is 0.

Notice that all the functions involved have the good taste of actually
being PDFs (positive everywhere, sum is 1).

When n tends toward infinity, fn tends toward f, both as uniform
convergence (max on x of fn(x)-f(x) tends toward 0) and as integral
convergence ( sum on x of |fn(x)-f(x)| dx tends toward 0). Yet, fn is
never symmetrical, in fact it becomes less and less symmetrical as n
gets larger.

JBQ

···

On 9/1/05, cdfsrule <groups.yahoo@verizon.net> wrote:

I appreciate your comments. I think there is some confusion so let's
make sure we are discussing the same "things" rather than arguing.

Another example of CLT in action is that for a sufficiently large
number of hands, the pdf for video poker converges on a normal
distribution of mean = er-1 and variance = per hand variance. An
example of this for pick'em or the left side of jacks at 10,000 hands
is found here:
http://www.jazbo.com/videopoker/curves.html

···

--- In vpFREE@yahoogroups.com, "cdfsrule" <groups.yahoo@v...> wrote:

This is a example of CLT in action.

Let me quote from the Jazbo page you mention:

"After 5000 hands, the left hand Jacks peak is almost perfectly
normal --- but it matches a normal curve with a variance (and
expectation) corresponding to subtracting out the Royal. Because
cases where a Royal does occur (represented by the right hand peak)
are well-separated from the first case, here we can get a good
approximation of the distribution by combining two normal curves."

Well here we agree: The left portion of the curve IS "almost
perfectly normal". But what about the right? what does it look like?

Jazbo DOESN'T show what happens past 1000. Past 1000, there are more
nice peaks, one for each possible Royal that could be hit. Indeed
there are 5000 peaks out there. The last peak, representing the very
unlikely situation that you hit 5000 RF in a row, is just 1 value.

So to be a little more precise, after 5000 hands, the PDF looks like
about 5000 peaks. THe first one (the left) looks normal. The other
ones, to the right become progressively less normal as you more to
the larger values.

So, while the portion of the curve displayed might be approximated by
2 well separated normal curves (for 5000 hands), the entire PDF,
needs many, many more curves, and that these grow linearly in
proportional to the number of hands, forever.

Ok, I know what many of you out are thinking. This guy is wrong, and
just won't let it rest. Well, don't take my word for it then. Do the
math yourself.

Here's my challenge: go compute for yourself the exact PDF for even
2 hands of JoB VP. Just 2 hands. Shouldn't be that hard. You can do
it in excell. Now, when you are done, check the largest possible
return. I bet that you will find that it is for 2 RF's hit in
succession. Now go ahead and compute the PDF for 3 hands. Hmm...
you will find that the largest possible return is for 3 RF's.

Well, you can go on computing the EXACT PDF for n hands if you want,
or you can make some inferences. Namely that for each hand played,
you add a new, well-separated peak. Also, that as the number of
hands get large, the number of peaks get equally large, and the peaks
near the left become NORMAL. But what happens to the entire PDF? IS
it becoming normal?

You all know what my answer is already. It isn't becoming normal.
And it is not supposed to, because it represents the PDF after
playing N hands, not the PDF of the AVERAGE result after playing N
hands. This is a critical difference.

But alas, something out there is becoming normal: if I was to get
together with every possible person who plays VP, ask them all to
play 5000 hands, and then take averages of their results, THAT
distribution would be normal (or approximate normality) as the number
individual results that I average increases, not the number of hands
played.

ok, one more thing (I guess this is for JBQ, wink, wink).

A corollary of the CLT is the variance decreases as the number
of "hands" increases. Is this true for Video Pokere? Well, why not
compute it. Indeed, as N increases without limit, the variance
should become 0. Does it?

So go ahead and compute the variance for 2 hands of Job VP.

Jazbo gives the value for 1 hand, its about 19 (in units of coins
bet).

If you are right, and the CLT is working here, the variance should
drop by a factor of sqrt(2) or 1.4 for 2 hands.

Does it? No.

(BTW, how do you explain that the variance isn't geting smaller?)

> This is a example of CLT in action.

Another example of CLT in action is that for a sufficiently large
number of hands, the pdf for video poker converges on a normal
distribution of mean = er-1 and variance = per hand variance. An
example of this for pick'em or the left side of jacks at 10,000

hands

···

is found here:
http://www.jazbo.com/videopoker/curves.html

The last peak, representing the very
unlikely situation that you hit 5000 RF in a row, is just 1 value.

The probability of hiting 5000 RF in a row is 0.000025^5000. Run that
on your calculator and see what you get. In my book, zero is not a peak.

Ok, I know what many of you out are thinking. This guy is wrong, and
just won't let it rest. Well, don't take my word for it then. Do the
math yourself.

Yes, the math speaks for itself.

But what happens to the entire PDF? IS
it becoming normal?

Yes, according to CLT, for a sufficiently large number of hands, the
PDF approaches a normal distribution with mean=er-1 and variance=per
hand variance.

You all know what my answer is already.

Yes, and your answer doesn't correspond to the mathematical answer.

And it is not supposed to, because it represents the PDF after
playing N hands, not the PDF of the AVERAGE result after playing N
hands. This is a critical difference.

Nonsense.

A corollary of the CLT is the variance decreases as the number
of "hands" increases.

False.

If you are right, and the CLT is working here, the variance should
drop by a factor of sqrt(2) or 1.4 for 2 hands.

You don't understand CLT.

(BTW, how do you explain that the variance isn't geting smaller?)

Simple: Math. The variance doesn't get smaller.

···

--- In vpFREE@yahoogroups.com, "cdfsrule" <groups.yahoo@v...> wrote:

> > This is a example of CLT in action.
>
> Another example of CLT in action is that for a sufficiently large
> number of hands, the pdf for video poker converges on a normal
> distribution of mean = er-1 and variance = per hand variance. An
> example of this for pick'em or the left side of jacks at 10,000
hands
> is found here:
> http://www.jazbo.com/videopoker/curves.html

The variance of the sum of unrelated events is the sum of the
variances, and the mean is the sum of the means.

The standard deviation is the square root of the variance.

Since the mean grow linearly but the standard deviation grows as the
square root, the standard deviation is negligible compared to the mean
when the number of samples tends to infinity.

After you normalize (by dividing by the mean), the normalized standard
deviation tends to zero, and the variance (which is the square of the
standard deviation) does the same.

That's the "law of large numbers": the deviation becomes negligible
compared to the mean when the number of samples grows toward infinity
(and when the mean isn't zero, obviously).

JBQ

···

On 9/2/05, cdfsrule <groups.yahoo@verizon.net> wrote:

(BTW, how do you explain that the variance isn't geting smaller?)

Let me say that another way.

The variance of the sum gets larger and larger.

The variance of the average gets smaller and smaller.

Why does it?

The variance ot the sum of n identical samples is n times the variance
of a sample.

The average of n samples is the sum of each sample, individually divided by n.

The variance of a sample divided by n is the variance of the sample
divided by n*n.

The sum of the n variances that were each individually divided by n*n
equals the variance of one sample, divided by n.

JBQ

···

On 9/2/05, Jean-Baptiste Queru <jbqueru@gmail.com> wrote:

On 9/2/05, cdfsrule <groups.yahoo@verizon.net> wrote:
> (BTW, how do you explain that the variance isn't geting smaller?)

I think you're both ignoring the most vital question in this entire
debate? To wit; how many angles can dance on the head of a pin? :slight_smile:

--- In vpFREE@yahoogroups.com, Jean-Baptiste Queru <jbqueru@g...>
wrote:

Let me say that another way.

The variance of the sum gets larger and larger.

The variance of the average gets smaller and smaller.

Why does it?

The variance ot the sum of n identical samples is n times the

variance

of a sample.

The average of n samples is the sum of each sample, individually

divided by n.

The variance of a sample divided by n is the variance of the sample
divided by n*n.

The sum of the n variances that were each individually divided by

n*n

equals the variance of one sample, divided by n.

JBQ

> > (BTW, how do you explain that the variance isn't geting

smaller?)

···

On 9/2/05, Jean-Baptiste Queru <jbqueru@g...> wrote:
> On 9/2/05, cdfsrule <groups.yahoo@v...> wrote:

The ratio of the standard deviation to the mean gets smaller and smaller.

The ratio is 1 at N0:
http://www.bjmath.com/bjmath/refer/N0.htm

N0=variance/(er-1)^2 hands (assuming CLT)

The significance is that at N0 hands of a positive expectation game,
the chances of net winning are 84%, at 4 times N0 hands the chances
are 98%.

Example: FPDW +0.25%cashback:
N0=26/(1.0075-1+.0025)^2= 260,000 hands

Let me say that another way.

The variance of the sum gets larger and larger.

The variance of the average gets smaller and smaller.

Why does it?

The variance ot the sum of n identical samples is n times the variance
of a sample.

The average of n samples is the sum of each sample, individually

divided by n.

···

--- In vpFREE@yahoogroups.com, Jean-Baptiste Queru <jbqueru@g...> wrote:

The variance of a sample divided by n is the variance of the sample
divided by n*n.

The sum of the n variances that were each individually divided by n*n
equals the variance of one sample, divided by n.

JBQ

On 9/2/05, Jean-Baptiste Queru <jbqueru@g...> wrote:
> On 9/2/05, cdfsrule <groups.yahoo@v...> wrote:
> > (BTW, how do you explain that the variance isn't geting smaller?)