Check out Atomic Chess, our featured variant for November, 2024.


[ Help | Earliest Comments | Latest Comments ]
[ List All Subjects of Discussion | Create New Subject of Discussion ]
[ List Earliest Comments Only For Pages | Games | Rated Pages | Rated Games | Subjects of Discussion ]

Single Comment

Wikipedia link re: Margin of error (may be relavent to piece value studies)[Subject Thread] [Add Response]
Kevin Pacey wrote on Sat, Sep 17, 2016 02:50 AM UTC:

In looking up the latest poll results online (on wikipedia) for the US election, I noticed reference to margin of error, and also noticed that it was naturally bigger for smaller sample sizes. In the following link:

https://en.wikipedia.org/wiki/Margin_of_error

It can be seen that a sample size of just 96 has a margin of error of 10%, but a sample size of 384 has a margin of error of only 5%. It struck me that for a study of piece values using computers, it might be vital to have a considerably large sample size of games where identical engines play against each other in order to be rather confident of conclusions drawn in contesting the values of different pieces. Perhaps the minimum ought to be a sample size of 384 games. In concluding such a study, it might be noted how the margin of error might affect the estimate of a piece's value, if it is at all significant (e.g. "plus or minus 0.125 pawns" [however that might be calculated] possibly stated, after some calculations that are made for a piece's value based on win/loss percentages for that piece).

In his study finding that in chess a knight is exactly worth a bishop, I recall GM Larry Kaufman used a huge number of games (1,000,000+?) between skilled humans to draw his conclusion with a high degree of statistical confidence. This might have been a flawed study all the same; it seems from chess books that most human chess authorities agree that a knight is a little worse than a bishop on average. My own guess is that looking at human vs. human games wouldn't necessarily produce the same statistical result as an engine vs. identical engine study, with such a huge number of games also being played, and all starting with an opening-stage position setup where a single bishop is pitted against a single knight. That's, at the least, since different people value bishops & knights (and under what circumstances they can be exchanged equitably) slightly differently, which affects people's decisions, and in turn the possible results of all the individual games counted in a study, in a more chaotic way than with engines. That's not to mention all too human blunders or lesser mistakes, although these might tend to even out more than discrepencies caused by different players valuing minor pieces differently. I should note though that I own one 1998 middlegame book that is quite content to quote human vs. human database statistics that have results in favour of 2 bishops over 2 knights (or knight + bishop) a big majority of the time, under varying conditions of even material otherwise, much as Kaufman found.

P.S.: In digging back through old Comments, I see that H.G. (if no one else) has in a way basically taken into account much (if not all) of what I posted above, and made computer studies with a minimum of 1,000 games, in at least some cases, e.g. Amazon vs. Q + N (don't know about sample size in the case of B vs. N), when calculating piece values via piece vs. piece(s) battles. Assuming the engine + methodology used is a strong one, I still can't square some of the results of computer studies with my intuition, to my bewilderment. A personal anecdote that's possibly amusing: at one stage when musing about margin of error in regard to piece value estimates, I thought for a second that if a knight (as a piece of lower or equal value to a bishop) were set to 3.0 and the margin were 10% then the margin of error for a study (of 96 games) comparing it to a bishop might be 3.0 x .1 = 0.3 pawns. In similar fashion, I thought if an archbishop were set to 8.0 with a margin of 10% then the margin of error for a study (of 96 games) comparing it to a queen might be 8.0 x .1 = 0.8 pawns. I soon saw no justification for tying margin of error to the assigned numerical value of a piece, and realized it must be incorrect math. :)

Another way to try to convert margin of error from a raw percentage into a percentage of a pawn could first involve considering what constitutes the numerical value of a minimum decisive advantage (i.e. an engine should win 100% of all games in a study with this much advantage). In chess, that's about 1.333 of a pawn according to the old book Point Count Chess; if we accept that value (for the sake of argument) then a margin of error of 10% (i.e. for a study with 96 games) could be converted to 1.333 x .1 = plus or minus 0.133 pawns worth of margin of error. This may be just more incorrect math, but oddly enough I don't see how to easily refute it at the moment, at least with my feeble/rusty math skills.