Comments/Ratings for a Single Item
The way I determine piece values it to have the computer play itself from a start position with material imbalance. E.g. to determine the value of the Archbishop (BN), I replace the Queen of one player by an Archbishop, and have the engine play a couple of hundred games agaist itself (where in half the games white has the Queen, and in the other half black. Say the side with the Queen scores 63% in that match, I then know that A < Q. To make it more even, I then delete the f-Pawn of the side with the Queen, so that I play with an imbalance of Q vs A+P (plus all other pieces present), again a coupleof hundred games. Then the A+P scores 55%. I then know that the Pawn mattered 18%, and having Q instead of A matters 13%, so about 0.75 times as much as a Pawn. Apparently the Queen is wourdth 0.75 Pawn more than the Archbishop, say 9.50 vs 8.75.
If 8.75 that is not the value I had given the computer before playing these matches, the result is unreliable, because the side with the Archbishop could have traded it unfavorably, thinking it had a good deal. (E.g. if I had told it that A was worth 6, it would be happy to trade it for B + N, while in real life it should avoid that.) I then have to redo everything with a better instructed computer. Fortunately the result that you get out isn't very much dependent on what value you originally put in, because even if the computer is misinformed to think it should try to trade A for B+N, the opponent with the Queen will share that misconception, and try to avoid these trades. So as long as both sides attach the same value to the piece, the outcome of the match will not be very sensitive to the value. Even if your initial guess was completely wrong, using the value derived from the result of that match as programmed value for a second match usually confirms that value. And once you get a consistent result, you can trust the value.
To make the measurement more reliable, you should not just compare the value against one material combination, but against several ones, e.g. not just A vs Q, but also A vs R+R, A vs B+B+N, A vs N+N+B, etc.
If the rules are different, you can still use this method, provided you have a computer program that plays by these rules, and is sufficiently strong. I am not sure which type of rule differences you have in mind. Variants with drops are problematic, because there the pieces tend to change owner so fast that an initial material imbalance very quickly changes. But in, say, 3check material is just as well conserved as in normal chess games, and you ca use the same method, and even attach a value to the number of checks by, say, giving a side that only needs to deliver 2 checks one less Bishop, to see whether the first check is wrth more or less than a Bishop.
One thing that boggles my mind about the computer study value found for an archbishop (e.g. on an 8x8 board) is that I recall it is very close to the one found for a chancellor's value (as it is to a queen's value)... yet add a rook's movement ability to an archbishop and you will have an amazon piece type, while adding a mere bishop's movement ability to a chancellor also produces an amazon piece type. Somehow this all seems strange to me.
Well, it shows that piece values are not simply the sum of values of individual moves, but that some moves cooperate better than others. B and N moves seem to cooperate especially well. So adding N to Q gives you a bonus because of the Queen's B component. Adding R to A doesn't give you any new bonus the A didn't already have.
It is of course a good question why the B and N moves work together so well. Especially since the most obvious possible causes were all ruled out by experiment. The N component breaks the color binding of the B, but if I add a color-changing move like bmW to the Bishop, it hardly gains value, and the Knight gains a similar value by adding such a move. So apparently breaking color binding isn't worth much, except that the B-pair bonus becomes included in the base value. That individual B and N suffer from lack of mating potential also turns out not to be a major value-depressing factor; if I add a bcW move to the Bishop to endow it with mate potential, such an enhanced Bishop is also not worth spectacularly more than an ordinary one.
My current idea is that orthogonally adjacent move targets give a bonus. This would also explain why R moves are worth more than B moves, even when there is no color binding (e.g. the difference between RF and BW is about 1.75 Pawn). Adding N to B produces 16 new orthogonal contacts, adding N to R (or B to R) only 8. This theory would predict that K and WD should be worth more than other 8-target short-range leapers (like N), which isn't borne out by experiment, however. Perhaps it is masked by other, value-suppressing collective properties, e.g. that the K is especially slow, and that the WD has poor manoeuvrability (fewer targets that can be reached within two moves than for a Knight) and poor 'forwardness' (forward moves are typically worth twice as much as sideway or backward moves).
Of course it is a good question why orthogonally adjacent targets would be better than other 'footprints'. I have not addressed this at all. It could be connected to the move of the King; orthogonally adjacent targets are a requirement for mating potential, against an orthodox King. But I already established that mating potential doesn't seem worth a great deal, as in most end-games there are plenty of Pawns to provide that through promotion. Yet it would be interesting to measure piece values in Knightmate, where a Rook cannot force mate on a Royal Knight. Would the Archbishop also be so strong in Knightmate? I really have no idea.
It could also have to do with the fact that Pawns move orthogonally, so that there is value in being able to attack the squares where it is and where it can go to. I did notice from watching games that Archbishop are especially efficient in annihilating Pawn chains. But this would suggest that only vertically adjacent targets would be beneficial. (But you also get more of those from B+N (8) then from R+N or R+B (4).) This can be easily tested by comparing the values of 'Semi-Chancellors' RvN and RsN, which have the same number of forward moves. It could also be that there is additioal value in being able to attack a Pawn, the square where it moves to, and a square it protects all at once. The BN footprint is also pretty good for that. It would be interesting to have piece values for Berolina Chess. Piece values are for a large part determined by how efficiently pieces interact with the ubiquitous Pawns, supporting their own, and hindering the opponent's; most end-games hinge on that.
Very interesting and logical approach; makes sense. When it comes to removing pawns, do you remove the pawn from only the other side too, for some tests to see that the pawn discrepancy sways the percentage by the same amount on both sides? That is to say, some pawns' absence might actually lead to an advantagous opening, etc. which would not have anything to do with a "loss of pawn" in terms of material counting, right?
I guess I would have to note here that the computer studies don't in fact give a (e.g. pawn value) bonus for adding a N to a Q (even though the latter has a B component) since the value given by such studies for the amazon is 12 Pawns as I recall, i.e. not even any sort of bonus in value whatsoever gained for whatever synergy there is between the N & Q move capabilities.
One thing that impresses me about an archbishop (or a chancellor) is that, on a fairly central square, within a 2 square radius the piece controls 16 squares, the same as a queen, and with the archbishop these are done in 4 nice chunks of 4 squares (though in a corner square this piece really suffers due to the limited scope of both minor piece components). On the other hand, an amazon controls all of these squares, i.e. the full 24, which allows it to deliver mate unaided on an empty board, though it's harder to compare abstractly with the effects of 2 seperate pieces (a N and a Q).
Aside from these particular examples, I have some lingering doubts about the methodology of computer studies (that's aside from the usual caveats, e.g. of what other pieces are on the board, and board size), namely margin of error, and whether top programs (as opposed to even average ones) should be used for play vs. each other in the studies (analogous to thinking better human players produce games with better use of the pieces being tested), though I do recall you'd had a way to dismiss the latter doubt of mine.
Regarding the doubts I have about margin of error in these studies, one is whether there should be a doubling of it (there seems to be an assumption that the 'superior' side's piece(s) will always win a larger percentage of the games in a study; however 'in theory' I think it might be possible for some study results to show the 'weaker' side's piece winning a higher percentage of the games in that particular study). The other doubt I have is whether margin of error somehow should be bigger when testing pieces of bigger value, i.e. the margin of error for an archbishop's value perhaps ought to be bigger than the margin of error for a bishop's value, in comparing these two pieces to other ones. Thus the margin of error associated with an amazon in a study might be relatively huge, perhaps. Nevertheless, my concerns over margin of error are less than my concern with not using top playing programs for such studies.
6 comments displayed
Permalink to the exact comments currently displayed.
A question for experienced chess software programmers: how do you go about assigning value for a new or unexplored piece? Is there a way to have the engine play itself, assuming different values, to see which values lead to fair games, or do you have to enter your own best guess for non-standard pieces?
Another question: how do you determine the value of the standard chess pieces if the variant has different rules?