Check out Glinski's Hexagonal Chess, our featured variant for May, 2024.

Enter Your Reply

The Comment You're Replying To
H. G. Muller wrote on Mon, Mar 11 09:59 PM UTC in reply to Kevin Pacey from 07:11 PM:

yet you're going right ahead yourself and saying 2700+ play is different

No, I said that if it was different the 2700+ result would not be of interest, while if they are the same it would be stupid to measure it at 2700+ while it is orders of magnitude easier around 2000. Whether less accurate play would give different results has to be tested. Below some level the games will no longer have any reality value, e.g. you could not expect correct values from a random mover.

So what you do is investigate how results for a few test cases depend on Time Control, starting at a TC where the engine plays at the level you are aiming for, and then reducing the time (and thus the level of play) to see where that starts to matter. With Fairy-Max as engine there turned out to be no change in results until the TC dropped below 40 moves/min. Examining the games also showed the cause of that: many games that could be easily won ended in draws because it was no longer searching deep enough to see that its passers could promote. So I conducted the tests at 40 moves/2min, where the play did not appear to suffer from any unnatural behavior.

You make it sound like it is my fault that you make so many false statements that need correcting...

Betza was actually write: the hand that wields the piece can have an effect on the empirical value. This is why I preferred to do the tests with Fairy-Max, which is basically a knowledge-less engine, which would treat all pieces on an equal basis. If you would use an engine that has advanced knowledge for, say, how to best position a piece w.r.t. the Pawn chain for some pieces and not for others, it would become an unfair comparison. ANd you can definitely make a piece worth less by encouraging very bad handling. E.g. if I would give a large positional bonus for having Knights in the corners, knights would become almost useless at low search depth. It would never use them. If you tell it a Queen is worth less than a Pawn, the side that starts with a Queen instead of a Rook would lose badly, as it would quickly trade Q for P and be R vs P behind.

The point is that the detrimental behavior that is encouraged here can never be stopped by the opponent. Small misconceptions tend to cancel out. E.g. if you twould have told the engine that a Bishop pair is worth less than a pair of Knights, the player with the Knights would avoid trading the Knights for Bishops, which is not much more difficult than avoiding the reverse trades, as the values are close. So it won't affect how often the imbalance will be traded away, and while it lasts, the Bishops will do more damage than the Knights, because the Bishop pair in truth is stronger. But there is no way you can prevent the opponent sacrificing his Queen for a Pawn, even if you have the misconception that the Pawn was worth more.

Note that large search depth tends to correct strategic misconceptions, because it brings the tactical consequences of strategic mistakes within the horizon. Wrecking your Pawn structure will eventually lead to forced loss of a Pawn, so the engine would avoid a wrecked Pawn structure even if it has no clue how to evaluate Pawn structures. Just because it doesn't want to lose a Pawn.

Statistical margins of error is high-school stuff. For N independent games the typical deviation of the result from the true probability will be square-root of N times the typical deviation of a single game result from the average. (Which is about 0.5, because not all games end in a 0 or 1 score.) So the typical deviation of the score percentage in a test of N games is 40%/sqrt(N). Having to calculate a square root isn't really advanced mathematics.


Edit Form
Conduct Guidelines
This is a Chess variants website, not a general forum.
Please limit your comments to Chess variants or the operation of this site.
Keep this website a safe space for Chess variant hobbyists of all stripes.
Because we want people to feel comfortable here no matter what their political or religious beliefs might be, we ask you to avoid discussing politics, religion, or other controversial subjects here. No matter how passionately you feel about any of these subjects, just take it someplace else.
Quick Markdown Guide

By default, new comments may be entered as Markdown, simple markup syntax designed to be readable and not look like markup. Comments stored as Markdown will be converted to HTML by Parsedown before displaying them. This follows the Github Flavored Markdown Spec with support for Markdown Extra. For a good overview of Markdown in general, check out the Markdown Guide. Here is a quick comparison of some commonly used Markdown with the rendered result:

Top level header: <H1>

Block quote

Second paragraph in block quote

First Paragraph of response. Italics, bold, and bold italics.

Second Paragraph after blank line. Here is some HTML code mixed in with the Markdown, and here is the same <U>HTML code</U> enclosed by backticks.

Secondary Header: <H2>

  • Unordered list item
  • Second unordered list item
  • New unordered list
    • Nested list item

Third Level header <H3>

  1. An ordered list item.
  2. A second ordered list item with the same number.
  3. A third ordered list item.
Here is some preformatted text.
  This line begins with some indentation.
    This begins with even more indentation.
And this line has no indentation.

Alt text for a graphic image

A definition list
A list of terms, each with one or more definitions following it.
An HTML construct using the tags <DL>, <DT> and <DD>.
A term
Its definition after a colon.
A second definition.
A third definition.
Another term following a blank line
The definition of that term.