Larian Banner: Baldur's Gate Patch 9
Previous Thread
Next Thread
Print Thread
Page 2 of 2 1 2
Joined: Jun 2020
enthusiast
Offline
enthusiast
Joined: Jun 2020
Originally Posted by Niara
Originally Posted by Alodar
You seem to be suffering from a mild case of pareidolia.

Have you looked at the charts? Please do.

I've taken a sample that included consecutive rolls taken from four different games, each of a sample size of 200; there is, within this sample, a clear and distinct pattern that repeats, for one, and not for any of the others.

That is the evidence presented.

If you wish to disprove what I'm saying, then please, By all means, present your consecutive sample of 10000 rolls and illustrate that the pattern disappears. I do not believe that it will, but currently the onus of proof is on YOU, not me. Do the work, if you want to contribute.

200 is a sufficient sample size for a video game RNG (for an isolated game, at least - something like an MMO would demand more) . 1000 would be better, I don't deny, but 200 is sufficient to draw initial conclusions. 10000 is actually excessive, in this situation... but since that's the number you pulled out, I eagerly await your documentation and findings.

Yes I've looked at the charts.
And yes you are absolutely seeing patterns that don't exist.


I don't need to disprove what you are saying.
You are making a claim.
The "evidence" you have presented to support your claim is insufficient and therefore your claim is unconvincing and can be dismissed.
If you wish to provide evidence for your claim please, By all means, present your consecutive sample of 10000 rolls and illustrate the pattern.
The burden of proof is on the person making the claim, YOU.

Or you can just make up conspiracy theories based on insufficient data and imagined patterns.

Tey #773786 20/05/21 02:29 PM
Joined: Sep 2020
veteran
Offline
veteran
Joined: Sep 2020
10,000 rolls would be way more than needed. All of the google first page results for searching "representative sample size" say that ~400 data points is perfectly fine for 95% confidence of a 5% margin of error.

200 data points (@Niara's sample) is adequate for 7% margin of error or only 85% confidence.

Tey #773802 20/05/21 04:40 PM
Joined: Jan 2021
addict
Offline
addict
Joined: Jan 2021
There was also the whole thing about 17 being surprisingly frequent with Larian's default RNG setting. We had two datasets where 17 spiked from different players. The average was ~10 but 17 was keeping the average afloat.

I wish I had kept my default RNG dataset, because the damage dice had similar pattern issues as the d20.

Joined: Jun 2020
enthusiast
Offline
enthusiast
Joined: Jun 2020
Originally Posted by mrfuji3
10,000 rolls would be way more than needed. All of the google first page results for searching "representative sample size" say that ~400 data points is perfectly fine for 95% confidence of a 5% margin of error.

200 data points (@Niara's sample) is adequate for 7% margin of error or only 85% confidence.

Niara is claiming an internal pattern within the data, that somehow each number rolled is dependent on the rolls before it.
It would require many more data points to demonstrate that claim as you are trying to attach significance on mini-runs within the larger sample of data.

Using representative sample size as you are referring to is only meaningful when examining the full data such as with the average of the data points.
According to Niara's own spreadsheet BG3's dice rolling demonstrates the same randomness as the other games as the averages on the spreadsheet shows.(with 85% confidence.)

To demonstrate that BG3 sequence has unique internal patterns the person making the claim would need to do a large number, say 10,000 rolls for each game and then show that the patterns that show up in BG3 data don't show up in any other data.
Alternatively they could just make up conspiracy theories based on insufficient data and imagined patterns.

Tey #773815 20/05/21 06:04 PM
Joined: Sep 2020
veteran
Offline
veteran
Joined: Sep 2020
Originally Posted by Alodar
Niara is claiming an internal pattern within the data, that somehow each number rolled is dependent on the rolls before it.
It would require many more data points to demonstrate that claim as you are trying to attach significance on mini-runs within the larger sample of data.

Using representative sample size as you are referring to is only meaningful when examining the full data such as with the average of the data points.
According to Niara's own spreadsheet BG3's dice rolling demonstrates the same randomness as the other games as the averages on the spreadsheet shows.(with 85% confidence.)

To demonstrate that BG3 sequence has unique internal patterns the person making the claim would need to do a large number, say 10,000 rolls for each game and then show that the patterns that show up in BG3 data don't show up in any other data.
Alternatively they could just make up conspiracy theories based on insufficient data and imagined patterns.
Provide some evidence that 10,000 rolls as opposed to a few hundred would be necessary to identify such a pattern. Either find multiple sources that support your claim or fit the data given by Niara and show that a horizontal line (pure random) is equally a good a fit as a sine wave. ~500 rolls is enough to statistically determine that a d20 rng is producing non-uniform results (it can detect even a single number appearing too frequently or not enough), so it should also be good enough to detect a pattern in the rolls. In fact, you should need less data points to detect a pattern because you're adding extra information: the order of rolls.

Otherwise, I'm going to accept all the sources I've found that claim that a few hundred rolls is sufficient for a good sample size. The clear pattern in Niara's data, combined with my analysis of 508 unweighted rolls and 750 weighted rolls, show that Larian's unweighted rng is not uniform and produces an overabundance of '17's and underabundances of '1's and '6's.

Joined: Jun 2020
enthusiast
Offline
enthusiast
Joined: Jun 2020
Originally Posted by mrfuji3
Provide some evidence that 10,000 rolls as opposed to a few hundred would be necessary to identify such a pattern. Either find multiple sources that support your claim or fit the data given by Niara and show that a horizontal line (pure random) is equally a good a fit as a sine wave. ~500 rolls is enough to statistically determine that a d20 rng is producing non-uniform results (it can detect even a single number appearing too frequently or not enough), so it should also be good enough to detect a pattern in the rolls. In fact, you should need less data points to detect a pattern because you're adding extra information: the order of rolls.

You require evidence that large amounts of data allow folks to better assess the significance of the data and avoid errors from testing a small number of possibly atypical samples?
I can point you to any statistic course ever.

I also recommend you look up apophenia.
Apophenia is the tendency to perceive meaningful connections between unrelated things

Quote
Otherwise, I'm going to accept all the sources I've found that claim that a few hundred rolls is sufficient for a good sample size.



Again, a few hundred rolls is sufficient for determining the average value (85% confidence), not for finding significant patterns within data. (Of course the data does show that BG3 averages are random (85% confidence), but please don't let facts get in your way.


Quote
The clear pattern in Niara's data, combined with my analysis of 508 unweighted rolls and 750 weighted rolls, show that Larian's unweighted rng is not uniform and produces an overabundance of '17's and underabundances of '1's and '6's.

As I've stated you are free to you make up conspiracy theories based on insufficient data and imagined patterns.
Seems to be a popular activity in the world these days.

Tey #773822 20/05/21 06:41 PM
Joined: Nov 2020
old hand
Offline
old hand
Joined: Nov 2020
While 10,000 would be a perfect sample size, I think a couple hundred is considered a decent sample size for an individual doing a test, especially since said sample reveals a pattern. You could easily extrapolate that said pattern will continue and is a safe assumption until someone can come along with evidence to disprove it. While not exactly professional, you do that kind of thing in Highscool and University level Statistics courses.
As confrontational as it was, I do believe Niara was correct in saying that if you wanted to prove her data was lacking or unsatisfactory, it is on you to provide the evidence showing that her data is in fact lacking. Saying her sample size is too small gives you some backing, but to actually prove her results wrong you need to present your own results with the size you stated. If you want us to believe there is no pattern in the rolls I genuinely ask you to show us by doing 10,000 rolls that indicate a lack of pattern.
The more data and tests done the better.

Tey #773823 20/05/21 06:59 PM
Joined: Jan 2021
addict
Offline
addict
Joined: Jan 2021
I mean, you could gather your own data set Alodar, rather than reach for hyperbole with apophenia. Everyone who shared a dataset did address it may just be a small sample size.

If you want to argue as staunchly as you are, please provide data.

Tey #773824 20/05/21 07:03 PM
Joined: Mar 2021
member
Offline
member
Joined: Mar 2021
We can make this simpler.

I too find that some checks are designed to be failed. I've played the game a few times and found that I fail some checks consistently, over and over again, regardless of modifiers, regardless of how many times I try, my rolls are low.

Fine. That's not enough data on it's own to determine a pattern in the data. The larger the player base, the more likely it is that odd things will come up for infividuals. But that doesn't disprove that there's a pattern either. We're still stuck with a single piece of anecdotal evidence. But you know what we can do from there?

"Hey, everyone. I'm experiencing this weird phenomenon. Is anyone else experiencing this, or am I just unlucky?"

Tey #773831 20/05/21 08:07 PM
Joined: Nov 2020
B
journeyman
Offline
journeyman
B
Joined: Nov 2020
I think people are referring to two different things.
Some folks want to detect a possible anomaly in the whole implementation (be it rng quality, coding error in the implementation, wrong logic etc...). Well fine, why not, good luck though.
Another person tells you that what you see as patterns is not relevant. I think (s)he is right because this is not the first time that with an affine/linear congruational generator (likely used here, not certain though) you'll see "patterns" at low numbers and no such things at higher numbers of test. I don't think (s)he wants to discourage you from showing that the distribution is not uniform and thus suspecting a problem in the game.

TL;DR: elementary statistics won't help you to prove the quality of the RNG itself - how Larian uses this generator is another matter.

Joined: Jan 2021
addict
Offline
addict
Joined: Jan 2021
Originally Posted by Blacas
I think people are referring to two different things.
Some folks want to detect a possible anomaly in the whole implementation (be it rng quality, coding error in the implementation, wrong logic etc...)
That's not really the case, when I was first sampling RNG from the game I expected it to be normal. We had a thread where a few players were venting about dice rolls, saying that the dice might not truly be random. I wanted to demonstrate that randomness is random and I thought the dice were fine at the time.

The sample I had showed that while the dice met metrics to be random, but they were streaky. Another player's sample showed similar streaks at 17, and Niara's sample showed that low rolls trended with low rolls and the vice versa. Which was rather unique that three different players all sampled the base RNG and found oddities.

Larian has also changed the RNG twice, the first time it was great the dice were random and not overly streaky, now the dice are biased towards high rolls. So something is odd about the base game dice. The initial patch 4 "weighted dice" were really well done though. I'd like to have them back in the game at some point.

Tey #774018 22/05/21 07:05 PM
Joined: Feb 2021
Location: Alaska
member
Offline
member
Joined: Feb 2021
Location: Alaska
I don't know math or statistics, but I have noticed a change in how slight of hand success/failure rate has changed from patch 3 (seemed appropriate), to patch 4 (both variants, with "loaded, and unloaded dice") that now seems really messed up. My success rate on pick locks with characters with +5 (from Dex, and proficiency) has dropped dramatically. I really wish I could find a way to log and parse the data to see what is really going on. But in playing a thief it has become painful to do any slight of hand operations, as I am often needing to save scum as I am getting up to 9 failures in a row burning 18 lock-pick kits (each using 2 sets of lockpick kits (so another bug)). That's the longest attempt string I have tried thus far. Patch 3 did not have these issues, so the "fix" actually broke something in regards to lock picking.


EDIT: Did testing on 5-23-2021 and the recent patch seemed to help starting out a new thief play-through. I shall see how it goes further on.

Last edited by Aazo; 24/05/21 03:14 AM.
Joined: Oct 2020
T
journeyman
Offline
journeyman
T
Joined: Oct 2020
Originally Posted by Droata
If you ask 1024 people to flip a coin, and all those who get heads stick around to flip again, you would expect to have 512 people flipping a second time. And 256 flipping a third, and 128 flipping a 4th, and 64 flipping a 5th and 32 flipping a 6th, and 16 flipping a 7th, and 8 flipping an 8th and 4 flipping a 9th and 2 flipping a 10th, and 1 person out of 1024 is expected to flip 11 heads in a row.

To reiterate, if 1 person out of 1024 people flips heads 11 times in a row, that doesn't indicate that they have a weighted coin. That is an entirely expected outcome consistent with a 50% probability of getting heads or tails on any given flip.

Baldur's Gate 3 has sold over 1 million copies. Do you think out of 1 million people it would be expected or unexpected that one of them would roll low 9 times in a row?

this is a good example

I would like to see the function used and generate like 10000s of results to analyse..

but I cant imagine me rolling dice and relaoding the game.. what is a good method of gathering dice rolls?

Last edited by Tav3245234325325; 23/05/21 05:35 AM.
#774075 23/05/21 06:19 AM
Joined: Jan 2021
addict
Offline
addict
Joined: Jan 2021
Originally Posted by <Redacted>
<Redacted>
Well we do have that for the "weighted dice" from the start of patch 4.

The good news is that the patch 4 "weighted dice" had a nice random distribution. (they were weighted to reduce streaks, the base RNG had a propensity to trend/streak).

The bad news is that a hot-fix replaced that set of "weighted dice" with dice that are allowed to streak on high rolls.

So now we get to choose between streaky dice and rigged dice.

Here is a post from a thread on the base-game RNG links are great.

Last edited by Raze; 16/03/22 09:23 AM. Reason: deleted forum account
Joined: Oct 2020
Location: Liberec
veteran
Offline
veteran
Joined: Oct 2020
Location: Liberec
Originally Posted by Tav3245234325325
what is a good method of gathering dice rolls?
Datamine random number generator, and then run it separately.


If my comments bother you, there is nothing easier than telling me to stop.
I mean ... I won't ... but it's easy to say. wink
Page 2 of 2 1 2

Link Copied to Clipboard
Powered by UBB.threads™ PHP Forum Software 7.7.5