MAXEM: A New Pseudo-Random Number Generating Algorithm with Implementation of Modulo Arithmetic, XORShift, and Entropy Modulation

Ramit Sadhukhan; Soubhik Chakraborty

doi:10.12688/f1000research.165637.1

Home Browse MAXEM: A New Pseudo-Random Number Generating Algorithm with Implementation...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Research Article

MAXEM: A New Pseudo-Random Number Generating Algorithm with Implementation of Modulo Arithmetic, XORShift, and Entropy Modulation

[version 1; peer review: 1 not approved]

Ramit Sadhukhan¹, Soubhik Chakraborty ²

PUBLISHED 15 Jul 2025

Author details Author details

¹ Mathematics, Birla Institute of Technology, Ranchi, Jharkhand, 835215, India
² Mathematics, Birla Institute of Technology, Ranchi, Jharkhand, 835215, India

Ramit Sadhukhan
Roles: Conceptualization, Data Curation, Formal Analysis, Writing – Original Draft Preparation

Soubhik Chakraborty
Roles: Conceptualization, Supervision, Validation, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Computational Modelling and Numerical Aspects in Engineering collection.

Abstract

This paper introduces a novel pseudo-random number generation algorithm incorporating modulo arithmetic, XORShift, and entropy modulation. The method is designed to enhance randomness and unpredictability by utilizing bitwise manipulations and dynamic entropy modulation. Experimental results demonstrate the algorithm’s performance across various test sets. The algorithm seeks to resolve two of the most common problems of common PRNGs, which are the quality of randomness with selection of an arbitrary seed and the presence of periodic cycles. The findings suggest potential applications in cryptography and statistical simulations.

Keywords

Keywords: pseudo-random numbers, PRNG, MAXEM, random number generator, run test for randomness Mathematics subject classification: 62P99

Corresponding author: Soubhik Chakraborty

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2025 Sadhukhan R and Chakraborty S. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Sadhukhan R and Chakraborty S. MAXEM: A New Pseudo-Random Number Generating Algorithm with Implementation of Modulo Arithmetic, XORShift, and Entropy Modulation [version 1; peer review: 1 not approved]. F1000Research 2025, 14:691 (https://doi.org/10.12688/f1000research.165637.1) First published: 15 Jul 2025, 14:691 (https://doi.org/10.12688/f1000research.165637.1) Latest published: 15 Jul 2025, 14:691 (https://doi.org/10.12688/f1000research.165637.1)

Introduction

By the law of statistical regularity, a sample can be thought of as a good representation of the population only when it is selected randomly.¹ The significance of randomness in sampling lies at the heart of various applications in statistics, computer science, and cryptography. Pseudo-random number generators (PRNGs) serve as a crucial tool in generating sequences of numbers that approximate the properties of random samples, enabling robust simulations and analyses across diverse fields.

Donald Knuth, in his work, The Art of Computer Programming,² emphasizes the significance of pseudo-random number generators (PRNGs) in computational processes. PRNGs are essential for simulating randomness in algorithms, particularly in areas such as cryptography, statistical sampling, and simulations. Their efficiency and unpredictability are crucial for ensuring that algorithms perform effectively under varied conditions and for testing hypotheses under controlled “random” scenarios.

A lot of research has been carried out to come up with efficient and reliable PRNGs has led to numerous advancements over the years. Traditional algorithms, such as the Linear Congruential Generator (LCG), are straightforward to implement and exhibit constant-time performance. However, they often fall short in terms of statistical quality and unpredictability, leading to correlations that can skew results.² On the other hand, the Mersenne Twister, introduced by Matsumoto and Nishimura,³ boasts a remarkable period and strong equidistribution properties, making it a popular choice for various applications despite its relative complexity. Recent advancements in the cryptographic applications of PRNGs emphasize the role of secure random number generators in preventing vulnerabilities like side-channel attacks.⁴

In a more recent context, the second author has significantly advanced the understanding of randomness through his research. In a prior paper “On Why and What of Randomness,”⁵ he explores the fundamental nature and necessity of randomness in various computational processes. His work emphasizes how randomness is not only vital for ensuring unpredictability in algorithms but also for enhancing the overall reliability of systems that depend on random number generation. By addressing the characteristics and sources of randomness, his research highlights its importance in mitigating risks associated with predictability and biases in computational settings.

This paper proposes a novel PRNG algorithm that utilizes modular arithmetic, bitwise operations, and an innovative approach to entropy modulation. By incorporating prime-based entropy, this algorithm aims to enhance unpredictability and reduce periodic cycle formation. The algorithm incorporates techniques of Modulo Arithmetic, XORShift and Entropy Modulation, and is thus called MAXEM.

Methodology

In the domain of random number generators by the means of algorithms, also called Pseudo-Random Number Generators (PRNGs),² wide gaps are left within the algorithms in terms of their distributions or in the method of their generation. The MAXEM algorithm proposed hereafter works on fulfilling some of these gaps, such as negating the possibility of any periodiic cycle formation in a sequence, or having the ability to choose any arbitrary number as a seed without affecting the quality of randomness of the generated sequence.

Selection of a seed

As observed by Berezowski,⁶ the distribution of prime numbers exhibits chaotic behavior, making their prediction and arrangement highly non-trivial. This was the first thought for selection of a proper seed to generate a sequence of pseudo-random numbers. A seed can be selected as an arbitrary prime number whose magnitude depends on the magnitude of the random numbers that need to be generated. Therefore, we define the seed to be:

p_{0} = select_closest_next_prime (random_integer (100, max (1000, 2^{⌊ \frac{m}{4} ⌋})))

Here, $m$ is the number of bits used to represent the integers in the sequence. If $m = 32$ , then the numbers in the sequence will be bounded by $2^{m}$ , meaning they can take values between $0$ and $2^{32} - 1$ (for 32-bit integers). Thus, this function takes an arbitrary integer value between 100 and the maximum value between $2^{⌊ \frac{m}{4} ⌋}$ and 1000, and chooses the closest prime number greater than that integer, which gives a wide range of prime numbers for the algorithm to choose from. We use $2^{⌊ \frac{m}{4} ⌋}$ as an upper limit to make the initial seed comparatively lower in magnitude to the largest magnitude of random number that the algorithm can generate, as shall be discussed later.

Initial ideas

The initial idea for developing this algorithm was to reduce, or, if possible, remove the repeating periodic cycles that appeared in the use of LCG (Linear Congruential Generators),⁷ one of the most primitive algorithms to generate pseudo-random numbers. This led to the thought of using a non-constant entropy term which would oscillate between increasing and decreasing the current number that is being generated.

b = {(- 1)}^{i} \cdot i \cdot p_{i - 1}

Here, one would notice the presence of the term $p_{i - 1}$ , which speaks a lot about our algorithm. We do not simply discard the initial seed after generating the first term of the sequence. We keep our notion of the distribution of prime numbers being chaotic in nature,⁶ and hence use a term $p_{i}$ which is the next prime number after $p_{i - 1}$ . Thus, $p_{1}$ would be the next prime number after $p_{0}$ , $p_{2}$ would be the next prime number after $p_{1}$ and so on.

The multiplier

This, again, is a very simple idea, inspired from LCG⁷ where the first component of the algorithm is of the form $x_{i} = f (x_{i - 1})$ , where the previous term is simply multiplied by a constant. We can modify it so that the term has a bit more randomness in the following way: $x_{i} = f (x_{i - 1}, p_{i - 1})$ , where the function describes the multiplication of the two terms from its previous iteration.

XOR operation and Bit-Shift

It is important to mention that every term that the algorithm generates is always limited to $m$ bits. This is achieved by taking a $mod 2^{m}$ operation on every term that is generated. This gives us the advantage of limiting the number of bits to the necessary amount, i.e., $m$ , as per the use case. In order to add an extra layer of complexity, with the intention to further reduce the chance of periodic cycle formation in the sequence, a bit-wise shifting operation is introduced. The value of this shift itself is not predetermined and depends on a modulus of the iteration counter such that,

shift_amount = imod ⌊ \frac{m}{2} ⌋

This ensures that the number of bits shifted always falls in the range $[0, \frac{m}{2} - 1]$ . Thus, the ${(i - 1)}^{th}$ term is right-shifted according to the above idea and can be represented as follows:

x_{i - 1} ≫ (imod ⌊ \frac{m}{2} ⌋) \equiv (\frac{x_{i - 1}}{2^{(imod ⌊ \frac{m}{2} ⌋)}})

Since $x_{i - 1}$ is of m bits, the term produced above is of m-bits. Again, on applying a $modulus$ on the multiplier term, we can also guarantee it to have the same number of bits. With this knowledge, we can easily apply an ‘ExclusiveOR’ or ‘XOR’ operation on these two terms to get a fairly convoluted string of bits. Marsaglia stated that in the context of random number generation, the use of XOR operations has been shown to enhance the efficiency and unpredictability of pseudo-random number generators. As demonstrated by him in 2003,⁸ XORShift random number generators utilize bit-wise XOR operations to produce sequences of numbers that exhibit desirable statistical properties, making them suitable for a variety of applications. Therefore, these operations can be combined to form the whole m-bit term as follows:

x_{i} = ((f (x_{i - 1}, p_{i - 1}) mod 2^{m}) \oplus (\frac{x_{i - 1}}{2^{(imod ⌊ \frac{m}{2} ⌋)}}))

The XOR operation is mathematically represented by the $\oplus$ symbol.

The MAXEM algorithm

The final output sequence at each step $i$ , denoted by $x_{i}$ , is generated using a combination of Modular Arithmetic, bit-wise manipulation (XORshift), and Entropy Modulation as is a culmination of the ideas discussed above and can be represented as follows:

x_{i} = (((x_{i - 1} \cdot p_{i - 1}) mod 2^{m}) \oplus (\frac{x_{i - 1}}{2^{(imod ⌊ \frac{m}{2} ⌋)}}) + {(- 1)}^{i} \cdot i \cdot p_{i - 1}) mod 2^{m}

Where:

• $i$ denotes the iteration counter,
• $x_{i - 1}$ is the previous number in the sequence, i.e., the term generated at iteration $i - 1$ ,
• $p_{i - 1}$ is the prime seed at iteration $i - 1$ ,
• $\oplus$ denotes the bitwise XOR operation,
• $mod 2^{m}$ ensures the number stays within an $m$ -bit range,
• $\frac{x_{i - 1}}{2^{(imod (m - 1))}}$ is a bitwise shift operation,
• ${(- 1)}^{i}$ introduces alternating addition and subtraction to the entropy term,
• $i \cdot p_{i - 1}$ adds entropy based on the iteration and prime seed.

The algorithm for this can be given as follows:

seed $\leftarrow$ next prime from random number in range $[100, max (1000, 2^{\frac{m}{4}})]$ Initialize sequence as an empty list current $\leftarrow$ seed

next_num $\leftarrow$ (current $\times$ seed) $mod 2^{m}$ next_num $\leftarrow$ next_num $\oplus$ (current $≫ (imod \frac{m}{2}))$ next_num $\leftarrow$ (next_num + ${(- 1)}^{i} \times i \times seed$ ) $mod 2^{m}$ Append next_num to sequence seed $\leftarrow$ next prime after seed current $\leftarrow$ next_num

return sequence

Running the algorithm for an example test case

Python language⁹ was used to code this algorithm due to the simplicity of the language and the wide range of libraries suited for scientific computing. Additionally, Google Colab, which has proven to be a valuable platform for running Python code in a cloud environment in the form of notebooks, enabling easy collaboration and access to powerful computational resources without the need for local hardware,¹⁰ was used as the primary development environment. The allocated runtime came with a system RAM of 12.7 GB, a disk memory of 107.7 GB and no dedicated memory for graphics processing. In this section, we discuss a test run of the above algorithm with the following parameters:

n = 1000000; m = 128

Figure 1 shows the distribution of random numbers generated as a scatter plot with the index values of the random number sequence on the x-axis and the value of the random numbers on the y-axis. It is quite evident that this scatter plot shows a discrete uniform distribution of random numbers in the range $0 to 2^{m}$ , given, $2^{m} \approx 3.402 \times 10^{38}$ .

Figure 1. Plot showing uniformly distributed random numbers of the MAXEM Algorithm.

The mean, variance, standard deviation, median, minimum, maximum, and range of the generated numbers are summarized in Table 1.

Table 1. Descriptive statistics of generated random numbers.

Statistic	Value
Mean	$1.700836445315122 \times 10^{38}$
Error in Mean	$- 0.03381834297068176 %$
Variance	$9.662277875992937 \times 10^{75}$
Standard Deviation	$9.829688640029723 \times 10^{37}$
Median	$1.7003589590683976 \times 10^{38}$
Minimum	$6850768168405461780$
Maximum	$3.402823437297518 \times 10^{38}$
Range	$3.402823437297518 \times 10^{38}$
25th Percentile	$8.495850037229687 \times 10^{37}$
75th Percentile	$2.5531358993801777 \times 10^{38}$
Interquartile Range (IQR)	$1.703550895657209 \times 10^{38}$

The population mean of a discrete uniform distribution can be calculated using the formula:

\bar{x} = \frac{n + 1}{2}

For the value of $n$ used in our simulations, this results in an approximate mean of $1.7014118346046923 \times 10^{38}$ . The error in mean calculated for our chosen sample size is:

Error in Mean = - 0.03381834297068176 %

On a broader scale, the results from a large number of simulations conducted using this algorithm indicate that the error in mean consistently remains below $\pm 0.21 %$ . This demonstrates the reliability and accuracy of the algorithm in generating random numbers that closely adhere to the expected mean of the underlying distribution.

Testing and Results

The test for the quality of the algorithm and the randomness of the sequence thus generated can be conducted under four different parts, each checking a different aspect of the sequence. From the example above ( section 3), here is the sequence of random numbers that were obtained:

\begin{matrix} 6850768168405461780 \\ 17931181564508769381107265929 \\ 46933024901820254982869564459716412279 \\ 321994471083130446124683672239386547930 \\ 123048090386001761310298247717400910747 \\ 190488107609783521857697426807763758782 \\ 16617893273582357041392761262326671242 \\ ⋮ \end{matrix}

Choice of arbitrary seed

The algorithm itself is defined in a way to function well with an arbitrary seed. The choice of seed as mentioned in equation ([eq:seed]) shows that this algorithm trivially has the notion of an arbitrary seed selection system, thus, freeing it from the problem of selecting seeds that do not yield good results. It is important to note that the algorithm always selects a prime number as its seed, by definition, even if a non-prime integer is given as an input. In case a non-prime integer is given as an input, the algorithm selects the closest prime number greater than that integer as the seed $p_{0}$ . The range of $[100, max (1000, 2^{⌊ \frac{m}{4} ⌋})]$ has only been introduced as a convention to reduce the computational overhead of just selecting a seed. The only constraint that we’ve faced is when the value of the seed $p_{0}$ is very small compared to $2^{m}$ . In this case, the first few numbers of the generated sequence are also smaller values and thus shift away from the notion of randomness in the whole distribution. In such a case, the lower bound of the range can be chosen to be a higher number masted on the magnitude of $m$ .

Occurrence of periodic cycles

A common problem of PRNG’s are the occurrence of periodic cycles in the sequence. Since the algorithms are deterministic, once a random number is generated that is equal to some previous number in the sequence, the whole algorithm starts repeating itself, or atleast has a chance to do so. The is the most common problem for using LCG’s,⁷ where one must be really careful in the selection of seeds and parameters for maximal cycle formation. However, in MAXEM, the possibility of a periodic cycle formation has simply been eliminated. The same number may be generated more than once, but the scenario will not lead to the repetition of the whole sequence. We can see that, from equation ([eq:entropy]), the term $x_{i}$ depends directly on the product of the terms $i$ and $p_{i - 1}$ . It is going to be highly unlikely for $bmod 2^{m}$ [equation([eq:entropy])] to generate the same term as it had before, because both $i$ and $p_{i}$ are changing. Thus, the chances that the whole sequence repeats is almost equal to null. This has also been verified by simulation after running the algorithm many times, with not a single case of formation of periodic cycles.

Test for randomness with run test

As discussed by Ross in Introductory Statistics (Third Edition),¹¹ non-parametric hypothesis tests, such as the Wald Wolfowitz run test for randomness, are useful when no specific distribution is assumed. The run test evaluates the randomness of a sequence by analyzing the occurrence of ‘runs’—sequences of consecutive identical elements. A significant deviation in the number of runs from what is expected under randomness suggests the sequence may not be random. The Run test has been carried out on the whole sequence with the following hypothesis:

$H_{0} : The given sequence is random, H_{1} : The given sequence is not random$

We have failed to reject the Null Hypothesis ( $H_{0}$ ) for every single time the algorithm has been run, thus concluding that the sequence thus generated by MAXEM can be considered random.

From the example above ( section 3), here is the sequence of letters that were generated and considered for the run test:

[l, l, l, m, l, m, l, m, m, l, m, l, l, l, m, l, m, m, l, m, l, m, m, m, m, l, l, l, m, m, \dots]

Here, $l : x_{i} < Median ({x_{n}}) and m : x_{i} \geq Median ({x_{n}})$ . We can see that the sequence has neither way too many runs, nor way too less of them, suggesting that there is very little chance of patterns forming in the sequence. If $u$ be the total number of runs in a sequence and $n_{l} and n_{m}$ be the number of $l$ ’s and $m$ ’s in the sequence respectively, we can calculate:

μ (u) = \frac{n + 2 n_{l} \cdot n_{m}}{n}

σ {(u)}^{2} = \frac{[μ (u) - 1] [μ (u) - 2]}{n - 1}

Z = \frac{u - μ (u)}{σ (u)}

where

μ (u)

is the expected number of runs and

σ (u)

is the standard error. If the Z-score(

Z

) is such that

| Z | \leq 1.96

(taking a 5% significance level of the standard normal variate), then we accept the Null Hypothesis (

H_{0}

), else reject it.

Test for local randomness

Martin-Löf¹² introduced a formal way to define random sequences using algorithmic complexity. While a sequence might be considered random as a whole, this doesn’t mean that every subsequence will also be random. The randomness of a full sequence doesn’t carry over automatically to its parts, as a subsequence might still follow some patterns depending on how it’s selected.

With this in mind, we tested the sequence generated by MAXEM following a simple algorithm as given below.

Choose an initial value for len Choose an arbitrary number $y \in [0, n - len]$ Select subsequence of length len starting at position $y$ Conduct a run test on each subsequence. Let $z$ = number of times the run test is satisfied Compute $success_percentag e_{len} = \frac{z}{1000} \times 100$ Increase len gradually, starting with small intervals and increasing the gap for larger values.

Figure 2 shows the results of local randomness test for $len$ values like 30, 40, 50, 100, 150, 200, 400,…, upto 20000. The high success results prove how the sequence generated by MAXEM is not only random globally, but also locally.

Figure 2. Local randomness testing with subsequences of various lengths.

These analyses indicate that MAXEM provides a high level of randomness with good scalability. Benchmarks like TestU01, designed by L’Ecuyer and Simard,¹³ provide a comprehensive framework to evaluate PRNG performance, and can be used to conduct further analyses on the robustness of the MAXEM algorithm.

Conclusion

The proposed PRNG algorithm suggests a new way of creating pseudo-random numbers through the use of a combination of modular arithmetic, bitwise operations, and entropy modulation. This section summarizes the main features of the algorithm, its capabilities, and its pros and cons.

Performance and complexity

The worst-case time complexity of the algorithm is primarily dominated by the modular multiplication and can be given as $O (n \cdot mlogm)$ if faster and more optimized multiplication such as Karatsuba’s algorithm¹⁴ is used. This algorithmic complexity does not take into account the complexity of the separate function that finds the nearest prime number for the seed, as it can be (and should be) done separately or by parallel methods. We can see that this is computationally more expensive that other common PRNGs succh as the Mersenne Twister algorithm.³ However, the trade-off is justified by increased entropy and unpredictability.

The space complexity of MAXEM remains to be $O (m)$ , as is the case with most other common PRNGs.

Advantages

The major goals of the MAXEM algorithm are to negate the formation of periodic cycles in a sequence of random numbers and also preserve the quality (randomness) of the sequence with the selection of an arbitrary seed. This makes MAXEM stand apart from the other traditional PRNGs like the Linear Congruential Generator.²

This algorithm, given a high value of $m$ , also has great security. Due to the intrinsic features of this algorithm, reverse engineering this without a good guess on the final seed would be difficult as one would need to guess both the iteration number and the seed, given that the seed is a prime number of x bits, where x is quite large. This task is not impossible but quite arduous. Hence, this algorithm can also be great for use in cryptography¹⁵ and other security measures.

The ability to adjust the bit range $2^{m}$ offers significant flexibility. By selecting appropriate values for $m$ , the generator can be tuned for different use cases, whether it’s cryptographic applications or scientific simulations.

Limitations

Despite the advantages of the MAXEM algorithm, the inherent high computational complexity can be a limiting factor and make the generation of sequences for large values of $m and n$ very slow to the extent that it becomes unsuitable for real-time applications. However, taking advantage of these modern techniques such as parallel computing can significantly help in the process of improving the efficiency of the system without a sacrifice of quality.

Future directions

The MAXEM algorithm has its own perks and limitations. Here listed are a few directions in which research can be conducted to contribute to MAXEM, making it even more powerful and efficient:

• MAXEM can be modified by combining it with other PRNGs with better computational complexity to balance its speed with its quality. A hybrid approach might help to create an even stronger algorithm that combines the strength of the other PRNGs.
• Analysing the algorithm for its resistance to common cryptanalytic attacks such as brute-force or statistical attacks, with formal proofs, can help to establish the viability of the algorithm for cryptographic applications.
• Given the computational expense of modular arithmetic, parallelizing the algorithm could improve its performance in generating large sequences. Research into parallel or distributed implementations of the algorithm could make it more scalable.

Declaration of data availability

The authors further declare that the data used in the work was not obtained from any external sources. All data supporting the results in this study were generated by the authors using the MAXEM algorithm, using their own code. The full dataset (random sequences, statistical measures, and plots) and the implementation code are publicly available on Zenodo at: https://doi.org/10.5281/zenodo.15631843.¹⁶

Data are available under the terms of the Creative Commons Attribution 4.0 International license (CC-BY 4.0).

Ethical declarations

The authors also declare that this study does not involve any human or animal participants, data, or tissue.

References

1. Neyman J: On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. J. Am. Stat. Assoc. 1934; 29(1): 538–539.
2. Knuth DE: The art of computer programming, volume 1: Fundamental algorithms. Massachusetts: Addison-Wesley; 3rd ed. 1997.
3. Matsumoto M, Nishimura T: Mersenne twister: A 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation. 1998; 8(1): 3–30. Publisher Full Text
4. Ferguson N, Schneier B: Practical cryptography. Wiley Publishing; 2004.
5. Chakraborty S: On why and what of randomness. ArXiv. 2009; abs/0902.1232. Reference Source
6. Berezowski M: Chaotic distribution of prime numbers and digits of π. SSRN Electron. J. 2019. Publisher Full Text
7. Tezuka S: Linear congruential generators. Uniform random numbers. Boston, MA: Springer; 1995; vol. 315. .
8. Marsaglia G: Xorshift RNGs. J. Stat. Softw. 2003; 8(1): 14. Publisher Full Text
9. Python Software Foundation: Python language reference.2024. Reference Source
10. Google: Google colaboratory.2024. Reference Source
11. Ross SM: Nonparametric hypotheses tests. Introductory statistics. Boston: Academic Press; Third 2010; pp. 647–697.
12. Martin-Löf P: The definition of random sequences. Inf. Control. 1966; 9(6): 602–619. Publisher Full Text
13. L’Ecuyer P, Simard R: TestU01: A c library for empirical testing of random number generators. ACM Transactions on Mathematical Software (TOMS). 2007; 33(4): 1–40. Publisher Full Text
14. Karatsuba A, Ofman Y: Multiplication of many-digital numbers by automatic computers. Dokl. Akad. Nauk SSSR. 1962; 145(2): 293–294. Reference Source
15. Goldreich O: On the foundations of modern cryptography. Advances in cryptology — CRYPTO’97. 1997; pp. 46–74. Publisher Full Text
16. Sadhukhan R, Chakrabory S: MAXEM: A New Pseudo-Random Number Generating Algorithm with Implementation of Modulo Arithmetic, XORShift, and Entropy Modulation.2025. Publisher Full Text

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 15 Jul 2025

Author details Author details

¹ Mathematics, Birla Institute of Technology, Ranchi, Jharkhand, 835215, India
² Mathematics, Birla Institute of Technology, Ranchi, Jharkhand, 835215, India

Ramit Sadhukhan
Roles: Conceptualization, Data Curation, Formal Analysis, Writing – Original Draft Preparation

Soubhik Chakraborty
Roles: Conceptualization, Supervision, Validation, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (1)

version 1

Published: 15 Jul 2025, 14:691

https://doi.org/10.12688/f1000research.165637.1

Copyright

© 2025 Sadhukhan R and Chakraborty S. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Sadhukhan R and Chakraborty S. MAXEM: A New Pseudo-Random Number Generating Algorithm with Implementation of Modulo Arithmetic, XORShift, and Entropy Modulation [version 1; peer review: 1 not approved]. F1000Research 2025, 14:691 (https://doi.org/10.12688/f1000research.165637.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Version 1

VERSION 1

PUBLISHED 15 Jul 2025

Views

5

Reviewer Report 01 Sep 2025

Pierre L'Ecuyer, Université de Montréal, Édouard-Montpetit, Montreal, Canada

Not Approved

https://doi.org/10.5256/f1000research.182325.r406145

General evaluation

The authors propose a new nonperiodic RNG whose state at step i can be seen as the triple (i, x_i, p_i) where x_i is a m-bit integer and p_i is the first prime number larger ... Continue reading

General evaluation

The authors propose a new nonperiodic RNG whose state at step i can be seen as the triple (i, x_i, p_i) where x_i is a m-bit integer and p_i is the first prime number larger than p_i−1. The first prime number p_0 is selected “at random” and x_i is defined via a slightly complicated function of (i − 1, x_i−1, p_i−1). Note that the size of the state increases with i and has no upper bound, because i and p_i increase without bound. The construction is purely heuristic, in the sense that there is no mathematical proof of uniformity or independence of the output values. For empirical statistical testing, the authors only report a simple visual plot for pairs of numbers and a run test with a small sample size. At a minimum, they should apply more stringent batteries of test like Crush and BigCrush of TestU01. Finally, it seems to me that this RNG will be extremely slow, since it requires finding a new large prime number at each step. The fast RNGs used for simulation nowadays can generate nearly one billion U(0, 1) random numbers per second on a single CPU. The authors should provide comparisons of their RNG with that. For recent tutorials/surveys on RNGs for simulation, see for example [102, 103] and other references given there. These papers argue that for good RNG constructions, one should have mathematical proofs of multidimensional uniformity of the output over the full period of the RNG. If the period is long enough, the periodicity is not an issue.

Specific comments and details

1. The first sentence of the introduction is incorrect. The “law of statistical regularity” does not say that and “only when it is selected randomly” is also not true. For example if you have a population of one million balls of which 20% are red, 30% are blue, and 50% are green, and you want a representative sample of 100 balls, then the most representative sample contains 20 red, 30 blue, and 50 green. A random sample cannot be more representative than this deterministic selection.

2. The Mersenne twister as important defects that were uncovered in [5]. Additional drawbacks of this RNG are discussed in [3], Section 2.

3. Page 4: Note that after the 2003 paper of Marsaglia, Xorshift RNGs have been shown to have major defects. See [1, 4].

4. In reference [2], note that Knuth discusses RNGs in his volume 2, not volume 1.

5. Reference [7] is incorrect: there is no “Linear congruential generators” in the book title.

6. What is the difference between Ref. [16] and the present paper?

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

No
Are the conclusions drawn adequately supported by the results?

No

References

1. R. P. Brent. Note on Marsaglia’s xorshift random number generators. Journal of Statistical Software, 11(5):1–4, 2004.
2. P. L’Ecuyer. Random number generation. In J. E. Gentle, W. Haerdle, and Y. Mori, editors, Handbook of Computational Statistics, pages 35–71. Springer-Verlag, Berlin, second edition, 2012.
3. P. L’Ecuyer, O. Nadeau-Chamard, Y.-F. Chen, and J. Lebar. Multiple streams with recurrence-based, counter-based, and splittable random number generators. In Proceedings of the 2021 Winter Simulation Conference, pages 1–16. IEEE Press, 2021.
4. F. Panneton and P. L’Ecuyer. On the xorshift random number generators. ACM Transactions on Modeling and Computer Simulation, 15(4):346–361, 2005.
5. F. Panneton, P. L’Ecuyer, and M. Matsumoto. Improved long-period generators based on linear recurrences modulo 2. ACM Transactions on Mathematical Software, 32(1):1–16, 2006.

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Simulation, random number generation

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

CITE

Report a concern

Respond or Comment

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 15 Jul 2025

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1
Version 1 15 Jul 25	read

Pierre L'Ecuyer, Université de Montréal, Édouard-Montpetit, Canada

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

Back to all reports

Reviewer Report

5 Views

01 Sep 2025 | for Version 1

Pierre L'Ecuyer, Université de Montréal, Édouard-Montpetit, Montreal, Canada

5 Views Cite this report Responses(0)

Not Approved

General evaluation

The authors propose a new nonperiodic RNG whose state at step i can be seen as the triple (i, x_i, p_i) where x_i is a m-bit integer and p_i is the first prime number larger than p_i−1. The first prime number p_0 is selected “at random” and x_i is defined via a slightly complicated function of (i − 1, x_i−1, p_i−1). Note that the size of the state increases with i and has no upper bound, because i and p_i increase without bound. The construction is purely heuristic, in the sense that there is no mathematical proof of uniformity or independence of the output values. For empirical statistical testing, the authors only report a simple visual plot for pairs of numbers and a run test with a small sample size. At a minimum, they should apply more stringent batteries of test like Crush and BigCrush of TestU01. Finally, it seems to me that this RNG will be extremely slow, since it requires finding a new large prime number at each step. The fast RNGs used for simulation nowadays can generate nearly one billion U(0, 1) random numbers per second on a single CPU. The authors should provide comparisons of their RNG with that. For recent tutorials/surveys on RNGs for simulation, see for example [102, 103] and other references given there. These papers argue that for good RNG constructions, one should have mathematical proofs of multidimensional uniformity of the output over the full period of the RNG. If the period is long enough, the periodicity is not an issue.

Specific comments and details

1. The first sentence of the introduction is incorrect. The “law of statistical regularity” does not say that and “only when it is selected randomly” is also not true. For example if you have a population of one million balls of which 20% are red, 30% are blue, and 50% are green, and you want a representative sample of 100 balls, then the most representative sample contains 20 red, 30 blue, and 50 green. A random sample cannot be more representative than this deterministic selection.

2. The Mersenne twister as important defects that were uncovered in [5]. Additional drawbacks of this RNG are discussed in [3], Section 2.

3. Page 4: Note that after the 2003 paper of Marsaglia, Xorshift RNGs have been shown to have major defects. See [1, 4].

4. In reference [2], note that Knuth discusses RNGs in his volume 2, not volume 1.

5. Reference [7] is incorrect: there is no “Linear congruential generators” in the book title.

6. What is the difference between Ref. [16] and the present paper?

Is the work clearly and accurately presented and does it cite the current literature?

Partly
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Partly
Are all the source data underlying the results available to ensure full reproducibility?

No
Are the conclusions drawn adequately supported by the results?

No

References

1. R. P. Brent. Note on Marsaglia’s xorshift random number generators. Journal of Statistical Software, 11(5):1–4, 2004.
2. P. L’Ecuyer. Random number generation. In J. E. Gentle, W. Haerdle, and Y. Mori, editors, Handbook of Computational Statistics, pages 35–71. Springer-Verlag, Berlin, second edition, 2012.
3. P. L’Ecuyer, O. Nadeau-Chamard, Y.-F. Chen, and J. Lebar. Multiple streams with recurrence-based, counter-based, and splittable random number generators. In Proceedings of the 2021 Winter Simulation Conference, pages 1–16. IEEE Press, 2021.
4. F. Panneton and P. L’Ecuyer. On the xorshift random number generators. ACM Transactions on Modeling and Computer Simulation, 15(4):346–361, 2005.
5. F. Panneton, P. L’Ecuyer, and M. Matsumoto. Improved long-period generators based on linear recurrences modulo 2. ACM Transactions on Mathematical Software, 32(1):1–16, 2006.

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Simulation, random number generation

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

Respond to this report

Responses (0)

[1] 1. Neyman J: On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. J. Am. Stat. Assoc. 1934; 29(1): 538–539.

[2] 2. Knuth DE: The art of computer programming, volume 1: Fundamental algorithms. Massachusetts: Addison-Wesley; 3rd ed. 1997.

[3] 3. Matsumoto M, Nishimura T: Mersenne twister: A 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation. 1998; 8(1): 3–30. Publisher Full Text

[4] 4. Ferguson N, Schneier B: Practical cryptography. Wiley Publishing; 2004.

[5] 5. Chakraborty S: On why and what of randomness. ArXiv. 2009; abs/0902.1232. Reference Source

[6] 6. Berezowski M: Chaotic distribution of prime numbers and digits of π. SSRN Electron. J. 2019. Publisher Full Text

[7] 7. Tezuka S: Linear congruential generators. Uniform random numbers. Boston, MA: Springer; 1995; vol. 315. .

[8] 8. Marsaglia G: Xorshift RNGs. J. Stat. Softw. 2003; 8(1): 14. Publisher Full Text

[9] 9. Python Software Foundation: Python language reference.2024. Reference Source

[10] 10. Google: Google colaboratory.2024. Reference Source

[11] 11. Ross SM: Nonparametric hypotheses tests. Introductory statistics. Boston: Academic Press; Third 2010; pp. 647–697.

[12] 12. Martin-Löf P: The definition of random sequences. Inf. Control. 1966; 9(6): 602–619. Publisher Full Text

[13] 13. L’Ecuyer P, Simard R: TestU01: A c library for empirical testing of random number generators. ACM Transactions on Mathematical Software (TOMS). 2007; 33(4): 1–40. Publisher Full Text

[14] 14. Karatsuba A, Ofman Y: Multiplication of many-digital numbers by automatic computers. Dokl. Akad. Nauk SSSR. 1962; 145(2): 293–294. Reference Source

[15] 15. Goldreich O: On the foundations of modern cryptography. Advances in cryptology — CRYPTO’97. 1997; pp. 46–74. Publisher Full Text

[16] 16. Sadhukhan R, Chakrabory S: MAXEM: A New Pseudo-Random Number Generating Algorithm with Implementation of Modulo Arithmetic, XORShift, and Entropy Modulation.2025. Publisher Full Text

MAXEM: A New Pseudo-Random Number Generating Algorithm with Implementation of Modulo Arithmetic, XORShift, and Entropy Modulation

Abstract

Keywords

Introduction

Methodology

Selection of a seed

Initial ideas

The multiplier

XOR operation and Bit-Shift

The MAXEM algorithm

Running the algorithm for an example test case

Figure 1. Plot showing uniformly distributed random numbers of the MAXEM Algorithm.

Table 1. Descriptive statistics of generated random numbers.

Testing and Results

Choice of arbitrary seed

Occurrence of periodic cycles

Test for randomness with run test

Test for local randomness

Figure 2. Local randomness testing with subsequences of various lengths.

Conclusion

Performance and complexity

Advantages

Limitations

Future directions

Declaration of data availability

Ethical declarations

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated