Go to the main page of herbst.ist.org

Pseudo Random String Generator - main page

The Pseudo Random String Generator (or PseudoRandomStringGenerator) is a software to generate pseudorandom sequences of characters. The application provides a simple graphical user interface, where you can define the alphabet and the length of the string to be generated. You can choose the alphabet for the string either explicitly (character by character, all Unicode characters are valid), or use preconfigured alphabets like all Latin lowercase letters, all Latin uppercase letters, all numbers, the most common special characters and combinations of these.

Why building a new random number generator, aren't there already enough of them out there? While this is true in terms of quantity, I found it necessary to develop my own special tailored application. Most existing generators were developed for experts (e.g. scientific software, modules to integrate into other software), or they are online applications. Also, there are many suspicious closed source generators, which I wouldn't trust. My goal is to deliver a simple tool with a straightforward user interface, so that anybody should be able to use it. Therefore, it is also free and open source. Additionally, my software is working offline, to be independent from the internet and to avoid vulnerabilities caused by sending the output across the internet.

The current version produces cryptographically secure pseudorandom sequences of characters. You can expect local randomness, even distribution and a cryptographically strong generation process. Roughly this means, that it is computationally hard to predict next characters of a sequence, even if an attacker could read the past output sequence and knows parts of the inner state of the computer, but it may be possible.

The software uses the Java Class SecureRandom to be able to deliver cryptographically secure pseudorandom sequences. Therefore, the output of this class complies with the statistical random number generator tests as specified in the FIPS 140-2, Security Requirements for Cryptographic Modules, section 4.9.1. and additionally with RFC 1750: Randomness Recommendations for Security ("must produce non-deterministic output"). I can not guarantee, that my software still meets the above requirements, but from a software engineering point of view, I don't see any reason, why the output of my implementation could be any weaker than the output of the original Java class. As the software is free and open source, you are always welcome to verify, comment and contribute.

If you are interested in more information about some of the maybe a bit too technical phrases above, keep reading. Otherwise you may want to go directly to the Download section, at the bottom part of this page.

Randomness

Commonly, randomness is understood as "pertaining to chance" (Parzen, Emanuel: Stochastic Processes. 1962, p. 7). Another simple definition would be, that a random event is one you don't know the result of. Randomness plays a fundamental role in many disciplines like gaming, lottery, quantum physics, thermodynamics, statistics, probability theory, information theory, biological evolution, cryptography and even philosophy. Most of these domains use their own definitions of randomness, because a useful universal definition is still missing. Mainly, these definitions deal with properties like:

As scientists still try to solve basic question like "does randomness exist at all?" (determinism), today's definitions may be subject to quite fundamental changes.

Computers and randomness

Computers are deterministic machines, so for one given input they always produce the same output, or at least they follow an algorithm (a receipt) to produce the output. At the end, the output is always "signed" by this process. Examining the output reveals information about by the structure of the internal code and hardware. Or in the words of John von Neumann: "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin."

Bugs and randomness

Computers are designed to do exactly what they are instructed to, otherwise we call it a bug. There are two kind of bugs: software bugs and hardware failures. Unfortunately, you can't use a software bug to invoke "some" randomness, because the bug would not produce random results. Even if it looks like the bug produces random output, there must be a structure in it, because of the program code. In contrast, if a computer has a hardware failure like a tottering contact on the main board, then this could be a true random source. But in practice, the computer most likely wouldn't be able to process anything else correctly as well.

True randomness

So, to generate true random numbers, you need to have some kind of hardware involved, that is a true random source. For example, you could make use of a tottering contact in a custom built random output machine. Or you could use a radio, tune it to a frequency, where you can only hear white noise and then feed this into the sound card of a computer and generate bits based on the input. The output of this is something a computer can not calculate on his own. On the other side, hardware random number generators are often sensitive to environmental influences like temperature and humidity and may produce biased output under bad conditions, so they tend to be more error-prone than software based pseudorandom number generators.

Pseudorandomness

A Pseudorandom sequence looks random, but it is not. The elements fo the sequence were not obtained from random events, but from algorithmic calculations (a deterministic process).

The better the quality of a pseudorandom number generator, the more statistical random number tests it passes. There is a big range of tests, like tests about the distribution of values, the arithmetic mean value and more specialised ones like the Chi-Square Test and the Autocorrelation Test. Local randomness is also an important property of a pseudorandom string, because in many cases it is not acceptable, if the generator pops out a "random" sequence of 1 Million '0's. Sure it is a valid result from a theoretical viewpoint, but I am also sure, that the gaming software industry is not interested in theoretical perfect generators. In many applications you need a "more reliable" randomness, when examined in local detail. In contrast, if you consider a true random sequence of sufficient length (consider millions or billions of numbers), there is a high probability to have local areas, that do not look random and exhibit a pattern.

The predictability of pseudorandom number generators can be especially useful for automated tests of software that has to handle random input, because software tests are usually easier to interpret, if they are repeatable.

Cryptographically secure pseudorandom number generators

As pseudorandom sequences are reproducible exactly and infinitely by definition, computer scientists wanted to utilise the advantages of pseudorandom number generators (working fast, being independent from external sources of randomness and therefore also less error-prone, because the inappropriate use of these sources can cause severe non-randomness) and add "some" unpredictability, to generate "cryptographically strong" output sequences. You can find links to the definitions of these concepts in the introducing part on top of this page. In short, the definition of a cryptographically secure number generator requires that it is hard to compute the output, even when parts of the state of the generator are known.

The influence of alphabet size and string length on possible variations

If you generate a word that consists of 8 digits, there are 100.000.000 possibilities to build an unique word. The formula obviously reads simply as "<alphabet length> to the power of <string length>". If you just use a larger alphabet, like all Latin characters, digits and some special characters - lets say you have 84 different characters (like the default alphabet in my software) and stick to the 8 characters length - there are now already 2.478.758.911.082.496 possible combinations. Taking 4 more characters, so that the length of the string is 12 characters, there are now even 123.410.307.017.276.100.000.000 possible words. You see, the numbers are growing quickly when increasing any of the two parameters, although the string length has more influence.

Proving randomness

How can you prove, if something behaves random? In fact, you can't, as any possible outcome is a valid random result, as we already mentioned before. See this nice illustration for a better explanation.

However, here is an example: Before proving, if a sequence is random, you may want to prove, if a single item of the sequence is random, e.g.: Is 7 a random number? You see, this question does not make sense at all. But if you go further and ask the question for a specific sequence, e.g.: Is 723384 a random sequence? Well, while this question does make sense now, it is not a trivial one and you will get many different answers. Ian Craw, a mathematician, would answer something like: "Randomness is not a property of individual numbers. More properly it is a property of infinite sequences of numbers."

Mathematicians have an interesting approach to avoid questions about the randomness of finite strings. They define, that a finite sequence itself can not be called random, a random sequence must be infinite, and this infinite sequence must contain every possible subsequence for an infinite number of times. With this in our minds we can now enjoy the following definition: "An infinite random sequence is infinitely-distributed, in which any possible finite sub-sequence of numbers is uniformly distributed." In practice, this means, that you can not test, if a single sequence was derived from a random process, but if you take more and more sequences into account, the test results should confirm your assumption more and more. But test results are to be treated with caution, because they can only serve as indicators, but not as absolute measures for randomness. Note, that you can not quantify randomness itself. A process is either random or not, there is nothing in between. Thus, a sequence derived from such a process therefore is also either random or not. But you can test, as laid out before, if sequences appear more or less random in specific aspects.

Statistical randomness

Statistics offers a plethora of random number tests, to verify statistical randomness.  As always, when using statistical tools, you have to be very careful about how and what questions (hypotheses) you ask, about where you get the data to test and about the conclusions you make or try to make. I said, that it is impossible to judge, if a sequence is random or not. Having said that, I also have to admit, that statistical tests surely can help a lot, when used properly. E.g. the mean value of one specific sequence can be any, it does not tell you anything about its randomness at this point. But, as you add more and more sequences to the tested sequence, the mean value should slowly come nearer and nearer to the expected mean value (the mean value of all possible numbers in that test). If this process does not manifest, you maybe found a hint about a certain non-randomness. You could then do some further testing and eventually decide whether the found hint indicates a correct assumption, or not.

With statistical methods, you almost never get answers like yes or no, but answers like "with a probability of 95 % this is true". So, a test may lead to the correct answer, but you can never be sure. Another problem is, that it is not possible to compare values of different tests. Thus, statistics offers helpful instruments to measure aspects of randomness, but is not an all in one solution. After testing a lot, you may also state as Ian Craw did: "If you want to be thoroughly perverse you could argue that the fact that it passes all such tests is itself evidence of a certain non-randomness!"

But there are more interesting approaches to test randomness:

Entropy and randomness

The concept of entropy, as it was developed in statistical thermodynamics (entropy is associated to the amount of disorder in a system, e.g. gas at room temperature is of high entropy, as its molecules do not show any describable order), seems somewhat connected to the concept of entropy in information theory, where entropy is a measure for the amount of information, that a message contains (note that the amount of information does not say anything about its usefulness in this context). According to the definition of information theory, a random sequence has maximum possible entropy. In other words, this sequence contains so much information, that you could not remove a single bit from it without loosing information.

Compressibility and randomness

The concept of entropy soon led to other measures like the Kolmogorov-Chaitin complexity in the algorithmic information theory. Also known as algorithmic entropy, descriptive complexity, Kolmogorov complexity, program-size complexity or stochastic complexity, the complexity of an object is defined as the amount of information needed to specify this object exactly. Consequently, a random object must be of high complexity, as there should be no way to describe the object any shorter than in the way the described object does it. This directly implies, that random objects must be incompressible. But you also have to be careful when testing on compressibility. Consider a substring of 1 Million '0's, taken from an infinite true random string. This is a perfectly valid example of a substring, as it must appear in an infinite random string. But this sequence - taken on its own - can be compressed very effectively, even if it was produced by a true random process.

Examples of usage for the software

First, a few words of caution: never use any software, that you do not trust, especially if security matters (e.g. to generate passwords).

I use the Pseudo Random String Generator to create passwords for online services, because those are most vulnerable to weak passwords, especially when the password length is limited to 8 or less characters. But again, I do not encourage you to generate your personal passwords with my software, unless you or somebody you trust, reads and understands the source code of the application.

Generate random text to fill in "personal" informations at suspicious websites/accounts is also helpful, as you even reveal information about yourself, if you try to produce a random string manually (you can't escape your own "circuits" and your environment).

Create random looking patterns/images of characters as works of art, backgrounds, or just for the fun of it.

Play with Fortuna (with her cryptographically secure, algorithmic and deterministic side, though :-)

Downloads and system requirements

This is free and open source software, released under the MIT license, so you can use the software however you like, if you don't forget to cite the author and the license.

The current stable version is 1.1.0.9, released 2013-01-26.

See a screenshot including a brief introduction to the operation of the software.

Download the PseudoRandomStringGenerator (197 KB) for Microsoft Windows XP / Vista / 7 / 8. This version needs the Java 6 Runtime (or newer) installed. Alternatively you can download a standalone and portable version of the PseudoRandomStringGenerator (17 MB) with an included Java 6 Runtime.

Download the PseudoRandomStringGenerator (406 KB) for Apple Mac OS X 10.5 (64-bit Intel only) / 10.6 / 10.7 / 10.8. Mac OS X 10.5 and 10.6 usually has the Java 6 Runtime installed. If not, it is available under System Preferences > Software Update. Mac OS X 10.7 and 10.8 does not include the Java 6 Runtime, but if it is needed, a dialog will pop up and ask if Java should be downloaded and installed.

Download the PseudoRandomStringGenerator (147 KB) for Linux / Unix. The software should run on all operating systems that have the Java 6 Runtime (or newer) installed).

Download the PseudoRandomStringGenerator source files (29,3 MB) (zipped Eclipse Java project).

Contact and support

If you have any questions or need support, send a mail to the address mentioned in the copyright notice at the bottom of this page.

Random links

Pseudo Random String Generator: Copyright (c) 2011 René Herbst (herbst[at]ist.org)