-1

I have a code like this:

$alphabet = array('0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z');
$alphabetSize = count($alphabet);
$alphabetBitSize = ceil(log($alphabetSize, 2));
$bitSize = $length * $alphabetBitSize;
$bytes = random_bytes(ceil($bitSize/8));

What I need is reading $bitSize bits in a loop to generate a latin1 string using the alphabet. Now I am totally lost about how to do this with the $bytes I have. Most of the answers are using string functions to do this, but I guess I need something binary. Another option is doing it with hex maybe. Any hints?

asked Dec 4, 2023 at 15:01
6
  • 2
    An example of what you want as output may help. Commented Dec 4, 2023 at 15:10
  • @NigelRen Just a random latin1 string using the alphabet. Though iterating through the random bits would help to do it, so no need to go beyond that. Commented Dec 4, 2023 at 15:41
  • 1
    Is there something about How to create a random string using PHP? that doesn't satisfy your requirements? Why should that page not be used as a duplicate to close this page? ...same with Generate random 5 characters string and PHP random string generator If you have nuanced requirements on "how to generate a random string of n length from a whitelist array", you can add them to canonicals and explain your minor deviation. Commented Dec 5, 2023 at 1:47
  • @mickmackusa Sure. I wanted to use a cryptographically secure algorithm like random_bytes. Though I see that random_int would be better for this scenario. Commented Dec 5, 2023 at 10:08
  • "I guess I need something binary" do you, or don't you? Commented Dec 5, 2023 at 16:17

3 Answers 3

1

I generally will use PHP's built-in function unpack() to convert the bytes into a hexadecimal string and then to a binary string. I hope this helps.

$alphabet = array('0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z');
$alphabetSize = count($alphabet);
$alphabetBitSize = ceil(log($alphabetSize, 2));
$bitSize = $length * $alphabetBitSize;
$bytes = random_bytes(ceil($bitSize/8));
$binaryStr = '';
foreach (unpack('C*', $bytes) as $byte) {
 $binaryStr .= sprintf("%08b", $byte);
}
$resultString = '';
for ($i = 0; $i < strlen($binaryStr); $i += $alphabetBitSize) {
 $bitSegment = substr($binaryStr, $i, $alphabetBitSize);
 $index = bindec($bitSegment) % $alphabetSize;
 $resultString .= $alphabet[$index];
}
echo $resultString;
answered Dec 4, 2023 at 15:08
Sign up to request clarification or add additional context in comments.

1 Comment

It gives somewhat longer texts, but fine, thanks! :-)
1

I wrote my own code with bin2hex

function generateHexString($length){
 $bitSize = $length*4;
 $bytes = random_bytes(ceil($bitSize/8));
 $hex = bin2hex($bytes);
 return substr($hex, 0, $length);
}
function generateLatinString($length){
 $alphabet = array(
 '0','1','2','3','4','5','6','7','8','9',
 'a','b','c','d','e','f','g','h','i','j',
 'k','l','m','n','o','p','q','r','s','t',
 'u','v','w','x','y','z'
 );
 $alphabetSize = count($alphabet);
 $hexChunkSize = (int)(ceil(log($alphabetSize, 16)));
 $hexString = generateHexString($length*$hexChunkSize);
 $string = '';
 $hexChars = str_split($hexString);
 $characterIndex = 0;
 $chunkIndex = 0;
 foreach ($hexChars as $hexChar){
 $characterIndex += hexdec($hexChar) * pow(16, $chunkIndex);
 ++$chunkIndex;
 if ($chunkIndex === $hexChunkSize){
 while ($characterIndex >= $alphabetSize)
 $characterIndex -= $alphabetSize;
 $character = $alphabet[$characterIndex];
 $string .= $character;
 $chunkIndex = 0;
 }
 }
 return $string;
}
answered Dec 4, 2023 at 16:01

Comments

0

To generate a string from the random bytes in a loop, you can use bitwise operations to extract the necessary bits from the byte sequence. You can try the code bellow:

$alphabet = array('0','1','2','3','4','5','6','7','8','9','a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z');
$alphabetSize = count($alphabet);
$alphabetBitSize = (int)ceil(log($alphabetSize, 2));
$length = 5; // Set your desired length
$bitSize = $length * $alphabetBitSize;
$bytes = random_bytes(ceil($bitSize / 8));
$result = '';
$byteIndex = 0;
$bitIndex = 0;
$cumulativeBitIndex = 0;
$charIndex = 0;
for ($i = 0; $i < $bitSize; $i++) {
 $byte = ord($bytes[$byteIndex]);
 $bit = ($byte >> (7 - $bitIndex)) & 1;
 $bitIndex++;
 $cumulativeBitIndex++;
 if ($bitIndex == 8) {
 $bitIndex = 0;
 $byteIndex++;
 }
 $charIndex = ($charIndex << 1) | $bit;
 if ($cumulativeBitIndex === $alphabetBitSize) {
 if ($charIndex >= $alphabetSize) {
 $charIndex -= $alphabetSize;
 }
 $result .= $alphabet[$charIndex];
 $cumulativeBitIndex = 0;
 $charIndex = 0;
 }
}
var_dump($result);

In this example, I used the ord function to get the ASCII value of a character from the random bytes. Then, performed bitwise shifting and masking to extract each bit. The resulting bit is used to index the alphabet array to form the final string.

Let me know if this works for you..

answered Dec 4, 2023 at 15:16

5 Comments

Thanks! It gives a binary string atm. To achieve my goal I would need chunks longer than 1 bit. This latin alphabet is 36 chars with digits, so I would need 6 bits chunks to describe a character and for example do n-36 if the chunk gives a number bigger or equal than 36. In my case the other answer was more useful, because it is almost complete, but I could accept this one as well, because only making chunks is needed to finish it. Maybe others find it a lot more useful and upvote it.
Added some code to generate the string. I rather accept this, because it can be used to iterate through each bit.
Another challenge would be keeping this cryptographically secure, because now characters at the beginning of the alphabet are more frequent than characters at the end. Maybe skipping numbers that are bigger than the alphabet size would do the trick, but that would require a different loop. Not sure how this can be done properly. Never dealt with binary problems before.
I changed a little on your code. It's not a big diffenece. But, it's a better practice. And I don't think you don't need to worry about security for this code. As long as the source of the random_bytes is secure, it's cryptographically secure.
Sort of. The if ($charIndex >= $alphabetSize) $charIndex -= $alphabetSize; part makes it somewhat less secure, but still ok I guess.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.