I have a huge php file that contains a number of arrays
<?php
$quadrature_weights = array();
$quadrature_weights[2] = array(
1.0000000000000000,
1.0000000000000000);
$quadrature_weights[3] = array(
0.8888888888888888,
0.5555555555555555,
0.5555555555555555);
$quadrature_weights[4] = array(
0.6521451548625461,
0.6521451548625461,
0.3478548451374538,
0.3478548451374538);
?>
The real file contains 64 quadrature_weights and the actual number of decimals of the numbers inside the arrays is in the order of 200 (I have reduced the number here for readability).
I would like to load this file into python and determine how many decimals to keep. Lets say i decide to keep 4 decimals the output should be a dictionary (or some other container) like this
quadrature_weights = {
2: [1.0000,
1.0000],
3: [0.8888,
0.5555,
0.5555],
4: [0.6521,
0.6521,
0.3478,
0.3478]
}
I am not familiar with php and quite frankly I have no idea how to do this. I suppose it would be possible read every single line and then do some sort of "decoding" manually but I was really hoping to avoid that.
1 Answer 1
If the number of decimals is in the 200s in the PHP file, then PHP will truncate that and there is a 100% guarantee you will not get the same numbers out if you let PHP parse those arrays and output JSON to then use in Python.
Personally, I'd read the file line-by-line and parse it in Python using regular expressions. Something along the lines of:
import re
quadrature_weights = {}
max_decimals = 6
with open("/path/to/file.php") as phpfile:
key = None
for line in phpfile:
match_key = re.search("\[(\d+)\]", line)
if match_key:
key = match_key.group(1)
quadrature_weights[key] = []
continue
match_value = re.search("([\d.]+)", line)
if match_value:
quadrature_weights[key].append(
round(float(match_value.group(1)), max_decimals)
)
print(quadrature_weights)
With an input file like yours, the output of this will be (indented for readability):
{
'2': [1.0, 1.0],
'3': [0.888889, 0.555556, 0.555556],
'4': [0.652145, 0.652145, 0.347855, 0.347855]
}
If you want to always keep the correct number of decimals, even if the number is "1.0000000000000", then you should treat the numbers as strings:
match_value = re.search("([\d.]+)", line)
if match_value:
value = match_value.group(1)
period = value.index('.')
max_length = period + 1 + max_decimals
quadrature_weights[key].append(value[0:max_length])
With this change, the dict will look like (indented for readability):
{
'2': ['1.000000', '1.000000'],
'3': ['0.888888', '0.555555', '0.555555'],
'4': ['0.652145', '0.652145', '0.347854', '0.347854']
}
Then you can convert the values to float when you actually need to use the numerical values for calculations.
2 Comments
re.match() should have been re.search() and I've added the float conversion and rounding to a specific number of decimals (as specified in max_decimals))
echo json_encode($quadrature_weights);), then read that into Python. ☺️1.0is exactly the same as1.000000000, and floats don't store the useless information of how many zeros you typed.