Challenge
Given the formula of a chemical, output the Mr of the compound.
Equation
Each element in the compound is followed by a number that denotes the number of said atom in the compound. If there isn't a number, there is only one of that atom in the compound.
Some examples are:
- Ethanol (C2H6O) would be
C2H6Owhere there are two carbon atoms, 6 hydrogen atoms and 1 oxygen atom - Magnesium Hydroxide (MgO2H2) would be
MgO2H2where there is one magnesium atom, two oxygen atoms and two hydrogen atoms.
Note that you will never have to handle brackets and each element is included only once in the formula.
Whilst most people will probably stick to the order they feel most comfortable with, there is no strict ordering system. For example, water may be given as either H2O or OH2.
Mr
Note: Here, assume formula mass is the same as molecular mass
The Mr of a compound, the molecular mass, is the sum of the atomic weights of the atoms in the molecule.
The only elements and their atomic weights to 1 decimal place that you have to support (hydrogen to calcium, not including noble gases) are as follows. They can also be found here
H - 1.0 Li - 6.9 Be - 9.0
B - 10.8 C - 12.0 N - 14.0
O - 16.0 F - 19.0 Na - 23.0
Mg - 24.3 Al - 27.0 Si - 28.1
P - 31.0 S - 32.1 Cl - 35.5
K - 39.1 Ca - 40.1
You should always give the output to one decimal place.
For example, ethanol (C2H6O) has an Mr of 46.0 as it is the sum of the atomic weights of the elements in it:
12.0 + 12.0 + 1.0 + 1.0 + 1.0 + 1.0 + 1.0 + 1.0 + 16.0
(2*C + 6*H + 1*O)
Input
A single string in the above format. You can guarantee that the elements included in the equation will be actual elemental symbols.
The given compound isn't guaranteed to exist in reality.
Output
The total Mr of the compound, to 1 decimal place.
Rules
Builtins which access element or chemical data are disallowed (sorry Mathematica)
Examples
Input > Output
CaCO3 > 100.1
H2SO4 > 98.1
SF6 > 146.1
C100H202O53 > 2250.0
Winning
Shortest code in bytes wins.
This post was adopted with permission from caird coinheringaahing. (Post now deleted)
8 Answers 8
Jelly, 63 bytes
ḟØDOP%(¡ṛị"ÇṚÆ’BH+"Ḳ"ɦṀ76<s¡_-¦y=×ばつÇ
Œs>œṗ8ḊÇ€S
A monadic link accepting a list of characters and returning a number.
How?
ḟØDOP%(¡ṛị"ÇṚÆ’BH+"Ḳ"ɦṀ76<s¡_-¦y=Ḟ¡¡FPɓ‘¤÷5 - Link 1, Atomic weight: list of characters
- e.g. "Cl23"
ØD - digit yield = "0123456789"
ḟ - filter discard "Cl"
O - cast to ordinals [67,108]
P - product 7236
(¡ṛ - base 250 literal = 1223
% - modulo 1121
¤ - nilad followed by link(s) as a nilad:
"ÇṚÆ’ - base 250 literal = 983264
B - convert to binary = [ 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0]
H - halve = [ 0.5, 0.5, 0.5, 0.5, 0, 0, 0, 0, 0, 0, 0, 0, 0.5, 0.5, 0.5, 0, 0, 0, 0, 0]
"Ḳ"ɦṀ76<s¡_-¦y=Ḟ¡¡FPɓ‘ - code-page indexes = [177 , 34 , 160 , 200 , 135, 54, 60, 115, 0, 95, 45, 5, 121 , 140 , 195 , 0, 0, 70, 80, 155]
+ - addition = [177.5, 34.5, 160.5, 200.5, 135, 54, 60, 115, 0, 95, 45, 5, 121.5, 140.5, 195.5, 0, 0, 70, 80, 155]
ị - index into (1-indexed and modular)
- ...20 items so e.g. 1121%20=1 so 177.5
÷5 - divide by 5 ×ばつÇ - Link 2: Total weight of multiple of atoms: list of characters e.g. "Cl23"
ØD - digit yield = "0123456789"
f - filter keep "23"
V - evaluate as Jelly code 23
ȯ1 - logical or with one (no digits yields an empty string which evaluates to zero)
Ç - call last link (1) as a monad (get the atomic weight) 35.5
×ばつ - multiply 816.5
Œs>œṗ8ḊÇ€S - Main link: list of characters e.g. "C24HCl23"
Œs - swap case "c24hcL23"
> - greater than? (vectorises) 10011000
8 - chain's left argument "C24HCl23"
œṗ - partition at truthy indexes ["","C24","H","Cl23"]
Ḋ - dequeue ["C24","H","Cl23"]
Ç€ - call last link (2) as a monad for €ach [ 288, 1, 816.5]
S - sum 1105.5
-
\$\begingroup\$ This is one of the longest Jelly answers I have ever seen, but it still is less than half the length of the program currently in second, so good job! \$\endgroup\$Gryphon– Gryphon2017年06月11日 22:45:33 +00:00Commented Jun 11, 2017 at 22:45
Python 3, (削除) 189 182 (削除ここまで) 168 bytes
-14 bytes by using the hash from Justin Mariner's JavaScript (ES6) answer.
import re
lambda s:sum([[9,35.5,39.1,24.3,28.1,14,16,31,40.1,23,32.1,10.8,12,27,6.9,19,0,1][int(a,29)%633%35%18]*int(n or 1)for a,n in re.findall("(\D[a-z]?)(\d*)",s)])
Below is the 182 byte version, I'll leave the explanation for this one - the above just changes the order of the weights, uses int to convert the element name from base 29, and uses different dividends to compress the range of integers down - see Justin Mariner's answer.
import re
lambda s:sum([[16,31,40.1,32.1,0,24.3,12,39.1,28.1,19,0,9,10.8,23,27,35.5,6.9,14,1][ord(a[0])*ord(a[-1])%1135%98%19]*int(n or 1)for a,n in re.findall("(\D[a-z]?)(\d*)",s)])
An unnamed function accepting a string, s, and returning a number.
How?
Uses a regex to split the input, s, into the elements and their counts using:
re.findall("(\D[a-z]?)(\d*)",s)
\D matches exactly one non-digit and [a-z]? matches 0 or 1 lowercase letter, together matching elements. \d* matches 0 or more digits. The parentheses make these into two groups, and as such findall("...",s) returns a list of tuples of strings, [(element, number),...].
The number is simple to extract, the only thing to handle is that an empty string means 1, this is achieved with a logical or since Python strings are falsey: int(n or 1).
The element string is given a unique number by taking the product of its first and last character's ordinals (usually these are the same e.g. S or C, but we need to differentiate between Cl, C, Ca, and Na so we cannot just use one character).
These numbers are then hashed to cover a much smaller range of [0,18], found by a search of the modulo space resulting in %1135%98%19. For example "Cl" has ordinals 67 and 108, which multiply to give 7736, which, modulo 1135 is 426, which modulo 98 is 34, which modulo 19 is 15; this number is used to index into a list of integers - the 15th (0-indexed) value in the list:
[16,31,40.1,32.1,0,24.3,12,39.1,28.1,19,0,9,10.8,23,27,35.5,6.9,14,1]
is 35.5, the atomic weight of Cl, which is then multiplied by the number of such elements (as found above).
These products are then added together using sum(...).
-
\$\begingroup\$ You are a genius... Outgolfed me by over 350 bytes \$\endgroup\$Mr. Xcoder– Mr. Xcoder2017年06月11日 16:10:25 +00:00Commented Jun 11, 2017 at 16:10
PHP, 235 bytes
preg_match_all("#([A-Z][a-z]?)(\d*)#",$argn,$m);foreach($m[1]as$v)$s+=array_combine([H,Li,Be,B,C,N,O,F,Na,Mg,Al,Si,P,S,Cl,K,Ca],[1,6.9,9,10.8,12,14,16,19,23,24.3,27,28.1,31,32.1,35.5,39.1,40.1])[$v]*($m[2][+$k++]?:1);printf("%.1f",$s);
Instead of array_combine([H,Li,Be,B,C,N,O,F,Na,Mg,Al,Si,P,S,Cl,K,Ca],[1,6.9,9,10.8,12,14,16,19,23,24.3,27,28.1,31,32.1,35.5,39.1,40.1]) you can use [H=>1,Li=>6.9,Be=>9,B=>10.8,C=>12,N=>14,O=>16,F=>19,Na=>23,Mg=>24.3,Al=>27,Si=>28.1,P=>31,S=>32.1,Cl=>35.5,K=>39.1,Ca=>40.1] with the same Byte count
JavaScript (ES6), 150 bytes
c=>c.replace(/(\D[a-z]?)(\d+)?/g,(_,e,n=1)=>s+=[9,35.5,39.1,24.3,28.1,14,16,31,40.1,23,32.1,10.8,12,27,6.9,19,0,1][parseInt(e,29)%633%35%18]*n,s=0)&&s
Inspired by Jonathan Allan's Python answer, where he explained giving each element a unique number and hashing those numbers to be in a smaller range.
The elements were made into unique numbers by interpreting them as base-29 (0-9 and A-S). I then found that %633%35%18 narrows the values down to the range of [0, 17] while maintaining uniqueness.
Test Snippet
f=
c=>c.replace(/(\D[a-z]?)(\d+)?/g,(_,e,n=1)=>s+=[9,35.5,39.1,24.3,28.1,14,16,31,40.1,23,32.1,10.8,12,27,6.9,19,0,1][parseInt(e,29)%633%35%18]*n,s=0)&&s
Input: <input oninput="O.value=f(this.value)"><br>
Result: <input id="O" disabled>
-
\$\begingroup\$ Oh, I think your way would save me a few bytes too! \$\endgroup\$Jonathan Allan– Jonathan Allan2017年06月15日 05:13:40 +00:00Commented Jun 15, 2017 at 5:13
Clojure, (削除) 198 (削除ここまで) 194 bytes
Update: better to for than reduce.
#(apply +(for[[_ e n](re-seq #"([A-Z][a-z]?)([0-9]*)"%)](*(if(=""n)1(Integer. n))({"H"1"B"10.8"O"16"Mg"24.3"P"31"K"39.1"Li"6.9"C"12"F"19"Al"2"S"32.1"Ca"40.1"Be"9"N"14"Na"23"Si"28.1"Cl"35.5}e))))
Original:
#(reduce(fn[r[_ e n]](+(*(if(=""n)1(Integer. n))({"H"1"B"10.8"O"16"Mg"24.3"P"31"K"39.1"Li"6.9"C"12"F"19"Al"2"S"32.1"Ca"40.1"Be"9"N"14"Na"23"Si"28.1"Cl"35.5}e))r))0(re-seq #"([A-Z][a-z]?)([0-9]*)"%))
I'm wondering if there is a more compact way to encode the look-up table.
Python 3, 253 bytes
def m(f,j=0):
b=j+1
while'`'<f[b:]<'{':b+=1
c=b
while'.'<f[c:]<':':c+=1
return[6.9,9,23,40.1,24.3,27,28.1,35.5,31,32.1,39.1,1,10.8,12,14,16,19]['Li Be Na Ca Mg Al Si Cl P S K H B C N O F'.split().index(f[j:b])]*int(f[b:c]or 1)+(f[c:]>' 'and m(f,c))
Mathematica, (削除) 390 (削除ここまで) (削除) 338 (削除ここまで) 329 Bytes
Saved 9 bytes due to actually being awake now and actually using the shortening I intended.
Version 2.1:
S=StringSplit;Total[Flatten@{ToExpression@S[#,LetterCharacter],S[#,DigitCharacter]}&/@S[StringInsert[#,".",First/@StringPosition[#,x_/;UpperCaseQ[x]]],"."]/.{"H"->1,"Li"->3,"Be"->9,"B"->10.8,"C"->12,"N"->14,"O"->16,"F"->19,"Na"->23,"Mg"->24.3,"Al"->27,"Si"->28.1,"P"->31,"S"->32.1,"Cl"->35.5,"K"->39.1,"Ca"->40.1}/.{a_,b_}->a*b]&
Explanation: Find the position of all the uppercase characters. Put a dot before each. Split the string at each dot. For this list of substrings do the following split it based on letters and split it based on digits. For the ones split by letters convert string to numbers. For the ones split by digits replace each chemical with its molecular weight. For any with a molecular weight and an atom count replace it with the product of them. Them find the total.
Version 1:
I'm sure this can be golfed lots (or just completely rewritten). I just wanted to figure out how to do it. (Will reflect on it in the morning.)
F=Flatten;d=DigitCharacter;S=StringSplit;Total@Apply[Times,#,2]&@(Transpose[{F@S[#,d],ToExpression@F@S[#,LetterCharacter]}]&@(#<>If[StringEndsQ[#,d],"","1"]&/@Fold[If[UpperCaseQ[#2],Append[#,#2],F@{Drop[#,-1],Last[#]<>#2}]&,{},StringPartition[#,1]]))/.{"H"->1,"Li"->3,"Be"->9,"B"->10.8,"C"->12,"N"->14,"O"->16,"F"->19,"Na"->23,"Mg"->24.3,"Al"->27,"Si"->28.1,"P"->31,"S"->32.1,"Cl"->35.5,"K"->39.1,"Ca"->40.1}&
Explanation: First split the string up into characters. Then fold over the array joining lowercase characters and numbers back to their capital. Next append a 1 to any chemical without a number on the end. Then do two splits of the terms in the array - one splitting at all numbers and one splitting at all letters. For the first replace the letters with their molar masses then find the dot product of these two lists.
Python 3 - 408 bytes
This is mainly @ovs' solution, since he golfed it down by over 120 bytes... See the initial solution below.
e='Li Be Na Ca Mg Al Si Cl P S K H B C N O F'.split()
f,g=input(),[]
x=r=0
for i in e:
if i in f:g+=[(r,eval('6.9 9 23 40.1 24.3 27 28.1 35.5 31 32.1 39.1 1 10.8 12 14 16 19'.split()[e.index(i)]))];f=f.replace(i,' %d- '%r);r+=1
h=f.split()
for c,d in zip(h,h[1:]):
s=c.find('-')
if-1<s:
if'-'in d:
for y in g:x+=y[1]*(str(y[0])==c[:s])
else:
for y in g:x+=y[1]*int(d)*(str(y[0])==c[:s])
print(x)
Python 3 - (削除) 550 548 (削除ここまで) 535 bytes (lost the count with indentation)
Saved 10 bytes thanks to @cairdcoinheringaahing and 3 saved thanks to ovs
I had a personal goal to not use any regex, and do it the fun, old-school way... It turned out to be 350 bytes longer than the regex solution, but it only uses Python's standard library...
a='Li6.9 Be9. Na23. Ca40.1 Mg24.3 Al27. Si28.1 Cl35.5 P-31. S-32.1 K-39.1 H-1. B-10.8 C-12. N-14. O-16. F-19.'.split()
e,m,f,g,r=[x[:1+(x[1]>'-')]for x in a],[x[2:]for x in a],input(),[],0
for i in e:
if i in f:g.append((r,float(m[e.index(i)])));f=f.replace(i,' '+str(r)+'- ');r+=1;
h,x=f.split(),0
for i in range(len(h)):
if '-'in h[i]:
if '-'in h[i+1]:
for y in g:x+=y[1]*(str(y[0])==h[i][:h[i].index('-')])
else:
for y in g:
if str(y[0])==h[i][:h[i].index('-')]:x+=(y[1])*int(h[i+1])
else:1
print(x)
If anyone is willing to golf it down (with indentation fixes and other tricks...), it will be 100% well received, feeling like there's a better way to do this...
-
\$\begingroup\$ You can replace
for y in g: if str(y[0])==h[i][:h[i].index('-')]:x+=y[1]withfor y in g:x+=y[1]*(str(y[0])==h[i][:h[i].index('-')])\$\endgroup\$2017年06月11日 14:19:30 +00:00Commented Jun 11, 2017 at 14:19 -
\$\begingroup\$ @cairdcoinheringaahing ah, great... updating when I have access to a computer \$\endgroup\$Mr. Xcoder– Mr. Xcoder2017年06月11日 15:45:23 +00:00Commented Jun 11, 2017 at 15:45
-
\$\begingroup\$ @ovs Thanks a lot! Credited you in the answer \$\endgroup\$Mr. Xcoder– Mr. Xcoder2017年06月12日 19:20:18 +00:00Commented Jun 12, 2017 at 19:20
-
\$\begingroup\$ In Python, you can use a semicolon in place of a newline, which allows you to save bytes on indentation. \$\endgroup\$Pavel– Pavel2017年06月12日 19:48:05 +00:00Commented Jun 12, 2017 at 19:48
-
\$\begingroup\$ @Phoenix not if there is a
if/for/whileon the next line. As this is the case on every indented line, you can't save bytes by this. \$\endgroup\$ovs– ovs2017年06月12日 20:35:34 +00:00Commented Jun 12, 2017 at 20:35
Explore related questions
See similar questions with these tags.
2H2O? \$\endgroup\$NumberForm[#&@@#~ChemicalData~"MolecularMass",{9,1}]&\$\endgroup\$