15
\$\begingroup\$

Given a list of floating point numbers, standardize it.

Details

  • A list \$x_1,x_2,\ldots,x_n\$ is standardized if the mean of all values is 0, and the standard deviation is 1. One way to compute this is by first computing the mean \$\mu\$ and the standard deviation \$\sigma\$ as $$ \mu = \frac1n\sum_{i=1}^n x_i \qquad \sigma = \sqrt{\frac{1}{n}\sum_{i=1}^n (x_i -\mu)^2} ,$$ and then computing the standardization by replacing every \$x_i\$ with \$\frac{x_i-\mu}{\sigma}\$.
  • You can assume that the input contains at least two distinct entries (which implies \$\sigma \neq 0\$).
  • Note that some implementations use the sample standard deviation, which is not equal to the population standard deviation \$\sigma\$ we are using here.
  • There is a CW answer for all trivial solutions.

Examples

[1,2,3] -> [-1.224744871391589,0.0,1.224744871391589]
[1,2] -> [-1,1]
[-3,1,4,1,5] -> [-1.6428571428571428,-0.21428571428571433,0.8571428571428572,-0.21428571428571433,1.2142857142857144]

(These examples have been generated with this script.)

asked Dec 14, 2018 at 18:37
\$\endgroup\$
0

28 Answers 28

8
\$\begingroup\$

R, (削除) 51 (削除ここまで) (削除) 45 (削除ここまで) (削除) 38 (削除ここまで) 37 bytes

Thanks to Giuseppe and J.Doe!

function(x)scale(x)/(1-1/sum(x|1))^.5

Try it online!

answered Dec 14, 2018 at 19:34
\$\endgroup\$
1
  • \$\begingroup\$ Beat me by 2 bytes and 1 minute \$\endgroup\$ Commented Dec 14, 2018 at 19:38
5
\$\begingroup\$

CW for all trivial entries


Python 3 + scipy, 31 bytes

from scipy.stats import*
zscore

Try it online!

Octave / MATLAB, 15 bytes

@(x)zscore(x,1)

Try it online!

\$\endgroup\$
0
5
\$\begingroup\$

APL (Dyalog Classic), (削除) 21 (削除ここまで) (削除) 20 (削除ここまで) 19 bytes

(×ばつ≢)×ばつ

Try it online!

⊢÷⌹ is sum of squares

×ばつ is sum of squares divided by length

answered Dec 16, 2018 at 10:46
\$\endgroup\$
1
  • \$\begingroup\$ Wow. I shouldn't be surprised anymore, but I am every time \$\endgroup\$ Commented Dec 16, 2018 at 16:19
4
\$\begingroup\$

MATL, 10 bytes

tYm-t&1Zs/

Try it online!

Explanation

t % Implicit input
 % Duplicate
Ym % Mean
- % Subtract, element-wise
t % Duplicate
&1Zs % Standard deviation using normalization by n
/ % Divide, element-wise
 % Implicit display
answered Dec 14, 2018 at 19:33
\$\endgroup\$
4
\$\begingroup\$

APL+WIN, (削除) 41,32 (削除ここまで) 30 bytes

9 bytes saved thanks to Erik + 2 more thanks to ngn

x←v-(+/v)÷⍴v←⎕⋄x÷(×ばつx÷⍴v)*.5

Prompts for vector of numbers and calculates mean standard deviation and standardised elements of input vector

answered Dec 14, 2018 at 19:03
\$\endgroup\$
4
  • \$\begingroup\$ Can't you assign x←v-(+/v)÷⍴v←⎕ and then do x÷((+/x*2)÷⍴v)*.5? \$\endgroup\$ Commented Dec 14, 2018 at 19:11
  • \$\begingroup\$ I can indeed. Thanks. \$\endgroup\$ Commented Dec 14, 2018 at 19:24
  • \$\begingroup\$ does apl+win do singleton extension (1 2 3+,4 ←→ 1 2 3+4)? if yes, you could rewrite (+/x*2)÷⍴v as +/x×x÷⍴v \$\endgroup\$ Commented Dec 16, 2018 at 10:24
  • \$\begingroup\$ @ngn That works for another 2 bytes. Thanks. \$\endgroup\$ Commented Dec 16, 2018 at 12:47
3
\$\begingroup\$

R + pryr, (削除) 53 (削除ここまで) 52 bytes

-1 byte using sum(x|1) instead of length(x) as seen in @Robert S.'s solution

pryr::f((x-(y<-mean(x)))/(sum((x-y)^2)/sum(x|1))^.5)

For being a language built for statisticians, I'm amazed that this doesn't have a built-in function. At least not one that I could find. Even the function mosaic::zscore doesn't yield the expected results. This is likely due to using the population standard deviation instead of sample standard deviation.

Try it online!

answered Dec 14, 2018 at 19:36
\$\endgroup\$
8
  • 2
    \$\begingroup\$ You can change the <- into a = to save 1 byte. \$\endgroup\$ Commented Dec 14, 2018 at 19:52
  • \$\begingroup\$ @J.Doe nope, I used the method I commented on Robert S.'s solution. scale is neat! \$\endgroup\$ Commented Dec 14, 2018 at 19:57
  • 2
    \$\begingroup\$ @J.Doe since you only use n once you can use it directly for 38 bytes \$\endgroup\$ Commented Dec 14, 2018 at 20:00
  • 2
    \$\begingroup\$ @RobertS. here on PPCG we tend to encourage allowing flexible input and output, including outputting more than is required, with the exception of challenges where the precise layout of the output is the whole point of the challenge. \$\endgroup\$ Commented Dec 14, 2018 at 21:09
  • 6
    \$\begingroup\$ Of course R built-ins wouldn't use "population variance". Only confused engineers would use such a thing (hencethe Python and Matlab answers ;)) \$\endgroup\$ Commented Dec 14, 2018 at 21:12
3
\$\begingroup\$

Tcl, 115 bytes

proc S L {lmap c $L {expr ($c-[set m ([join $L +])/[set n [llength $L]].])/sqrt((([join $L -$m)**2+(]-$m)**2)/$n)}}

Try it online!

answered Dec 14, 2018 at 20:06
\$\endgroup\$
0
2
\$\begingroup\$

Jelly, 10 bytes

_×ばつ

Try it online!

It's not any shorter, but Jelly's determinant function ÆḊ also calculates vector norm.

_Æm x - mean(x)
 μ then:
 L1⁄2 Square root of the Length
 ÷ÆḊ divided by the norm
 ×ばつ Multiply by that value
answered Dec 14, 2018 at 20:31
\$\endgroup\$
1
  • \$\begingroup\$ Hey, nice alternative! Unfortunately, I can't see a way to shorten it. \$\endgroup\$ Commented Dec 14, 2018 at 22:34
2
\$\begingroup\$

Mathematica, 25 bytes

Mean[(a=#-Mean@#)a]^-.5a&

Pure function. Takes a list of numbers as input and returns a list of machine-precision numbers as output. Note that the built-in Standardize function uses the sample variance by default.

answered Dec 15, 2018 at 0:26
\$\endgroup\$
2
\$\begingroup\$

J, 22 bytes

-1 byte thanks to Cows quack!

(-%[:%:1#.-*-%#@[)+/%#

Try it online!

J, (削除) 31 (削除ここまで) 23 bytes

(-%[:%:#@[%~1#.-*-)+/%#

Try it online!

 +/%# - mean (sum (+/) divided (%) by the number of samples (#)) 
( ) - the list is a left argument here (we have a hook)
 - - the difference between each sample and the mean
 * - multiplied by 
 - - the difference between each sample and the mean
 1#. - sum by base-1 conversion
 %~ - divided by
 #@[ - the length of the samples list
 %: - square root
 [: - convert to a fork (function composition) 
 - - subtract the mean from each sample
 % - and divide it by sigma
answered Dec 15, 2018 at 9:49
\$\endgroup\$
2
  • 1
    \$\begingroup\$ Rearranging it gives 22 [:(%[:%:1#.*:%#)]-+/%# tio.run/##y/qfVmyrp2CgYKVg8D/…, I think one of those caps could be removed, but haven't had any luck so far, EDIT: a more direct byteshaving is (-%[:%:1#.-*-%#@[)+/%# also at 22 \$\endgroup\$ Commented Dec 15, 2018 at 13:12
  • \$\begingroup\$ @Cows quack Thanks! \$\endgroup\$ Commented Dec 15, 2018 at 14:30
2
\$\begingroup\$

APL (Dyalog Unicode), (削除) 33 (削除ここまで) 29 bytes

{d÷.5*⍨l×ばつ⍨d←⍵-(+/⍵)÷l←≢⍵}

-4 bytes thanks to @ngn

Try it online!

answered Dec 14, 2018 at 19:09
\$\endgroup\$
2
  • \$\begingroup\$ you could assign ⍵-m to a variable and remove m← like this: {d÷.5*⍨l÷⍨+/×⍨d←⍵-(+/⍵)÷l←≢⍵} \$\endgroup\$ Commented Dec 16, 2018 at 10:39
  • \$\begingroup\$ @ngn Ah, nice, thanks, I didn't see that duplication somehow \$\endgroup\$ Commented Dec 16, 2018 at 16:15
2
\$\begingroup\$

Haskell, (削除) 80 (削除ここまで) (削除) 75 (削除ここまで) 68 bytes

t x=k(/sqrt(f$sum$k(^2)))where k g=g.(-f(sum x)+)<$>x;f=(/sum(1<$x))

Thanks to @flawr for the suggestions to use sum(1<$x) instead of sum[1|_<-x] and to inline the mean, @xnor for inlining the standard deviation and other reductions.

Expanded:

-- Standardize a list of values of any floating-point type.
standardize :: Floating a => [a] -> [a]
standardize input = eachLessMean (/ sqrt (overLength (sum (eachLessMean (^2)))))
 where
 -- Map a function over each element of the input, less the mean.
 eachLessMean f = map (f . subtract (overLength (sum input))) input
 -- Divide a value by the length of the input.
 overLength n = n / sum (map (const 1) input)
answered Dec 16, 2018 at 6:28
\$\endgroup\$
5
  • 1
    \$\begingroup\$ You can replace [1|_<-x] with (1<$x) to save a few bytes. That is a great trick for avoiding the fromIntegral, that I haven't seen so far! \$\endgroup\$ Commented Dec 16, 2018 at 10:54
  • \$\begingroup\$ By the way: I like using tryitonline, you can run your code there and then copy the preformatted aswer for posting here! \$\endgroup\$ Commented Dec 16, 2018 at 10:57
  • \$\begingroup\$ And you do not have to define m. \$\endgroup\$ Commented Dec 16, 2018 at 11:02
  • \$\begingroup\$ You can write (-x+) for (+(-x)) to avoid parens. Also it looks like f can be pointfree: f=(/sum(1<$x)), and s can be replaced with its definition. \$\endgroup\$ Commented Dec 16, 2018 at 20:00
  • \$\begingroup\$ @xnor Ooh, (-x+) is handy, I’m sure I’ll be using that in the future \$\endgroup\$ Commented Dec 16, 2018 at 21:15
2
\$\begingroup\$

MathGolf, 7 bytes

▓-_2▓√/

Try it online!

Explanation

This is literally a byte-for-byte recreation of Kevin Cruijssen's 05AB1E answer, but I save some bytes from MathGolf having 1-byters for everything needed for this challenge. Also the answer looks quite good in my opinion!

▓ get average of list
 - pop a, b : push(a-b)
 _ duplicate TOS
 2 pop a : push(a*a)
 ▓ get average of list
 √ pop a : push(sqrt(a)), split string to list
 / pop a, b : push(a/b), split strings
answered Dec 17, 2018 at 14:55
\$\endgroup\$
1
\$\begingroup\$

Jelly, 10 bytes

_Æm÷2Æm1⁄2Ɗ$

Try it online!

answered Dec 14, 2018 at 18:57
\$\endgroup\$
1
\$\begingroup\$

JavaScript (ES7), (削除) 80 (削除ここまで) 79 bytes

a=>a.map(x=>(x-g(a))/g(a.map(x=>(x-m)**2))**.5,g=a=>m=eval(a.join`+`)/a.length)

Try it online!

Commented

a => // given the input array a[]
 a.map(x => // for each value x in a[]:
 (x - g(a)) / // compute (x - mean(a)) divided by
 g( // the standard deviation:
 a.map(x => // for each value x in a[]:
 (x - m) ** 2 // compute (x - mean(a))2
 ) // compute the mean of this array
 ) ** .5, // and take the square root
 g = a => // g = helper function taking an array a[],
 m = eval(a.join`+`) // computing the mean
 / a.length // and storing the result in m
 ) // end of outer map()
answered Dec 14, 2018 at 18:59
\$\endgroup\$
1
\$\begingroup\$

Python 3 + numpy, 46 bytes

lambda a:(a-mean(a))/std(a)
from numpy import*

Try it online!

answered Dec 15, 2018 at 12:29
\$\endgroup\$
1
\$\begingroup\$

Haskell, 59 bytes

(%)i=sum.map(^i)
f l=[(0%l*y-1%l)/sqrt(2%l*0%l-1%l^2)|y<-l]

Try it online!

Doesn't use libraries.

The helper function % computes the sum of ith powers of a list, which lets us get three useful values.

  • 0%l is the length of l (call this n)
  • 1%l is the sum of l (call this s)
  • 2%l is the sum of squares of l (call this m)

We can express the z-score of an element y as

(n*y-s)/sqrt(n*v-s^2)

(This is the expression (y-s/n)/sqrt(v/n-(s/n)^2) simplified a bit by multiplying the top and bottom by n.)

We can insert the expressions 0%l, 1%l, 2%l without parens because the % we define has higher precedence than the arithmetic operators.

(%)i=sum.map(^i) is the same length as i%l=sum.map(^i)l. Making it more point-free doesn't help. Defining it like g i=... loses bytes when we call it. Although % works for any list but we only call it with the problem input list, there's no byte loss in calling it with argument l every time because a two-argument call i%l is no longer than a one-argument one g i.

answered Dec 16, 2018 at 7:24
\$\endgroup\$
2
  • \$\begingroup\$ We do have \$\LaTeX\$ here:) \$\endgroup\$ Commented Dec 16, 2018 at 9:59
  • \$\begingroup\$ I really like the % idea! It looks just like the discrete version of the statistical moments. \$\endgroup\$ Commented Dec 16, 2018 at 10:02
1
\$\begingroup\$

K (oK), (削除) 33 (削除ここまで) 23 bytes

-10 bytes thanks to ngn!

{t%%(+/t*t:x-/x%#x)%#x}

Try it online!

First attempt at coding (I don't dare to name it "golfing") in K. I'm pretty sure it can be done much better (too many variable names here...)

answered Dec 15, 2018 at 10:33
\$\endgroup\$
4
  • 1
    \$\begingroup\$ nice! you can replace the initial (x-m) with t (tio) \$\endgroup\$ Commented Dec 16, 2018 at 9:53
  • 1
    \$\begingroup\$ the inner { } is unnecessary - its implicit parameter name is x and it has been passed an x as argument (tio) \$\endgroup\$ Commented Dec 16, 2018 at 9:56
  • 1
    \$\begingroup\$ another -1 byte by replacing x-+/x with x-/x. the left argument to -/ serves as initial value for the reduction (tio) \$\endgroup\$ Commented Dec 16, 2018 at 10:08
  • \$\begingroup\$ @ngn Thank you! Now I see that the first 2 golfs are obvious; the last one is beyond my current level :) \$\endgroup\$ Commented Dec 16, 2018 at 10:14
1
\$\begingroup\$

MATLAB, 26 bytes

Trivial-ish, std(,1) for using population standard deviation

f=@(x)(x-mean(x))/std(x,1)
answered Dec 16, 2018 at 15:37
\$\endgroup\$
1
\$\begingroup\$

TI-Basic (83 series), (削除) 14 (削除ここまで) 11 bytes

Ans-mean(Ans
Ans/√(mean(Ans2

Takes input in Ans. For example, if you type the above into prgmSTANDARD, then {1,2,3}:prgmSTANDARD will return {-1.224744871,0.0,1.224744871}.

Previously, I tried using the 1-Var Stats command, which stores the population standard deviation in σx, but it's less trouble to compute it manually.

answered Dec 16, 2018 at 21:25
\$\endgroup\$
1
\$\begingroup\$

05AB1E, 9 bytes

ÅA-DnÅAt/

Port of @Arnauld's JavaScript answer, so make sure to upvote him!

Try it online or verify all test cases.

Explanation:

ÅA # Calculate the mean of the (implicit) input
 # i.e. [-3,1,4,1,5] → 1.6
 - # Subtract it from each value in the (implicit) input
 # i.e. [-3,1,4,1,5] and 1.6 → [-4.6,-0.6,2.4,-0.6,3.4]
 D # Duplicate that list
 n # Take the square of each
 # i.e. [-4.6,-0.6,2.4,-0.6,3.4] → [21.16,0.36,5.76,0.36,11.56]
 ÅA # Pop and calculate the mean of that list
 # i.e. [21.16,0.36,5.76,0.36,11.56] → 7.84
 t # Take the square-root of that
 # i.e. 7.84 → 2.8
 / # And divide each value in the duplicated list with it (and output implicitly)
 # i.e. [-4.6,-0.6,2.4,-0.6,3.4] and 2.8 → [-1.6428571428571428,
 # -0.21428571428571433,0.8571428571428572,-0.21428571428571433,1.2142857142857144]
answered Dec 17, 2018 at 8:51
\$\endgroup\$
1
\$\begingroup\$

Pyth, (削除) 21 (削除ここまで) 19 bytes

[email protected]^-Jk2Q2

Try it online here.

[email protected]^-Jk2Q2Q Implicit: Q=eval(input())
 Trailing Q inferred
 J.OQ Take the average of Q, store the result in J
 m Q Map the elements of Q, as k, using:
 -Jk Difference between J and k
 ^ 2 Square it
 .O Find the average of the result of the map
 @ 2 Square root it
 - this is the standard deviation of Q
m Q Map elements of Q, as d, using:
 -dJ d - J
 c Float division by the standard deviation
 Implicit print result of map

Edit: after seeing Kevin's answer, changed to use the average builtin for the inner results. Previous answer: mc-dJ.OQ@csm^-Jk2QlQ2

answered Dec 16, 2018 at 11:23
\$\endgroup\$
1
\$\begingroup\$

SNOBOL4 (CSNOBOL4), 229 bytes

	DEFINE('Z(A)')
Z	X =X + 1
	M =M + A<X>	:S(Z)
	N =X - 1.
	M =M / N
D	X =GT(X) X - 1	:F(S)
	A<X> =A<X> - M	:(D)
S	X =LT(X,N) X + 1	:F(Y)
	S =S + A<X> ^ 2 / N	:(S)
Y	S =S ^ 0.5
N	A<X> =A<X> / S
	X =GT(X) X - 1	:S(N)
	Z =A	:(RETURN)

Try it online!

Link is to a functional version of the code which constructs an array from STDIN given its length and then its elements, then runs the function Z on that, and finally prints out the values.

Defines a function Z which returns an array.

The 1. on line 4 is necessary to do the floating point arithmetic properly.

answered Dec 17, 2018 at 15:29
\$\endgroup\$
1
\$\begingroup\$

Julia 0.7, 37 bytes

a->(a-mean(a))/std(a,corrected=false)

Try it online!

answered Dec 19, 2018 at 10:01
\$\endgroup\$
1
\$\begingroup\$

Charcoal, (削除) 25 (削除ここまで) 19 bytes

≧−∕ΣθLθθI∕θ2∕ΣXθ2Lθ

Try it online! Link is to verbose version of code. Explanation:

 θ Input array
≧ Update each element
 − Subtract
 Σ Sum of
 θ Input array
 ∕ Divided by
 L Length of
 θ Input array

Calculate \$\mu\$ and vectorised subtract it from each \$x_i\$.

 θ Updated array
 ∕ Vectorised divided by
 2 Square root of
 Σ Sum of
 θ Updated array
 X Vectorised to power
 2 Literal 2
 ∕ Divided by
 L Length of
 θ Array
I Cast to string
 Implicitly print each element on its own line.

Calculate \$\sigma\$, vectorised divide each \$x_i\$ by it, and output the result.

Edit: Saved 6 bytes thanks to @ASCII-only for a) using SquareRoot() instead of Power(0.5) b) fixing vectorised Divide() (it was doing IntDivide() instead) c) making Power() vectorise.

answered Dec 15, 2018 at 0:02
\$\endgroup\$
2
  • \$\begingroup\$ crossed out 25 = no bytes? :P (Also, you haven't updated the TIO link yet) \$\endgroup\$ Commented Dec 25, 2018 at 10:59
  • \$\begingroup\$ @ASCII-only Oops, thanks! \$\endgroup\$ Commented Dec 25, 2018 at 14:32
1
\$\begingroup\$

Factor, 34 bytes

[ dup demean swap 0 std-ddof v/n ]

Try it online!

Sadly, while Factor has the z-score word, it uses the sample standard deviation instead of the population standard deviation.

answered Apr 13, 2022 at 8:35
\$\endgroup\$
0
\$\begingroup\$

CASIO BASIC (CASIO fx-9750GIII), 20 bytes

?→List1
1-Variable List1
(List1-x̄)÷σx

builtins

answered Apr 25 at 18:43
\$\endgroup\$
0
\$\begingroup\$

APL(NARS), 26 chars

{m←÷≢⍵⋄d×ばつ⍨d×ばつ+/⍵} 

test:

 f←{m←÷≢⍵⋄d×ばつ⍨d×ばつ+/⍵}
 f 1 2
 ̄1 1 
 f 1 2 3
 ̄1.224744871 0 1.224744871 
 f ̄3 1 4 1 5
 ̄1.642857143 ̄0.2142857143 0.8571428571 ̄0.2142857143 1.214285714 
answered May 1 at 10:22
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.