Implement an API for probability distributions

Question 1

Introduction

In this challenge, your task is to implement a collection of simple functions that together form a usable mini-library for simple probability distributions. To accommodate some of the more esoteric languages people like to use here, the following implementations are acceptable:

A code snippet defining a collection of named functions (or closest equivalents).
A collection of expressions that evaluate to named or anonymous functions (or closest equivalents).
A single expression that evaluates to several named or anonymous functions (or closest equivalents).
A collection of independent programs that take inputs from command line, STDIN or closest equivalent, and output to STDOUT or closest equivalent.

The functions

You shall implement the following functions, using shorter names if desired.

uniform takes as input two floating point numbers a and b, and returns the uniform distribution on [a,b]. You can assume that a < b; the case a ≥ b is undefined.
blend takes as inputs three probability distributions P, Q and R. It returns a probability distribution S, which draws values x, y and z from P, Q and R, respectively, and yields y if x ≥ 0, and z if x < 0.
over takes as input a floating point number f and a probability distribution P, and returns the probability that x ≥ f holds for a random number x drawn from P.

For reference, over can be defined as follows (in pseudocode):

over(f, uniform(a, b)):
 if f <= a: return 1.0
 else if f >= b: return 0.0
 else: return (b - f)/(b - a)
over(f, blend(P, Q, R)):
 p = over(0.0, P)
 return p*over(f, Q) + (1-p)*over(f, R)

You can assume that all probability distributions given to over are constructed using uniform and blend, and that the only thing a user is going to do with a probability distribution is to feed it to blend or over. You can use any convenient datatype to represent the distributions: lists of numbers, strings, custom objects, etc. The only important thing is that the API works correctly. Also, your implementation must be deterministic, in the sense of always returning the same output for the same inputs.

Test cases

Your output values should be correct to at least two digits after the decimal point on these test cases.

over(4.356, uniform(-4.873, 2.441)) -> 0.0
over(2.226, uniform(-1.922, 2.664)) -> 0.09550806803314438
over(-4.353, uniform(-7.929, -0.823)) -> 0.49676329862088375
over(-2.491, uniform(-0.340, 6.453)) -> 1.0
over(0.738, blend(uniform(-5.233, 3.384), uniform(2.767, 8.329), uniform(-2.769, 6.497))) -> 0.7701533851999125
over(-3.577, blend(uniform(-3.159, 0.070), blend(blend(uniform(-4.996, 4.851), uniform(-7.516, 1.455), uniform(-0.931, 7.292)), blend(uniform(-5.437, -0.738), uniform(-8.272, -2.316), uniform(-3.225, 1.201)), uniform(3.097, 6.792)), uniform(-8.215, 0.817))) -> 0.4976245638164541
over(3.243, blend(blend(uniform(-4.909, 2.003), uniform(-4.158, 4.622), blend(uniform(0.572, 5.874), uniform(-0.573, 4.716), blend(uniform(-5.279, 3.702), uniform(-6.564, 1.373), uniform(-6.585, 2.802)))), uniform(-3.148, 2.015), blend(uniform(-6.235, -5.629), uniform(-4.647, -1.056), uniform(-0.384, 2.050)))) -> 0.0
over(-3.020, blend(blend(uniform(-0.080, 6.148), blend(uniform(1.691, 6.439), uniform(-7.086, 2.158), uniform(3.423, 6.773)), uniform(-1.780, 2.381)), blend(uniform(-1.754, 1.943), uniform(-0.046, 6.327), blend(uniform(-6.667, 2.543), uniform(0.656, 7.903), blend(uniform(-8.673, 3.639), uniform(-7.606, 1.435), uniform(-5.138, -2.409)))), uniform(-8.008, -0.317))) -> 0.4487803553043079

Question 2

Can we use built-in functions to make them?

Question 3

@AndréMuta I forgot that Mathematica probably has built-ins for all of this... but I'm going to allow them, as long as they follow the rules.

Question 4

What is your suggestion on how to represent floating point data in BrainFuck?

Question 5

@flawr For languages that don't have native floating point numbers, you can use any convenient encoding for floats between -10.0 and 10.0 (exclusive) that has at most 0.001 difference between consecutive values. The output should be accurate to within 0.01 difference for the test cases.

Question 6

Ruby, 103

u=b=->*a{a}
o=->f,d{d[2]?(p=o[0,d[0]])*o[f,d[1]]+(1-p)*o[f,d[2]]:(f<a=d[0])?1:(f>b=d[1])?0:(b-f)/(b-a)}

Defines three lambdas, u, b, and o. u and b just create two-element and three-element arrays respectively. o assumes a two-element array is a uniform distribution and a three-element one is a blend of three distributions. In the latter case it calls itself recursively.

Question 7

MATLAB, 73

Time for a little "functional programming" in MATLAB. These are 3 anonymous functions. Uniform and blend are called the same way as the examples, but for over the arguments should be swapped. I don't really need an over since the first two return functions, but as a formality feval is a function that can call a function.

%uniform
@(a,b)@(x)(x<b)*min(1,(b-x)/(b-a))
%blend
@(P,Q,R)@(x)P(0)*(Q(x)-R(x))+R(x)
%over
@feval

Now MATLAB's parsing and evaluation system is a little wonky to say the least. It doesn't allow you to directly call a function that was returned from a function. Instead, one must first save the result to a variable. The 4th example could be done as follows:

x=uniform(-5.233,3.384);y=uniform(2.767,8.329);z=uniform(-2.769,6.497);over(blend(x,y,z),0.738)

However, it is possible to get around this by using feval to call all the functions. If the following definitions are used, then the examples can be evaluated exactly as they are written.

uniform=@(a,b)@(x)(x<b)*min(1,(b-x)/(b-a))
blend=@(P,Q,R)@(x)feval(P,0)*(feval(Q,x)-feval(R,x))+feval(R,x)
over=@(x,f)feval(f,x)

Question 8

Functions making functions... how perverse!

Question 9

CJam, 58 bytes

{[\]}:U;
{[@]}:B;
{_,2={~1$-@@-\/0e>1e<}{6Yb@f*\.{O})[_1@-].*:+}?}:O;

These are postfix operators that work on the stack: 2.0 1.0 3.0 U O is over(2, uniform(1, 3)).

Score count

{[\]} is the function itself, :U; assigns it to the name U and pops it. Essentially this isn't part of the function, so by score counting rule 2, I'd only have to count {[\]}. B is defined similarly.

However, O is recursive, and if I don't specify a name, there's no way to recurse. So here, I'd be inclined to count the :O; part. Then my score is 5+5+48=58 bytes in total.

Explanation

U pops two arguments and makes a pair in reverse order: a b => [b a].

B pops three arguments and makes a triple in rotated order: a b c => [b c a].

O's structure is as follows:

{ }:O; Define O as this function:
 _,2= ? If the argument list's length is 2:
 {~Γ} Append the list to the stack and execute subprogram Γ.
 {~Δ} Else, do the same, but execute subprogram Δ.

Subprogram Γ handles uniform distributions:

Executed ops Explanation Stack contents
============ =========== ==============
 Initial f; b; a
1$ Copy b f; b; a; b
 - Difference f; b; (a-b)
 @@ Rotate x2 (a-b); f, b
 - Difference (a-b); (f-b)
 \/ Flip divide (f-b)/(a-b)
 0e> Clamp low max(0, (f-b)/(a-b))
 1e< Clamp high min(1, max(0, (f-b)/(a-b)))

Subprogram Δ handles blended distributions:

Executed ops Explanation Stack contents
============ =========== ==============
 Initial f; [Q R P]
6Yb Push [1,1,0] f; [Q R P]; [1 1 0]
 @ Rotate [Q R P]; [1 1 0]; f
 f* Multiply each [Q R P]; [f f 0]
 \ Swap [f f 0]; [Q R P]
 .{O} Pairwise O [q r p]
 ) Uncons [q r] p
 [_1@-] [p, 1-p] [q r] [p 1-p]
 .*:+ Dot product q*p+r*(1-p)

Question 10

Mathematica, (削除) 129 (削除ここまで) 116 bytes

u=UniformDistribution@{##}&;b=If[x<0,z,y]~TransformedDistribution~{x\uF3D2#,y\uF3D2#2,z\uF3D2#3}&;o=Probability[x>=#,x\uF3D2#2]&

u, b and o are uniform, blend, and over respectively.Wrapper over the standard functions. Replace the \uF3D2s with the 3-byte character. Just returns 0 and 1 for cases 1, 4, and 7.

Question 11

Python, 146 bytes

u=lambda*a:a
b=u
x=lambda f,a,b:[int(f<=a),(b-f)/(b-a)][a<f<b]
y=lambda f,p,q,r:o(0,p)*o(f,q)+(1-o(0,p))*o(f,r)
o=lambda f,p:[x,y][len(p)-2](f,*p)

Same strategy as histocrat's Ruby answer, but in Python. To do recursion without a Z-combinator (which would be costly), x and y are defined as helper functions that evaluate over for 2- and 3-length argument tuples (uniform and blend arguments, respectively).

Test cases on ideone

Question 12

Matlab, 104 bytes

I hope this is still valid, as this only works for distributions with support in [-10,10] which is the requirement for languages that have no floating point support. The support vector and the accuracy can be easily adjusted by just altering corresponding numbers. u,o,b is for uniform,blend,over. The pdf is just represented as a discrete vector. I think this approach can easily be transferred to other languages.

D=1e-4;X=-10:D:10;
u=@(a,b)(1/(b-a))*(a<X&X<b);
o=@(x,d)sum(d.*(X>x))*D;
b=@(p,q,r)o(0,p).*q+(1-o(0,p)).*r;

You can test them if you define those functions first and then just paste this code:

[o(4.356, u(-4.873, 2.441)) , 0.0;
o(2.226, u(-1.922, 2.664)) , 0.09550806803314438;
o(-4.353, u(-7.929, -0.823)) , 0.49676329862088375;
o(-2.491, u(-0.340, 6.453)) , 1.0;
o(0.738, b(u(-5.233, 3.384), u(2.767, 8.329), u(-2.769, 6.497))) , 0.7701533851999125;
o(-3.577, b(u(-3.159, 0.070), b(b(u(-4.996, 4.851), u(-7.516, 1.455), u(-0.931, 7.292)), b(u(-5.437, -0.738), u(-8.272, -2.316), u(-3.225, 1.201)), u(3.097, 6.792)), u(-8.215, 0.817))) , 0.4976245638164541;
o(3.243, b(b(u(-4.909, 2.003), u(-4.158, 4.622), b(u(0.572, 5.874), u(-0.573, 4.716), b(u(-5.279, 3.702), u(-6.564, 1.373), u(-6.585, 2.802)))), u(-3.148, 2.015), b(u(-6.235, -5.629), u(-4.647, -1.056), u(-0.384, 2.050)))) , 0.0;
o(-3.020, b(b(u(-0.080, 6.148), b(u(1.691, 6.439), u(-7.086, 2.158), u(3.423, 6.773)), u(-1.780, 2.381)), b(u(-1.754, 1.943), u(-0.046, 6.327), b(u(-6.667, 2.543), u(0.656, 7.903), b(u(-8.673, 3.639), u(-7.606, 1.435), u(-5.138, -2.409)))), u(-8.008, -0.317))) , 0.4487803553043079]

Question 13

Matlab has FP support, so I think that this would be invalid.

Question 14

I'm hesitant to allow this, since Matlab supports floating point numbers natively. If you can replace X and D with MIN_FLOAT and MAX_FLOAT (or whatever Matlab calls them), then this is a valid approach.

Question 15

Yes, you could ouse realmax/realmin, you could even make a vector that goes throu all floating point number if you have enough memory.

histocrat histocrat 22.3k4 gold badges38 silver badges81 bronze badges · Answer 1 · 2015-10-22 23:13:28Z

Ruby, 103

u=b=->*a{a}
o=->f,d{d[2]?(p=o[0,d[0]])*o[f,d[1]]+(1-p)*o[f,d[2]]:(f<a=d[0])?1:(f>b=d[1])?0:(b-f)/(b-a)}

Defines three lambdas, u, b, and o. u and b just create two-element and three-element arrays respectively. o assumes a two-element array is a uniform distribution and a three-element one is a blend of three distributions. In the latter case it calls itself recursively.

feersum feersum 31.8k10 gold badges66 silver badges125 bronze badges · Answer 2 · 2015-10-23 00:00:48Z

MATLAB, 73

Time for a little "functional programming" in MATLAB. These are 3 anonymous functions. Uniform and blend are called the same way as the examples, but for over the arguments should be swapped. I don't really need an over since the first two return functions, but as a formality feval is a function that can call a function.

%uniform
@(a,b)@(x)(x<b)*min(1,(b-x)/(b-a))
%blend
@(P,Q,R)@(x)P(0)*(Q(x)-R(x))+R(x)
%over
@feval

Now MATLAB's parsing and evaluation system is a little wonky to say the least. It doesn't allow you to directly call a function that was returned from a function. Instead, one must first save the result to a variable. The 4th example could be done as follows:

x=uniform(-5.233,3.384);y=uniform(2.767,8.329);z=uniform(-2.769,6.497);over(blend(x,y,z),0.738)

However, it is possible to get around this by using feval to call all the functions. If the following definitions are used, then the examples can be evaluated exactly as they are written.

uniform=@(a,b)@(x)(x<b)*min(1,(b-x)/(b-a))
blend=@(P,Q,R)@(x)feval(P,0)*(feval(Q,x)-feval(R,x))+feval(R,x)
over=@(x,f)feval(f,x)

\$\begingroup\$ Functions making functions... how perverse! \$\endgroup\$

Luis Mendo
– Luis Mendo

2015年10月23日 11:56:12 +00:00
Commented Oct 23, 2015 at 11:56

lynn lynn 69.7k11 gold badges137 silver badges285 bronze badges · Answer 3 · 2015-10-22 22:48:52Z

CJam, 58 bytes

{[\]}:U;
{[@]}:B;
{_,2={~1$-@@-\/0e>1e<}{6Yb@f*\.{O})[_1@-].*:+}?}:O;

These are postfix operators that work on the stack: 2.0 1.0 3.0 U O is over(2, uniform(1, 3)).

Score count

{[\]} is the function itself, :U; assigns it to the name U and pops it. Essentially this isn't part of the function, so by score counting rule 2, I'd only have to count {[\]}. B is defined similarly.

However, O is recursive, and if I don't specify a name, there's no way to recurse. So here, I'd be inclined to count the :O; part. Then my score is 5+5+48=58 bytes in total.

Explanation

U pops two arguments and makes a pair in reverse order: a b => [b a].

B pops three arguments and makes a triple in rotated order: a b c => [b c a].

O's structure is as follows:

{ }:O; Define O as this function:
 _,2= ? If the argument list's length is 2:
 {~Γ} Append the list to the stack and execute subprogram Γ.
 {~Δ} Else, do the same, but execute subprogram Δ.

Subprogram Γ handles uniform distributions:

Executed ops Explanation Stack contents
============ =========== ==============
 Initial f; b; a
1$ Copy b f; b; a; b
 - Difference f; b; (a-b)
 @@ Rotate x2 (a-b); f, b
 - Difference (a-b); (f-b)
 \/ Flip divide (f-b)/(a-b)
 0e> Clamp low max(0, (f-b)/(a-b))
 1e< Clamp high min(1, max(0, (f-b)/(a-b)))

Subprogram Δ handles blended distributions:

Executed ops Explanation Stack contents
============ =========== ==============
 Initial f; [Q R P]
6Yb Push [1,1,0] f; [Q R P]; [1 1 0]
 @ Rotate [Q R P]; [1 1 0]; f
 f* Multiply each [Q R P]; [f f 0]
 \ Swap [f f 0]; [Q R P]
 .{O} Pairwise O [q r p]
 ) Uncons [q r] p
 [_1@-] [p, 1-p] [q r] [p 1-p]
 .*:+ Dot product q*p+r*(1-p)

score 1 · Answer 4 · 2015-10-22 20:02:14Z

Mathematica, (削除) 129 (削除ここまで) 116 bytes

u=UniformDistribution@{##}&;b=If[x<0,z,y]~TransformedDistribution~{x\uF3D2#,y\uF3D2#2,z\uF3D2#3}&;o=Probability[x>=#,x\uF3D2#2]&

u, b and o are uniform, blend, and over respectively.Wrapper over the standard functions. Replace the \uF3D2s with the 3-byte character. Just returns 0 and 1 for cases 1, 4, and 7.

user45941user45941 · Answer 5 · 2015-10-23 11:28:01Z

Python, 146 bytes

u=lambda*a:a
b=u
x=lambda f,a,b:[int(f<=a),(b-f)/(b-a)][a<f<b]
y=lambda f,p,q,r:o(0,p)*o(f,q)+(1-o(0,p))*o(f,r)
o=lambda f,p:[x,y][len(p)-2](f,*p)

Same strategy as histocrat's Ruby answer, but in Python. To do recursion without a Z-combinator (which would be costly), x and y are defined as helper functions that evaluate over for 2- and 3-length argument tuples (uniform and blend arguments, respectively).

Test cases on ideone

flawr flawr 44.1k7 gold badges108 silver badges253 bronze badges · Answer 6 · 2015-10-22 22:02:52Z

Matlab, 104 bytes

I hope this is still valid, as this only works for distributions with support in [-10,10] which is the requirement for languages that have no floating point support. The support vector and the accuracy can be easily adjusted by just altering corresponding numbers. u,o,b is for uniform,blend,over. The pdf is just represented as a discrete vector. I think this approach can easily be transferred to other languages.

D=1e-4;X=-10:D:10;
u=@(a,b)(1/(b-a))*(a<X&X<b);
o=@(x,d)sum(d.*(X>x))*D;
b=@(p,q,r)o(0,p).*q+(1-o(0,p)).*r;

You can test them if you define those functions first and then just paste this code:

[o(4.356, u(-4.873, 2.441)) , 0.0;
o(2.226, u(-1.922, 2.664)) , 0.09550806803314438;
o(-4.353, u(-7.929, -0.823)) , 0.49676329862088375;
o(-2.491, u(-0.340, 6.453)) , 1.0;
o(0.738, b(u(-5.233, 3.384), u(2.767, 8.329), u(-2.769, 6.497))) , 0.7701533851999125;
o(-3.577, b(u(-3.159, 0.070), b(b(u(-4.996, 4.851), u(-7.516, 1.455), u(-0.931, 7.292)), b(u(-5.437, -0.738), u(-8.272, -2.316), u(-3.225, 1.201)), u(3.097, 6.792)), u(-8.215, 0.817))) , 0.4976245638164541;
o(3.243, b(b(u(-4.909, 2.003), u(-4.158, 4.622), b(u(0.572, 5.874), u(-0.573, 4.716), b(u(-5.279, 3.702), u(-6.564, 1.373), u(-6.585, 2.802)))), u(-3.148, 2.015), b(u(-6.235, -5.629), u(-4.647, -1.056), u(-0.384, 2.050)))) , 0.0;
o(-3.020, b(b(u(-0.080, 6.148), b(u(1.691, 6.439), u(-7.086, 2.158), u(3.423, 6.773)), u(-1.780, 2.381)), b(u(-1.754, 1.943), u(-0.046, 6.327), b(u(-6.667, 2.543), u(0.656, 7.903), b(u(-8.673, 3.639), u(-7.606, 1.435), u(-5.138, -2.409)))), u(-8.008, -0.317))) , 0.4487803553043079]

Matlab has FP support, so I think that this would be invalid.
I'm hesitant to allow this, since Matlab supports floating point numbers natively. If you can replace X and D with MIN_FLOAT and MAX_FLOAT (or whatever Matlab calls them), then this is a valid approach.
Yes, you could ouse realmax/realmin, you could even make a vector that goes throu all floating point number if you have enough memory.

Stack Exchange Network

Implement an API for probability distributions

Introduction

The functions

Test cases

6 Answers 6

Ruby, 103

MATLAB, 73

CJam, 58 bytes

Score count

Explanation

Mathematica, (削除) 129 (削除ここまで) 116 bytes

Python, 146 bytes

Matlab, 104 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Implement an API for probability distributions

Introduction

The functions

Test cases

6 Answers 6

Ruby, 103

MATLAB, 73

CJam, 58 bytes

Score count

Explanation

Mathematica, (削除) 129 (削除ここまで) 116 bytes

Python, 146 bytes

Matlab, 104 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions