How do I create a user-defined aggregate function?

Question 1

I need an aggregate function that MySQL doesn't provide.

I would like it to be in MySQL's flavor of SQL (that is, not in C).

How do I do this? What I'm stuck on is creating an aggregate function -- the docs don't seem to mention how this is done.

Examples of desired usage of a product function:

mysql> select product(col) as a from `table`;
+------+
| a |
+------+
| 144 |
+------+
1 row in set (0.00 sec)
mysql> select col, product(col) as a from `table` group by col;
+-----+------+
| col | a |
+-----+------+
| 6 | 36 |
| 4 | 4 |
+-----+------+
2 rows in set (0.01 sec)

Question 2

According to the documentation http://dev.mysql.com/doc/refman/5.5/en/adding-udf.html it's only possible to write aggregate functions in C. Sorry!

Question 3

Either C or C++. Not SQL, anyway.

Question 4

I presume any language which can generate binary libraries in the platform-supported binary format with C calling conventions.

Question 5

I don't know. It was documented as "C or C++ (or another language that can use C calling conventions)" in version 5.0. The docs dropped "or another language that can use C calling conventions" in version 5.1. That's a strange phrase to drop.

Question 6

is it available in recent mysql versions now (after few years) ?

Question 7

I don't know if there is way to define a new aggregate function, not without messing with MySQL source code.

But if your numbers are all positive, you may well derive from the arithmetic identity:

log( product( Ai ) ) = sum( log( Ai ) )

that you can use EXP(SUM(LOG(x))) to calculate PRODUCT(x). Test in SQL-Fiddle:

SELECT EXP(SUM(LOG(a))) AS product
FROM t ;
SELECT col, EXP(SUM(LOG(a))) AS product
FROM t 
GROUP BY col ;

When the data can have 0s, it gets a bit more complicated:

SELECT (NOT EXISTS (SELECT 1 FROM t WHERE a = 0)) 
 * EXP(SUM(LOG(a))) AS p
FROM t 
WHERE a > 0 ;
SELECT d.col, 
 (NOT EXISTS (SELECT 1 FROM t AS ti WHERE ti.col = d.col AND ti.a = 0)) 
 * COALESCE(EXP(SUM(LOG(t.a))),1) AS p
FROM 
 ( SELECT DISTINCT col
 FROM t
 ) AS d
 LEFT JOIN
 t ON t.col = d.col
 AND t.a > 0
GROUP BY d.col ;

Tested at SQL-Fiddle

For other DBMS, that do not have MySQL's auto-conversion of boolean values to integers, the

(NOT EXISTS (SELECT ...))

should be replaced with:

(CASE WHEN EXISTS (SELECT 1...) THEN 0 ELSE 1 END)

Specifically for Oracle, a few more changes will be needed, without changing the logic of the answer, only because Oracle does not follow strict ANSI standard in some areas. Tested at SQL-Fiddle-2

Question 8

That's sweet. High school math has come back to haunt me. +1 !!!

Question 9

Cool math, but I actually wanted to know how to create an aggregate function in general. product was just supposed to be one example of several.

Question 10

This is pretty cool, but doesn't work if any of the values are zero, since log(0) is undefined.

Question 11

@jameshfisher Correct. One can write easily the extra condition, checking for zeros (where the product would be zero of course). I didn't think at the time it was necessary to add that complication.

Question 12

It's not clear to me how best to add that condition. We can't add the condition in the inner function of the values: since we want that PRODUCT(..., 0, ...) = 0, we want that EXP(SUM(..., f(0), ...)) = 0, for some f that we choose, but to satisfy this, we need that SUM(..., f(0), ...) = LOG(0) -- again thwarted by the same issue that log(0) is undefined. We need to check for the presence of zero in some other way, e.g. MIN(ABS(a)) = 0. So we'd have SELECT CASE WHEN MIN(ABS(a)) = 0 THEN 0 ELSE EXP(SUM(LOG(a))) END AS product. Is this the kind of thing you were thinking of?

Question 13

In the interests of learning how to fish, I have successfully compiled and installed a "Hello, World!" UDF (user-defined function) for MySQL found here. The hello_world.so file (after being complied with gcc -shared -o hello_world.so -I /usr/include/mysql hello_world.c) should be stored in /usr/lib/mysql/plugins/ with 755 permissions on Ubuntu linux systems. [The "-I /usr/include/mysql" is the path to the mysql header files; I found my code wouldn't compile without this parameter, but YMMV.]

The program does nothing but print out the string "Hello, World!" for each record in the resulting dataset of a query, but that's all it is supposed to do. I will try to write a SMALL aggregate function in the next few days. There is an example of an aggregate function which computes the average cost of a group of price and quantity records; the SMALL function shouldn't be that different from that function in the end.

Hope this helps.

Question 14

Technically not a user defined aggregate function, and not optimal, but this works and doesn't need to create an UDF: use GROUP_CONCAT() in your query which will make your query an aggregate one and will consolidate all values in one comma-separated string, and then write a SQL function that processes that string.

Example (function based on this answer)::

DELIMITER $$
CREATE FUNCTION `replace_multiple`(_text text, _from text, _to text) RETURNS text CHARSET utf8mb4
 DETERMINISTIC
BEGIN
DECLARE _nextfrom TEXT DEFAULT NULL;
DECLARE _nextto TEXT DEFAULT NULL;
DECLARE _nextlenfrom INT DEFAULT NULL;
DECLARE _nextlento INT DEFAULT NULL;
DECLARE _valuefrom TEXT DEFAULT NULL;
DECLARE _valueto TEXT DEFAULT NULL;
iterator:
LOOP
 -- exit the loop if the list seems empty or was null;
 -- this extra caution is necessary to avoid an endless loop in the proc.
 IF CHAR_LENGTH(TRIM(_from)) = 0 OR _from IS NULL THEN
 LEAVE iterator;
 END IF;
 -- capture the next value from the list
 SET _nextfrom = SUBSTRING_INDEX(_from,',',1);
 SET _nextto = SUBSTRING_INDEX(_to,',',1);
 -- save the length of the captured value; we will need to remove this
 -- many characters + 1 from the beginning of the string
 -- before the next iteration
 SET _nextlenfrom = CHAR_LENGTH(_nextfrom);
 SET _nextlento = CHAR_LENGTH(_nextto);
 -- trim the value of leading and trailing spaces, in case of sloppy CSV strings
 SET _valuefrom = TRIM(_nextfrom);
 SET _valueto = TRIM(_nextto);
 -- use the extracted value
 SET _text = replace(_text, _valuefrom, _valueto);
 -- rewrite the original string using the `INSERT()` string function,
 -- args are original string, start position, how many characters to remove,
 -- and what to "insert" in their place (in this case, we "insert"
 -- an empty string, which removes _nextlen + 1 characters)
 SET _from = INSERT(_from,1,_nextlenfrom + 1,'');
 SET _to = INSERT(_to,1,_nextlento + 1,'');
END LOOP;
RETURN _text;
END$$
DELIMITER ;

Table t:

id	fromThis	toThat
1	1	one
2	2	two
3	3	three

Query:

SELECT replace_multiple('my 2 cents', GROUP_CONCAT(fromThis), GROUP_CONCAT(toThat))
FROM t;

Will translate to:

SELECT replace_multiple('my 2 cents', '1,2,3', 'one,two,three')

Which for this specific custom function will return my two cents

Colin 't Hart Colin 't Hart 9,51015 gold badges37 silver badges44 bronze badges · Accepted Answer · 2012-10-17 14:11:28Z

8

According to the documentation http://dev.mysql.com/doc/refman/5.5/en/adding-udf.html it's only possible to write aggregate functions in C. Sorry!

Share

Improve this answer

answered Oct 17, 2012 at 14:11

Colin 't Hart's user avatar

Colin 't Hart Colin 't Hart

9,51015 gold badges37 silver badges44 bronze badges

4

Either C or C++. Not SQL, anyway.

Mike Sherrill 'Cat Recall'
– Mike Sherrill 'Cat Recall'

2012年10月18日 00:49:53 +00:00
Commented Oct 18, 2012 at 0:49
1

I presume any language which can generate binary libraries in the platform-supported binary format with C calling conventions.

Colin 't Hart
– Colin 't Hart

2012年10月18日 07:30:12 +00:00
Commented Oct 18, 2012 at 7:30
I don't know. It was documented as "C or C++ (or another language that can use C calling conventions)" in version 5.0. The docs dropped "or another language that can use C calling conventions" in version 5.1. That's a strange phrase to drop.

Mike Sherrill 'Cat Recall'
– Mike Sherrill 'Cat Recall'

2012年10月18日 10:18:33 +00:00
Commented Oct 18, 2012 at 10:18
is it available in recent mysql versions now (after few years) ?

Dinesh
– Dinesh

2017年06月23日 16:01:32 +00:00
Commented Jun 23, 2017 at 16:01

Add a comment |

Stack Exchange Network

How do I create a user-defined aggregate function?

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

How do I create a user-defined aggregate function?

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions