1

I created a table like this,

CREATE TABLE Leaderboard(
 userId bigint not null, 
 matchId bigint not null,
 score mediumint not null,
 country CHAR(10),
 tournamentId int not null
)

I am creating matches of 5 people, all from 5 distinct countries ("UK", "USA", "SPAIN", "GERMANY", "FRANCE)

I want to insert some a million entries to this table with following requirements,

A match will be formed like this

usergermany | userUSA | userSPAIN | userUK | userFrance 

so a match will have 5 users, all from distinct countries. A tournament will have many matches, and a user can only play in one match in a tournament.

So an example table will look like this

userid matchid score country tournamentid
...
988654 3877543 random USA 177
388654 3877543 random GERMANY 177
433432 3877543 random FRANCE 177
776212 3877543 random UK 177
1632987 3877543 random SPAIN 177
2113242 3877544 random SPAIN 177
2918974 3877544 random USA 177
111738 3877544 random UK 177
1772342 3877544 random FRANCE 177
1343243 3877544 random GERMANY 177
123131 3877545 random UK 178
1231414 3877545 random FRANCE 178
2858348 3877545 random GERMANY 178
1122432 3877545 random USA 178
2923434 3877545 random SPAIN 178
...

A userId cannot exist in a tournament twice,

A country cannot exists in a match twice (match groups are formed from distinct countries)

Also since 5 players compete in a match, a matchid will only appear 5 times in the table.

I want to insert a million entries into this table, userid being randomly from 1 to 3 million with the above constraints. 10 thousand matches for each tournament, so there will be 100 tournaments starting from 1.

100 tournaments, 10000 matches each, make up to 1 million rows.

--UPDATE--

create table seq_data as
with recursive tmp(x) as (
 select 1
 union all
 select x+1 from tmp
 limit 3000000
)
select * from tmp;
 
CREATE TABLE if not exists Leaderboard(
 userId bigint not null,
 matchId bigint not null,
 score mediumint not null,
 country CHAR(10),
 tournamentId int not null
);
 
DELIMITER //
CREATE PROCEDURE InsertLeaderboardData()
BEGIN
 DECLARE tournamentloop INT DEFAULT 0;
 DECLARE matchloop INT DEFAULT 0;
 DECLARE groupLoop INT DEFAULT 0;
 DECLARE matchidKey INT DEFAULT 1;
 WHILE tournamentloop < 1 DO
 SET tournamentloop = tournamentloop + 1;
 SET matchloop = 0;
 WHILE matchloop < 10000 DO
 SET matchidKey = matchidKey + 1;
 SET matchloop = matchloop + 1;
 SET groupLoop = 0;
 WHILE groupLoop < 5 DO
 SET groupLoop = groupLoop + 1;
 INSERT INTO Leaderboard (userId, matchId, score, country, tournamentId)
 VALUES (1, matchidKey, FLOOR(RAND() * 1000) + 1, ELT(groupLoop, "SPAIN", "FRANCE", "UK", "USA", "GERMANY"), tournamentloop);
 SELECT matchidKey;
 END WHILE;
 END WHILE;
 END WHILE;
END;
//
 
DELIMITER ;
CALL InsertLeaderboardData();
DROP PROCEDURE InsertLeaderboardData;

This runs but slowly, yet I want 1 more feature.

As to INSERT INTO's userId I set the value to 1 for now to edit it later. What I want to do is, I created sequenced numbers into the table seq_data starting from 1 up to 3 million. For each WHILE tournamentloop < 100 DO scope, I want to get 50 thousand DISTINCT numbers from the seq_data table. so I can give the users an id.

The morale is, no userid can appear twice in a tournament. so for each tournament scope i need to retrieve random ids which are distinct and assign it to rows. I already created the table for sequenced numbers, I don't know what is the best approach from here.

-- Statement

INSERT INTO Leaderboard (userId, matchId, score, country, tournamentId) VALUES (useridKey, matchidKey, FLOOR(RAND() * 1000) + 1, ELT(groupLoop, "SPAIN", "FRANCE", "UK", "USA", "GERMANY"), tournamentloop);

I turn this into

DECLARE tournamentloop INT DEFAULT 0;
DECLARE matchloop INT DEFAULT 0;
DECLARE groupLoop INT DEFAULT 0;
DECLARE matchidKey INT DEFAULT 1;
DECLARE useridKey INT DEFAULT 1;
PREPARE insertStmt FROM 'INSERT INTO Leaderboard (userId, matchId, score, country, tournamentId)VALUES (useridKey, matchidKey, FLOOR(RAND() * 1000) + 1, ELT(groupLoop, "SPAIN", "FRANCE", "UK", "USA", "GERMANY"), tournamentloop);';
 EXECUTE insertStmt;

I get error: Unknown column 'useridKey' in 'field list'

asked Apr 4, 2024 at 7:21
6
  • Please study the difference between user-defined variables and local variables. And your task does not need in stored procedure, iterations and so on, needed dataset can be generated with single query which uses recursive CTE and statically generated countries table. Commented Apr 4, 2024 at 8:39
  • WITH RECURSIVE must be written only once! all CTEs (both recursive and non-) are written after it before outer SELECT separated with a comma. Study Reference manual for correct syntax. Also you need only one CTE which generates a list of natural integers. Then you'd join needed amount of its copies in the outer query and use each copy for its personal purposes adding needed WHERE which restricts the amount of used numbers to needed one. Commented Apr 4, 2024 at 13:37
  • Where can we find the organizations documented rules for these tournaments and examples of how they think the participants will be managed. Commented Apr 4, 2024 at 21:35
  • I don't know. I just have a small workshop I am working on which I explained in the question. Yet I still couldn't achieve. I hope @Akina can help because this is a bit advanced for me. Commented Apr 4, 2024 at 21:54
  • @Akina please demonstrate how I can use the join with outer query for this purpose, I tried with while yet it is very very slow. Please please demonstrate, I really want to learn. Commented Apr 5, 2024 at 2:58

1 Answer 1

2
INSERT INTO Leaderboard (matchID, score, country, tournamentID)
WITH RECURSIVE 
cte1 (num) AS ( SELECT 1 UNION ALL SELECT num + 1 FROM cte1 LIMIT 100 ),
cte2 (country) AS (SELECT "SPAIN" UNION ALL SELECT "FRANCE" UNION ALL SELECT "UK"
 UNION ALL SELECT "USA" UNION ALL SELECT "GERMANY")
SELECT t1.num, -- match ID
 RAND() * 100, -- random score in 0..99
 t3.country,
 t2.num -- tournament ID
FROM cte1 t1 -- match IDs
CROSS JOIN cte1 t2 -- tournament IDs
CROSS JOIN cte2 t3
WHERE t1.num <= 3 -- 3 matches per tournament
 AND t2.num < 5 -- 4 tournaments
;
-- totally the query creates 3 matches * 4 tournaments * 5 countries = 60 rows

fiddle

answered Apr 5, 2024 at 4:30
12
  • this is so great! indeed there is one last missing column, which is the userid, userid can be any number let's say between 1 to thousand but should be unique for every tournament id. (since a player can't play 2 matches at the same time in a tournament) Commented Apr 5, 2024 at 4:34
  • @umarkaa there is one last missing column, which is the userid See provided fiddle. userid is defined as autoincrement, so its values are filled automatically.. Commented Apr 5, 2024 at 4:38
  • spasiva. Can you also try this, as I am studying the code... Not making it a PK but a random number between 0-50... but unique to each tournamentid... that is all, Commented Apr 5, 2024 at 4:45
  • @umarkaa Using cte1 and cte2, create cte3 which selects, for each country, some amount of randomly selected Ids (for example, join random country to each separate id, this will produce a rowset where there is no a userid value which belongs to more than one coutry). I.e. you'll obtain random amount of pairs (userid, country) for each country (check that those amount of pairs per country is not too small, maybe use dynamic country selection). In outer query you'd select one random pair for current country, with correlated subquery, for example. Commented Apr 5, 2024 at 4:50
  • 1
    same userid will always belong to 1 country If so then create 100 users (use cte1), assign each user to definite country (use num MOD 5 as country index), insert random user with needed country into each table row. Commented Apr 5, 2024 at 5:00

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.