[Solved] Selecting unique random combinations from a table

Discuss the database features

[Solved] Selecting unique random combinations from a table

Postby dreamquartz » Tue May 11, 2021 8:06 pm

Hi All,

I need to be able to select records from a predefined list, where the following is the case:
1 The combination of L-M MUST be unique for a requested number of records
2. L MUST be unique when selected
3. M MUST be unique when selected

The list is in random order, and will be re-ordered every time a new resultset, based on a randomly requested set of records is required.

I have the following list:
# L M
-----------------
10 10 7
49 4 7
43 3 9
45 3 12
47 4 4
15 15 10
38 3 3
24 1 8
5 5 7
7 7 9
27 1 14
54 22 9
53 4 21
50 4 8
48 4 5
32 2 6
31 2 4
14 14 17
16 16 2
0 1 5
42 3 8
21 21 1
26 1 11
46 4 3

I need to be able to do a selection of e.g. 7 records
Based on the 3 Criteria above, the following resultset should be produced, based on the generated list.
# L M
-----------------
10 10 7
43 3 9
47 4 4
15 15 10
24 1 8
32 2 6
14 14 17

Can someone please provide me with some guidance?

Thanks,

Dream
Last edited by dreamquartz on Tue Jun 08, 2021 9:24 am, edited 2 times in total.
LO 6.x, HSQLDB 2.6.0 & Ubuntu 20.04 LTS.
dreamquartz
 
Posts: 801
Joined: Mon May 30, 2011 4:02 am

Re: Selecting unique random combinations from a table

Postby Villeroy » Tue May 11, 2021 8:14 pm

Code: Select all   Expand viewCollapse view
SELECT * FROM "Table Name" ORDER BY RAND() LIMIT 7
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
Villeroy
Volunteer
 
Posts: 29711
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Selecting unique random combinations from a table

Postby dreamquartz » Wed May 12, 2021 1:27 am

Thanks for your response Villeroy.

The Query
Villeroy wrote:
Code: Select all   Expand viewCollapse view
SELECT * FROM "Table Name" ORDER BY RAND() LIMIT 7
will provide something like

# L M
13 13 11
35 2 9
47 4 4
17 17 6
22 1 9
29 1 6
50 4 8
This resultset is not acceptable
22 1 9 can not be selected because of criterium 2. L MUST be unique
22 1 9 can not be selected because of criterium 3. M MUST be unique
- both L and M are already present
29 1 6 fails also criterium 2. and 3.
50 4 8 fails criterium 2.

As a result of using RAND(), the table will always have a different order, and therefore hardcoding will not work.
For a resultset of more than 1, the 2nd selection is dependant of what is selected in the 1st.
Ergo, all new selections will be dependant of what is allowed in the all selections made prior.

Dream
LO 6.x, HSQLDB 2.6.0 & Ubuntu 20.04 LTS.
dreamquartz
 
Posts: 801
Joined: Mon May 30, 2011 4:02 am

Re: Selecting unique random combinations from a table

Postby gkick » Wed May 12, 2021 5:56 am

Your data contains duplicates. I may be wrong, however I think the random function will not exclude duplication by default, it just sorts the recordset in random order
Libre Office 6.4.6 on Windows 10 HSQL 2.51 backend
gkick
 
Posts: 292
Joined: Wed Aug 07, 2019 5:24 pm
Location: Chile

Re: Selecting unique random combinations from a table

Postby dreamquartz » Wed May 12, 2021 7:40 pm

gkick wrote:Your data contains duplicates. I may be wrong, however I think the random function will not exclude duplication by default, it just sorts the recordset in random order

You are right in part.
I understand your remark.
Criterium 2. and Criterium 3. might not be clear.
I have updated the initial Criteria.
Thanks for noticing that.

All, however do fulfil the requirement under Criterium 1.
L and M can re-occur, but the combination can NOT.
L=4 is NOT EQUAL to M=4.
Think about replacing all the numerical values under L with letters, make L=4 NOT L=D (d is the 4th letter of the alphabet), but some random letter.
The table would then look like something like this:

# L M
-----------------
10 A 7
49 T 7
43 K 9
45 K 12
47 T 4
15 S 10
38 K 3
24 U 8
5 N 7
7 W 9
27 U 14
54 B 9
53 T 21
50 T 8
48 T 5
32 X 6
31 X 4
14 Q 17
.
.
.


where L 10=>L A, L 3=>L K ......

The resultset would then look like
# L M
-----------------
10 A 7
43 K 9
47 T 4
15 S 10
24 U 8
32 X 6
14 Q 17

Maybe that helps a little.

Dream
LO 6.x, HSQLDB 2.6.0 & Ubuntu 20.04 LTS.
dreamquartz
 
Posts: 801
Joined: Mon May 30, 2011 4:02 am

Re: Selecting unique random combinations from a table

Postby chrisb » Thu May 13, 2021 5:07 pm

it's a tough nut to crack using plain sql.
the difficulty lies in having to test two fields (which means that we cannot use MERGE) & discarding those records in which we find a match.
i think that the only way to do this is to create a table into which:
1) we can insert those records which meet our criteria.
2) be used for reference as we parse the data set.

so select the first record from the list source
parse the storage table we created
if no match is found then add to store
if a match is found then do nothing
then if we have more records repeat the process

i tried to do this using recursion but was unable to parse the partially derived table.
i succeeded using a macro & a temporary table (not difficult).
i also succeeded by using a permanent table & a procedure which is the neatest solution & can be found below.
the only problem in using a procedure is that it must be executed from menu:Tools>SQL or alternatively via a macro which could be fired by a form button.
the procedure requires a view or table as its data-source.
it is most likely that a function almost identical to the procedure could be created & a function can be executed from within a query.

i only used the fields "#", "L" & "M" (all integers) as shown in your initial post.

--CREATE PROCEDURE MYLIST
--this procedure inserts an amended list into a table called "tMyNewList"
--create table "tMyNewList"(id int not null, l int not null, m int not null);
--'vListSource' is the name of the table/view which contains the original list data
--both 'vListSource' & 'tMyNewList' must exist before the procedure can be created
--make sure to replace "vListSource" with the relevant table/view name
--drop procedure MYLIST;

--to use this procedure paste:
--CALL MYLIST()
--into menu:Tools>SQL & execute or execute via a macro

--to retrieve data 'select * from "tMyNewList"'
--or 'select top 7 * from "tMyNewList"'
--or 'select * from "tMyNewList" where rownum() <=7'

Code: Select all   Expand viewCollapse view
create procedure MYLIST()
   language sql
   not deterministic
   modifies sql data
   begin atomic
      declare iI, iL, iM int;
      declare undo handler for sqlexception;
      delete from "tMyNewList";

      for select "#" id, L, M from "vListSource"
         do
         set iI = ID; set iL = L; set iM = M;
         for select count(*) c from "tMyNewList" where L = iL or M = iM
            do
            if c = 0 then
               insert into "tMyNewList" values(iI, iL, iM);
            end if;
         end for;
      end for;
   end
open office 4.1.8 & LibreOffice 6.4.4.2 x64 using HSQL 1.8.0.10 (Embedded) and HSQL 2.5.2 (Split) on Windows 10
chrisb
 
Posts: 262
Joined: Mon Jun 07, 2010 4:16 pm

Re: Selecting unique random combinations from a table

Postby eremmel » Thu May 13, 2021 11:13 pm

@chrisb,
I thought about a recursive query, but I've not worked it out.
The idea is that each level concatenates the L (and M) values in a string-column like "-41-2-34-4-", named L_CON (and M_CON) and on the recursion join condition you make sure that INSTR(X.L_CON, "-" + Y.L + "-") = 0. In this way only unique values are joined in the recursion.
 Edit: A quick test: 

This works in Postgres, I've no HSQLDB 2.x at hand. Replace STRPOS with INSTR. A performant working might depend on the query plan.
Code: Select all   Expand viewCollapse view
WITH RECURSIVE random_data AS (
  SELECT id, l, m, 0 as level
   , CONCAT('-', l, '-') as l_con
   , CONCAT('-', m, '-') as m_con
  FROM ( SELECT id, l, m FROM test ORDER BY RANDOM() LIMIT 1 ) P
UNION ALL
  SELECT id, l, m, level, l_con, m_con
  FROM (
   SELECT X.id, X.l, X.m, Y.level+1 as level
      , CONCAT(Y.l_con, X.l, '-') as l_con
      , CONCAT(Y.m_con, X.m, '-') as m_con
   FROM ( SELECT id, l, m FROM test ORDER BY RANDOM() ) X
         INNER JOIN random_data Y ON
            STRPOS(Y.l_con, CONCAT('-', X.l, '-') ) = 0
            AND STRPOS(Y.m_con, CONCAT('-', X.m, '-')) = 0
   LIMIT 1
   ) Z
)
SELECT id, l, m, R.level, R.m_con, R.l_con
FROM random_data R
WHERE R.level < 7

Random output:
Code: Select all   Expand viewCollapse view
50   4   8  0  "-8-"              "-4-"
38   3   3  1  "-8-3-"              "-4-3-"
10  10   7  2  "-8-3-7-"            "-4-3-10-"
31   2   4  3  "-8-3-7-4-"          "-4-3-10-2-"
26   1  11  4  "-8-3-7-4-11-"       "-4-3-10-2-1-"
54  22   9  5  "-8-3-7-4-11-9-"     "-4-3-10-2-1-22-"
15  15  10  6  "-8-3-7-4-11-9-10-"  "-4-3-10-2-1-22-15-"

It's Microsoft marketing that tells you computers are qualified for non-technicians
W10 (build 19042), LO 7.0.4.2(x64)
eremmel
 
Posts: 1057
Joined: Tue Dec 30, 2008 1:15 am

Re: Selecting unique random combinations from a table

Postby chrisb » Fri May 14, 2021 2:58 pm

@eremmel
EDIT:
i have just logged on to post a solution based on your suggestion but you have beaten me to it.
------------------------------------------------------------------------------------------------
well i would never have thought of this in a million years but your suggestion has yielded a result.
many thanks & well done!!!
 Edit: 14 May 2021: my concatanation was overly complex & has been updated 

Code: Select all   Expand viewCollapse view
with recursive
v(id, l, m, r) as --used derived table because we need a level indicator 'r'
(select "#", l, m, rownum() from "tListSource")
,
t(r, l_con, m_con) as --this is the recursive bit using string manupulation as suggested by eremmel
(
   select
       r,
      cast('-' || l || '-' as varchar (500)),
      cast('-' || m || '-' as varchar(500))
   from v
   where r = 1
   union all
   select
      v.r,
      t.l_con || case when locate('-' || v.l || '-', t.l_con) + locate('-' || v.m || '-', t.m_con) = 0 then v.l || '-' else '' end,
      t.m_con || case when locate('-' || v.l || '-', t.l_con) + locate('-' || v.m || '-', t.m_con) = 0 then v.m || '-' else '' end
   from t
   join v
   on v.r = t.r + 1
)
select v.* from --the final output
   (select min(r) min_r from t group by l_con) y
   join v on v.r = y.min_r
Last edited by chrisb on Mon Jun 07, 2021 4:33 pm, edited 1 time in total.
open office 4.1.8 & LibreOffice 6.4.4.2 x64 using HSQL 1.8.0.10 (Embedded) and HSQL 2.5.2 (Split) on Windows 10
chrisb
 
Posts: 262
Joined: Mon Jun 07, 2010 4:16 pm

Re: Selecting unique random combinations from a table

Postby chrisb » Sat May 15, 2021 10:41 am

@eremmel
i made the necessary amendments to make your code HSQLDB 2.5.1 compatible.
it seems that HSQLDB does not like the join.
i always retrieve 2 random records at LEVELS 0 & 1.
the INSTR & CONCAT functions do work as expected when tested in isolation.
if i use UNION ALL then zero records are returned & i receive the error message 'query execution reached limit of recursion'.
am i missing something which should be obvious?
Code: Select all   Expand viewCollapse view
--qEremmelsCodeHSQL
WITH RECURSIVE random_data(id, l, m, level, l_con, m_con) AS
(
  SELECT id, l, m, 0 as level
   , CONCAT('-', l, '-') as l_con
   , CONCAT('-', m, '-') as m_con
  FROM ( SELECT id, l, m FROM test ORDER BY RAND() LIMIT 1 ) P
UNION --ALL
  SELECT id, l, m, level, l_con, m_con
  FROM (
   SELECT X.id, X.l, X.m, Y.level+1 as level
      , CONCAT(Y.l_con, X.l, '-') as l_con
      , CONCAT(Y.m_con, X.m, '-') as m_con
   FROM ( SELECT id, l, m FROM test ORDER BY RAND() ) X
         INNER JOIN random_data Y ON
            instr(Y.l_con, CONCAT('-', X.l, '-') ) = 0
            AND instr(Y.m_con, CONCAT('-', X.m, '-')) = 0
   LIMIT 1
   ) Z
)
SELECT id, l, m, R.level, R.m_con, R.l_con
FROM random_data R
--WHERE R.level < 7

QueryOutput.PNG
typical output
QueryOutput.PNG (4.04 KiB) Viewed 6169 times
open office 4.1.8 & LibreOffice 6.4.4.2 x64 using HSQL 1.8.0.10 (Embedded) and HSQL 2.5.2 (Split) on Windows 10
chrisb
 
Posts: 262
Joined: Mon Jun 07, 2010 4:16 pm

Re: Selecting unique random combinations from a table

Postby eremmel » Sat May 15, 2021 6:21 pm

Hi @chisb,
There is a discussion Recursive query regression in 2.3.2 on this subject. I believe HyperSQL takes here the wrong angle, see the post at 2018-09-14. There never got a reply. In my opinion HyperSQL is the only one with the formal correct (useless) implementation of recursive-with-union-all. :(

It also looks like that the limit 1 in the second query is applied to the complete recursive-loop and not applied at each level individual. That might be a bug, or it is also according the official standard.... :(

Maybe we can replace the limit 1 with:
 Edit: Code changed after testing in PostgreSql 

Code: Select all   Expand viewCollapse view
    SELECT id, l, m, level, l_con, m_con, ROWNUM() AS r
  FROM (
   SELECT X.id, X.l, X.m, Y.level+1 as level
      , CONCAT(Y.l_con, X.l, '-') as l_con
      , CONCAT(Y.m_con, X.m, '-') as m_con
   FROM ( SELECT id, l, m FROM test ORDER BY RAND() ) X
         INNER JOIN random_data Y ON
            instr(Y.l_con, CONCAT('-', X.l, '-') ) = 0
            AND instr(Y.m_con, CONCAT('-', X.m, '-')) = 0
   ) Z
  WHERE Z.r = 1

We let it join with many candidates and take only the first one. Hopefully database is smart enough to limit processing.

This also an interesting discussion about recursiveness and union vs union all.

P.S.
When it comes to open source databases I only advice PostgreSql and SQLLite (small implementations). Not MySQL nor HyperSQL.
Last edited by eremmel on Sat May 15, 2021 8:53 pm, edited 3 times in total.
It's Microsoft marketing that tells you computers are qualified for non-technicians
W10 (build 19042), LO 7.0.4.2(x64)
eremmel
 
Posts: 1057
Joined: Tue Dec 30, 2008 1:15 am

Re: Selecting unique random combinations from a table

Postby Villeroy » Sat May 15, 2021 7:29 pm

Thank you gentlemen. Interesting to see the limits of popular database engines. As a hobbyist I never jumped across the fence of "SELECT FROM WHERE GROUP ORDER".
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
Villeroy
Volunteer
 
Posts: 29711
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Selecting unique random combinations from a table

Postby eremmel » Sat May 15, 2021 8:34 pm

Villeroy wrote:Thank you gentlemen. Interesting to see the limits of popular database engines. As a hobbyist I never jumped across the fence of "SELECT FROM WHERE GROUP ORDER".

Villeroy, next to a life as hobbyist I expect also a professional life with lots of SQL ;).
It's Microsoft marketing that tells you computers are qualified for non-technicians
W10 (build 19042), LO 7.0.4.2(x64)
eremmel
 
Posts: 1057
Joined: Tue Dec 30, 2008 1:15 am

Re: Selecting unique random combinations from a table

Postby eremmel » Sat May 15, 2021 10:13 pm

@chrisb,
I noticed that PostgreSql handles a CTE (i.e. named-query in WITH ... SELECT) as a (temporary) materialized view. and I belief that this is a common approach for database implementation. (recently the phrase [NOT] MATERIALIZED is added to databases for WITH-clause). We can use this to speed up our query:
1 The ORDER BY RAND() is only applied once and not with each occurrence and nesting level.
2 When we number the rows from CTE we only need to examine those that are not 'seen' yet by the previous nesting levels.
This give a HSQL query like (not tested):
 Edit: Just got HSQLDB 2.6.0 and Java 11 with this query it works 

Code: Select all   Expand viewCollapse view
WITH RECURSIVE
-- Expect that CTE expression is handled as temp. materialized view
random_table(id, l, m, r) AS (
  SELECT id, l, m, rownum() AS r
  FROM ( SELECT * FROM PUBLIC.TEST ORDER BY RAND() ) W
),
random_data(id, l, m, level, r, l_con, m_con) AS (
  SELECT id, l, m, 0 AS level, r,
    CONCAT(CAST('-' as VARCHAR(1000)), l, '-')  AS l_con,
    CONCAT(CAST('-' as VARCHAR(1000)), m, '-')  AS m_con
  FROM random_table
  WHERE r = 1
UNION ALL
  SELECT id, l, m, level, r,
    l_con,
    m_con
  FROM ( SELECT X.id, X.l, X.m, Y.level + 1 AS level, X.r,
           CONCAT(Y.l_con, X.l, '-') AS l_con,
           CONCAT(Y.m_con, X.m, '-') AS m_con,
           ROW_NUMBER() OVER() AS r_limit
         FROM random_table X
           JOIN random_data Y ON
         Y.r < X.r  -- Look only to rows we did not see yet
         AND instr(Y.l_con, CONCAT('-', X.l, '-')) = 0
         AND instr(Y.m_con, CONCAT('-', X.m, '-')) = 0
  ) Z
  WHERE Z.r_limit = 1
)
SELECT id, l, m, level, r, m_con, l_con
FROM random_data
WHERE level < 8
ORDER BY level

One trick was to add cast('-' as varchar(1000) to specify the buffer length for concatenation during recursion.
Last edited by eremmel on Sat May 15, 2021 11:09 pm, edited 4 times in total.
It's Microsoft marketing that tells you computers are qualified for non-technicians
W10 (build 19042), LO 7.0.4.2(x64)
eremmel
 
Posts: 1057
Joined: Tue Dec 30, 2008 1:15 am

Re: Selecting unique random combinations from a table

Postby chrisb » Sat May 15, 2021 10:15 pm

 Edit: we must have been submitting our posts at the same time so will respond tomorow. 

@eremmel,
Maybe we can replace the limit 1 with:

tried it but sadly same result.

there was an issue with all versions of HSQL 2.x when using UNION ALL within a recursive query where the query would always freeze but since HSQLDB 2.5.1 i have had no issues.
by the way i only use HSQLDB when responding to users on this forum (and playing around).

my first attempt at solving this thread was to use recursion but i was unable to parse the partially derived table.

as it goes user dreamquartz has been provided with a solution thanks to your suggestion & i have shaken the cobwebs from my head (temporarily at least).
i am thinking that a list which is randomly generated & cannot reliably be reproduced would require storage but i may be wrong.
open office 4.1.8 & LibreOffice 6.4.4.2 x64 using HSQL 1.8.0.10 (Embedded) and HSQL 2.5.2 (Split) on Windows 10
chrisb
 
Posts: 262
Joined: Mon Jun 07, 2010 4:16 pm

Re: Selecting unique random combinations from a table

Postby chrisb » Sat May 15, 2021 11:12 pm

@eremmel
i just had to try your new code otherwise i will not be able to sleep!
the word MATERIALIZED does not exist in the HSQLDB 2.5.1 manual
i ran the code below but the result is the exact same as was produced by your first code.
it's just not having that join.
in the very first post dreamquartz said:
The list is in random order

Code: Select all   Expand viewCollapse view
--Expect that CTE expression is handled as temp. materialized view (HSQL 2.5.1 compatible)
with recursive
random_table(id, l, m, r) AS
(
  SELECT id, l, m, rownum() AS r
  FROM ( SELECT * FROM test ORDER BY RAND() ) W
)
,
random_data(id, l, m, level, r, l_con, m_con) AS
(
  SELECT id, l, m, 0 AS level, r,
    CONCAT('-', l, '-') AS l_con,
    CONCAT('-', m, '-') AS m_con
  FROM random_table
  WHERE r = 1
UNION
  SELECT id, l, m, level, r,
    l_con,
    m_con
  FROM ( SELECT X.id, X.l, X.m, Y.level + 1 AS level, X.r,
           CONCAT(Y.l_con, X.l, '-') AS l_con,
           CONCAT(Y.m_con, X.m, '-') AS m_con,
           ROWNUM() AS r_limit
         FROM random_table X
           JOIN random_data Y ON
         Y.r < X.r  -- Look only to rows we did not see yet (when failing remove this condition)
         AND instr(Y.l_con, concat('-', X.l, '-')) = 0
         AND instr(Y.m_con, concat('-', X.m, '-')) = 0
) Z
  WHERE Z.r_limit = 1
)
SELECT id, l, m, level, r, m_con, l_con
FROM random_data
open office 4.1.8 & LibreOffice 6.4.4.2 x64 using HSQL 1.8.0.10 (Embedded) and HSQL 2.5.2 (Split) on Windows 10
chrisb
 
Posts: 262
Joined: Mon Jun 07, 2010 4:16 pm

Re: Selecting unique random combinations from a table

Postby eremmel » Sat May 15, 2021 11:39 pm

The list is in random order

So we do not have to apply the 'ORDER BY RAND()'? That is easy solved. But than we get the issue that dreamquarz might belief that reading from a table without any ORDER BY results in a repeatable result set. It looks so but no database have to, unless an ORDER BY is specified.

Assume that the table of dreamquartz has a certain order, it might be possible that the very first sequence (depth first) that can be generated ends at level N, but the level that is requested is larger than N. In that case the query will not work. This is an interesting complication: find the first correct sequence of at least depth N and do not return any sequences that ends before N.
It's Microsoft marketing that tells you computers are qualified for non-technicians
W10 (build 19042), LO 7.0.4.2(x64)
eremmel
 
Posts: 1057
Joined: Tue Dec 30, 2008 1:15 am

Re: Selecting unique random combinations from a table

Postby chrisb » Sun May 16, 2021 12:23 am

eremmel said
Edit: Just got HSQLDB 2.6.0 and Java 11 with this query it works

that's great.
i tried your code with:
HSQLDB 2.5.1
openoffice 4.1.8, java 1.8.0_291 x86
libeoffice 6.4.4.2 (x64), java 1.8.0_291 x86_64

both produce the same error: 'query execution reached limit of recursion'

HELLO downloaded HSQLDB-2.5.2.jar & YES your query is now working :D
open office 4.1.8 & LibreOffice 6.4.4.2 x64 using HSQL 1.8.0.10 (Embedded) and HSQL 2.5.2 (Split) on Windows 10
chrisb
 
Posts: 262
Joined: Mon Jun 07, 2010 4:16 pm

Re: Selecting unique random combinations from a table

Postby dreamquartz » Sun May 16, 2021 7:18 am

@eremmel,
The table can be any length, but definitely not larger than 200 records and is in complete random order.
We even randomize the table for this issue.
The initial 3 criteria are always the case, and the only thing is that the L-M combination is hardcoded.

@chrisb
We are running 2.6.0 at this point in time on Ubuntu 20.04.

The Clients and us do not run it on an other Operating System.

General
Thanks for also testing your responses on other platforms and versions.
I know this problem does occur in many variations.
Leaving this one open for now....

Dream

P.S.: Updated my signature
LO 6.x, HSQLDB 2.6.0 & Ubuntu 20.04 LTS.
dreamquartz
 
Posts: 801
Joined: Mon May 30, 2011 4:02 am

Re: Selecting unique random combinations from a table

Postby eremmel » Sun May 16, 2021 2:44 pm

@dreamquartz,

Here is a query you might use. It should obey your requirements. It takes the table as offered (so no extra randomization applied).
This query is slightly different than the ones before. The former queries looked only to one next row that can be added. When that path ended no other options were analyzed. This implies that the length of that path might be shorter than you requested. So I had to change this a bit. The recursion now considers all (or many) paths until it finds the very first path with the requested length. Based on that path parents are reconstructed.

I''ve impression that HyperSQL finds it difficult to handle these recursive queries. Looks like it take a breadth-first approach. So this might become expensive with a larger table. But when I enable a WHERE clause on Z.r_limit (see comments) to limit the number of searches the query ran fast.

The only thing you should do is look up the TODOs in the query.
Code: Select all   Expand viewCollapse view
WITH RECURSIVE
-- Expect that CTE expression is handled as temp. materialized view
random_table(id, l, m, r) AS (
  SELECT "#" as id, l, m, rownum() AS r
  FROM  PUBLIC.TEST  -- TODO: Here has to go your table name
),
random_data(id, l, m, level, r, l_con, m_con) AS (
  SELECT id, l, m, 1 AS level, r,
    CONCAT(CAST('-' as VARCHAR(1000)), l, '-')  AS l_con,
    CONCAT(CAST('-' as VARCHAR(1000)), m, '-')  AS m_con
  FROM random_table
  WHERE r = 1
UNION
  SELECT id, l, m, level, r,
   l_con,
   m_con
  FROM (
         SELECT X.id, X.l, X.m, Y.level + 1 AS level, X.r,
           CONCAT(Y.l_con, X.l, '-') AS l_con,
           CONCAT(Y.m_con, X.m, '-') AS m_con,
          rownum() AS r_limit
         FROM random_table X
           JOIN random_data Y ON
         Y.r < X.r  -- Look only to rows we did not see yet
         AND instr(Y.l_con, CONCAT('-', X.l, '-')) = 0
         AND instr(Y.m_con, CONCAT('-', X.m, '-')) = 0
  ) Z
-- Use this WHERE-clause to limit the number of searches:
-- A too low number might result in empty result
-- Removing gives a slow query, so make it just about equal to table size.
WHERE Z.r_limit < 400
)
SELECT Q.id AS "#", Q.l, Q.m, Q.level, Q.r, Q.m_con, Q.l_con -- TODO select columns needed
FROM (
  -- Select first very first row at the requested level.
-- This is  the last row with the full-path
  -- Use this to select all the parents.
  SELECT id, l, m, level, r, m_con, l_con
  FROM random_data
  WHERE level = 10  -- TODO here goes the requested number or records.
  LIMIT 1
) P INNER JOIN random_data Q ON
    Q.r <= P.r
   AND instr(P.l_con, Q.l_con) = 1
   AND instr(P.m_con, Q.m_con) = 1
ORDER BY Q.level


@chrisb, It was fun working with you t tackle this brain-killer.
It's Microsoft marketing that tells you computers are qualified for non-technicians
W10 (build 19042), LO 7.0.4.2(x64)
eremmel
 
Posts: 1057
Joined: Tue Dec 30, 2008 1:15 am

Re: Selecting unique random combinations from a table

Postby dreamquartz » Wed May 19, 2021 7:25 pm

@eremmel,

One thing I found that if PUBLIC.TEST is not randomized, you end up with the same results in the same order all the time.
I have the original table, of which the table in the OP is a part, randomized into a View, that is used in as replacement for PUBLIC.TEST.

Dream
LO 6.x, HSQLDB 2.6.0 & Ubuntu 20.04 LTS.
dreamquartz
 
Posts: 801
Joined: Mon May 30, 2011 4:02 am

Re: Selecting unique random combinations from a table

Postby eremmel » Wed May 19, 2021 7:58 pm

dreamquartz wrote:@eremmel,

One thing I found that if PUBLIC.TEST is not randomized, you end up with the same results in the same order all the time.
Dream

Yes, I did not randomize the data in my query last because randomization is performed already, according your specifications:
The list is in random order, and will be re-ordered every time a new resultset, based on a randomly requested set of records is required.

The last part of this statement "based on a randomly requested set of records is required" This might imply that not always a random result is wanted.
You can easily use a view of change the query by replacing PUBLIC.TEST with a derived table; this is found in the other queries.
It's Microsoft marketing that tells you computers are qualified for non-technicians
W10 (build 19042), LO 7.0.4.2(x64)
eremmel
 
Posts: 1057
Joined: Tue Dec 30, 2008 1:15 am

Re: Selecting unique random combinations from a table

Postby dreamquartz » Wed May 19, 2021 8:27 pm

Indeed,

Thanks

Dream

PS can you offer more information about the workings of WITH RECURSIVE, other than I found in the documentation, the LO and OO forums, and HSQLDB.ORG?
I can imagine simple structures that can be explained with EXPLAIN PLAN?
Maybe you could expand/modify this topic into a tutorial......
LO 6.x, HSQLDB 2.6.0 & Ubuntu 20.04 LTS.
dreamquartz
 
Posts: 801
Joined: Mon May 30, 2011 4:02 am

Re: Selecting unique random combinations from a table

Postby eremmel » Wed May 19, 2021 9:00 pm

Any documentation / Tutorials on WITH-SELECT, Common Table Expressions, Recursive queries will do. Focus on databases like MS SQLServer or PostgreSql to learn concepts and most important thing is play with it. I think documentation on HyperSQL picky backs on the general available knowledge; you only need the documentation of HyperSQL to check capabilities and porting related stuff (not in all databases a function is called instr(), but almost all database have the capability).

So the only thing I can do is cut/past an put into a tutorial, but that is duplication of information and not needed. Now you know the keywords, so you can work out your own learning route. Also check out youtube as well, but looking to a video is not my preferred way.

With recursive queries you can go two routes: top-down or bottom-up (rolling totals to the top). Try to understand both patterns. The solution presented here is a top-down one.

Success
It's Microsoft marketing that tells you computers are qualified for non-technicians
W10 (build 19042), LO 7.0.4.2(x64)
eremmel
 
Posts: 1057
Joined: Tue Dec 30, 2008 1:15 am

Re: Selecting unique random combinations from a table

Postby dreamquartz » Mon May 31, 2021 6:46 am

@chrisb and @eremmel,

Thank you very much for your solutions.
After many test-runs the resultset is completely conform the initial 3 Criteria.

Now that the solution is presented to the Client, the Client added criterium for Exclusion.

I do not know how to go about Exclusions, because they cannot be hard-coded.

Criterium 4.
The following indicates what is similar.
NOTE: I used '=' for simplicity:: it means similar
Looking at 1. it shows that L.1 is similar to L.2. is similar to M.1.
This indicates that either is chosen, the others must be excluded.

1. L.1=L.2=M.1
2. L.3=L.4=M.2
3. L.5=L.6=M.3
4. L.7=L.8=M.4
5. L.9=L.10=M.5
6. L.11=L.12=M.6
7. L.13=L.14=M.7
8. L.15=L.16=M.9
9. L.17=L.18=M.10
10. L.19=L.20=M.11
11. L.21=L.22=M.12

The following are 3 attempts for the initial 7 records, based on the new criterium added.

10 10 7
43 3 9
24 1 8
14 14 17
----------
49 4 7
15 15 10
27 1 14
54 22 9
32 2 6
14 14 17
----------
43 3 9
24 1 8
5 5 7
14 14 17

Please note that the list is way longer as indicated, so it should be no problem to get to 7 records or even to the maximum of 12.

Can you please provide me with some insights?
Please also provide me with feedback if it is not possible.

Dream
LO 6.x, HSQLDB 2.6.0 & Ubuntu 20.04 LTS.
dreamquartz
 
Posts: 801
Joined: Mon May 30, 2011 4:02 am

Re: Selecting unique random combinations from a table

Postby chrisb » Tue Jun 01, 2021 4:31 pm

you have not specified the actual characteristics of the fields L & M, in the context of this thread they may be place holders.
if L & M are both integers & foreign keys & your pairings are strictly adhered to then this code may help, it can be used with any of the solutions:
Code: Select all   Expand viewCollapse view
select t2.id "#", t2.l, t2.m --eliminate one of pair randomly from 'qMyTakeOnEremmelsSuggestion'
from
(
   select
      case when rand() < 0.5 then min(l) else max(l) end l
   from
      tbl
   group by l + mod(l, 2)
) t1
join tbl t2 on t1.l = t2.l
below it has been integrated into 'qMyTakeOnEremmelsSuggestion'
i have not gone into detail because i may be wasting my time:
table: TEST
fields: "#", L, M
Code: Select all   Expand viewCollapse view
--qMyTakeOnEremmelsSuggstionOneOfPair
with recursive
v(id, l, m, r) as --used derived table because we need a level indicator 'r'
(select "#", l, m, rownum() from test)
,
t(r, l_con, m_con) as --this is the recursive bit using string manupulation as suggested by eremmel
(
   select
      r,
      cast('-' || l || '-' as varchar (500)),
      cast('-' || m || '-' as varchar(500))
   from v
   where r = 1
   union all
   select
      v.r,
      t.l_con || case when locate('-' || v.l || '-', t.l_con) + locate('-' || v.m || '-', t.m_con) = 0 then v.l || '-' else '' end,
      t.m_con || case when locate('-' || v.l || '-', t.l_con) + locate('-' || v.m || '-', t.m_con) = 0 then v.m || '-' else '' end
   from t
   join v
   on v.r = t.r + 1
)
,
tbl(id, l, m) as --'tbl' is identical to 'qMyTakeOnEremmelsSuggestion' i.e output from the above code
(
   select v.id, l, m from --the final output
      (select min(r) min_r from t group by l_con) y
      join v on v.r = y.min_r
)
select t2.id "#", t2.l, t2.m --eliminate one of pair randomly from 'qMyTakeOnEremmelsSuggestion'
from
(
   select
      case when rand() < 0.5 then min(l) else max(l) end l
   from
      tbl
   group by l + mod(l, 2)
) t1
join tbl t2 on t1.l = t2.l
order by l
Last edited by chrisb on Mon Jun 07, 2021 4:38 pm, edited 1 time in total.
open office 4.1.8 & LibreOffice 6.4.4.2 x64 using HSQL 1.8.0.10 (Embedded) and HSQL 2.5.2 (Split) on Windows 10
chrisb
 
Posts: 262
Joined: Mon Jun 07, 2010 4:16 pm

Re: Selecting unique random combinations from a table

Postby dreamquartz » Wed Jun 02, 2021 2:07 am

@chrisb

As a result of the not so funny 4th criterium, we are allowed to discuss the issue in more detail.
In the OP:
All are INTEGER
'#' is the PK
L stands for 'Tests'
M stands for 'Exams'

The Client is trying to prevent 'cheating' and 'copy-catting' when running Tests and later Exams.
The Number of Tests is 22.
The Number of Exams is 12.
All 4 criteria are true.
The 4th criterium is true as written.

NOTE: Tests and Exams are continuously developed, but @ this time, these are the numbers.
NOTE: that Exams shows a bigger number than 12 in the OP, is for that purpose.

The Client does not always want to end up with the same combinations of Tests and Exams, to make it also exiting for the Examiner, thus random order for the resultset.

Groups can vary from 1-12 max.
NOTE: this maximum number is now chosen, based on the available space, and is therefore also arbitrary.

Participants would work in groups of (arbitrarily) 3, if at all possible.
Because the Participants are able to observe what others are doing, the criteria were developed.

Theoretically there are 12x22=264 combinations possible at this point in time.

There are restrictions for the 'feeder-table', based on criterium 4.
Example (real data):
# L M
0 1 3
1 1 4
2 1 5
3 1 6
4 1 7
5 1 8
6 1 9
7 1 11
8 1 12
9 2 3
10 2 4
11 2 5
12 2 6
13 2 7
14 2 8
15 2 9
16 2 11
.
.
.
.
indicates that L1 can go with M3, but NOT with M1 or M2, because L1 is similar to M1 and similar to M2.
The same is applicable for L2.
L1 and L2 are, by the way, also similar to M10.
These similarities were hard-coded out of the feeder-table.

Next to the fact that there is a similarity issue between L and M 'horizontally', there is also a similarity issue between L1 and L2 and also M1 and M2 'vertically'.
The horizontal and vertical similarity does NOT prevent one from the other selection, as described in my previous post, so either can be selected. However the moment one is selected the other(s) MUST be excluded.

When starting with the 1st record of the partial feeder-table, the example goes only so far, based on the 4 criteria.
Because 7 records could not be accomplished, it is therefore needed to use the 2nd record from the table as a starting point, and so on.

A last thought: The Client has an Exam Committee developing Tests and Exams. For simplicity the Client likes to maintain control over the feeder-table for entry, and modification, and also having the opportunity to identify 'similarities' and maintain control over entry and modification.

We understand that the Client wants to maintain flexibility, but is also wanting the User to not think further than some simple entry and/or modification.

Ah well.......

It makes life exiting LOL.

Dream
LO 6.x, HSQLDB 2.6.0 & Ubuntu 20.04 LTS.
dreamquartz
 
Posts: 801
Joined: Mon May 30, 2011 4:02 am

Re: Selecting unique random combinations from a table

Postby eremmel » Wed Jun 02, 2021 5:41 pm

One might do the following:
Extend the table #, L, M with a column S that gives a unique number to each set of similar combinations of L,M
Extend the recursive query and apply the same mechanism as used with L and M. This will prevent any duplicate similarities.
This is left as a exercise to the user for it is just simple programming by example....
It's Microsoft marketing that tells you computers are qualified for non-technicians
W10 (build 19042), LO 7.0.4.2(x64)
eremmel
 
Posts: 1057
Joined: Tue Dec 30, 2008 1:15 am

Re: Selecting unique random combinations from a table

Postby dreamquartz » Wed Jun 02, 2021 7:09 pm

@eremmel

I like that thought.
That makes sense. See what we can come up with.

Dream
LO 6.x, HSQLDB 2.6.0 & Ubuntu 20.04 LTS.
dreamquartz
 
Posts: 801
Joined: Mon May 30, 2011 4:02 am

Re: Selecting unique random combinations from a table

Postby chrisb » Wed Jun 02, 2021 7:46 pm

i think that i now understand the fourth criteria & have created a function which i believe encompasses all.

tools>SQL
execute first:
Code: Select all   Expand viewCollapse view
create table "tExclusions"(
   id int not null generated by default as identity (start with 1),
   l1 int not null,
   l2 int not null,
   m int not null,
   primary key(id)
)
;

tools>SQL
execute second:
Code: Select all   Expand viewCollapse view
insert into "tExclusions" values
(default,1,2,1),
(default,3,4,2),
(default,5,6,3),
(default,7,8,4),
(default,9,10,5),
(default,11,12,6),
(default,13,14,7),
(default,15,16,9),
(default,17,18,10),
(default,19,20,11),
(default,21,22,12)
;

tools>SQL
the table TEST must exist before creating this view. it's the very same table that you have been using throughout.
table: TEST
fields: ID, "#", L, M
we create a view where the table "tExclusions" is LEFT JOINED with table TEST
--vTestPlusExclusions
execute third:
Code: Select all   Expand viewCollapse view
create view "vTestPlusExclusions" as
select t.*, u.xl, u.xm
from test t
left join
(   select l1, l2 xl, m xm
   from "tExclusions"
   union all
   select l2, l1, m
   from "tExclusions"
) u
on t.l = u.l1
;

tools>SQL
execute fourth:
--CREATE FUNCTION EXCLUDE_F
--select * from table (exclude_f())
Code: Select all   Expand viewCollapse view
create function exclude_f()
   returns table(id int, "#" int, l int, m int)
   specific exclude_one
   language sql
   not deterministic
   reads sql data
   called on null input

   begin atomic
      declare table t(id int, "#" int, l int, m int);
      declare iI, iH, iL, iM, iXL, iXM int;
      declare undo handler for sqlexception;

      for select id, "#", L, M, XL, XM from "vTestPlusExclusions"
         do
         set iI = ID; set iH = "#"; set iL = L; set iM = M; set iXL = XL; set iXM = XM;
         for select
            count(*) c
         from t
         where L in(iL, iXL) or M in(iM, iXM)
            do
            if c = 0 then
               insert into t values iI, iH, iL, iM;
            end if;
         end for;
      end for;
      return table (select * from t);
   end


from the query design window:
activate 'run SQL command directly'
paste select * from table (exclude_f()) & execute.
open office 4.1.8 & LibreOffice 6.4.4.2 x64 using HSQL 1.8.0.10 (Embedded) and HSQL 2.5.2 (Split) on Windows 10
chrisb
 
Posts: 262
Joined: Mon Jun 07, 2010 4:16 pm

Re: Selecting unique random combinations from a table

Postby dreamquartz » Fri Jun 04, 2021 5:17 am

@eremmel

I have added a column 's', holding the relation between L1& L2, numbered '12', as per the list for Criterium 4.
Under 's' 1112' reflects L11=L12

# L M S
0 1 3 12
1 1 4 12
2 1 5 12
3 1 6 12
4 1 7 12
5 1 8 12
6 1 9 12
7 1 11 12
8 1 12 12
9 2 3 12
10 2 4 12
11 2 5 12
12 2 6 12
13 2 7 12
14 2 8 12
15 2 9 12
16 2 11 12
.
23 3 8 34
.
100 11 7 1112
.
I pushed it through the Query.
The Query runs great, but as a result of the limiting factor for L being max 22, and M being max 12, you can imagine that the resultset will only hold a maximum of 11 records.
When there are 12 records requested, there will always be an overlap of 2 L's being 'similar' (see criterium 4.), so your suggestion should only applicable for <12.

I am at the point to wanting to give up this Quest, but can you please indicate if something like this is possible in a 'with recursive''?

Just a question: "Can there be a User input in a "with recursive"?
Can not find anything, but because of the group size between 1-12, it would be handy to let the User set the number.

Starting to understand the principle way more, and will be able to explain the Client better of what we (chrisb, you, and us) are trying to achieve.

@chrisb
Wanted to try eremmel's suggestion first.
See if it leads somewhere.


Will keep you posted.

Dream.
LO 6.x, HSQLDB 2.6.0 & Ubuntu 20.04 LTS.
dreamquartz
 
Posts: 801
Joined: Mon May 30, 2011 4:02 am

Next

Return to Base

Who is online

Users browsing this forum: No registered users and 8 guests