Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Best Practices for Dynamic Filtering on Array Columns (e.g., ClickHouse Array(Int32)) #10048

Open
Labels
questionThe issue is a question. Please use Stack Overflow for questions.
@OrBarCycode

Description

Problem

I'm using Cube.js with a ClickHouse data source and need to support dynamic filtering on an array column (Project IDs of type Array(Int32)) via the GraphQL API. Users should be able to provide an arbitrary list of values (e.g., [123, 456]) to filter rows where the array contains any of those values (OR semantics).
Cube.js doesn't natively support array types or operators like contains for arrays, so standard dimension filters append = ? to the WHERE clause, which breaks array functions (e.g., hasAny). I've tried several workarounds, but each has trade-offs (detailed below). I'm looking for:

Better approaches or best practices for this use case.
Confirmation if the "formatted string" method is acceptable (it works but feels hacky).
Any plans for native array support (e.g., custom operators like arrayContains)?

Attempts and Issues

  1. Unnest with arrayJoin:
    javascriptproject_id_element: {
    sql: arrayJoin(${CUBE}."Project IDs"),
    type: number
    }
    Issue: Filtering by multiple values duplicates measures (e.g., count) when multiple values match in one row.

  2. Direct Array Function:
    javascripthas_project_id: {
    sql: arrayExists(x -> x = {filterValue}, ${CUBE}."Project IDs"),
    type: boolean
    }
    Issue: Cube appends = ? to filters, breaking arrayExists or hasAny.

  3. Using FILTER_PARAMS in Segment:
    javascriptsegments: {
    project_filter: { sql: hasAny(${CUBE}."Project IDs", ${FILTER_PARAMS.project_ids}) }
    }
    Issue: Still gets wrapped in = ? or is hard to maintain with dynamic values.

  4. Format Array as String (Current Solution):
    javascriptproject_id_filter: {
    type: string,
    sql: arrayStringConcat(arrayMap(x -> concat('$', toString(x), '$'), ${CUBE}."Project IDs"), ',')
    }
    Works: Filters with contains on ["123ドル$", "456ドル$"] produce WHERE ... LIKE '%123ドル$%' OR ... LIKE '%456ドル$%'.
    Concerns: Brittle (client must format $x$), potential performance issues with string ops, edge cases (empty arrays).

Related Cube.js schema

cube(`violations`, {
 sql_table: `violations`,
 
 data_source: `default`,
 measures: {
 // Assuming some measures like count
 count: {
 type: `count`,
 },
 },
 dimensions: {
 id: {
 sql: `${CUBE}."ID"`,
 type: `string`,
 primary_key: true
 },
 // Note: "Project IDs" is the array column (Array(Int32))
 project_id: {
 type: `number`,
 sql: `${CUBE}."Project IDs"`, // This doesn't work for filtering as-is
 },
 has_project_id: {
 sql: `arrayExists(x -> x = {filterValue}, ${CUBE}."Project IDs")`,
 type: `boolean`,
 },
 project_id_element: {
 sql: `arrayJoin(${CUBE}."Project IDs")`,
 type: `number`, // Or string
 },
 project_id_filter: {
 type: `string`,
 sql: `arrayStringConcat(arrayMap(x -> concat('$', toString(x), '$'), ${CUBE}."Project IDs"), ',')`,
 },
 }
});```

**Related Cube.js generated SQL**

```sql
SELECT 
 someOtherDim,
 COUNT(*) AS count
FROM violations
WHERE hasAny("Project IDs", [123, 456]) -- OR equivalent with arrayExists
GROUP BY someOtherDim;

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionThe issue is a question. Please use Stack Overflow for questions.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /