Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

mapbased/fuse-query

Repository files navigation

FuseQuery

Github Actions Status Github Actions Status Github Actions Status codecov.io Platform License

FuseQuery is a Cloud Distributed SQL Query Engine at scale.

Cloud-Native and Distributed ClickHouse from scratch in Rust.

Give thanks to ClickHouse and Arrow.

Features

  • High Performance

    • Everything is Parallelism
  • High Scalability

    • Everything is Distributed
  • High Reliability

    • True Separation of Storage and Compute

Architecture

DataFuse Architecture

Crates

Crate Description Status
distributed Distributed scheduler and executor for planner WIP
optimizers Optimizer for Distributed&Local plan WIP
datablocks Vectorized data processing unit WIP
datastreams Async streaming iterators WIP
datasources Interface to the datasource(system.numbers for performance/Fuse-Store) WIP
executors Executor(EXPLAIN/SELECT) for the Pipeline WIP
functions Scalar and Aggregation Functions WIP
processors Dataflow Streaming Processor WIP
planners Distributed&Local planners for building processor pipelines WIP
servers Server handler(MySQL/HTTP) MySQL
transforms Data Stream Transform(Source/Filter/Projection/AggregatorPartial/AggregatorFinal/Limit) WIP

Status

SQL Support

  • Projection
  • Filter
  • Limit
  • Aggregate
  • Functions
  • Filter Push-Down
  • Projection Push-Down (TODO)
  • Distributed Query (WIP)
  • Sorting (TODO)
  • Joins (TODO)
  • SubQueries (TODO)

Performance

  • Memory SIMD-Vector processing performance only
  • Dataset: 100,000,000,000 (100 Billion)
  • Hardware: AMD Ryzen 7 PRO 4750U, 8 CPU Cores, 16 Threads
  • Rust: rustc 1.49.0 (e1884a8e3 2020年12月29日)
  • Build with Link-time Optimization and Using CPU Specific Instructions
  • ClickHouse server version 21.2.1 revision 54447
Query FuseQuery (v0.1) ClickHouse (v21.2.1)
SELECT avg(number) FROM system.numbers_mt (3.11 s.) ×ばつ3.14 slow, (9.77 s.)
10.24 billion rows/s., 81.92 GB/s.
SELECT sum(number) FROM system.numbers_mt (2.96 s.) ×ばつ2.02 slow, (5.97 s.)
16.75 billion rows/s., 133.97 GB/s.
SELECT min(number) FROM system.numbers_mt (3.57 s.) ×ばつ3.90 slow, (13.93 s.)
7.18 billion rows/s., 57.44 GB/s.
SELECT max(number) FROM system.numbers_mt (3.59 s.) ×ばつ4.09 slow, (14.70 s.)
6.80 billion rows/s., 54.44 GB/s.
SELECT count(number) FROM system.numbers_mt (1.76 s.) ×ばつ2.22 slow, (3.91 s.)
25.58 billion rows/s., 204.65 GB/s.
SELECT sum(number+number+number) FROM numbers_mt (23.14 s.) ×ばつ5.47 slow, (126.67 s.)
789.47 million rows/s., 6.32 GB/s.
SELECT sum(number) / count(number) FROM system.numbers_mt (3.09 s.) ×ばつ1.96 slow, (6.07 s.)
16.48 billion rows/s., 131.88 GB/s.
SELECT sum(number) / count(number), max(number), min(number) FROM system.numbers_mt (6.73 s.) ×ばつ4.01 slow, (27.59 s.)
3.62 billion rows/s., 28.99 GB/s.

Note:

  • ClickHouse system.numbers_mt is 16-way parallelism processing
  • FuseQuery system.numbers_mt is 16-way parallelism processing

How to Run?

Fuse-Query Server

Run from source

$ make run
12:46:15 [ INFO] Options { log_level: "debug", num_cpus: 8, mysql_handler_port: 3307 }
12:46:15 [ INFO] Fuse-Query Cloud Compute Starts...
12:46:15 [ INFO] Usage: mysql -h127.0.0.1 -P3307

or Run with docker(Recommended):

$ docker pull datafusedev/fuse-query
...
$ docker run --init --rm -p 3307:3307 datafusedev/fuse-query
05:12:36 [ INFO] Options { log_level: "debug", num_cpus: 6, mysql_handler_port: 3307 }
05:12:36 [ INFO] Fuse-Query Cloud Compute Starts...
05:12:36 [ INFO] Usage: mysql -h127.0.0.1 -P3307

or Download the release binary here:

https://github.com/datafusedev/fuse-query/releases

Query with MySQL client

Connect
$ mysql -h127.0.0.1 -P3307
Explain Plan
mysql> explain select (number+1) as c1, number/2 as c2 from system.numbers_mt(10000000) where (c1+c2+1) < 100 limit 3;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Limit: 3
 Projection: (number + 1) as c1:UInt64, (number / 2) as c2:UInt64
 Filter: (((c1 + c2) + 1) < 100)
 ReadDataSource: scan parts [8](Read from system.numbers_mt table, Read Rows:10000000, Read Bytes:80000000) |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
Explain Pipeline
×ばつ 1 processor └─ Merge (LimitTransform ×ばつ 8 processors) to (MergeProcessor ×ばつ 1) └─ LimitTransform ×ばつ 8 processors └─ ProjectionTransform ×ばつ 8 processors └─ FilterTransform ×ばつ 8 processors └─ SourceTransform ×ばつ 8 processors | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)">
mysql> explain pipeline select (number+1) as c1, number/2 as c2 from system.numbers_mt(10000000) where (c1+c2+1) < 100 limit 3;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explain |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 
 └─ LimitTransform ×ばつ 1 processor
 └─ Merge (LimitTransform ×ばつ 8 processors) to (MergeProcessor ×ばつ 1)
 └─ LimitTransform ×ばつ 8 processors
 └─ ProjectionTransform ×ばつ 8 processors
 └─ FilterTransform ×ばつ 8 processors
 └─ SourceTransform ×ばつ 8 processors |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
Select
mysql> select (number+1) as c1, number/2 as c2 from system.numbers_mt(10000000) where (c1+c2+1) < 100 limit 3;
+------+------+
| c1 | c2 |
+------+------+
| 1 | 0 |
| 2 | 0 |
| 3 | 1 |
+------+------+
3 rows in set (0.06 sec)

How to Test?

$ make test

Roadmap

  • 0.1 support aggregation select
  • 0.2 support distributed query (WIP)
  • 0.3 support group by, order by
  • 0.4 support join
  • 0.5 support sub queries
  • 0.6 support TPC-H benchmark

About

FuseQuery is a Cloud-Native Distributed Query Engine at scale

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

Contributors

Languages

  • Rust 95.5%
  • Python 4.0%
  • Other 0.5%

AltStyle によって変換されたページ (->オリジナル) /