#324 — October 2, 2020
Database Weekly
Amazon Timestream Goes GA: Time Series Data 'at Any Scale' — Given Timescale’s big announcements last week, Amazon’s formal GA release of its Timestream time series data store comes at an interesting time. The popularity of time series databases shows no sign of waning and AWS clearly wants a piece of the action.
Amazon Web Services
🤓 Time-Series Compression Algos, Explained » — Get a deep-dive into the history of compression algorithms, how they work, and when and why to apply them to your projects (✨ fun fact: TimescaleDB applies various types to get 90%+ storage efficiency).
Timescale sponsor
Pandemic Driving ‘Back to Basics’ in Big Data, Study Suggests — Much as riskier equities tend to get abandoned during crisis times, so too with more fanciful big data projects, it seems. BI and data visualization use is up at the expense of AI and machine learning.
Datanami
ClickHouse, Redshift and 2.5 Billion Rows of Time Series Data — Rounding out our focus on time series data this week, here Brandon uses AWS to generate 2.5 billion rows of true time series data and uses ClickHouse (a ‘big data scale OLAP RDBMS’) to demonstrate some very impressive query performance.
Brandon Harris
⚡️ Quick bytes:
- Google's Pub/Sub service now has a message ordering feature in beta.
- EDB (formerly known as EnterpriseDB) has acquired 2ndQuadrant, a Postgres solutions and tools company, and claims the move cements its place as 'the leader in the Postgres market.'
- TigerGraph is offering 50GB of distributed graph database storage for on-prem use.
- AWS Outposts are hardware for companies to run AWS services on-prem and now S3 is one of the services it can provide (as well as EC2, EBS and RDS).
- Microsoft allegedly 'exposed' a limited amount of Bing search data from an unsecured Elastic server.
- ksqldb 0.12.0 (a real-time event streaming database) is out with real-time query upgrades.
Using AWS Lambda as a Consumer for Amazon Kinesis — Some best practices when using Lambda with Kinesis for high-throughput, low latency data stream processing.
James Beswick
Prisma’s Data Guide — A growing library of articles making databases more approachable. Topics include data modeling, Postgres, and DB basics.
Prisma sponsor
Build a Data Streaming Pipeline using Kafka Streams and Quarkus — One for team Java. Build a data streaming and processing pipeline using Kafka concepts like joins, windows, processors, state stores, punctuators, and interactive queries.
Kapil Shukla (Red Hat Developer)
Tips for Running MongoDB in Production Using Change Streams — Real-time tracking and auditing functionality has crept into a lot of databases that didn’t have it by default with ‘Change Streams’ being MongoDB’s approach for applications to access real-time data changes without endless polling.
Onyancha Brian Henry
⚙️ Code and Tools
dbcrossbar: Move Large Datasets Between Different Databases and Formats — Copy tabular data between databases, CSV files and cloud storage. Written in Rust.
Faraday, Inc.
rqlite 5.5: A Distributed Relational Database Built on SQLite — Think SQLite but turned into a ‘proper’ distributed database (using Raft consensus) and that’s what you get here. v5.5.0 adds support for parameterized SQL statements.
rqlite
sqlbench: Measures and Compares The Execution Time of SQL Queries — Only for Postgres right now, though pull requests for other databases are welcome. Written in Go.
Felix Geisendörfer
Jailer 10.0: A Database Subsetting and Relational Data Browsing Tool — Navigate bidirectionally through databases by following foreign-key-based or user-defined relationships. Built in Java and supports relational databases supported through a JDBC driver.
Wisser