#289 — January 31, 2020
Database Weekly
The Rise and Fall of the OLAP Cube — A post about the shift away from building ‘data cubes’ to running OLAP workloads on columnar databases, complete with a look at the history and motivations.
Cedric Chin
Powering Pinterest Ads Analytics with Apache Druid — A look at why Pinterest moved from HBase to Druid (a database designed specifically for high performance real-time analytics).
Pinterest Engineering
Why a Gaming App Migrated Off Cassandra — When Cassandra’s data modeling limitations started influencing and restricting higher-level design choices, these MMOG developers looked to CockroachDB.
Cockroach Labs sponsor
Starting Out with Data Puddles, Then We’ll Think About Data Lakes — Last year Comic Relief, a major British charity, wrote about their journey to ‘90% serverless’. Today we see how they’re re-thinking their data ingestion, storage and query stack with Lambda, S3 and Athena.
Adam Clark
Amazon Relational Database Service (RDS) Can Now Export Snapshots to S3 — You can now export Amazon Relational Database Service (Amazon RDS) or Amazon Aurora snapshots to Amazon S3 as Apache Parquet, an efficient open columnar storage format for analytics.
Amazon Web Services
Engineering SQL Support on Apache Pinot at Uber — The story of how Uber has worked on adding full SQL support on Apache Pinot to enable quick analysis and reporting on aggregated data.
Haibo Wang
💻 Jobs
Full Stack Engineer — Expensify seeks a self-driven individual passionate about making code effective, with an understanding of algorithms and design patterns.
Expensify
MongoDB Consultant - Remote Americas — Work on projects with a wide variety of companies and on any database architecture you can imagine.
Percona
📄 And the rest
Migrating from Oracle to Postgres: Tips and Tricks — Covers a handful of common tripping points like checking for NOT NULL columns and the GRANT command, plus using Orafce, an extension containing numerous compatibility functions to make an Oracle to Postgres transition smoother.
Yorvi Arias
How I Write/Format SQL Code — "Most of this comes from my time as a Data Engineer at Facebook."
Marton Trencseni
Easy Fixes For SQL Queries — A handful of rules of thumb (indeed, five thumbs) for querying any traditional relational database.
Kovid Rathee
Using SQL's EXISTS and NOT EXISTS — EXISTS has been part of the SQL standard since SQL:86 but it’s frequently underused.
Vlad Mihalcea
MongoDB Still a Mystery to You? Try Studio 3T Today for a Full 30 Days — Instant driver code for JavaScript, Python, Ruby, and more. Build fast queries with our drag & drop editor.
Studio 3T sponsor
Billy: How VictoriaMetrics Deals with More Than 500 Billion Rows — Re-running ScyllaDB’s ‘Billy’ benchmark on VictoriaMetrics, a scalable time-series database.
Aliaksandr Valialkin
Distributed SQL vs. 'NewSQL' — The term ‘NewSQL’ was coined to describe relational database systems with NoSQL-esque features and OLTP scalability.
Sid Choudhury (YugaByte)
pg_timetable: Advanced Postgres Job Scheduling — Looking at a new job scheduler for Postgres implemented from scratch in Go that’s not just about running single queries at set times but that can also execute more complicated sequences of operations. GitHub repo.
Hans-Jürgen Schönig
Recommending GNU Recutils — GNU Recutils is a set of tools and libraries to access human-editable, plain text databases called ‘recfiles’.
James Tomasino
How to Remove Times from Dates in SQL Server
Brent Ozar