Issue 151 — April 21, 2017
Featured
GeoMesa: An Open-Source Spatio-Temporal Database Layer — GeoMesa provides spatio-temporal indexing on top of Accumulo, Bigtable, and Cassandra, as well as near real-time stream processing and spatial semantics on top of Kafka.
The GeoMesa Project tools
Amazon Redshift Spectrum: Exabyte-Scale In-Place Queries of S3 Data — Spectrum makes it possible to run complex queries using Redshift on data stored on AWS S3 without any loading or data prep.
Amazon tools
The Top Features Coming to SQL Server 2017 — A summary of what Microsoft has unveiled as coming to SQL Server 2017 from Python support to adaptive query optimization and a built-in graph database.
Joey D'Antoni news
High Available and Scalable Open Source Database - SiriDB — The time series database SiriDB can scale on the fly, is robust by design and uses a unique mechanism to operate without indexes. SiriDB’s query language includes dynamic grouping for easy and fast analysis. GitHub repo
Transceptor Technology sponsored
How to Calculate Multiple Aggregate Functions in a Single Query — A look at alternatives to using a large collection of SQL queries to query the same data in different ways, such as pivots and grouping sets.
JavaOOQ tutorial
Architecture of Giants: Data Stacks at Facebook, Netflix, Airbnb, and Pinterest — Simple event data infrastructure diagrams from several fast-scaling companies.
Michelle Wetzler story
Why to Use A Relational Database for Time-Series Data — NoSQL databases are commonly used to store time-series data, but the creator of TimescaleDB sets out a technical case for bringing time-series data into a relational setup.
Mike Freedman (Timescale) opinion
In brief
Datanami news
Oracle's Databases and Developer Tools Now Available on Docker — Oracle’s main database product, WebLogic, Coherence, MySQL, and others are available in Docker containers on the Docker Store marketplace.
Docker news
Whitepaper: 5 Steps to Agile DB Management — Database Management is years behind software development. Here's how to bring your DBs up to speed.
SelectStar sponsored
Removing Duplicate Rows in Postgres — What if you accidentally load data twice? A look at handling duplicate data and a resulting clean up.
Hans-Juergen Schoenig tutorial
Tips for Monitoring Redis — Ways to get more info from Redis, such as on latency and slow commands.
Mike Perham tutorial
ScaleGrid tutorial
vertabelo tutorial
Peter Lafferty tutorial
Reality Games story
Running Out of IDs — "Keep on using serial for most use cases and keep bigserial in your back pocket if a real need arises."
Josh Branchaud story
Cassandra vs. MongoDB — Considering the differences between Cassandra vs. MongoDB as the data store for your next project
scalegrid.io opinion sponsored
Brigade Engineering opinion
Dan Goldin opinion
Apache Foundation tools