Issue 106 — May 26, 2016
Featured
Apache Spark as a Compiler: Joining a Billion Rows on your Laptop — A look at how Apache Spark acts as a compiler to optimize queries using whole-stage code generation techniques commonly used in language compilers.
Sameer Agarwal story
EnterpriseDB Wraps PostgreSQL Into An Enterprise-Grade Suite to Challenge Oracle — EnterpriseDB has brought more tools into its Postgres distro to challenge big commercial DBs.
Daniel Robinson news
A Look at the Design of SQLite4 — Described as "just like SQLite3, but with an improved interface and file format", SQLite4 is going to be "an alternative, not a replacement, for SQLite3."
SQLite news
TrailDB: An Efficient Library for Storing and Processing Event Data — AdRoll has open sourced a fast, C-based library for compressing and handling time-series event data.
Ville Tuulos code
Have you got SQL fingers? — Watch these free SQL Prompt tips videos for SQL writing hints from top SQL Server MVPs. SQL Prompt is the SQL code productivity add-in for SQL Server Management Studio and Visual Studio. Find out how easily you can write SQL.
Red Gate sponsored
Unicorn: BigTable, Document and Graph Database with Full Text Search — A simple abstraction of BigTable-like databases such as Cassandra, HBase, and Accumulo that provides an easy-to-use document data model and MongoDB-like API.
Haifeng Li tools
Benchmarking Elasticsearch vs InfluxDB for Time-Series Data — Some of the InfluxData team set out to compare the performance and features of InfluxDB and Elasticsearch for common time-series workloads. (Note: InfluxData develops InfluxDB.)
Shubhra Kar video
Understanding Outliers with Skew and Kurtosis in SQL — Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean, and kurtosis is a measure of the "tailedness" of the same distribution. Confused? Here’s a practical example.
Periscope tutorial
Efficient Pagination in SQL and ElasticSearch — Traditional pagination queries can have a high hidden cost. Here, an interesting technique to make pagination more efficient is explored.
Salsify tutorial
Monitoring MongoDB Performance Metrics — By properly monitoring MongoDB you can quickly spot slowdowns, or pressing resource limitations, and know how to fix performance issues. This series covers WiredTiger but there’s a MMAP edition too.
Jean-Mathieu Saponaro tutorial
Scaling to 100M: MySQL is a Better NoSQL — When considering a NoSQL use case, such as key/value storage, MySQL made more sense to Wix in terms of performance, ease of use, and stability.
Yoav Abrahami opinion
Jobs
Job Offers. No resume necessary. — Create your Hired profile to get top companies to start applying to hire you. Get offers from 75,000ドル - 250,000ドル on the platform in 1 week.
Hired.com
In brief
Alex Woodie news
Robin Wauters news
Amazon Web Services news
Cray Unveils Open Source 'Big Data' Box — Serious metal with 35TB of PCIe SSD storage, 22TB of RAM, and preloaded with OpenStack and Mesos.
The Register news
How we improved query response times 6X with Apache Storm and Cassandra — Keen IO platform engineer shares lessons learned while scaling out their analytics infrastructure with Apache Storm and Cassandra.
Keen IO story sponsored
Data Types in PostgreSQL — A 128 page slidedeck ‘guided tour’ through PostgreSQL data types.
Peter van Hardenberg
Postgres & Redis Sitting In a Tree — "Typically a company may start with a relational database like Postgres and then add Redis for more high velocity use-cases. What if you could tie the two systems together to enable so much more?"
Rimas Silkaitis
Periscope tutorial
Edgar Ribeiro tutorial tools
Chua Hock-Chuan tutorial
Creating Pivot Tables in PostgreSQL Using the Crosstab Function — The ‘tablefunc’ extension provides a really interesting set of functions. One of them is the crosstab function, used for pivot table creation. That’s what this article covers.
Maria Alcaraz tutorial
Authentise, Inc tutorial
Aaron Bertrand opinion
DreamFactory 2.1.2 released, includes app packages — Join data between a SQL database and MongoDB with a few configuration clicks.
DreamFactory tools sponsored
SQL and NoSQL Databases Ranked by StackOverflow Activity — MySQL and SQL Server are well out in the lead.
StackOverkill tools
Redis Labs Modules code