What You'll Learn
In recent years, the use of NoSQL databases has become widespread, because there is a great need for distributed databases. NoSQL databases - unlike traditional relational databases - are built from the ground up to be distributed. Yet NoSQL has significant negatives, such as the lack of data consistency, vulnerabilities around data integrity, and slow and hard-to-write queries, due to the inherent lack of SQL support. Now there is a new class of databases: distributed SQL databases, also known as NewSQL.
This Refcard serves as a reference to the key characteristics of distributed SQL databases and provides information on the benefits of these databases, as well as insights into optimal query design, to get the most out of this architectural pattern.
The Need for Distributed SQL
Distributed SQL databases offer advantages for new, modern, cloud-native applications.
Sharding Middleware
A precursor to distributed SQL that supports the distribution of a single-server database, such as MySQL, across multiple independent servers.
Distributed SQL Architecture
Can run both transactional and analytical workloads by the use of an architecture that's made up of three layers: (i) distributed storage, (ii) distributed query execution, and (iii) SQL API.
Comparing Architectures of Distributed SQL Databases
Shared-everything (e.g. MySQL, PostgreSQL); shared-storage (e.g. Oracle Exadata, Snowflake, Google BigQuery); shared-nothing (e.g. SingleStore, VoltDB)
Query Execution Architecture
How the Data Definition Language (DDL) and Data Manipulation Language (DML) affect query design.
Distributed DDL, Distributed DML, and Distributed Joins
Insights into query design and execution for optimal performance at scale