How to Use ShardingSphere-Proxy in Real Production Scenarios—Your Quick Start Guide

This post analyzes ops & maintenance solutions with experiences taken from real production scenarios for data sharding and other functions provided by ShardingSphere-Proxy Version 5.1.0.

Unless otherwise specified, in the following examples, when we say “database” we refer to MySQL.

What does ShardingSphere-Proxy do?
ShardingSphere-Proxy allows users to use Apache ShardingSphere just as if it were a native database.

To gain a better understanding of what’s ShardingSphere-proxy, let’s take a look at its definition provided, by Apache ShardingSphere’s official website:

ShardingSphere-Proxy is a transparent database proxy that provide a database server containing database binary protocols designed to support heterogeneous languages.
Currently, it supports MySQL and PostgreSQL (and PostgreSQL-based databases, such as openGauss) and any related terminals (such as MySQL Command Client, https://blog.devart.com/mysql-command-line-client.html Workbench, etc.) that are compatible with MySQL or PostgreSQL protocols to operate data. It’s a DBA-friendly tool.

It’s worth noting that ShardingSphere-Proxy is a service process. In terms of client-side program connections, it is similar to a MySQL database.

Why you need ShardingSphere-Proxy
ShardingSphere-Proxy is a good choice when:

sharding rules or other rules are used; because data will be distributed across multiple database instances, inevitably making management inconvenient.
non-Java developers need to leverage ShardingSphere capabilities.

1. Application scenarios
There are many scenarios where ShardingSphere-JDBC is used for data sharding. If you have a user table and need to perform horizontal scaling with Hash for the User ID property, the way the client connects to the database is like this:

Below are three real production scenarios:

A testing engineer wants to see the information of user ID 123456 in databases & tables, and you need to tell the engineer which subtable the user is in.
You need to find out the total user growth in 2022 and overall user information for drafting a yearly report.
Your company is going to hold its 8th-anniversary event and you are required to provide a list of active users who have been registered for over 8 years. Since the data is distributed across database shards and table shards, it is not easy to complete the above-mentioned tasks. If you develop code every time to satisfy these temporary requirements, it’ll be inefficient to say the least. ShardingSphere-Proxy is perfect for these scenarios.

ShardingSphere-Proxy hides the actual backend databases, so the user operates the client side the same way as a database.

For example, t_user is split into several real tables at the database level, that is from t_user_0 to t_user_9 . While operating ShardingSphere-Proxy on the client side, the user only needs to know one logical table t_user,and routing to the real tables is executed inside ShardingSphere-Proxy.

1. Logical table: The logical name of the horizontally-scaled databases/tables with the same structure. A logical table is the logical identifier of tables in SQL. For example, user data is sharded into 10 tables according to the significant digits of the primary key, that is, t_user_0 to t_user_9 , and their common logical table is named t_user.

2. Actual table: The physical table actually exists in databases after scale-out, that is, the above-mentioned t_user_0 to t_user_9 .

2. The differences between ShardingSphere-JDBC and ShardingSphere-Proxy

After reading the above description, you probably feel that ShardingSphere-Proxy and ShardingSphere-JDBC are so similar. So what are the differences between the two?

Check out more on the differences between the two below:

1. ShardingSphere-JDBC is a .jar package. Its bottom layer completes SQL parsing, routing, rewriting, execution, and other processes by rewriting JDBC components. You should add the configuration files to implement the corresponding functions in the project, making it intrusive to applications.

2. ShardingSphere-Proxy is a process service. In most cases, it is positioned as a productivity tool to assist operations. It disguises itself as a database, making itself non-intrusive to applications. The SQL execution logic in ShardingSphere-Proxy is the same as in ShardingSphere-JDBC because they share the same kernel.

Since ShardingSphere-Proxy is non-intrusive to applications, and it shares the same kernel with ShardingSphere-JDBC — so why do we still need ShardingSphere-JDBC?

1. When an application directly operates databases through ShardingSphere-JDBC, there is only one network I/O. However, when the application connects to ShardingSphere-Proxy, one network I/O, and then ShardingSphere-Proxy operates databases, and another network I/O occurs, in total two network I/O requests.

2. There is one more layer of application called link, which is more likely to cause a data traffic bottleneck and potential risks to the application. In general, it’s suggested that an application should be used together with ShardingSphere-JDBC.

Of course, ShardingSphere-JDBC and ShardingSphere-Proxy can be deployed simultaneously with a hybrid architecture. ShardingSphere-JDBC is suitable for high-performance lightweight Online Transaction Processing (OLTP) applications developed in Java, while ShardingSphere-Proxy is perfect for Online Analytical Processing (OLAP) applications and scenarios for managing and operating sharding databases.

Quick Start Guide
There are three setup methods to install ShardingSphere-Proxy: binary package, Docker, and Helm. Stand-alone deployment and clustered deployment are also provided. Here, we take the standalone binary package as an example:

1. Get the ShardingSphere-Proxy binary installation package at this link;

2. Decompress it and then modify conf/server.yaml and files starting with the config- prefix to configure sharding, read/write splitting and other functions;

3. If you use Linux as operating system, please run bin/start.sh. For Windows operating systems, please run bin/start.bat to bootup ShardingSphere-Proxy.

The file directory looks like this: