473,289 Members | 1,945 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes and contribute your articles to a community of 473,289 developers and data experts.

How to Use ShardingSphere-Proxy in Real Production Scenarios—Your Quick Start Guide

This post analyzes ops & maintenance solutions with experiences taken from real production scenarios for data sharding and other functions provided by ShardingSphere-Proxy Version 5.1.0.

Unless otherwise specified, in the following examples, when we say “database” we refer to MySQL.

What does ShardingSphere-Proxy do?
ShardingSphere-Proxy allows users to use Apache ShardingSphere just as if it were a native database.

To gain a better understanding of what’s ShardingSphere-proxy, let’s take a look at its definition provided, by Apache ShardingSphere’s official website:

ShardingSphere-Proxy is a transparent database proxy that provide a database server containing database binary protocols designed to support heterogeneous languages.
Currently, it supports MySQL and PostgreSQL (and PostgreSQL-based databases, such as openGauss) and any related terminals (such as MySQL Command Client, https://blog.devart.com/mysql-command-line-client.html Workbench, etc.) that are compatible with MySQL or PostgreSQL protocols to operate data. It’s a DBA-friendly tool.



It’s worth noting that ShardingSphere-Proxy is a service process. In terms of client-side program connections, it is similar to a MySQL database.


Why you need ShardingSphere-Proxy
ShardingSphere-Proxy is a good choice when:
  • sharding rules or other rules are used; because data will be distributed across multiple database instances, inevitably making management inconvenient.
  • non-Java developers need to leverage ShardingSphere capabilities.

1. Application scenarios
There are many scenarios where ShardingSphere-JDBC is used for data sharding. If you have a user table and need to perform horizontal scaling with Hash for the User ID property, the way the client connects to the database is like this:



Below are three real production scenarios:
  1. A testing engineer wants to see the information of user ID 123456 in databases & tables, and you need to tell the engineer which subtable the user is in.
  2. You need to find out the total user growth in 2022 and overall user information for drafting a yearly report.
  3. Your company is going to hold its 8th-anniversary event and you are required to provide a list of active users who have been registered for over 8 years. Since the data is distributed across database shards and table shards, it is not easy to complete the above-mentioned tasks. If you develop code every time to satisfy these temporary requirements, it’ll be inefficient to say the least. ShardingSphere-Proxy is perfect for these scenarios.

ShardingSphere-Proxy hides the actual backend databases, so the user operates the client side the same way as a database.

For example, t_user is split into several real tables at the database level, that is from t_user_0 to t_user_9 . While operating ShardingSphere-Proxy on the client side, the user only needs to know one logical table t_user,and routing to the real tables is executed inside ShardingSphere-Proxy.

1. Logical table: The logical name of the horizontally-scaled databases/tables with the same structure. A logical table is the logical identifier of tables in SQL. For example, user data is sharded into 10 tables according to the significant digits of the primary key, that is, t_user_0 to t_user_9 , and their common logical table is named t_user.

2. Actual table: The physical table actually exists in databases after scale-out, that is, the above-mentioned t_user_0 to t_user_9 .




2. The differences between ShardingSphere-JDBC and ShardingSphere-Proxy

After reading the above description, you probably feel that ShardingSphere-Proxy and ShardingSphere-JDBC are so similar. So what are the differences between the two?



Check out more on the differences between the two below:

1. ShardingSphere-JDBC is a .jar package. Its bottom layer completes SQL parsing, routing, rewriting, execution, and other processes by rewriting JDBC components. You should add the configuration files to implement the corresponding functions in the project, making it intrusive to applications.

2. ShardingSphere-Proxy is a process service. In most cases, it is positioned as a productivity tool to assist operations. It disguises itself as a database, making itself non-intrusive to applications. The SQL execution logic in ShardingSphere-Proxy is the same as in ShardingSphere-JDBC because they share the same kernel.

Since ShardingSphere-Proxy is non-intrusive to applications, and it shares the same kernel with ShardingSphere-JDBC — so why do we still need ShardingSphere-JDBC?

1. When an application directly operates databases through ShardingSphere-JDBC, there is only one network I/O. However, when the application connects to ShardingSphere-Proxy, one network I/O, and then ShardingSphere-Proxy operates databases, and another network I/O occurs, in total two network I/O requests.

2. There is one more layer of application called link, which is more likely to cause a data traffic bottleneck and potential risks to the application. In general, it’s suggested that an application should be used together with ShardingSphere-JDBC.

Of course, ShardingSphere-JDBC and ShardingSphere-Proxy can be deployed simultaneously with a hybrid architecture. ShardingSphere-JDBC is suitable for high-performance lightweight Online Transaction Processing (OLTP) applications developed in Java, while ShardingSphere-Proxy is perfect for Online Analytical Processing (OLAP) applications and scenarios for managing and operating sharding databases.




Quick Start Guide
There are three setup methods to install ShardingSphere-Proxy: binary package, Docker, and Helm. Stand-alone deployment and clustered deployment are also provided. Here, we take the standalone binary package as an example:

1. Get the ShardingSphere-Proxy binary installation package at this link;

2. Decompress it and then modify conf/server.yaml and files starting with the config- prefix to configure sharding, read/write splitting and other functions;

3. If you use Linux as operating system, please run bin/start.sh. For Windows operating systems, please run bin/start.bat to bootup ShardingSphere-Proxy.

The file directory looks like this:

Expand|Select|Wrap|Line Numbers
  1. ├── LICENSE
  2. ├── NOTICE
  3. ├── README.txt
  4. 0009 ─ ─ bin #Start/stop script
  5. 0009 ─ ─ conf #service configuration, data sharding, read/write splitting, data encryption, and other function configuration files
  6. ├── lib # Jar package
  7. └── licenses
1. Copy MySQL Java connector to the ext-lib package
Download the driver mysql-connector-java-5.1.47.jar (Click the link to download) or mysql-connector-java-8.0.11.jar (https://repo1.maven.org/maven2/mysql...ava-8.0.11.jar) into the ext-lib package. Note:there is no ext-lib package in the initial directory, so you need to create one yourself.

2. Modify the conf/server.yaml configuration file
For server.yaml the default operation mode is Cluster Mode. Below is how to configure the standalone operation mode.

Expand|Select|Wrap|Line Numbers
  1. mode:
  2. Type: Standalone #Standalone mode
  3. repository:
  4. type: File
  5. props:
  6. Path: /Users/xxx/software/apache-shardingsphere-5.1.0-shardingsphere-proxy/file #persistent file paths such as metadata configuration
  7. Overwrite: false #Overwrite the existing metadata?
  8. Rules: #Verification Info
  9. - !AUTHORITY
  10. Users: #Init user
  11. - root@%:root
  12. - sharding@:sharding
  13. provider:
  14. type: ALL_PRIVILEGES_PERMITTED
  15. - !TRANSACTION
  16. defaultType: XA
  17. provider Type: Atomikos
  18. - !SQL_PARSER
  19. sqlCommentParseEnabled: true
  20. sqlStatementCache:
  21. initialCapacity: 2000
  22. maximum Size: 65535
  23. concurrencyLevel: 4
  24. parseTreeCache:
  25. initialCapacity: 128
  26. maximum Size: 1024
  27. concurrencyLevel: 4
  28. Props: #public configuration
  29. max-connections-size-per-query: 1
  30. kernel-executor-size: 16  # Infinite by default.
  31. proxy-frontend-flush-threshold: 128  # The default value is 128.
  32. proxy-opentracing-enabled: false
  33. proxy-hint-enabled: false
  34. sql-show: false
  35. check-table-metadata-enabled: false
  36. show-process-list-enabled: false
  37. # Proxy backend query fetch size. A larger value may increase the memory usage of ShardingSphere Proxy.
  38. # The default value is -1, which means set the minimum value for different JDBC drivers.
  39. proxy-backend-query-fetch-size: -1
  40. check-duplicate-table-enabled: false
  41. proxy-frontend-executor-size: 0 # Proxy frontend executor size. The default value is 0, which means let Netty decide.
  42. # Available options of proxy backend executor suitable: OLAP(default), OLTP. The OLTP option may reduce time cost of writing packets to client, but it may increase the latency of SQL execution
  43. # and block other clients if client connections are more than `proxy-frontend-executor-size`, especially executing slow SQL.
  44. proxy-backend-executor-suitable: OLAP
  45. proxy-frontend-max-connections: 0 # Less than or equal to 0 means no limitation.
  46. sql-federation-enabled: false
  47. # Available proxy backend driver type: JDBC (default), ExperimentalVertx
  48. proxy-backend-driver-type: JDBC
Note: if you start a standalone ShardingSphere-Proxy and later need to change Proxy configurations, you need to set mode.overwrite to true. By doing so, ShardingSphere-Proxy will reload the metadata after startup.

3. Start ShardingSphere-Proxy
Execute the bootup command: sh bin/start.sh. The default port is 3307, and the port can be replaced by adding parameters to the start script command: sh bin/start.sh 3308 .

To check whether ShardingSphere-Proxy started successfully, execute the check log command: tail -100f logs/stdout.log .

The following information placed on the last line means that the startup is successful:
Expand|Select|Wrap|Line Numbers
  1. [INFO ] xxx-xx-xx xx:xx:xx.xxx [main] o.a.s.p.frontend.ShardingSphereProxy - ShardingSphere-Proxy Standalone mode started successfully
Scenarios and applications
Based on actual prouction scenarios, we’d like to show you how you can utilize ShardingSphere-Proxy to meet your expectations.




1. Initialize the database & table

Expand|Select|Wrap|Line Numbers
  1. # CREATE DATABASE
  2. CREATE DATABASE user_sharding_0;
  3. CREATE DATABASE user_sharding_1;
  4. # CREATE TABLE
  5. use user_sharding_0;
  6. CREATE TABLE `t_user_0` (
  7. `id` bigint (20) NOT NULL,
  8. `user_id` bigint (20) NOT NULL,
  9. `create_date` datetime DEFAULT NULL,
  10. PRIMARY KEY (`id`)) ENGINE = InnoDB DEFAULT CHARSET = latin1;
  11. CREATE TABLE `t_user_1` (
  12. `id` bigint (20) NOT NULL,
  13. `user_id` bigint (20) NOT NULL,
  14. `create_date` datetime DEFAULT NULL,
  15. PRIMARY KEY (`id`)) ENGINE = InnoDB DEFAULT CHARSET = latin1;
  16. use user_sharding_1;
  17. CREATE TABLE `t_user_0` (
  18. `id` bigint (20) NOT NULL,
  19. `user_id` bigint (20) NOT NULL,
  20. `create_date` datetime DEFAULT NULL,
  21. PRIMARY KEY (`id`)) ENGINE = InnoDB DEFAULT CHARSET = latin1;
  22. CREATE TABLE `t_user_1` (
  23. `id` bigint (20) NOT NULL,
  24. `user_id` bigint (20) NOT NULL,
  25. `create_date` datetime DEFAULT NULL,
  26. PRIMARY KEY (`id`)) ENGINE = InnoDB DEFAULT CHARSET = latin1;
2. Initialize sharding configuration in Proxy

Expand|Select|Wrap|Line Numbers
  1. schemaName: sharding_db
  2. dataSources:
  3. ds_0:
  4. url: jdbc:mysql://127.0.0.1:3306/user_sharding_0?serverTimezone=UTC&useSSL=false
  5. username: root
  6. password: root
  7. connectionTimeoutMilliseconds: 30000
  8. idleTimeoutMilliseconds: 60000
  9. maxLifetimeMilliseconds: 1800000
  10. maxPoolSize: 50
  11. minPoolSize: 1
  12. ds_1:
  13. url: jdbc:mysql://127.0.0.1:3306/user_sharding_1?serverTimezone=UTC&useSSL=false
  14. username: root
  15. password: root
  16. connectionTimeoutMilliseconds: 30000
  17. idleTimeoutMilliseconds: 60000
  18. maxLifetimeMilliseconds: 1800000
  19. maxPoolSize: 50
  20. minPoolSize: 1
  21. rules:
  22. - !SHARDING
  23. tables:
  24. t_user:
  25. actualDataNodes: ds_${0..1}.t_user_${0..1}
  26. tableStrategy:
  27. standard:
  28. shardingColumn: user_id
  29. shardingAlgorithmName: t_user_inline
  30. keyGenerateStrategy:
  31. column: user_id
  32. keyGeneratorName: snowflake
  33. bindingTables:
  34. - t_user
  35. defaultDatabaseStrategy:
  36. standard:
  37. shardingColumn: user_id
  38. shardingAlgorithmName: database_inline
  39. defaultTableStrategy:
  40. none:
  41. shardingAlgorithms:
  42. database_inline:
  43. type: INLINE
  44. props:
  45. algorithm-expression: ds_${user_id % 2}
  46. t_user_inline:
  47. type: INLINE
  48. props:
  49. algorithm-expression: t_user_${user_id % 2}
  50. KeyGenerators:
  51. snowflake:
  52. type: SNOWFLAKE
3. Test sharding configuration

Use the MySQL terminal command to connect to the ShardingSphere-Proxy server.

If you deploy databases in Docker containers, you need to add -h native ip. Because accessing 127.0.0.1 in the container is blocked.

Expand|Select|Wrap|Line Numbers
  1. #replace {xx} with actual parameter
  2. mysql -h {ip} -u {username} -p{password} -P 3307
  3. #example command
  4. mysql -h 127.0.0.1 -u root -proot -P 3307
ShardingSphere-Proxy supports Navicat MySQL, DataGrip, WorkBench, TablePlus, and other database management connectors.

After confirming the connection is successful, query the databases and make sure they are consistent with those in the configuration file.

Expand|Select|Wrap|Line Numbers
  1. mysql> show databases;
  2. +-------------+
  3. | schema_name |
  4. +-------------+
  5. | sharding_db |
  6. +-------------+
  7. 1 row in set (0.02 sec)
Execute the new t_user statement, insert 6 pieces of user data (3 pieces for 2021, and 3 pieces for 2022).

Expand|Select|Wrap|Line Numbers
  1. mysql> use sharding_db;
  2. mysql> INSERT INTO t_user (id, user_id, create_date) values(1, 1, '2021-01-01 00:00:00'), (2, 2, '2021-01-01 00:00:00'), (3, 3, '2021-01-01 00:00:00'), (4, 4, '2022-01-01 00:00:00'), (5, 5, '2022-02-01 00:00:00'), (6, 6, '2022-03-01 00:00:00');
  3. Query OK, 6 rows affected (0.16 sec)
  4. mysql> select * from t_user;
  5. +----+---------+---------------------+
  6. | id | user_id | create_date         |
  7. +----+---------+---------------------+
  8. |  2 |       2 | 2021-01-01 00:00:00 |
  9. |  4 |       4 | 2022-01-01 00:00:00 |
  10. |  6 |       6 | 2022-03-01 00:00:00 |
  11. |  1 |       1 | 2021-01-01 00:00:00 |
  12. |  3 |       3 | 2021-01-01 00:00:00 |
  13. |  5 |       5 | 2022-02-01 00:00:00 |
  14. +----+---------+---------------------+
At this time, the data is in the user_sharding_0 and
user_sharding_1 databases respectively.


Scenario 1: How to locate data information?
Since ShardingSphere-Proxy has logically aggregated the tables, you can query them directly.
Expand|Select|Wrap|Line Numbers
  1. mysql> select * from t_user where user_id = 1;
  2. +----+---------+---------------------+
  3. | id | user_id | create_date         |
  4. +----+---------+---------------------+
  5. |  1 |       1 | 2021-01-01 00:00:00 |
  6. +----+---------+---------------------+
  7. 1 row in set (0.01 sec)

Scenario 2: How to check user growth in 2022 and user information?
Expand|Select|Wrap|Line Numbers
  1. mysql> select count(*) from t_user where create_date > '2022-00-00 00:00:00';
  2. +----------+
  3. | count(*) |
  4. +----------+
  5. |        3 |
  6. +----------+
  7. 1 row in set (0.10 sec)
  8. mysql> select * from t_user where create_date > '2022-00-00 00:00:00';
  9. +----+---------+---------------------+
  10. | id | user_id | create_date         |
  11. +----+---------+---------------------+
  12. |  4 |       4 | 2022-01-01 00:00:00 |
  13. |  6 |       6 | 2022-01-01 00:00:00 |
  14. |  5 |       5 | 2022-01-01 00:00:00 |
  15. +----+---------+---------------------+
  16. 3 rows in set (0.02 sec)

Scenario 3: How to get a list of active users who have been registered for over 8 years?
Refer to the above code and you will know how to cope with it.


Conclusion
This post summarizes the basic concepts of ShardingSphere-Proxy based on the actual data sharding production scenarios, and demonstrates how ShardingSphere-Proxy faces these scenarios.

After reading this article, you should be able to:

Understand why ShardingSphere-Proxy is an excellent product to assist developers in DevOps.
Know the differences between ShardingSphere-JDBC and ShardingSphere-Proxy, their advantages and disadvantages, and how they are implemented.
Since you now have a better understanding of ShardingSphere-Proxy, we believe, it will be easier for you to study its source code. To know more about ShardingSphere, please visit Apache ShardingSphere's official website or our community’s previous blogs such as Create a Distributed Database Solution Based on PostgreSQL/openGauss.

Apache ShardingSphere Project Links:

ShardingSphere Github

ShardingSphere Twitter

ShardingSphere Slack

Contributor Guide

References
[1]Apache ShardingSphere Download Page:
https://shardingsphere.apache.org/document/current/en/downloads/

[2] mysql-connector-java-5.1.47.jar: https://repo1.maven.org/maven2/mysql/mysql-connector-java/5.1.47/mysql-connector-java-5.1.47.jar

[3] mysql-connector-java-8.0.11.jar: https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.11/mysql-connector-java-8.0.11.jar

[4] Apache ShardingSphere official website: https://shardingsphere.apache.org/

[5] Build a Distributed Database Solution based on PostgreSQL/openGauss: https://shardingsphere.apache.org/blog/en/material/jan_28_blog_pg_create_a_distributed_database_solut ion_based_on_postgresql__opengauss/
Aug 26 '22 #1
0 20452

Sign in to post your reply or Sign up for a free account.

Similar topics

7
by: Brad Clements | last post by:
Is anyone interested in purchasing a hardcopy version of the Python 2.4 Library reference? That is, assuming it was NOT a direct print of current html/pdf versions. So, nicely formatted for a...
15
by: Ron Adam | last post by:
Does anyone have suggestions on how to improve this further? Cheers, Ron_Adam def getobjs(object, dlist=, lvl=0, maxlevel=1): """ Retrieve a list of sub objects from an object. """
5
by: Bill Willyerd | last post by:
I have been looking for some documentation that would support or reject my opinion on Production -vs- Development naming conventions. I believe that each environment should be housed on separate...
2
by: A.M | last post by:
Hi, I uderstand any Quick Start samples in .NET documentation or VS.NET documentation or in any other community/portal websites, is a link to http://samples.gotdotnet.com/quickstart/. That...
2
by: Hai Ly Hoang | last post by:
I want to learn aspnet. Which website i can start with ? Thanks
0
by: genojoe | last post by:
I am using .NET 1.1 on a computer with Windows XP Professional Service Pack 2. I installed the Samples and QuickStart Tutorials that is part of Framework SDK v1.1. Installation went smoothly. ...
0
by: Jim C. Nasby | last post by:
Our open-source project (http://opensims.org) uses PostgreSQL as the reference implementation, but we've found that the bulk of installation issues have to due with getting PostgreSQL up and...
1
by: David Thielen | last post by:
Hi; Is there a sample somewhere on a control that creates tab menus like at http://www.windwardreports.com/? Or like at http://www.windwardreports.com/dotnethelp/index.htm (no sub-menu)? And...
1
by: rmli | last post by:
Oracle Server Quick Reference Guide http://quickdba.blogspot.com/2006/05/oracle-server-quick-reference-guide.html Visit daily to get more updated articles
15
by: (PeteCresswell) | last post by:
Some time within the next nine months, there's a pretty good chance I'll have the "opportunity" to port a bond trading system that I wrote - whose front end is currently MS Access - to .NET. ...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: marcoviolo | last post by:
Dear all, I would like to implement on my worksheet an vlookup dynamic , that consider a change of pivot excel via win32com, from an external excel (without open it) and save the new file into a...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.