Skip to content

TPS and Tomcat & HikariCP Settings(EN)

김민종 edited this page Oct 28, 2024 · 1 revision

Measurement Environment

  • Due to exceeding the monthly budget, the test database environment could not be configured the same as the production environment.
    • Database: db.m6gd.large (production), t4g.micro (test)
  • It was decided to optimize performance in the test environment as much as possible before applying the results to production.
  • Dummy data of 700,000 to 1 million records was loaded per database table.

TPS Goal

Benchmark Service - Baekjoon

  • Selected Baekjoon, a widely used algorithm service in programming-related education.

TPS Goal

  • Set the target TPS as the value obtained by dividing the number of page views by seconds.
  • Page views / 86,400 (seconds/day)
  • 19,080,000 / 86,400 = 220.69 TPS

Test Types

  1. Single test of key APIs that directly assist in pair programming.
  2. Scenario-based multi-test predicting the flow of service usage.

1. Single Test

  • Selected several features that help users of the 'Coding Duo' service in pair programming, conducting TPS measurements for each individual API call and observing any anomalies.

image.png

  • Conducted the test with 200 virtual users, increasing by 50 every second for a total of 8 minutes.

Pair Room

Pair Room Creation TPS

Scenario: Each user calls the [POST] /api/pair-room API once every second.

image.png

  • TPS Trend

image.png

  • Performance Test Metrics

Scenario TPS

  • 16 TPS

Remarks

  • DB Bottleneck: HikariCP pending increased to a maximum of 168.
  • 0.17% (i.e., 13 requests failed out of 77,705 requests).

Pair Room Retrieval TPS

Scenario: Each user calls the [GET] /api/pair-room/{accessCode} API once every second.

image.png

  • TPS Trend

image.png

  • Performance Test Metrics

Scenario TPS

  • 22.61 TPS

Pair Room Status Change TPS

Scenario: Each user calls the [PATCH] /api/pair-room/{accessCode}/status API once every second.

image.png

  • TPS Trend

image.png

  • Performance Test Metrics

Scenario TPS

  • 25.80 TPS

Remarks

  • DB Bottleneck: HikariCP pending increased to a maximum of 148.

To-Do

To-Do Creation TPS

Scenario: Each user calls the following APIs once every second

  1. [POST] /api/pair-room Create room
  2. [POST] /api/{accessCode}/todos Create TODO list in the room

image.png

  • TPS Trend

image.png

  • Performance Test Metrics

Scenario TPS

  • 16.78

Remarks

  • DB Bottleneck: HikariCP pending increased to a maximum of 179.
  • Pair room creation timeout: 338 occurrences.

To-Do Retrieval TPS

Scenario: Each user calls the following APIs once every second

  1. [POST] /api/{accessCode}/todos Create TODO list in the room
  2. [GET] /api/{accessCode}/todos Retrieve TODO list

image.png

  • TPS Trend

image.png

  • Performance Test Metrics

Scenario TPS

  • 5.30

Remarks

  • JVM thread live max 312
  • Low TPS
  • Low success rate

Reference Links

Reference Link Creation TPS

Scenario: Each user calls the [POST] /api/{accessCode}/reference-link API once every second.

image.png

  • TPS Trend

image.png

  • Performance Test Metrics

Scenario TPS

  • 24.39 TPS

Category

Category Creation TPS

Each user calls the [POST] /api/{accessCode}/category API once every second.

image.png

  • Success Rate of Requests per Second

image.png

  • Performance Test Metrics

Scenario TPS

  • 22.04

Single Test - Result Analysis

TPS

  • Minimum: 5.3
  • Maximum: 25.8
  • Average: 18.9

Issues

  • HikariCP exhaustion issue occurred when the number of virtual users reached around 150.

Conclusion

Confirmed the occurrence of a database bottleneck and will improve it through scenario testing.


2. Scenario Testing

Scenario Specifications

  1. Pair Room Creation
    1. Create SSE connection
    2. Retrieve To-Do list
    3. Retrieve Reference links
    4. Retrieve Timer information
  2. Start Timer
  3. Create Reference Link
  4. Create To-Do list 10 times
  5. Create Category 3 times
  6. Change Pair Roles
    1. Change Pair Room Status
  7. Stop Timer

Test Environment

image.png

  • 0~4 minutes: Increase by 50 sequentially
  • 4~8 minutes: Increase by 100 sequentially
  • 8~12 minutes: Increase by 150 sequentially
  • 12~16 minutes: Increase by 200 sequentially

Experiment Goals

  • Adjust HikariCP max-pool
  • Adjust Tomcat threads max, max connections, accept count

HikariCP Pool Size Adjustment

1. Max Connection Pool 20

image.png

Result

  • TPS: 7.9
  • In certain situations, connections are not returned, causing requests to remain in a pending state.
    • When more than 150 users connect, the Hikari connections increase rapidly.

2. Max Connection Pool 10

image.png

Result

  • TPS: 10.8
  • In certain situations, connections are not returned, causing requests to remain in a pending state.
    • When more than 178 users connect, the Hikari connections increase rapidly.

Changes in Values by Max Pool Size

image.png

image.png

  • From left to right, the number of max connections in the pool is 10, 20, and 5.
  • The measurement results showed that the default setting of 10 connections had the highest TPS, while both 20 and 5 connections exhibited similar values.
  • Additionally, the measurement results indicated that the number of connections in the pending state was the lowest when the default setting of 10 connections was used.

Conclusion: It was confirmed that HikariCP is most efficient with a max pool size of 10 connections.


Tomcat Configuration Adjustments

Due to database bottlenecks making significant measurements impossible, scenarios were simplified, and the limits of HTTP requests and TPS were measured under sudden large loads.

Testing Environment

  • 0~1 minute: 100 users
  • 1~2 minutes: 200 users
  • 1 minute~1 minute 30 seconds: 300 users

1. Max-Thread

image.png

Max-Thread: 100

  • TPS: 30.72
  • Success Rate: 94.79%

Max-Thread: 150

  • TPS: 29.35
  • Success Rate: 98.25%

Max-Thread: 200

  • TPS: 29.10
  • Success Rate: 97.03%

Conclusion: Although there is no significant difference in TPS, the success rate is most stable at a max-thread of 150, with a rate of 98.53%.

2. Max-Connection

image.png

image.png

Max-Connection: 4000

  • Success Rate: 97.47%
  • TPS: 27.98
  • GC Stop the World: 13.4ms

Max-Connection: 6000

  • Success Rate: 98.58%
  • TPS: 27.8
  • GC Stop the World: 16.6ms

Max-Connection: 8000

  • Success Rate: 98.53%
  • TPS: 29.95
  • GC Stop the World: 66.1ms

Conclusion: There is little notable change in TPS values due to adjustments in Tomcat's Max-Connection. However, reducing it from 8192 to 6000 improved the GC Stop the World duration by about four times.

3. Accept Count

Accept-Count: 100

image.png

Accept-Count: 200

image.png

Conclusion: Although Tomcat's acceptCount was increased to improve the request timeout by about 2%, no significant changes were observed.

Results

Applied Values

  • HikariCP Max-Pool-Size: 10
  • Tomcat Max-Thread: 150
  • Tomcat Accept-Count: 100
  • Tomcat Max-Connection: 6000

The final TPS was measured using the same environment (db.m6gd.large) as the operational server along with the predicted expected values.

image.png

220.47 TPS was confirmed, which is close to the target value.

Clone this wiki locally