Spring Embedded Tomcat 튜닝 – 테오의 저장소

왜 Embedded Tomcat을 튜닝해야 하는가?

Spring Boot는 기본적으로 Embedded Tomcat을 내장하여 별도 WAS 설치 없이 애플리케이션을 실행합니다. 하지만 기본 설정은 개발 편의에 맞춰져 있어, 프로덕션 환경에서는 스레드 풀, 커넥터, 타임아웃, 커넥션 제한 등을 반드시 튜닝해야 합니다. 잘못된 설정은 요청 대기열 폭주, 커넥션 고갈, 느린 응답 등 심각한 성능 문제를 유발합니다.

스레드 풀 설정

Tomcat은 요청마다 스레드를 할당하는 thread-per-request 모델입니다. 스레드 풀 크기가 성능의 핵심입니다.

# application.yml
server:
  tomcat:
    threads:
      max: 200          # 최대 스레드 수 (기본: 200)
      min-spare: 20     # 유휴 최소 스레드 (기본: 10)
    max-connections: 8192  # 최대 동시 커넥션 (기본: 8192)
    accept-count: 100      # 대기열 크기 (기본: 100)

설정	기본값	의미
`threads.max`	200	동시 처리 가능한 최대 요청 수
`threads.min-spare`	10	요청 없어도 유지하는 최소 스레드
`max-connections`	8192	NIO 커넥터의 최대 동시 커넥션
`accept-count`	100	max-connections 초과 시 OS 레벨 대기열

튜닝 공식: threads.max = (CPU 코어 수) × (1 + I/O 대기 비율). CPU 바운드 앱은 코어 수와 비슷하게, I/O 바운드(DB 호출 많은) 앱은 코어 수의 5~10배로 설정합니다.

# 4코어 서버, I/O 비율 80%인 API 서버
# threads.max = 4 × (1 + 0.8/0.2) = 4 × 5 = 20 ... 이론값
# 실전에서는 부하 테스트로 최적값 찾기 (보통 100~400)

server:
  tomcat:
    threads:
      max: 150
      min-spare: 30
    max-connections: 10000
    accept-count: 200

타임아웃 설정

타임아웃 미설정은 Slow HTTP 공격이나 느린 클라이언트로 인한 스레드 고갈의 원인입니다.

server:
  # 커넥션 타임아웃: 클라이언트 연결 후 첫 요청까지 대기 시간
  connection-timeout: 10s     # 기본: 20s → 10s로 단축

  tomcat:
    # Keep-Alive 타임아웃: 유휴 커넥션 유지 시간
    keep-alive-timeout: 30s   # 기본: connection-timeout과 동일

    # Keep-Alive 최대 요청 수: 하나의 커넥션에서 처리할 최대 요청
    max-keep-alive-requests: 200  # 기본: 100, -1은 무제한

    # 요청 본문 읽기 타임아웃
    connection-timeout: 10000  # ms

Java 코드로 세밀하게 제어:

@Configuration
public class TomcatConfig {

    @Bean
    public WebServerFactoryCustomizer<TomcatServletWebServerFactory>
            tomcatCustomizer() {

        return factory -> {
            factory.addConnectorCustomizers(connector -> {
                var protocol = (Http11NioProtocol)
                    connector.getProtocolHandler();

                // 스레드 풀
                protocol.setMaxThreads(200);
                protocol.setMinSpareThreads(30);
                protocol.setAcceptCount(200);

                // 커넥션
                protocol.setMaxConnections(10000);
                protocol.setConnectionTimeout(10000);
                protocol.setKeepAliveTimeout(30000);
                protocol.setMaxKeepAliveRequests(200);

                // 요청 크기 제한
                protocol.setMaxHttpHeaderSize(16384);   // 16KB
                protocol.setMaxSwallowSize(2097152);    // 2MB

                // 압축
                connector.setProperty("compression", "on");
                connector.setProperty("compressionMinSize", "1024");
                connector.setProperty("compressibleMimeType",
                    "application/json,text/html,text/css,application/javascript");
            });
        };
    }
}

NIO2 커넥터와 비동기 처리

기본 NIO 대신 NIO2를 사용하면 비동기 I/O 성능이 향상됩니다.

@Bean
public WebServerFactoryCustomizer<TomcatServletWebServerFactory>
        nio2Customizer() {
    return factory -> {
        factory.setProtocol("org.apache.coyote.http11.Http11Nio2Protocol");
    };
}

// 또는 application.yml
server:
  tomcat:
    protocol: org.apache.coyote.http11.Http11Nio2Protocol

NIO2는 OS 레벨의 비동기 I/O(epoll/kqueue)를 직접 활용하여 높은 동시성 환경에서 NIO보다 나은 성능을 보입니다.

Access Log 설정

운영 환경에서 요청 로그는 필수입니다. Tomcat Access Log로 응답 시간, 상태 코드, 클라이언트 정보를 기록합니다.

server:
  tomcat:
    accesslog:
      enabled: true
      directory: /var/log/app
      prefix: access
      suffix: .log
      file-date-format: .yyyy-MM-dd
      # 커스텀 패턴: 클라이언트IP, 요청시간, 메서드, URL, 상태, 응답시간(ms)
      pattern: "%h %t "%r" %s %b %D"
      rotate: true
      max-days: 30
      condition-if: accessLogEnabled  # 조건부 로깅

JSON 포맷으로 구조화 로깅:

server:
  tomcat:
    accesslog:
      enabled: true
      pattern: >-
        {"ip":"%h","time":"%t","method":"%m",
         "uri":"%U","query":"%q","status":%s,
         "size":%b,"duration":%D,"ua":"%{User-Agent}i"}

Graceful Shutdown

Spring Boot 2.3+에서 Graceful Shutdown을 활성화하면, 종료 시 진행 중인 요청이 완료될 때까지 기다립니다.

server:
  shutdown: graceful

spring:
  lifecycle:
    timeout-per-shutdown-phase: 30s  # 최대 대기 시간

Graceful Shutdown 동작 순서:

1. SIGTERM 수신
2. 새 요청 거부 (503 반환)
3. 진행 중인 요청 완료 대기 (최대 30초)
4. Keep-Alive 커넥션 정리
5. 스레드 풀 종료
6. ApplicationContext 종료

Actuator로 Tomcat 메트릭 모니터링

# 주요 Tomcat 메트릭 (Micrometer)
tomcat.threads.current        # 현재 스레드 수
tomcat.threads.busy           # 사용 중인 스레드 수
tomcat.threads.config.max     # 최대 스레드 설정값
tomcat.connections.current    # 현재 커넥션 수
tomcat.connections.keepalive  # Keep-Alive 커넥션 수
tomcat.sessions.active        # 활성 세션 수

# application.yml
management:
  endpoints:
    web:
      exposure:
        include: health,metrics,prometheus
  metrics:
    tags:
      application: my-api

// Grafana 대시보드용 PromQL 예시
// 스레드 사용률
tomcat_threads_busy_threads / tomcat_threads_config_max_threads * 100

// 스레드 풀 포화도 알림
// 사용률 80% 초과 시 경고
rate(tomcat_threads_busy_threads[5m]) > 0.8 * tomcat_threads_config_max_threads

환경별 권장 설정

# 소규모 API (2코어, 4GB)
server:
  tomcat:
    threads: { max: 100, min-spare: 10 }
    max-connections: 5000
    accept-count: 100
  connection-timeout: 10s

# 대규모 API (8코어, 16GB)
server:
  tomcat:
    threads: { max: 400, min-spare: 50 }
    max-connections: 20000
    accept-count: 500
  connection-timeout: 5s

# WebSocket 서버 (긴 커넥션)
server:
  tomcat:
    threads: { max: 50, min-spare: 10 }
    max-connections: 50000
    keep-alive-timeout: 120s
    max-keep-alive-requests: -1

Spring Actuator 커스텀 엔드포인트 — Tomcat 메트릭과 함께 활용할 운영 엔드포인트
Spring Docker 이미지 최적화 — 컨테이너 환경에서의 Tomcat 리소스 제한 설정