페이징 성능 개선하기

1. 글을 작성하게 된 계기

알고 있던 몇 가지 페이징 성능 개선 방법을 정리하기 위해 작성한 글입니다.

커서 페이징
비동기 페이징
집계 테이블 활용

2. 커서 페이징

커서 페이징은 인덱스를 통해 데이터를 빠르게 조회하는 페이징 기법입니다. 너무 카운트 쿼리가 필요하지 않으며, 데이터베이스 인덱스만으로 전/후 데이터를 읽기 때문에 읽기 속도가 빠릅니다. 전체 페이지 수를 알 수 없으며, 최초 조회 시 시간이 걸릴 수 있습니다.

  
-- 첫 번째 페이지 조회
SELECT id, title, content, created_at 
FROM posts 
LIMIT 10;

-- 두 번째 페이지 조회
SELECT id, title, content, created_at 
FROM posts 
WHERE id < ${POST_ID}
LIMIT 10;

사실 이는 너무 대중화 된 기법이라 제외할까 망설였는데, 안 넣는 것 보다 나을 것 같아 넣었습니다.

3. 비동기 카운트 쿼리

전체 데이터 수를 알아야 하는 페이징의 경우, 카운트 와 데이터 페치 두 번의 쿼리를 실행합니다. 이때 각 쿼리를 비동기 로 실행한 후, 결과를 합쳐서 성능을 향상시킬 수 있습니다. 이를 코드로 보면 다음과 같습니다. 하지만 이는 offset 방식 의 단점을 벗어나진 못하므로, 데이터가 많아지면 속도가 느려집니다.

  
@Repository
class UserEntityReadRepository(
    private val queryFactory: JPAQueryFactory,
) : UserReadRepository {

    companion object {
        private val totalCountExpression = numberTemplate(Long::class.java, "count(1)")
    }

    override suspend fun findUsers(
        page: Int,
        size: Int,
    ): Pair<Long, List<User>> = coroutineScope {
        // 비동기 카운트 쿼리
        val totalCount = async {
            queryFactory.select(totalCountExpression)
                .from(user)
                .fetchOne() ?: 0L
        }

       // 비동기 데이터 패치
        val findUsers = async {
            queryFactory.selectFrom(user)
                .offset((page) * size.toLong())
                .limit(10)
                .fetch()
        }
       
       // 결과를 합친 후 반환.
        Pair(totalCount.await(), findUsers.await())
    }
}

참조할 수 있는 간단한 예제를 만들어 뒀는데, 해당 레포지토리를 참조해보세요.

4. 집계 테이블

집계 테이블에 전체 데이터 개수를 미리 세어 성능을 향상시킬 수도 있습니다.

  
@Service
class PostReadService(
    private val postRepository: PostRepository,
    private val postCountRepository: PostCountRepository
) : PostReadUseCase {

    override fun findPost(
        pageable: Pageable
    ): PostsResponse {
        val totalCount = postCountRepository.getTotalCount()
        val findPosts = postRepository.findPosts(pageable)
        return PostsResponse(totalCount, findPosts)
    }
    ......
}

하지만 이는 쓰기 작업이 발생할 때, 데이터 정합성 위해 매 번 락 을 걸어야 합니다. 혹은 백그라운드로 배치를 사용하거나요.

  
@Service
class PostWriteService(
    private val lockService: LockService,
    private val postRepository: PostRepository,
    private val postCountRepository: PostCountRepository
) : PostWriteUseCase {

    @Transactional
    override fun save(
        key: String,
        post: Post
    ): Long {
        lockService.getLock(key)
        try {
            postCountRepository.increaseCount()
            val newPostId = postRepository.save(post)
        } finally {
            lockService.releaseLock(key)
        }
        
        ......
    }
    ......
}

이를 응용하면 데이터가 많더라도 다음과 같은 페이징도 가능합니다. 개인 거래내역 과 같은 개인 데이터도 페이징 할 수 있는 것이죠.

  
CREATE TABLE account
(
    id             BIGINT AUTO_INCREMENT PRIMARY KEY,
    user_id        BIGINT       NOT NULL,
    account_number VARCHAR(20)  NOT NULL,
    created_at     TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE account_statistic
(
    account_id         INT PRIMARY KEY,
    total_transactions INT DEFAULT 0,
    FOREIGN KEY (account_id) REFERENCES account (id)
);

5. 정리

페이징 기법은 다양하고 각 장/단점이 뚜렷합니다. 일장일단이기 때문에 전체 데이터 수와 현재 상황을 파악해 자신에게 맞는 기법을 적용할 수 있도록 합니다.

페이징 성능 개선하기

1. 글을 작성하게 된 계기

2. 커서 페이징

3. 비동기 카운트 쿼리

4. 집계 테이블

5. 정리

Further Reading

Statement와 Expression

final 키워드를 default로 사용하는 것은 좋은 방법일까?

가맹점마다 정산 모델이 다를 때 어떻게 처리해야 할까?