DataLoader를 사용하여 GraphQL 쿼리 최적화하기
query ($id: Int!) {
activity(id: $id) {
id
title
content
type
images
files
subject
endDate
createdAt
groupActivities {
group {
id
name
}
}
}
}
`;
쿼리 로그
쿼리 로그를 출력하면 아래와 같은 쿼리가 출력된다.
SELECT "Activity"."id" AS "Activity_id", "Activity"."created_at" AS "Activity_created_at",
"Activity"."updated_at" AS "Activity_updated_at", "Activity"."title" AS "Activity_title",
"Activity"."content" AS "Activity_content", "Activity"."subject" AS "Activity_subject",
"Activity"."end_date" AS "Activity_end_date", "Activity"."images" AS "Activity_images",
"Activity"."files" AS "Activity_files", "Activity"."type" AS "Activity_type"
FROM "activity" "Activity" WHERE "Activity"."id" = 7 LIMIT 1;
SELECT "GroupActivity"."id" AS "GroupActivity_id", "GroupActivity"."created_at" AS "GroupActivity_created_at",
"GroupActivity"."updated_at" AS "GroupActivity_updated_at", "GroupActivity"."group_id" AS "GroupActivity_group_id",
"GroupActivity"."activity_id" AS "GroupActivity_activity_id"
FROM "group_activity" "GroupActivity" WHERE "GroupActivity"."activity_id" = 7;
SELECT "Group"."id" AS "Group_id", "Group"."created_at" AS "Group_created_at", "Group"."updated_at" AS "Group_updated_at",
"Group"."name" AS "Group_name", "Group"."type" AS "Group_type", "Group"."year" AS "Group_year",
"Group"."grade" AS "Group_grade", "Group"."school_id" AS "Group_school_id"
FROM "group" "Group" WHERE "Group"."id" = 7 LIMIT 1;
Activity의 Id가 7인 Row를 JOIN의 형태로 가져오는 것으로 이해하고 있었는데 리턴되는 JSON 값을 구성하기 위해 실제로는 1번의 API 콜에 의해 SELECT 쿼리가 ( JOIN한 횟수 - 1 ) 만큼 날라가는 것을 쿼리 로그를 출력하면서 알게 되었다.
문제점
만약 100개의 Activity를 GET 해온다면 한 번의 API 콜에 SELECT 쿼리는 총 약 301개 정도가 날라가게 된다. 해당 문제점은 GraphQL의 단점으로 지적되는 N+1 문제이다. Relation이 연결되어 있는 테이블에서 join을 할 때, 1개의 Activity별로 각각 SELECT 쿼리를 날리는 것은 한 번에 JOIN이 포함되어있는 SELECT 쿼리를 날리는 것에 비해 I/O 비용이 많이 발생하게 된다.
해결책
N+1 문제를 해결하기 위해 DataLoader를 사용하였다.
GitHub - graphql/dataloader: DataLoader is a generic utility to be used as part of your application's data fetching layer to pro
DataLoader is a generic utility to be used as part of your application's data fetching layer to provide a consistent API over various backends and reduce requests to those backends via batching...
github.com
DataLoader의 동작 방식은 아래와 같다.
- event loop에서 1개의 Tick(1개의 loop) 동안 request 받은 key값을 리스트의 형태로 저장한다.
- 저장한 key를 이용하여 SELECT 쿼리를 한 번에 날린다.
Loader계층이 Service계층과 Resolver계층 사이에 위치할 수 있도록 구조를 변경하고, 필요한 도메인마다 개별로 구성하였다.
//studentactivities.module.ts
@Module({
imports: [TypeOrmModule.forFeature([StudentActivity])],
providers: [StudentActivitiesService, studentActivityLoaders],
exports: [StudentActivitiesService, studentActivityLoaders],
})
export class StudentActivitiesModule {}
// studentactivities.loader.ts
@Injectable({ scope: Scope.REQUEST })
export default class studentActivityLoaders {
constructor(private studentActivityService: StudentActivitiesService) {}
public readonly studentActivityLoader = new DataLoader<string, any[]>(
(ActivityIds: readonly string[]) => {
return this.studentActivityService.getStudentActivityWithActivityIds(
ActivityIds,
);
},
{
cache: false,
},
);
public readonly studentActivityUserLoader = new DataLoader<string, any[]>(
(UserIds: readonly string[]) => {
return this.studentActivityService.getStudentActivityWithUserIds(UserIds);
},
{ cache: false },
);
}
// studentactivities.service.ts
async getStudentActivityWithActivityIds(ActivityIds: readonly string[]) {
const activities: any = await this.repository
.createQueryBuilder('studentActivity')
.leftJoinAndSelect('studentActivity.activity', 'activity')
.where('activity.id IN (: ...ActivityIds)', {
ActivityIds,
})
.getMany();
const activitiesMap: { [key: string]: Activity[] } = {};
activities.forEach((activity) => {
activitiesMap[activity.id] = activity.Activity;
});
const result = ActivityIds.map((id) => activitiesMap[id]);
return result;
}
// studentactivities.resolver.ts
@ResolveField(() => Activity)
activity(@Parent() studentActivity: StudentActivity) {
return this.studentActivityLoaders.studentActivityLoader.load(
studentActivity.activityId.toString(),
);
}
수정한 해당 쿼리를 다시 로그를 출력해서 체크해보면 한 번의 쿼리로 정리됨을 알 수 있다.
SELECT "activity"."id" AS "activity_id", "activity"."created_at" AS "activity_created_at",
"activity"."title" AS "activity_title", "activity"."content" AS "activity_content",
"activity"."subject" AS "activity_subject", "activity"."end_date" AS "activity_end_date",
"activity"."images" AS "activity_images", "activity"."files" AS "activity_files",
"activity"."type" AS "activity_type", "groupActivity"."id" AS "groupActivity_id",
"groups"."id" AS "groups_id", "groups"."name" AS "groups_name" FROM "activity"
"activity" LEFT JOIN "group_activity" "groupActivity"
ON "groupActivity"."activity_id"="activity"."id"
AND ("groupActivity"."activity_id" = "activity"."id")
LEFT JOIN "group" "groups" ON "groups"."id"="groupActivity"."group_id"
AND ("groups"."id" = "groupActivity"."group_id") WHERE "activity"."id" = 7;