DataLoader를 사용하여 GraphQL 쿼리 최적화하기

 query ($id: Int!) {
    activity(id: $id) {
      id
      title
      content
      type
      images
      files
      subject
      endDate
      createdAt
      groupActivities {
        group {
          id
          name
        }
      }
    }
  }
`;

쿼리 로그

쿼리 로그를 출력하면 아래와 같은 쿼리가 출력된다.

SELECT "Activity"."id" AS "Activity_id", "Activity"."created_at" AS "Activity_created_at", 
       "Activity"."updated_at" AS "Activity_updated_at", "Activity"."title" AS "Activity_title", 
       "Activity"."content" AS "Activity_content", "Activity"."subject" AS "Activity_subject", 
       "Activity"."end_date" AS "Activity_end_date", "Activity"."images" AS "Activity_images",
       "Activity"."files" AS "Activity_files", "Activity"."type" AS "Activity_type" 
FROM "activity" "Activity" WHERE "Activity"."id" = 7 LIMIT 1;

SELECT "GroupActivity"."id" AS "GroupActivity_id", "GroupActivity"."created_at" AS "GroupActivity_created_at", 
       "GroupActivity"."updated_at" AS "GroupActivity_updated_at", "GroupActivity"."group_id" AS "GroupActivity_group_id",
       "GroupActivity"."activity_id" AS "GroupActivity_activity_id" 
FROM "group_activity" "GroupActivity" WHERE "GroupActivity"."activity_id" = 7;

SELECT "Group"."id" AS "Group_id", "Group"."created_at" AS "Group_created_at", "Group"."updated_at" AS "Group_updated_at", 
       "Group"."name" AS "Group_name", "Group"."type" AS "Group_type", "Group"."year" AS "Group_year", 
       "Group"."grade" AS "Group_grade", "Group"."school_id" AS "Group_school_id" 
FROM "group" "Group" WHERE "Group"."id" = 7 LIMIT 1;

 

Activity의 Id가 7인 Row를 JOIN의 형태로 가져오는 것으로 이해하고 있었는데 리턴되는 JSON 값을 구성하기 위해 실제로는 1번의 API 콜에 의해 SELECT 쿼리가 ( JOIN한 횟수 - 1 ) 만큼 날라가는 것을 쿼리 로그를 출력하면서 알게 되었다.

 

문제점

만약 100개의 Activity를 GET 해온다면 한 번의 API 콜에 SELECT 쿼리는 총 약 301개 정도가 날라가게 된다. 해당 문제점은  GraphQL의 단점으로 지적되는 N+1 문제이다. Relation이 연결되어 있는 테이블에서 join을 할 때, 1개의 Activity별로 각각 SELECT 쿼리를 날리는 것은 한 번에 JOIN이 포함되어있는 SELECT 쿼리를 날리는 것에 비해 I/O 비용이 많이 발생하게 된다.

 

해결책

N+1 문제를 해결하기 위해 DataLoader를 사용하였다. 

 

GitHub - graphql/dataloader: DataLoader is a generic utility to be used as part of your application's data fetching layer to pro

DataLoader is a generic utility to be used as part of your application's data fetching layer to provide a consistent API over various backends and reduce requests to those backends via batching...

github.com

DataLoader의 동작 방식은 아래와 같다.

  1. event loop에서 1개의 Tick(1개의 loop) 동안 request 받은 key값을 리스트의 형태로 저장한다.
  2. 저장한 key를 이용하여  SELECT 쿼리를 한 번에 날린다.

Loader계층이 Service계층과 Resolver계층 사이에 위치할 수 있도록 구조를 변경하고, 필요한 도메인마다 개별로 구성하였다.

//studentactivities.module.ts

@Module({
  imports: [TypeOrmModule.forFeature([StudentActivity])],
  providers: [StudentActivitiesService, studentActivityLoaders],
  exports: [StudentActivitiesService, studentActivityLoaders],
})
export class StudentActivitiesModule {}
// studentactivities.loader.ts

@Injectable({ scope: Scope.REQUEST })
export default class studentActivityLoaders {
  constructor(private studentActivityService: StudentActivitiesService) {}

  public readonly studentActivityLoader = new DataLoader<string, any[]>(
    (ActivityIds: readonly string[]) => {
      return this.studentActivityService.getStudentActivityWithActivityIds(
        ActivityIds,
      );
    },
    {
      cache: false,
    },
  );

  public readonly studentActivityUserLoader = new DataLoader<string, any[]>(
    (UserIds: readonly string[]) => {
      return this.studentActivityService.getStudentActivityWithUserIds(UserIds);
    },
    { cache: false },
  );
}
// studentactivities.service.ts

async getStudentActivityWithActivityIds(ActivityIds: readonly string[]) {
    const activities: any = await this.repository
      .createQueryBuilder('studentActivity')
      .leftJoinAndSelect('studentActivity.activity', 'activity')
      .where('activity.id IN (: ...ActivityIds)', {
        ActivityIds,
      })
      .getMany();
    const activitiesMap: { [key: string]: Activity[] } = {};
    activities.forEach((activity) => {
      activitiesMap[activity.id] = activity.Activity;
    });
    const result = ActivityIds.map((id) => activitiesMap[id]);
    return result;
  }
// studentactivities.resolver.ts

@ResolveField(() => Activity)
  activity(@Parent() studentActivity: StudentActivity) {
    return this.studentActivityLoaders.studentActivityLoader.load(
      studentActivity.activityId.toString(),
    );
  }

수정한 해당 쿼리를 다시 로그를 출력해서 체크해보면 한 번의 쿼리로 정리됨을 알 수 있다.

SELECT "activity"."id" AS "activity_id", "activity"."created_at" AS "activity_created_at",
       "activity"."title" AS "activity_title", "activity"."content" AS "activity_content",
       "activity"."subject" AS "activity_subject", "activity"."end_date" AS "activity_end_date",
       "activity"."images" AS "activity_images", "activity"."files" AS "activity_files",
       "activity"."type" AS "activity_type", "groupActivity"."id" AS "groupActivity_id",
       "groups"."id" AS "groups_id", "groups"."name" AS "groups_name" FROM "activity"
           "activity" LEFT JOIN "group_activity" "groupActivity" 
           ON "groupActivity"."activity_id"="activity"."id" 
           AND ("groupActivity"."activity_id" = "activity"."id")
           LEFT JOIN "group" "groups" ON "groups"."id"="groupActivity"."group_id" 
           AND ("groups"."id" = "groupActivity"."group_id") WHERE "activity"."id" = 7;