Next.js Batch LLM Evaluator

Overview

This demo is a full stack example that uses the following:

A Next.js app with Prisma for the database.
Trigger.dev Realtime to stream updates to the frontend.
Work with multiple LLM models using the Vercel AI SDK. (OpenAI, Anthropic, XAI)
Distribute tasks across multiple tasks using the new batch.triggerByTaskAndWait method.

GitHub repo

View the Batch LLM Evaluator repo

Click here to view the full code for this project in our examples repository on GitHub. You can fork it and use it as a starting point for your own project.

Video

Relevant code

View the Trigger.dev task code in the src/trigger/batch.ts file.
The evaluateModels task uses the batch.triggerByTaskAndWait method to distribute the task to the different LLM models.
It then passes the results through to a summarizeEvals task that calculates some dummy “tags” for each LLM response.
We use a useRealtimeRunsWithTag hook to subscribe to the different evaluation tasks runs in the src/components/llm-evaluator.tsx file.
We then pass the relevant run down into three different components for the different models:
- The AnthropicEval component: src/components/evals/Anthropic.tsx
- The XAIEval component: src/components/evals/XAI.tsx
- The OpenAIEval component: src/components/evals/OpenAI.tsx
Each of these components then uses useRealtimeRunWithStreams to subscribe to the different LLM responses.

This example uses the older useRealtimeRunWithStreams hook. For new projects, consider using the new useRealtimeStream hook (SDK 4.1.0+) for a simpler API and better type safety with defined streams.

Learn more about Trigger.dev Realtime

To learn more, take a look at the following resources:

Trigger.dev Realtime - learn more about how to subscribe to runs and get real-time updates
Realtime streaming - learn more about streaming data from your tasks
Batch Triggering - learn more about how to trigger tasks in batches
React hooks - learn more about using React hooks to interact with the Trigger.dev API

Introduction

Frameworks

Guides

Use cases

Example projects

Python guides

Example tasks

Migration guides

Community packages

Next.js Batch LLM Evaluator

Overview

GitHub repo

View the Batch LLM Evaluator repo

Video

Relevant code

Learn more about Trigger.dev Realtime

Introduction

Frameworks

Guides

Use cases

Example projects

Python guides

Example tasks

Migration guides

Community packages

​Overview

​GitHub repo

View the Batch LLM Evaluator repo

​Video

​Relevant code

​Learn more about Trigger.dev Realtime

Overview

GitHub repo

Video

Relevant code

Learn more about Trigger.dev Realtime