Skip to content
Cloudflare Docs

AI (Transcription and Summary)

RealtimeKit provides AI-powered features to enhance your meetings, including real-time transcription and automatic meeting summaries. These features help you capture important discussions and generate concise overviews of your meetings.

Meeting transcription

RealtimeKit's meeting transcription allows you to transcribe your RealtimeKit meetings in real-time, making it easy to capture important discussions and refer back to them later.

Control transcriptions using presets

You can control whether a participant's audio will be transcribed using the transcription_enabled flag in the participant's preset. All participants with transcription_enabled turned on in their preset will be able to generate transcripts in real-time in a RealtimeKit meeting.

Follow the Preset Creation Guide to create a new preset.

Configure transcription behavior

You can control transcription behavior through a configurable AI setup. When creating a meeting using the REST API, you can pass an AI configuration for transcriptions. This allows for greater control over the transcription process.

Supported languages

You can specify the language for transcription to ensure accurate and relevant results. The following languages are supported:

  • English (United States) - en-US
  • English (India) - en-IN
  • German - de
  • Hindi - hi
  • Swedish - sv
  • Russian - ru
  • Polish - pl
  • Greek - el
  • French - fr
  • Dutch - nl
  • Turkish - tr
  • Spanish - es
  • Italian - it
  • Portuguese - pt
  • Romanian - ro
  • Korean - ko
  • Indonesian - id
  • Multi - multi

Example:

"language": "en-US"

Keywords

Keywords can be added to help the transcription engine accurately detect and transcribe specific terms, such as names, technical jargon, or other context-specific words. This is particularly useful in meetings where certain terms are frequently used and need to be recognized correctly.

Example:

"keywords": ["RealtimeKit", "Mary", "Sue"]

Profanity filter

You can enable or disable the profanity filter based on your needs. This feature ensures that any offensive language is either included or excluded from the transcriptions, depending on your preference.

Example:

"profanity_filter": false

Example configuration

Here is an example of how to pass the AI configuration in the meeting creation API:

{
"title": "Meeting Transcriptions",
"ai_config": {
"transcription": {
"keywords": ["RealtimeKit"],
"language": "en-US",
"profanity_filter": false
}
}
}

Consume transcripts

There are three ways to consume transcripts:

  1. Client Core SDK - The transcripts can be consumed on the client-side using the RealtimeKit SDK. These transcripts are generated on the server in real-time
  2. Webhooks - The meeting transcript can be consumed via a webhook after the meeting ends
  3. REST API - The meeting transcript can be fetched via the REST API

Consume transcripts in real-time

For consuming transcripts in real-time on the client SDK, ensure that the transcription_enabled flag is enabled in the preset. Transcripts for all participants with this flag set will be broadcasted in the meeting.

You can use the meeting.ai object to access the transcripts:

JavaScript
console.log(meeting.ai.transcripts);

The transcripts are also emitted by the meeting.ai object, so a listener can be attached to it:

JavaScript
meeting.ai.on("transcript", (transcriptData) => {
console.log("Transcript:", transcriptData);
});

As participants speak during the meeting, you will receive partial transcripts, giving you real-time feedback even before they finish their sentences. The isPartialTranscript flag in the transcript data shows whether the transcript is partial or final.

Example transcript data:

{
"id": "1a2b3c4d-5678-90ab-cdef-1234567890ab",
"name": "Alice",
"peerId": "4f5g6h7i-8j9k-0lmn-opqr-1234567890st",
"userId": "uvwxyz-1234-5678-90ab-cdefghijklmn",
"customParticipantId": "abc123xyz",
"transcript": "Hello?",
"isPartialTranscript": true,
"date": "Wed Aug 07 2024 10:15:30 GMT+0530 (India Standard Time)"
}

In the example above, isPartialTranscript is true, indicating the transcript is still in progress. Once the participant finishes speaking, the final transcript will be sent with isPartialTranscript set to false. This helps you distinguish between ongoing speech and completed transcriptions, making the transcription process more dynamic and responsive.

Consume transcript via webhook

You can configure a webhook with the meeting.transcript event enabled to receive the meeting transcript after the meeting has ended. You can do this either on the Developer Portal or using the REST API.

For webhook format details, refer to Webhooks.

Fetch the meeting transcript

You do not need to rely on the webhook to get the transcript for a meeting. RealtimeKit provides a REST API to obtain the transcripts for a particular session. You can use this API to get the transcript for a meeting at a later time. RealtimeKit stores the transcript of a meeting for 7 days since the start of the meeting.

The transcript is received in the form of a CSV with the following format:

Timestamp, Participant ID, User ID, Custom Participant ID, Participant Name, Transcript

Field descriptions:

  • Timestamp - An ISO 8601 format string indicating the time of utterance (or the time of speech)
  • Participant ID - An identifier for individual peers in the meeting. For instance, if the participant joins the meeting twice, both peers will have the same User ID but different Participant IDs
  • User ID - An identifier for a participant in the meeting, as returned by the Add Participant API call
  • Custom Participant ID - An identifier that you can specify to identify a user. This can be sent in the request body of the Add Participant API call
  • Participant Name - The display name of the user
  • Transcript - The transcribed utterance

Test transcription

Once you have configured a preset and a webhook according to the instructions above, you can proceed to test whether meeting transcription is working for your organization:

  1. Create a meeting
  2. Add a participant to the meeting. Make sure that the preset you use was configured according to this guide
  3. Join the meeting with the authToken you just obtained. As you unmute and speak, your speech should be getting transcribed in real-time for all the participants in the meeting
  4. Once the meeting ends, you will receive a webhook with the event meeting.transcript. The body of this webhook will consist of the entire meeting transcript

Meeting summary

RealtimeKit's meeting summary feature allows you to automatically generate concise summaries of your meeting based on the transcription data. This feature makes it easy to capture key points and action items, providing a concise overview of your discussions.

Enable meeting summarization

To enable automatic summarization post meeting, set the summarize_on_end flag when creating a meeting using the REST API.

Summarization configuration options

You can tailor the summarization process using the following configuration options:

Word limit

Define the word limit for the summary, ensuring it fits your needs. You can set a limit between 150 and 1000 words.

Example:

"word_limit": 500
Text format

Choose the format for the summary text. Supported formats are:

  • plain_text
  • markdown

Example:

"text_format": "markdown"
Summary type

Select the type of summary based on the nature of the meeting. Supported types are:

  • general
  • team_meeting
  • sales_call
  • client_check_in
  • interview
  • daily_standup
  • one_on_one_meeting
  • lecture
  • code_review

Example:

"summary_type": "team_meeting"

Example configuration

Here is an example of how to enable summarization in the meeting creation API call:

{
"title": "Team Meeting",
"ai_config": {
"summarization": {
"word_limit": 500,
"text_format": "plain_text",
"summary_type": "team_meeting"
}
},
"summarize_on_end": true
}

Consume summaries

There are two ways to consume the generated summaries:

  1. Webhooks - Receive the meeting summary via a webhook after the meeting ends
  2. API Call - Fetch the meeting summary using the REST API

Fetch summary via webhook

To receive the meeting summary automatically once the meeting concludes, configure a webhook with the meeting.summary event enabled. This can be done either on the Developer Portal or using the REST API.

For webhook format details, refer to Webhooks.

Fetch summary via API call

You can use the API to fetch the summary for a meeting at a later time using the REST API. RealtimeKit stores the summary of a meeting for 7 days since the start of the meeting.

Trigger summary manually

If you need to generate a summary after the meeting has ended, you can trigger the summary using the REST API.

Next steps

Explore additional capabilities:

  • Webhooks - Set up webhooks for AI events