You can get the transcript for any episode from our directory of 200+ million episodes.

Example:

  1. (Recommended) Use the getEpisodeTranscript query to get the transcript, including timecodes and speaker names (if provided).
{
  getEpisodeTranscript(uuid:"e03bf3ef-829e-4f47-9f02-29ac6a747b4f"){
    id
    text
    speaker
    startTimecode
    endTimecode
  }
}
  1. If you want to get episode details along with the transcript, use transcript or transcriptWithSpeakersAndTimecodes.

If you just need the text of the transcript, use transcript. If you also need the timecodes and speaker names (if provided), use transcriptWithSpeakersAndTimecodes.

{
  getPodcastEpisode(uuid:"e03bf3ef-829e-4f47-9f02-29ac6a747b4f"){
    uuid
    name
    taddyTranscribeStatus
    transcript
    transcriptWithSpeakersAndTimecodes{
      id
      text
      speaker
      startTimecode
      endTimecode
    }
  }
}

Example 1 (Recommended) is recommended over example 2 because if Taddy API doesn’t already have the transcript, we generate one on-demand, which can take 10+ seconds to generate. This means for example 2, episode details won’t be returned until the transcript has been generated.

Depending on your use-case, you may want to consider splitting your requests into two requests for a better user-experience:

  1. get general episode details (fast) and

  2. get the transcript for the episode (using getEpisodeTranscript).

  1. (Advanced) You can use transcriptUrls or transcriptUrlsWithDetails to get the URLs where you can download a transcript yourself. This includes both the transcripts provided by podcast (if available) and the transcripts that have been automatically generated via Taddy API.

To download a transcript provided by Taddy API via its url, you must pass in your Taddy API X-USER-ID and X-API-KEY in the headers. Here is an example using curl:

curl -H "X-USER-ID: 1" -H "X-API-KEY: xyz..." \\                                                                                           
<https://ax2.taddy.org/9d874d17-fe25-4cbb-a802-ce65f7c198a1/106d1dfb-50ed-4844-8e04-31960d8767c7/transcript.vtt>

How Taddy API generates transcripts

Behind the scenes, Taddy API gets the transcript for an episode in 3 ways:

  1. Podcast-provided transcripts - Some podcasts provide their own text transcripts (however, less than 1% of podcasts currently do this)
  2. Automatic transcription for popular podcasts - We automatically generate transcripts for the most popular 5000 podcasts using an open-source transcription model running on our GPUs
  3. On-demand transcription - We can transcribe any episode on-demand (It takes ~10 seconds to transcribe every ~1hr of audio)

To summarize, between transcripts provided by the podcast and Taddy API generated transcripts, you can get the transcript for any episode on Taddy API.

Pricing