Media

Media references are not yet supported in Python and TypeScript v2 functions.

Functions are executed in an environment that has strict memory limits. Exceeding these memory limits can happen quickly when dealing with file data; we recommend only interacting with media under 20MB.

Functions provide a MediaItem type that exposes a number of methods for conveniently interacting with the underlying media. There are a number of built-in operations that allow you to easily interact with different kinds of media with no need for external libraries.

If you need any operations that don't currently exist out-of-the-box, you will likely need to use external libraries or write your own custom code. Learn more about adding dependencies to functions repositories.

Media item parameter

There are two ways to interact with media items in functions:

Using a media reference property on an object type
Passing a media reference parameter of an action type to the function

Using media reference property of an object type

The following example shows the isAudio media operations on a media reference property of an object type.

Copied!1
MediaItem.isAudio(objectType.mediaReferenceProperty)

Passing a media reference parameter on action type

Action parameters of type media reference can be passed to the function as a paramter.

The screenshot below shows an action passing a media parameter to its backing function.

Universal operations

Some operations are supported by all media types.

Read raw media data

You can access a media item by selecting the media reference property on the object. The signature for the method is as follows:

Copied!1
2
3
// Blob is a standard JavaScript type, representing a file-like object of immutable, raw data.
// https://developer.mozilla.org/en-US/docs/Web/API/Blob
readAsync(): Promise<Blob>;

Get media metadata

You can access a media item's metadata by selecting the metadata reference property on the object. The signature for the method is as follows:

Copied!1
getMetadataAsync(): Promise<IMediaMetadata>;

Determining media types

Type guards in TypeScript v1

Type guards in TypeScript v1 allow you to access functionality that is specific to certain media types. The following type guards can be used on media item metadata:

isAudioMetadata()
isDicomMetadata()
isDocumentMetadata()
isImageryMetadata()
isSpreadsheetMetadata()
isUntypedMetadata()
isVideoMetadata()

As an example, you could use the imagery type guard to pull out image specific metadata fields:

Copied!1
2
3
4
5
const metadata = await myObject.mediaReference?.getMetadataAsync();
if (isImageryMetadata(metadata)) {
    const imageWidth = metadata.dimensions?.width;
    ...
}

You can also use type guards on the media item namespace, which then gives you access to more methods on the type-specific media item. The type guards you can use here are:

MediaItem.isAudio()
MediaItem.isDicom()
MediaItem.isDocument()
MediaItem.isImagery()
MediaItem.isSpreadsheet()
MediaItem.isVideo()

Document-specific operations

Text extraction

To extract text from a document, you can either use optical character recognition (OCR) or extract embedded text on the media item.

For machine-generated PDFs, it may be faster and/or more accurate to extract text embedded digitally in the PDF rather than using optical character recognition (OCR). Below is an example of text extraction usage:

Copied!1
extractTextAsync(options: IDocumentExtractTextOptions): Promise<string[]>;

When using TypeScript v1, the following can optionally be provided as an object:

startPage: The zero-indexed start page (inclusive).
endPage: The zero-indexed end page (exclusive).

For non-machine-generated PDFs, it would be best to use the OCR method for extracting text.

Copied!1
ocrAsync(options: IDocumentOcrOptions): Promise<string[]>;

The following can optionally be provided as a TypeScript object:

startPage: The zero-indexed start page (inclusive).
endPage: The zero-indexed end page (exclusive).
languages: A list of languages to recognize (can be empty).
scripts: A list of scripts to recognize (can be empty).
outputType: Specifies the output type as text or hocr.

Remember that you need to use type guards in order to access media-type specific operations. Here's an example of using the isDocument() type guard to then perform OCR text extraction:

Copied!1
2
3
4
5
6
7
8
9
10
11
12
import { MediaItem } from "@foundry/functions-api";
import { ArxivPaper } from "@foundry/ontology-api";

@Function()
public async firstPageText(paper: ArxivPaper): Promise<string | undefined> {
    if (MediaItem.isDocument(paper.mediaReference)) {
        const text = (await paper.mediaReference.ocrAsync({ endPage: 1, languages: [], scripts: [], outputType: 'text' }))[0];
        return text;
    }

    return undefined;
}

Audio-specific operations

Transcription in TypeScript v1

Audio media items support transcription using the transcribe method. The signature is as follows:

Copied!1
transcribeAsync(options: IAudioTranscriptionOptions): Promise<string>;

The following can optionally be passed in to specify how the transcription should run. The available options are:

language: The language to transcribe, passed using the TranscriptionLanguage enum.
performanceMode: Runs transcriptions in More Economical or More Performant mode, passed using the TranscriptionPerformanceMode enum.
outputFormat: Specifies the output format by passing an object of type plainTextNoSegmentData (plain text) or pttml. pttml is a TTML-like ↗ format where the object also takes a boolean addTimestamps parameter if the type is plainTextNoSegmentData.

Here's an example of providing options for transcription:

Copied!1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import { Function, MediaItem, TranscriptionLanguage, TranscriptionPerformanceMode } from "@foundry/functions-api";
import { AudioFile } from "@foundry/ontology-api";

@Function()
public async transcribeAudioFile(file: AudioFile): Promise<string|undefined> {
    if (MediaItem.isAudio(file.mediaReference)) {
        return await file.mediaReference.transcribeAsync({
            language: TranscriptionLanguage.ENGLISH,
            performanceMode: TranscriptionPerformanceMode.MORE_ECONOMICAL,
            outputFormat: {type: "plainTextNoSegmentData", addTimestamps: true}
        });
    }

    return undefined;
}

←

PREVIOUSAttachments

NEXTModels / Functions on models

→