A decorator that modifies the configuration of a Spark transform.
The configure decorator must be used to wrap a Transform:
Copied!1 2 3 4>>> @configure(profile=['high-memory']) ... @transform(...) ... def my_compute_function(...): ... pass
profile (str ↗ or List [str ↗ ] , optional) – The transforms profile(s) to use.
allowed_run_duration (timedelta , optional) – The allowed duration for a job to complete, after which infrastructure will fail and maybe retry a job. Use carefully. When configuring allowed duration, consider variables such as changes in data scale or shape. Duration is minute precision only. IMPORTANT: Do not use for incremental transforms, as duration can change significantly when running a snapshot.
run_as_user (boolean , optional) –
Determines whether a transforms runs with user permissions. When enabled, a job can behave differently depending on the permissions of the user running the job.
Deprecated since version 3.85.0.
backend (str ↗ or ComputeBackend , optional) – The compute backend to use for this transform. Defaults to Spark. Velox can be selected to natively accelerate the transform.
checkpoint_outputs – The outputs that can be used for storing checkpoints. If not set, defaults to all outputs. This is particularly useful for incremental transforms where writing checkpoints to an incremental dataset may cause the build to fail.
Transform object.The configure decorator can only be used on a Transform object. This means it only applies to Spark transforms.
For more information about defining custom transforms profiles, refer to the section on defining profiles in the Spark transforms documentation ↗.