Dagster includes facilities for typing the input and output values of ops (“runtime” types).
dagster.
Any
¶Use this type for any input, output, or config field whose type is unconstrained
All values are considered to be instances of Any
.
Examples:
@op
def identity(_, x: Any) -> Any:
return x
# Untyped inputs and outputs are implicitly typed Any
@op
def identity_imp(_, x):
return x
# Explicitly typed
@op(
ins={'x': In(dagster_type=Any)},
out=Out(dagster_type=Any),
)
def identity(_, x):
return x
@op(config_schema=Field(Any))
def any_config(context):
return context.op_config
dagster.
Bool
¶Use this type for any boolean input, output, or config_field. At runtime, this will perform an
isinstance(value, bool)
check. You may also use the ordinary bool
type as an alias.
Examples:
@op
def boolean(_, x: Bool) -> String:
return 'true' if x else 'false'
@op
def empty_string(_, x: String) -> bool:
return len(x) == 0
# Explicit
@op(
ins={'x': In(dagster_type=Bool)},
out=Out(dagster_type=String),
)
def boolean(_, x):
return 'true' if x else 'false'
@op(
ins={'x': In(dagster_type=String)},
out=Out(dagster_type=bool),
)
def empty_string(_, x):
return len(x) == 0
@op(config_schema=Field(Bool))
def bool_config(context):
return 'true' if context.op_config else 'false'
dagster.
Int
¶Use this type for any integer input or output. At runtime, this will perform an
isinstance(value, int)
check. You may also use the ordinary int
type as an alias.
Examples:
@op
def add_3(_, x: Int) -> int:
return x + 3
# Explicit
@op(
ins={'x', In(dagster_type=Int)},
out=Out(dagster_type=Int),
)
def add_3(_, x):
return x + 3
dagster.
Float
¶Use this type for any float input, output, or config value. At runtime, this will perform an
isinstance(value, float)
check. You may also use the ordinary float
type as an alias.
Examples:
@op
def div_2(_, x: Float) -> float:
return x / 2
# Explicit
@op(
ins={'x', In(dagster_type=Float)},
out=Out(dagster_type=float),
)
def div_2(_, x):
return x / 2
@op(config_schema=Field(Float))
def div_y(context, x: Float) -> float:
return x / context.op_config
dagster.
String
¶Use this type for any string input, output, or config value. At runtime, this will perform an
isinstance(value, str)
check. You may also use the ordinary str
type
as an alias.
Examples:
@op
def concat(_, x: String, y: str) -> str:
return x + y
# Explicit
@op(
ins= {
'x': In(dagster_type=String),
'y': In(dagster_type=str),
},
out= Out(dagster_type=str),
)
def concat(_, x, y):
return x + y
@op(config_schema=Field(String))
def hello(context) -> str:
return 'Hello, {friend}!'.format(friend=context.op_config)
dagster.
Nothing
¶Use this type only for inputs and outputs, in order to establish an execution dependency without
communicating a value. Inputs of this type will not be pased to the op compute function, so
it is necessary to use the explicit InputDefinition
API to define them rather than
the Python 3 type hint syntax.
All values are considered to be instances of Nothing
.
Examples:
@op
def wait(_) -> Nothing:
time.sleep(1)
return
@op(
ins={"ready": In(dagster_type=Nothing)},
)
def done(_) -> str:
return 'done'
@job
def nothing_job():
done(wait())
# Any value will pass the type check for Nothing
@op
def wait_int(_) -> Int:
time.sleep(1)
return 1
@job
def nothing_int_job():
done(wait_int())
dagster.
Optional
¶Use this type only for inputs and outputs, if the value can also be None
.
Examples:
@op
def nullable_concat(_, x: str, y: Optional[str]) -> str:
return x + (y or '')
# Explicit
@op(
ins={
'x': In(String),
'y': In(Optional[String]),
},
out=Out(String),
)
def nullable_concat(_, x, y):
return x + (y or '')
dagster.
List
¶Use this type for inputs, or outputs.
Lists are also the appropriate input types when fanning in multiple outputs using a
MultiDependencyDefinition
or the equivalent composition function syntax.
Examples:
@op
def concat_list(_, xs: List[str]) -> str:
return ''.join(xs)
# Explicit
@op(
ins={'xs': In(dagster_type=List[str])},
out=Out(dagster_type=String),
)
def concat_list(_, xs) -> str:
return ''.join(xs)
# Fanning in multiple outputs
@op
def emit_1(_) -> int:
return 1
@op
def emit_2(_) -> int:
return 2
@op
def emit_3(_) -> int:
return 3
@op
def sum_op(_, xs: List[int]) -> int:
return sum(xs)
@job
def sum_job():
sum_op([emit_1(), emit_2(), emit_3()])
dagster.
Dict
¶Use this type for inputs, or outputs that are dicts.
For inputs and outputs, you may optionally specify the key and value types using the square brackets syntax for Python typing.
Examples:
@op
def repeat(_, spec: Dict) -> str:
return spec['word'] * spec['times']
# Explicit
@op(
ins={'spec': In(Dict)},
out=Out(String),
)
def repeat(_, spec):
return spec['word'] * spec['times']
dagster.
Set
¶Use this type for inputs, or outputs that are sets. Alias for
typing.Set
.
You may optionally specify the inner type using the square brackets syntax for Python typing.
Examples:
@op
def set_op(_, set_input: Set[String]) -> List[str]:
return sorted([x for x in set_input])
# Explicit
@op(
ins={"set_input": In(dagster_type=Set[String])},
out=Out(List[String]),
)
def set_op(_, set_input):
return sorted([x for x in set_input])
dagster.
Tuple
¶Use this type for inputs or outputs that are tuples. Alias for
typing.Tuple
.
You may optionally specify the inner types using the square brackets syntax for Python typing.
Config values should be passed as a list (in YAML or the Python config dict).
Examples:
@op
def tuple_op(_, tuple_input: Tuple[str, int, float]) -> List:
return [x for x in tuple_input]
# Explicit
@op(
ins={'tuple_input': In(dagster_type=Tuple[String, Int, Float])},
out=Out(List),
)
def tuple_op(_, tuple_input):
return [x for x in tuple_input]
dagster.
FileHandle
[source]¶A reference to a file as manipulated by a FileManager
Subclasses may handle files that are resident on the local file system, in an object store, or in any arbitrary place where a file can be stored.
This exists to handle the very common case where you wish to write a computation that reads, transforms, and writes files, but where you also want the same code to work in local development as well as on a cluster where the files will be stored in a globally available object store such as S3.
path_desc
¶A representation of the file path for display purposes only.
dagster.
DagsterType
(type_check_fn, key=None, name=None, is_builtin=False, description=None, loader=None, materializer=None, required_resource_keys=None, kind=<DagsterTypeKind.REGULAR: 'REGULAR'>, typing_type=None, metadata_entries=None, metadata=None)[source]¶Define a type in dagster. These can be used in the inputs and outputs of ops.
type_check_fn (Callable[[TypeCheckContext, Any], [Union[bool, TypeCheck]]]) – The function that defines the type check. It takes the value flowing
through the input or output of the op. If it passes, return either
True
or a TypeCheck
with success
set to True
. If it fails,
return either False
or a TypeCheck
with success
set to False
.
The first argument must be named context
(or, if unused, _
, _context
, or context_
).
Use required_resource_keys
for access to resources.
key (Optional[str]) –
The unique key to identify types programmatically.
The key property always has a value. If you omit key to the argument
to the init function, it instead receives the value of name
. If
neither key
nor name
is provided, a CheckError
is thrown.
In the case of a generic type such as List
or Optional
, this is
generated programmatically based on the type parameters.
For most use cases, name should be set and the key argument should not be specified.
name (Optional[str]) – A unique name given by a user. If key
is None
, key
becomes this value. Name is not given in a case where the user does
not specify a unique name for this type, such as a generic class.
description (Optional[str]) – A markdown-formatted string, displayed in tooling.
loader (Optional[DagsterTypeLoader]) – An instance of a class that
inherits from DagsterTypeLoader
and can map config data to a value of
this type. Specify this argument if you will need to shim values of this type using the
config machinery. As a rule, you should use the
@dagster_type_loader
decorator to construct
these arguments.
materializer (Optional[DagsterTypeMaterializer]) – An instance of a class
that inherits from DagsterTypeMaterializer
and can persist values of
this type. As a rule, you should use the
@dagster_type_materializer
decorator to construct these arguments.
required_resource_keys (Optional[Set[str]]) – Resource keys required by the type_check_fn
.
is_builtin (bool) – Defaults to False. This is used by tools to display or
filter built-in types (such as String
, Int
) to visually distinguish
them from user-defined types. Meant for internal use.
kind (DagsterTypeKind) – Defaults to None. This is used to determine the kind of runtime type for InputDefinition and OutputDefinition type checking.
typing_type – Defaults to None. A valid python typing type (e.g. Optional[List[int]]) for the value contained within the DagsterType. Meant for internal use.
dagster.
PythonObjectDagsterType
(python_type, key=None, name=None, **kwargs)[source]¶Define a type in dagster whose typecheck is an isinstance check.
Specifically, the type can either be a single python type (e.g. int), or a tuple of types (e.g. (int, float)) which is treated as a union.
Examples
ntype = PythonObjectDagsterType(python_type=int)
assert ntype.name == 'int'
assert_success(ntype, 1)
assert_failure(ntype, 'a')
ntype = PythonObjectDagsterType(python_type=(int, float))
assert ntype.name == 'Union[int, float]'
assert_success(ntype, 1)
assert_success(ntype, 1.5)
assert_failure(ntype, 'a')
python_type (Union[Type, Tuple[Type, ..]) – The dagster typecheck function calls instanceof on this type.
name (Optional[str]) – Name the type. Defaults to the name of python_type
.
key (Optional[str]) – Key of the type. Defaults to name.
description (Optional[str]) – A markdown-formatted string, displayed in tooling.
loader (Optional[DagsterTypeLoader]) – An instance of a class that
inherits from DagsterTypeLoader
and can map config data to a value of
this type. Specify this argument if you will need to shim values of this type using the
config machinery. As a rule, you should use the
@dagster_type_loader
decorator to construct
these arguments.
materializer (Optional[DagsterTypeMaterializer]) – An instance of a class
that inherits from DagsterTypeMaterializer
and can persist values of
this type. As a rule, you should use the
@dagster_type_mate
decorator to construct these arguments.
dagster.
dagster_type_loader
(config_schema, required_resource_keys=None, loader_version=None, external_version_fn=None)[source]¶Create an dagster type loader that maps config data to a runtime value.
The decorated function should take the execution context and parsed config value and return the appropriate runtime value.
config_schema (ConfigSchema) – The schema for the config that’s passed to the decorated function.
loader_version (str) – (Experimental) The version of the decorated compute function. Two loading functions should have the same version if and only if they deterministically produce the same outputs when provided the same inputs.
external_version_fn (Callable) – (Experimental) A function that takes in the same parameters as the loader function (config_value) and returns a representation of the version of the external asset (str). Two external assets with identical versions are treated as identical to one another.
Examples:
@dagster_type_loader(Permissive())
def load_dict(_context, value):
return value
dagster.
DagsterTypeLoader
[source]¶Dagster type loaders are used to load unconnected inputs of the dagster type they are attached to.
The recommended way to define a type loader is with the
@dagster_type_loader
decorator.
dagster.
dagster_type_materializer
(config_schema, required_resource_keys=None)[source]¶Create an output materialization hydration config that configurably materializes a runtime value.
The decorated function should take the execution context, the parsed config value, and the
runtime value. It should materialize the runtime value, and should
return an appropriate AssetMaterialization
.
config_schema (object) – The type of the config data expected by the decorated function.
Examples:
# Takes a list of dicts such as might be read in using csv.DictReader, as well as a config
value, and writes
@dagster_type_materializer(str)
def materialize_df(_context, path, value):
with open(path, 'w') as fd:
writer = csv.DictWriter(fd, fieldnames=value[0].keys())
writer.writeheader()
writer.writerows(rowdicts=value)
return AssetMaterialization.file(path)
dagster.
DagsterTypeMaterializer
[source]¶Dagster type materializers are used to materialize outputs of the dagster type they are attached to.
The recommended way to define a type loader is with the
@dagster_type_materializer
decorator.
dagster.
usable_as_dagster_type
(name=None, description=None, loader=None, materializer=None)[source]¶Decorate a Python class to make it usable as a Dagster Type.
This is intended to make it straightforward to annotate existing business logic classes to make them dagster types whose typecheck is an isinstance check against that python class.
python_type (cls) – The python type to make usable as python type.
name (Optional[str]) – Name of the new Dagster type. If None
, the name (__name__
) of
the python_type
will be used.
description (Optional[str]) – A user-readable description of the type.
loader (Optional[DagsterTypeLoader]) – An instance of a class that
inherits from DagsterTypeLoader
and can map config data to a value of
this type. Specify this argument if you will need to shim values of this type using the
config machinery. As a rule, you should use the
@dagster_type_loader
decorator to construct
these arguments.
materializer (Optional[DagsterTypeMaterializer]) – An instance of a class
that inherits from DagsterTypeMaterializer
and can persist values of
this type. As a rule, you should use the
@dagster_type_materializer
decorator to construct these arguments.
Examples:
# dagster_aws.s3.file_manager.S3FileHandle
@usable_as_dagster_type
class S3FileHandle(FileHandle):
def __init__(self, s3_bucket, s3_key):
self._s3_bucket = check.str_param(s3_bucket, 's3_bucket')
self._s3_key = check.str_param(s3_key, 's3_key')
@property
def s3_bucket(self):
return self._s3_bucket
@property
def s3_key(self):
return self._s3_key
@property
def path_desc(self):
return self.s3_path
@property
def s3_path(self):
return 's3://{bucket}/{key}'.format(bucket=self.s3_bucket, key=self.s3_key)
dagster.
make_python_type_usable_as_dagster_type
(python_type, dagster_type)[source]¶Take any existing python type and map it to a dagster type (generally created with
DagsterType
) This can only be called once
on a given python type.
dagster.
check_dagster_type
(dagster_type, value)[source]¶Test a custom Dagster type.
dagster_type (Any) – The Dagster type to test. Should be one of the
built-in types, a dagster type explicitly constructed with
as_dagster_type()
, @usable_as_dagster_type
, or
PythonObjectDagsterType()
, or a Python type.
value (Any) – The runtime value to test.
The result of the type check.
Examples
assert check_dagster_type(Dict[Any, Any], {'foo': 'bar'}).success