Safe Haskell | None |
---|---|
Language | Haskell2010 |
- data ComputeNode loc a = ComputeNode {
- _cnNodeId :: NodeId
- _cnOp :: !NodeOp
- _cnType :: !DataType
- _cnParents :: !(Vector UntypedNode)
- _cnLogicalDeps :: !(Vector UntypedNode)
- _cnLocality :: !Locality
- _cnName :: !(Maybe NodeName)
- _cnLogicalParents :: !(Maybe (Vector UntypedNode))
- _cnPath :: NodePath
- data TypedLocality loc = TypedLocality {}
- data LocLocal
- data LocDistributed
- data LocUnknown
- type UntypedNode = ComputeNode LocUnknown Cell
- type UntypedDataset = Dataset Cell
- type UntypedLocalData = LocalData Cell
- type Dataset a = ComputeNode LocDistributed a
- type LocalData a = ComputeNode LocLocal a
- type DataFrame = Try (Dataset Cell)
- type LocalFrame = Try (LocalData Cell)
- data NodeEdge
- data StructureEdge
- class CheckedLocalityCast loc where
- class CheckedLocalityCast loc => IsLocality loc where
Documentation
data ComputeNode loc a Source #
(internal) The main data structure that represents a data node in the computation graph.
This data structure forms the backbone of computation graphs expressed with spark operations.
loc is a typed locality tag. a is the type of the data, as seen by the Haskell compiler. If erased, it will be a Cell type.
ComputeNode | |
|
Eq (ComputeNode loc a) Source # | |
CanRename (ComputeNode loc a) String Source # | |
data TypedLocality loc Source #
Eq (TypedLocality loc) Source # | |
Show (TypedLocality loc) Source # | |
data LocDistributed Source #
type UntypedNode = ComputeNode LocUnknown Cell Source #
type UntypedDataset = Dataset Cell Source #
type UntypedLocalData = LocalData Cell Source #
type Dataset a = ComputeNode LocDistributed a Source #
A typed collection of distributed data.
Most operations on datasets are type-checked by the Haskell compiler: the type tag associated to this dataset is guaranteed to be convertible to a proper Haskell type. In particular, building a Dataset of dynamic cells is guaranteed to never happen.
If you want to do untyped operations and gain some flexibility, consider using UDataFrames instead.
Computations with Datasets and observables are generally checked for correctness using the type system of Haskell.
type LocalData a = ComputeNode LocLocal a Source #
A unit of data that can be accessed by the user.
This is a typed unit of data. The type is guaranteed to be a proper type accessible by the Haskell compiler (instead of simply a Cell type, which represents types only accessible at runtime).
TODO(kps) rename to Observable
type DataFrame = Try (Dataset Cell) Source #
The dataframe type. Any dataset can be converted to a dataframe.
For the Spark users: this is different than the definition of the dataframe in Spark, which is a dataset of rows. Because the support for single columns is more akward in the case of rows, it is more natural to generalize datasets to contain cells. When communicating with Spark, though, single cells are wrapped into rows with single field, as Spark does.
type LocalFrame = Try (LocalData Cell) Source #
Observable, whose type can only be infered at runtime and that can fail to be computed at runtime.
Any observable can be converted to an untyped observable.
Untyped observables are more flexible and can be combined in arbitrary manner, but they will fail during the validation of the Spark computation graph.
TODO(kps) rename to DynObservable
The different paths of edges in the compute DAG of nodes, at the start of computations.
- scope edges specify the scope of a node for naming. They are not included in the id.
data StructureEdge Source #
The edges in a compute DAG, after name resolution (which is where most of the checks and computations are being done)
- parent edges are the direct parents of a node, the only ones required for defining computations. They are included in the id.
- logical edges define logical dependencies between nodes to force a specific ordering of the nodes. They are included in the id.
class CheckedLocalityCast loc where Source #
_validLocalityValues :: [TypedLocality loc] Source #
class CheckedLocalityCast loc => IsLocality loc where Source #
_getTypedLocality :: TypedLocality loc Source #