graph LR
DataFrame["DataFrame"]
DataFrameAPI["DataFrameAPI"]
PandasDataFrame["PandasDataFrame"]
ArrayDataFrame["ArrayDataFrame"]
ArrowDataFrame["ArrowDataFrame"]
LocalDataFrameIterableDataFrame["LocalDataFrameIterableDataFrame"]
IterableDataFrame["IterableDataFrame"]
PandasDataFrame -- "inherits from" --> DataFrame
ArrayDataFrame -- "inherits from" --> DataFrame
ArrowDataFrame -- "inherits from" --> DataFrame
LocalDataFrameIterableDataFrame -- "inherits from" --> DataFrame
IterableDataFrame -- "inherits from" --> DataFrame
DataFrameAPI -- "operates on" --> DataFrame
PandasDataFrame -- "converts to" --> ArrowDataFrame
ArrowDataFrame -- "converts to" --> PandasDataFrame
The Data Abstraction Layer subsystem in Fugue provides a standardized DataFrame API, enabling consistent interaction with diverse in-memory and distributed data structures.
The foundational abstract interface for all Fugue DataFrame implementations. It defines the core contract for data manipulation, including schema management and fundamental operations like peek_array, _drop_cols, count, as_array, as_arrow, as_pandas, and as_iterable. It acts as the unifying interface for all data structures within Fugue.
Related Classes/Methods:
fugue/dataframe/dataframe.py:DataFramefugue/dataframe/dataframe.py:DataFrame.peek_arrayfugue/dataframe/dataframe.py:DataFrame._drop_colsfugue/dataframe/dataframe.py:DataFrame.countfugue/dataframe/dataframe.py:DataFrame.as_arrayfugue/dataframe/dataframe.py:DataFrame.as_arrowfugue/dataframe/dataframe.py:DataFrame.as_pandasfugue/dataframe/dataframe.py:DataFrame.as_iterable
Provides a consistent, high-level API for common DataFrame operations. It offers a set of utility functions (e.g., alter_columns, drop_columns, select_columns, rename) that act as a facade over the DataFrame interface. This simplifies interactions and ensures consistent behavior across different underlying DataFrame types.
Related Classes/Methods:
fugue/dataframe/api.py:DataFrameAPIfugue/dataframe/api.py:DataFrameAPI.alter_columnsfugue/dataframe/api.py:DataFrameAPI.drop_columnsfugue/dataframe/api.py:DataFrameAPI.select_columnsfugue/dataframe/api.py:DataFrameAPI.rename
Adapts the DataFrame interface for Pandas DataFrames. It provides a concrete implementation of the DataFrame interface specifically for Pandas DataFrames, translating Fugue's abstract operations into equivalent Pandas operations.
Related Classes/Methods:
Adapts the DataFrame interface for in-memory array-like data. It implements the DataFrame interface for in-memory array-like data structures (lists of lists), offering a lightweight DataFrame representation for smaller datasets.
Related Classes/Methods:
Adapts the DataFrame interface for Apache Arrow tables. It implements the DataFrame interface using Apache Arrow tables, enabling efficient columnar data processing and interoperability with other Arrow-compatible systems.
Related Classes/Methods:
Handles DataFrames composed of iterables of other DataFrames. It represents a DataFrame that is an iterable of other DataFrames, useful for handling partitioned data or results from operations that produce multiple DataFrames.
Related Classes/Methods:
Provides a generic iterable-based DataFrame representation. It represents a DataFrame as a generic iterable of rows, providing a flexible base for various iterable-based data sources when a more specific DataFrame type is not applicable.
Related Classes/Methods: