Package com.macrofocus.molap.dataframe
Interface DataFrame<Row,Column,V>
-
- Type Parameters:
Row
- the type of row keys maintained used to query this data frame.Column
- the type of column keys maintained used to query this data frame.V
- the type of values
- All Known Subinterfaces:
AggregateDataFrame<C>
,DrillDataFrame<C>
,Matrix<R,C>
,MutableDataFrame<R,C,V>
,MutableMatrix<R,C>
,RowMajorDataFrame<R,C,V>
,TimeSeriesDataFrames.Dynamic<R,C,V>
,TimeSeriesDataFrames.Static<R,C,V>
- All Known Implementing Classes:
AbstractDataFrame
,AbstractMatrix
,AbstractMutableDataFrame
,AbstractRowMajorDataFrame
,AccessorRowMajorDataFrame
,AggregatedNodesDataFrame
,AppendAndReindexDataFrame
,AppendDataFrame
,ArcGISJsonDataFrame
,AsyncDataFrame
,CachedDataFrame
,CacheDistanceMatrix
,CacheMatrix
,CalibratedDistanceMatrix
,ClosestDataFrame
,ColumnCorrelationMatrix
,ColumnMajorDataFrame
,ColumnModelDataFrame
,ColumnOrderDataFrame
,CombinedDataFrame
,CovarianceDataFrame
,DefaultDataFrame
,DelegatedDataFrame
,DerivedDataFrame
,EmptyDataFrame
,FilterDataFrame
,HashMapRowMajorDataFrame
,HeadDataFrame
,IndexedDataFrame
,JsonDataFrame
,LinksDataFrame
,MinMaxDataFrame
,MultipageDataFrame
,NodesDataFrame
,NormalizedMatrix
,NumberDataFrame
,OpMatrix
,QueryDataFrame
,ReIndexedDataFrame
,ReMappedDataFrame
,ResultSetDataFrame
,ScaledDataFrame
,SelectDataFrame
,SelectionDataFrame
,SimpleDataFrame
,SimpleMutableMatrix
,StatMatrix
,SubsetDataFrame
,TailDataFrame
,TimeSeriesDataFrames.DefaultDynamic
,TimeSeriesDataFrames.DefaultStatic
,WrappedDataFrame
public interface DataFrame<Row,Column,V>
An indexed table structure, i.e. a 2D array of values that can be queried row and/or column-wise using labels. This is similar to the DataFrame that can be found in the R and Pandas libraries.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description void
addDataFrameListener(DataFrameListener<Row,Column> listener)
Add a listener to the list that's notified each time a change to the data frame occurs.void
addWeakDataFrameListener(DataFrameListener<Row,Column> listener)
Add a listener to the list that's notified each time a change to the data frame occurs.AggregateDataFrame<Column>
aggregate(boolean includeIndex, Aggregation... aggregation)
AggregateDataFrame<Column>
aggregate(Aggregation... aggregation)
Returns a new data frame suitable for data aggregation.DataFrame<Row,Column,V>
append(DataFrame<Row,Column,V> dataFrame)
Returns a new data frame with the rows of the specified data frame appended to the end of the this data frame.DataFrame<java.lang.Integer,Column,V>
appendAndReindex(DataFrame<Row,Column,V> dataFrame)
Returns a new data frame with the rows of the specified data frame appended to the end of the this data frame.void
benchmark()
Perform some repetitive operation to estimate the performance of the data frame.java.lang.Iterable<Column>
columns()
Returns the column keys.FilterDataFrame<Row,Column,V>
filter(com.macrofocus.filter.MutableFilter<Row> filter)
Returns a new data frame those rows are filtered by the specified filter model.CentroidAggregation
getCentroid(Column column)
Returns the aggregation method for finding the centroid of geometries.Series<Row,V>
getColumn(Column column)
Returns a series of all the values of a given column.int
getColumnAddress(Column column)
Returns the absolute index for the specified column key.java.lang.Class
getColumnClass(Column column)
Returns the most specific superclass for all cell values in a columnint
getColumnCount()
Returns the number of columns contained by this data frame.UniqueIndex<Column>
getColumnIndex()
Gets the index used to access the columns.Column
getColumnKey(int index)
Returns the column key at the specified absolute index.java.lang.String
getColumnName(Column column)
Returns the name of the column.ConstantAggregation
getConstant(java.lang.Object value)
Returns the aggregation method that returns a constant value.CountAggregation
getCount(Column column)
Returns the aggregation method for counting the number of values.CountDistinctAggregation
getCountDistinct(Column column)
Returns the aggregation method for counting the number of distinct values.CountDistinctWithNullAggregation
getCountDistinctWithNull(Column column)
Returns the aggregation method for counting the number of distinct values, including null value if present.DataFrameAggregation
getDataFrameAggregation()
Returns data frame for aggregation.DistributiveStatisticsAggregation
getDistributiveStatistics(Column column)
Returns the aggregation method for computing the univariate statistics.FirstAggregation
getFirst(Column value)
Returns the aggregation method that returns a constant value.FirstQuartileAggregation
getFirstQuartile(Column column)
Returns the aggregation method for computing the median value.MaxAggregation
getMax(Column column)
Returns the aggregation method for finding the maximum value.MeanAggregation
getMean(Column column)
Returns the aggregation method for computing the mean value.MedianAggregation
getMedian(Column column)
Returns the aggregation method for computing the median value.MinAggregation
getMin(Column column)
Returns the aggregation method for finding the minimum value.RandomAggregation
getRandom(double min, double max)
Returns the aggregation method that returns a random value.Series<Column,?>
getRow(Row row)
Returns a series of all the values of a given row.int
getRowAddress(Row row)
Returns the absolute index for the specified row key.java.lang.Class
getRowClass(Row row)
Returns the most specific superclass for all cell values in a row.int
getRowCount()
Returns the number of rows contained by the this data frame.UniqueIndex<Row>
getRowIndex()
Gets the index used to access the rows.Row
getRowKey(int index)
Returns the row key at the specified absolute index.UnivariateStatistics
getStatistics(Column column)
StdDevAggregation
getStdDev(Column column)
Returns the aggregation method for computing the standard deviation.SumAggregation
getSum(Column column)
Returns the aggregation method for computing the sum.ThirdQuartileAggregation
getThirdQuartile(Column column)
Returns the aggregation method for computing the median value.UnivariateStatisticsAggregation
getUnivariateStatistics(Column column)
Returns the aggregation method for computing the univariate statistics.V
getValueAt(Row row, Column column)
Returns the value for the cell at the intersection of thecolumn
key androw
key.VarianceAggregation
getVariance(Column column)
Returns the aggregation method for computing the variance.VarianceAggregation
getVarianceByPopulation(Column column, Column population)
Returns the aggregation method for computing the variance by factoring in the specified population.Aggregation
getWeightedMean(Column weight, Column column)
Returns the aggregation method for computing the weigthed mean.Aggregation
getWeightedSum(Column weight, Column column)
Returns the aggregation method for computing the weighted sum.DataFrame
join(Series series, Column[] columns)
DataFrame<Row,Column,V>
orderRows(SortKey<Column>... columns)
Returns a new data frame reordered using the values coming from the specified columns.void
print()
Display the content of the data frame to the console.void
print(java.io.PrintStream out, java.lang.String caption, boolean html)
Display the content of the data frame to the specified output.void
printSchema()
Display the content of the data frame to the console.<Column> DataFrame<Row,Column,V>
reindexColumns(Column... columns)
Returns a new data frame with column reindexed using the specified values.DataFrame<java.lang.Integer,Column,V>
reindexRows()
Returns a new data frame reindexed using integers.DataFrame<V,Column,V>
reindexRows(boolean keepColumn, Column column)
Returns a new data frame reindexed using the values coming from the specified column.DataFrame<MultiKey,Column,V>
reindexRows(boolean keepColumns, Column... columns)
Returns a new data frame reindexed using the values coming from the specified rows.DataFrame<V,Column,V>
reindexRows(Column column)
Returns a new data frame reindexed using the values coming from the specified column.DataFrame<MultiKey,Column,V>
reindexRows(Column... columns)
Returns a new data frame reindexed using the values coming from the specified rows.DataFrame<Row,Column,V>
remapColumns(Column... columns)
DataFrame<Row,Column,V>
removeColumns(Column... columns)
void
removeDataFrameListener(DataFrameListener<Row,Column> listener)
Remove a listener to the list that's notified each time a change to the data frame occurs.void
removeDataFrameListeners()
Remove all listeners to the list that's notified each time a change to the data frame occurs.DataFrame<Row,Column,V>
removeDuplicates(Column... columns)
Returns a new data frame where duplicate entries for values of the specified column values.java.lang.Iterable<Row>
rows()
Returns the row keys.
-
-
-
Method Detail
-
getColumnName
java.lang.String getColumnName(Column column)
Returns the name of the column. This is a convenience method for labeling purpose.- Parameters:
column
- the key of the column- Returns:
- the name of the column
-
getRowClass
java.lang.Class getRowClass(Row row)
Returns the most specific superclass for all cell values in a row.- Parameters:
row
- the key of the row- Returns:
- the common ancestor class of the object values in the row.
-
getColumnClass
java.lang.Class getColumnClass(Column column)
Returns the most specific superclass for all cell values in a column- Parameters:
column
- the key of the column- Returns:
- the common ancestor class of the object values in the column.
-
getValueAt
V getValueAt(Row row, Column column)
Returns the value for the cell at the intersection of thecolumn
key androw
key.- Parameters:
row
- the row key whose value is to be queriedcolumn
- the column key whose value is to be queried- Returns:
- the value Object at the specified cell
-
getRow
Series<Column,?> getRow(Row row)
Returns a series of all the values of a given row.- Parameters:
row
- the row key- Returns:
- a Series object
-
getColumn
Series<Row,V> getColumn(Column column)
Returns a series of all the values of a given column.- Parameters:
column
- the row key- Returns:
- a Series object
-
rows
java.lang.Iterable<Row> rows()
Returns the row keys.- Returns:
- the row keys
-
columns
java.lang.Iterable<Column> columns()
Returns the column keys.- Returns:
- the column keys
-
getRowKey
Row getRowKey(int index)
Returns the row key at the specified absolute index. This is the inverse ofgetRowAddress(Object)
.- Parameters:
index
- the index- Returns:
- the row key
-
getColumnKey
Column getColumnKey(int index)
Returns the column key at the specified absolute index. This is the inverse ofgetColumnAddress(Object)
.- Parameters:
index
- the index- Returns:
- the column key
-
getRowAddress
int getRowAddress(Row row)
Returns the absolute index for the specified row key. This is the inverse ofgetRowKey(int)
.- Parameters:
row
- the row key- Returns:
- the absolute index of the specified key.
-
getColumnAddress
int getColumnAddress(Column column)
Returns the absolute index for the specified column key. This is the inverse ofgetColumnKey(int)
.- Parameters:
column
- the column key- Returns:
- the absolute index of the specified key.
-
getRowCount
int getRowCount()
Returns the number of rows contained by the this data frame.- Returns:
- the number of rows.
-
getColumnCount
int getColumnCount()
Returns the number of columns contained by this data frame.- Returns:
- the number of columns.
-
reindexRows
DataFrame<java.lang.Integer,Column,V> reindexRows()
Returns a new data frame reindexed using integers.- Returns:
- the reindexed data frame.
-
reindexRows
DataFrame<V,Column,V> reindexRows(Column column)
Returns a new data frame reindexed using the values coming from the specified column.- Parameters:
column
- the columns to use for the label values- Returns:
- the reindexed data frame.
-
reindexRows
DataFrame<V,Column,V> reindexRows(boolean keepColumn, Column column)
Returns a new data frame reindexed using the values coming from the specified column.- Parameters:
column
- the columns to use for the label values- Returns:
- the reindexed data frame.
-
reindexRows
DataFrame<MultiKey,Column,V> reindexRows(Column... columns)
Returns a new data frame reindexed using the values coming from the specified rows.- Parameters:
columns
- the columns to use for the label values- Returns:
- the reindexed data frame.
-
reindexRows
DataFrame<MultiKey,Column,V> reindexRows(boolean keepColumns, Column... columns)
Returns a new data frame reindexed using the values coming from the specified rows.- Parameters:
columns
- the columns to use for the label values- Returns:
- the reindexed data frame.
-
reindexColumns
<Column> DataFrame<Row,Column,V> reindexColumns(Column... columns)
Returns a new data frame with column reindexed using the specified values.- Parameters:
columns
- the values to use for the reindex columns- Returns:
- the reindexed data frame.
-
getRowIndex
UniqueIndex<Row> getRowIndex()
Gets the index used to access the rows.- Returns:
- the row index
-
getColumnIndex
UniqueIndex<Column> getColumnIndex()
Gets the index used to access the columns.- Returns:
- the column index
-
orderRows
DataFrame<Row,Column,V> orderRows(SortKey<Column>... columns)
Returns a new data frame reordered using the values coming from the specified columns.- Parameters:
columns
- the columns to use for sorting- Returns:
- the sorted data frame.
-
filter
FilterDataFrame<Row,Column,V> filter(com.macrofocus.filter.MutableFilter<Row> filter)
Returns a new data frame those rows are filtered by the specified filter model.- Parameters:
filter
- the filter model- Returns:
- a filtered data frame
-
removeDuplicates
DataFrame<Row,Column,V> removeDuplicates(Column... columns)
Returns a new data frame where duplicate entries for values of the specified column values.- Parameters:
columns
- the columns to use to check for duplicates- Returns:
- a new data frame
-
aggregate
AggregateDataFrame<Column> aggregate(Aggregation... aggregation)
Returns a new data frame suitable for data aggregation. The additional methods provided to customize the aggregation criteria are provided by theAggregateDataFrame
.- Parameters:
aggregation
- the aggregation methods to use as columns- Returns:
- an aggregated data frame.
-
aggregate
AggregateDataFrame<Column> aggregate(boolean includeIndex, Aggregation... aggregation)
-
append
DataFrame<Row,Column,V> append(DataFrame<Row,Column,V> dataFrame)
Returns a new data frame with the rows of the specified data frame appended to the end of the this data frame.- Parameters:
dataFrame
- the data frame those rows are to be appended- Returns:
- a combined data frame.
-
appendAndReindex
DataFrame<java.lang.Integer,Column,V> appendAndReindex(DataFrame<Row,Column,V> dataFrame)
Returns a new data frame with the rows of the specified data frame appended to the end of the this data frame.- Parameters:
dataFrame
- the data frame those rows are to be appended- Returns:
- a combined data frame.
-
getStatistics
UnivariateStatistics getStatistics(Column column)
-
getDataFrameAggregation
DataFrameAggregation getDataFrameAggregation()
Returns data frame for aggregation.- Returns:
- the aggregation data frame
-
getFirst
FirstAggregation getFirst(Column value)
Returns the aggregation method that returns a constant value.- Parameters:
value
- the constant value- Returns:
- the aggregation method
-
getConstant
ConstantAggregation getConstant(java.lang.Object value)
Returns the aggregation method that returns a constant value.- Parameters:
value
- the constant value- Returns:
- the aggregation method
-
getRandom
RandomAggregation getRandom(double min, double max)
Returns the aggregation method that returns a random value.- Returns:
- the aggregation method
-
getSum
SumAggregation getSum(Column column)
Returns the aggregation method for computing the sum.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getCount
CountAggregation getCount(Column column)
Returns the aggregation method for counting the number of values.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getDistributiveStatistics
DistributiveStatisticsAggregation getDistributiveStatistics(Column column)
Returns the aggregation method for computing the univariate statistics.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getMin
MinAggregation getMin(Column column)
Returns the aggregation method for finding the minimum value.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getMax
MaxAggregation getMax(Column column)
Returns the aggregation method for finding the maximum value.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getMean
MeanAggregation getMean(Column column)
Returns the aggregation method for computing the mean value.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getVariance
VarianceAggregation getVariance(Column column)
Returns the aggregation method for computing the variance.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getVarianceByPopulation
VarianceAggregation getVarianceByPopulation(Column column, Column population)
Returns the aggregation method for computing the variance by factoring in the specified population.- Parameters:
column
- the column keypopulation
- the key of the column for the population- Returns:
- the aggregation method
-
getStdDev
StdDevAggregation getStdDev(Column column)
Returns the aggregation method for computing the standard deviation.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getUnivariateStatistics
UnivariateStatisticsAggregation getUnivariateStatistics(Column column)
Returns the aggregation method for computing the univariate statistics.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getMedian
MedianAggregation getMedian(Column column)
Returns the aggregation method for computing the median value.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getFirstQuartile
FirstQuartileAggregation getFirstQuartile(Column column)
Returns the aggregation method for computing the median value.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getThirdQuartile
ThirdQuartileAggregation getThirdQuartile(Column column)
Returns the aggregation method for computing the median value.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getWeightedSum
Aggregation getWeightedSum(Column weight, Column column)
Returns the aggregation method for computing the weighted sum.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getWeightedMean
Aggregation getWeightedMean(Column weight, Column column)
Returns the aggregation method for computing the weigthed mean.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getCountDistinct
CountDistinctAggregation getCountDistinct(Column column)
Returns the aggregation method for counting the number of distinct values.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getCountDistinctWithNull
CountDistinctWithNullAggregation getCountDistinctWithNull(Column column)
Returns the aggregation method for counting the number of distinct values, including null value if present.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
getCentroid
CentroidAggregation getCentroid(Column column)
Returns the aggregation method for finding the centroid of geometries.- Parameters:
column
- the column key- Returns:
- the aggregation method
-
printSchema
@GwtIncompatible void printSchema()
Display the content of the data frame to the console.
-
print
@GwtIncompatible void print()
Display the content of the data frame to the console.
-
print
@GwtIncompatible void print(java.io.PrintStream out, java.lang.String caption, boolean html)
Display the content of the data frame to the specified output.- Parameters:
out
- the outputcaption
- the titlehtml
- false for text output, true for HTML formatting
-
benchmark
void benchmark()
Perform some repetitive operation to estimate the performance of the data frame.
-
addDataFrameListener
void addDataFrameListener(DataFrameListener<Row,Column> listener)
Add a listener to the list that's notified each time a change to the data frame occurs.- Parameters:
listener
- the DataFrameListener
-
addWeakDataFrameListener
void addWeakDataFrameListener(DataFrameListener<Row,Column> listener)
Add a listener to the list that's notified each time a change to the data frame occurs. The listener will automatically be disposed of should no other object have a reference to it.- Parameters:
listener
- the DataFrameListener
-
removeDataFrameListener
void removeDataFrameListener(DataFrameListener<Row,Column> listener)
Remove a listener to the list that's notified each time a change to the data frame occurs.- Parameters:
listener
- the DataFrameListener
-
removeDataFrameListeners
void removeDataFrameListeners()
Remove all listeners to the list that's notified each time a change to the data frame occurs.
-
-