lisa.datautils#

Dataframe utilities.

Globals

Classes

DataAccessor

Proxy class that allows extending the pandas.DataFrame API.

DataFrameAccessor

Proxy class that allows extending the pandas.DataFrame API.

SeriesAccessor

Proxy class that allows extending the pandas.DataFrame API.

SignalDesc

Define a signal to be used by various signal-oriented APIs.

Timestamp

Nanosecond-precision timestamp. It inherits from float and as such can be manipulating as a floating point number of seconds. The nanoseconds attribute allows getting the exact timestamp regardless of the magnitude of the float, allowing for more precise computation.

Functions

df_add_delta()

Add a column containing the delta of the given other column.

df_combine()

Same as pandas.DataFrame.combine() on a list of series rather than just two.

df_combine_duplicates()

Combine the duplicated rows using func and remove the duplicates.

df_convert_to_nullable()

Convert the columns of the dataframe to their equivalent nullable dtype, when possible.

df_deduplicate()

Same as series_deduplicate() but for pandas.DataFrame.

df_delta()

pre_df and post_df containing paired/consecutive events indexed by time, df_delta() merges the two dataframes and adds a delta column containing the time spent between the two events. A typical usecase would be adding pre/post events at the entry/exit of a function.

df_dereference()

Similar to series_dereference().

df_filter()

Filter the content of a dataframe.

df_filter_task_ids()

Filter a dataframe using a list of lisa.analysis.tasks.TaskID.

df_find_redundant_cols()

Find the columns that are redundant to col, i.e. that can be computed as df[x] = df[col].map(dict(...)).

df_make_empty_clone()

Make an empty clone of the given dataframe.

df_merge()

Merge a list of pandas.DataFrame, keeping the index sorted.

df_refit_index()

Same as series_refit_index() but acting on pandas.DataFrame.

df_split_signals()

Yield subset of df that only contain one signal, along with the signal identification values.

df_squash()

Slice a dataframe of deltas in [start:end] and ensure we have an event at exactly those boundaries.

df_update_duplicates()

Same as series_update_duplicates() but on a pandas.DataFrame.

df_window()

Same as series_window() but acting on a pandas.DataFrame.

df_window_signals()

Similar to df_window() with method='pre' but guarantees that each signal will have a values at the beginning of the window.

series_align_signal()

Align a signal to an expected reference signal using their cross-correlation.

series_combine()

Same as pandas.Series.combine() on a list of series rather than just two.

series_convert()

Convert a pandas.Series with a best effort strategy.

series_deduplicate()

Remove duplicate values in a pandas.Series.

series_dereference()

Replace each value in series by the value at the corresponding index by the source indicated by series’s value.

series_derivate()

Compute a derivative of a pandas.Series with respect to another series.

series_envelope_mean()

Compute the average between the mean of local maximums and local minimums of the series.

series_integrate()

Compute the integral of y with respect to x.

series_local_extremum()

Returns a series of local extremum.

series_mean()

Compute the average of y by integrating with respect to x and dividing by the range of x.

series_refit_index()

Slice a series using series_window() and ensure we have a value at exactly the specified boundaries, unless the signal started after the beginning of the required window.

series_rolling_apply()

Apply a function on a rolling window of a series.

series_update_duplicates()

Update a given series to avoid duplicated values.

series_window()

Select a portion of a pandas.Series.