lisa.datautils.df_delta#

lisa.datautils.df_delta(pre_df, post_df, group_on=None)[source]#

pre_df and post_df containing paired/consecutive events indexed by time, df_delta() merges the two dataframes and adds a delta column containing the time spent between the two events. A typical usecase would be adding pre/post events at the entry/exit of a function.

Rows from pre_df and post_df are grouped by the group_on columns. E.g.: ['pid', 'comm'] to group by task. Except columns listed in group_on, pre_df and post_df must have columns with different names.

Events that cannot be paired are ignored.

Parameters:
  • pre_df (pandas.DataFrame) – Dataframe containing the events that start a record.

  • post_df (pandas.DataFrame) – Dataframe containing the events that end a record.

  • group_on (list(str)) – Columns used to group pre_df and post_df. E.g.: This would be ['pid', 'comm'] to group by task.

Returns:

a pandas.DataFrame indexed by the pre_df dataframe with:

  • All the columns from the pre_df dataframe.

  • All the columns from the post_df dataframe.

  • A delta column (duration between the emission of a ‘pre’ event

    and its consecutive ‘post’ event).