EdgeFrame flatten_column¶

flatten_column(self, column, delimiter=None)¶

Spread data to multiple rows based on cell data.

Parameters:

Parameters:	column : unicode The column to be flattened. delimiter : unicode (default=None) The delimiter string. Default is comma (,).
Returns:	: _Unit

column : unicode

The column to be flattened.

delimiter : unicode (default=None)

The delimiter string. Default is comma (,).

Returns:

: _Unit

Splits cells in the specified column into multiple rows according to a string delimiter. New rows are a full copy of the original row, but the specified column only contains one value. The original row is deleted.

Examples

Given a data file:

1-"solo,mono,single"
2-"duo,double"

The commands to bring the data into a frame, where it can be worked on:

>>> my_csv = CsvFile("original_data.csv", schema=[('a', int32), ('b', str)], delimiter='-')
>>> my_frame = Frame(source=my_csv)

Looking at it:

>>> my_frame.inspect()

  a:int32   b:str
/-------------------------------/
    1       solo, mono, single
    2       duo, double

Now, spread out those sub-strings in column b:

>>> my_frame.flatten_column('b')

Check again:

>>> my_frame.inspect()

  a:int32   b:str
/------------------/
    1       solo
    1       mono
    1       single
    2       duo
    2       double

Quick search

Table Of Contents

EdgeFrame flatten_column¶