Skip to content

ENH: A new method that will more efficiently display 'tall' df #42837

Closed
@adamrossnelson

Description

@adamrossnelson

Is your feature request related to a problem?

I wish there was a way to preserve verticle screen space when I'm inspecting 'tall' data frames. These are data frames with many observaitons but few columns. Thus, they're tall. I use the word tall to avoid conflating with long data.

Describe the solution you'd like

I believe this function accomplishes the goal:

def insp(df, n=5, parts='ht'):
  if parts == 'ht':
    display = pd.concat([df.head(n).reset_index().rename(columns={'index':'loc'}),
                         df.tail(n).reset_index().rename(columns={'index':'loc'})], 
                        axis=1,
                        keys=['head','tail'])
  if parts == 'hs':
    sep='< head | sample >'
    display = pd.concat([df.head(n).reset_index().rename(columns={'index':'loc'}),
                         df.sample(n).reset_index().rename(columns={'index':'loc'})], 
                        axis=1,
                        keys=['head','sample'])
  if parts == 'st':
    sep='< head | sample >'
    display = pd.concat([df.sample(n).reset_index().rename(columns={'index':'loc'}),
                         df.tail(n).reset_index().rename(columns={'index':'loc'})], 
                        axis=1,
                        keys=['sample','tail'])
  if parts == 'hst':
    sep1 = '< head | sample >'
    sep2 = '< sample | tail >'
    display = pd.concat([df.head(n).reset_index().rename(columns={'index':'loc'}),
                         df.sample(n).reset_index().rename(columns={'index':'loc'}),
                         df.tail(n).reset_index().rename(columns={'index':'loc'})], 
                        axis=1,
                        keys=['head','sample','tail'])
  return(display)

API breaking implications

Not applicable and/or unsure.

Describe alternatives you've considered

So the above function works great. Try it out. Take a look over at this colab notebook that demonstrates:

https://colab.research.google.com/drive/1mcNLlG6RVbhoXGCMdVyUryxKSgcaRPNC?usp=sharing

But, I'd propose a new method. Something like Pandas.DataFrame.inspect() or Pandas.DataFrame.insp() - to compliment the .head() - .sample() - and .tail() methods.

Additional Context

Inspired by the first suggestion in this article:
https://towardsdatascience.com/pandas-hacks-that-i-wish-i-had-when-i-started-out-1f942caa9792
(See the first 'hack' of the three).

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNeeds DiscussionRequires discussion from core team before further actionOutput-Formatting__repr__ of pandas objects, to_string

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions