python - Misaligned columns in Pandas dataframe from pandas.ExcelFile import -
i have excel spreadsheet contains transactional data. tried importing pandas dataframe:
>>> import pandas pd >>> xlsfile = pd.excelfile("/data/transactions.xls") >>> data = xlsfile.parse('data')
... and, @ first glance, looked ok. noticed column (i.e. 'ship region') should contain 1 of 4 possible values:
... had values didn't make sense. although values, part, end in correct columns, there thousands of instances not case:
>>> len(data['ship region'].unique()) 5007
values neighboring cells somehow creeping wrong columns.
>>> value in data['ship region'].unique(): ... print value ... americas emea apac nan ship name justin bieber marie curie industries bks iyengar [...etc...]
can see i'm doing wrong?
that strange. version of pandas using?
by way, can use pd.read_excel() , in 1 line.
Comments
Post a Comment