pandas - AttributeError: 'DataFrame' object has no attribute 'colmap' in Python -


i python beginner , try use following code source: portfolio rebalancing bandwidth method in python

the code works far.

the problem if want call function not usual rebalance(df, tol), location in dataframe on, like: rebalance(df[500:], tol), following error:

attributeerror: 'dataframe' object has no attribute 'colmap'. question is: how have adjust code in order make possible?

here code:


import datetime dt import numpy np import pandas pd import pandas.io.data pid  def setup_df():     df1 = pid.get_data_yahoo("ibm",                               start=dt.datetime(1970, 1, 1),                               end=dt.datetime.today())     df1.rename(columns={'adj close': 'ibm'}, inplace=true)      df2 = pid.get_data_yahoo("f",                               start=dt.datetime(1970, 1, 1),                               end=dt.datetime.today())     df2.rename(columns={'adj close': 'ford'}, inplace=true)      df = df1.join(df2.ford, how='inner')     df = df[['ibm', 'ford']]     df['sh ibm'] = 0     df['sh ford'] = 0     df['ibm value'] = 0     df['ford value'] = 0     df['ratio'] = 0     # useful in conjunction iloc referencing column names     # index number     df.colmap = dict([(col, i) i,col in enumerate(df.columns)])     return df  def invest(df, i, amount):     """     invest amount dollars evenly between ibm , ford     starting @ ordinal index i.     modifies df.     """     c = df.colmap     halfvalue = amount/2     df.iloc[i:, c['sh ibm']] = halfvalue / df.iloc[i, c['ibm']]     df.iloc[i:, c['sh ford']] = halfvalue / df.iloc[i, c['ford']]      df.iloc[i:, c['ibm value']] = (         df.iloc[i:, c['ibm']] * df.iloc[i:, c['sh ibm']])     df.iloc[i:, c['ford value']] = (         df.iloc[i:, c['ford']] * df.iloc[i:, c['sh ford']])     df.iloc[i:, c['ratio']] = (         df.iloc[i:, c['ibm value']] / df.iloc[i:, c['ford value']])  def rebalance(df, tol):     """     rebalance df whenever ratio falls outside tolerance range.     modifies df.     """     = 0     amount = 100     c = df.colmap     while true:         invest(df, i, amount)         mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol)         # ignore prior locations ratio falls outside tol range         mask[:i] = false         try:             # move 1 index past first index mask true             # note means ratio @ remain outside tol range             = np.where(mask)[0][0] + 1         except indexerror:             break         amount = (df.iloc[i, c['ibm value']] + df.iloc[i, c['ford value']])     return df  df = setup_df() tol = 0.05 #setting bandwidth tolerance rebalance(df, tol)  df['portfolio value'] = df['ibm value'] + df['ford value'] df["ibm_weight"] = df['ibm value']/df['portfolio value'] df["ford_weight"] = df['ford value']/df['portfolio value']  print df['ibm_weight'].min() print df['ibm_weight'].max() print df['ford_weight'].min() print df['ford_weight'].max()  # shows rows trigger rebalancing mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol) print(df.loc[mask]) 

the problem encountered due poor design decision on part. colmap attribute defined on df in setup_df:

df.colmap = dict([(col, i) i,col in enumerate(df.columns)]) 

it not standard attribute of dataframe.

df[500:] returns new dataframe generated copying data df new dataframe. since colmap not standard attribute, not copied new dataframe.

to call rebalance on dataframe other 1 returned setup_df, replace c = df.colmap

c = dict([(col, j) j,col in enumerate(df.columns)]) 

i've made change in the original post well.

ps. in other question, had chosen define colmap on df dict not have recomputed every call rebalance , invest.

your question shows me minor optimization not worth making these functions dependent on specialness of dataframe returned setup_df.


there second problem encounter using rebalance(df[500:], tol):

since df[500:] returns copy of portion of df, rebalance(df[500:], tol) modify copy , not original df. if object, df[500:], has no reference outside of rebalance(df[500:], tol), garbage collected after call rebalance completed. entire computation lost. therefore rebalance(df[500:], tol) not useful.

instead, modify rebalance accept i parameter:

def rebalance(df, tol, i=0):     """     rebalance df whenever ratio falls outside tolerance range.     modifies df.     """     c = dict([(col, j) j, col in enumerate(df.columns)])     while true:         mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol)         # ignore prior locations ratio falls outside tol range         mask[:i] = false         try:             # move 1 index past first index mask true             # note means ratio @ remain outside tol range             = np.where(mask)[0][0] + 1         except indexerror:             break         amount = (df.iloc[i, c['ibm value']] + df.iloc[i, c['ford value']])         invest(df, i, amount)     return df 

then can rebalance df starting @ 500th row using

rebalance(df, tol, i=500) 

note finds first row on or after i=500 needs rebalancing. not rebalance @ i=500 itself. allows call rebalance(df, tol, i) arbitrary i without having determine in advance if rebalancing required on row i.


Comments

Popular posts from this blog

facebook - android ACTION_SEND to share with specific application only -

python - Creating a new virtualenv gives a permissions error -

javascript - cocos2d-js draw circle not instantly -