pandas - AttributeError: 'DataFrame' object has no attribute 'colmap' in Python -
i python beginner , try use following code source: portfolio rebalancing bandwidth method in python
the code works far.
the problem if want call function not usual rebalance(df, tol)
, location in dataframe on, like: rebalance(df[500:], tol)
, following error:
attributeerror: 'dataframe' object has no attribute 'colmap'
. question is: how have adjust code in order make possible?
here code:
import datetime dt import numpy np import pandas pd import pandas.io.data pid def setup_df(): df1 = pid.get_data_yahoo("ibm", start=dt.datetime(1970, 1, 1), end=dt.datetime.today()) df1.rename(columns={'adj close': 'ibm'}, inplace=true) df2 = pid.get_data_yahoo("f", start=dt.datetime(1970, 1, 1), end=dt.datetime.today()) df2.rename(columns={'adj close': 'ford'}, inplace=true) df = df1.join(df2.ford, how='inner') df = df[['ibm', 'ford']] df['sh ibm'] = 0 df['sh ford'] = 0 df['ibm value'] = 0 df['ford value'] = 0 df['ratio'] = 0 # useful in conjunction iloc referencing column names # index number df.colmap = dict([(col, i) i,col in enumerate(df.columns)]) return df def invest(df, i, amount): """ invest amount dollars evenly between ibm , ford starting @ ordinal index i. modifies df. """ c = df.colmap halfvalue = amount/2 df.iloc[i:, c['sh ibm']] = halfvalue / df.iloc[i, c['ibm']] df.iloc[i:, c['sh ford']] = halfvalue / df.iloc[i, c['ford']] df.iloc[i:, c['ibm value']] = ( df.iloc[i:, c['ibm']] * df.iloc[i:, c['sh ibm']]) df.iloc[i:, c['ford value']] = ( df.iloc[i:, c['ford']] * df.iloc[i:, c['sh ford']]) df.iloc[i:, c['ratio']] = ( df.iloc[i:, c['ibm value']] / df.iloc[i:, c['ford value']]) def rebalance(df, tol): """ rebalance df whenever ratio falls outside tolerance range. modifies df. """ = 0 amount = 100 c = df.colmap while true: invest(df, i, amount) mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol) # ignore prior locations ratio falls outside tol range mask[:i] = false try: # move 1 index past first index mask true # note means ratio @ remain outside tol range = np.where(mask)[0][0] + 1 except indexerror: break amount = (df.iloc[i, c['ibm value']] + df.iloc[i, c['ford value']]) return df df = setup_df() tol = 0.05 #setting bandwidth tolerance rebalance(df, tol) df['portfolio value'] = df['ibm value'] + df['ford value'] df["ibm_weight"] = df['ibm value']/df['portfolio value'] df["ford_weight"] = df['ford value']/df['portfolio value'] print df['ibm_weight'].min() print df['ibm_weight'].max() print df['ford_weight'].min() print df['ford_weight'].max() # shows rows trigger rebalancing mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol) print(df.loc[mask])
the problem encountered due poor design decision on part. colmap
attribute defined on df
in setup_df
:
df.colmap = dict([(col, i) i,col in enumerate(df.columns)])
it not standard attribute of dataframe.
df[500:]
returns new dataframe generated copying data df
new dataframe. since colmap
not standard attribute, not copied new dataframe.
to call rebalance
on dataframe other 1 returned setup_df
, replace c = df.colmap
c = dict([(col, j) j,col in enumerate(df.columns)])
i've made change in the original post well.
ps. in other question, had chosen define colmap
on df
dict not have recomputed every call rebalance
, invest
.
your question shows me minor optimization not worth making these functions dependent on specialness of dataframe returned setup_df
.
there second problem encounter using rebalance(df[500:], tol)
:
since df[500:]
returns copy of portion of df
, rebalance(df[500:], tol)
modify copy , not original df
. if object, df[500:]
, has no reference outside of rebalance(df[500:], tol)
, garbage collected after call rebalance
completed. entire computation lost. therefore rebalance(df[500:], tol)
not useful.
instead, modify rebalance
accept i
parameter:
def rebalance(df, tol, i=0): """ rebalance df whenever ratio falls outside tolerance range. modifies df. """ c = dict([(col, j) j, col in enumerate(df.columns)]) while true: mask = (df['ratio'] >= 1+tol) | (df['ratio'] <= 1-tol) # ignore prior locations ratio falls outside tol range mask[:i] = false try: # move 1 index past first index mask true # note means ratio @ remain outside tol range = np.where(mask)[0][0] + 1 except indexerror: break amount = (df.iloc[i, c['ibm value']] + df.iloc[i, c['ford value']]) invest(df, i, amount) return df
then can rebalance df
starting @ 500th row using
rebalance(df, tol, i=500)
note finds first row on or after i=500 needs rebalancing. not rebalance @ i=500 itself. allows call rebalance(df, tol, i)
arbitrary i
without having determine in advance if rebalancing required on row i
.
Comments
Post a Comment