Skip to content

BUG: Assign with "df.loc[index_value][column_name] = value" fails to assign properly #35743

Closed
@Boris-Molina

Description

@Boris-Molina
  • [Y] I have checked that this issue has not already been reported.

  • [Y] I have confirmed this bug exists on the latest version of pandas.

  • [N] (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

This line of code:

self.strategy.loc[bar]['FundingRate'] = np.log(F1/F0)

Fails to assign the proper value . I think it either does nothing (leaves the original NaN from when the df "strategy" was created) or assigns a NaN. In any case, I added a set of flags to get the values input with this code:

print('Inputs for np.log F0={} F1={}'.format(F0,F1))
print('Index label bar={},  type={}'.format(bar, type(bar)))
value = np.log(F1/F0)
print('This should be the assigned value={}'.format(value))
self.strategy.loc[bar]['FundingRate'] = value
assert self.strategy.loc[bar]['FundingRate'] == value, 'Error PANDAS fails to assign {}, instead we find {}'.format(value, self.strategy.loc[bar]['FundingRate'])

Problem description

Assignment operations have started to fail to properly assign values after upgrading from v 1.0.4 to v1.0.5. While I need to run my code in Python 3.7/Pandas 1.0.5 due to other package dependencies, I recreated the problem with Python 3.8 and Pandas v1.1.0

The assignments work if I change to:

df.loc[bar, 'FundingRate'] = value

Or with:

df['FundingRate'].loc[bar] = value

I can't reproduce this problem in a simple setting. It only occurs in runtime on a system of +6k lines of code which has been working seamlessly with pervasive use of these types of assignments (df.loc[indexvalue][column_name] = value). Also, I can't do a dill.dump because there are tensorflow objects that are not serializable.

Runtime Output:

The assert fails. This is the printout of the code output:

Inputs for np.log F0=180.4753051802489 F1=180.4753051802489
Index label bar=2000-01-03 00:00:00,  type=<class 'pandas._libs.tslibs.timestamps.Timestamp'>
This should be the assigned value=0.0
Traceback (most recent call last):

  File "/media/WORK/Boris/LEM_Strategy/Software/LEM_Classes/main_nn_seq.py", line 150, in <module>
    run_stats = AA.run_strategy(consensus_type=consensus_type)

  File "/media/WORK/Boris/LEM_Strategy/Software/LEM_Classes/lib/nn_seqstrat.py", line 642, in run_strategy
    date, _ = self.get_date_price(bar, when='_CLOSE')

  File "/media/WORK/Boris/LEM_Strategy/Software/LEM_Classes/lib/backtester_sequential.py", line 152, in get_date_price
    assert self.strategy.loc[bar]['FundingRate'] == value, 'Error PANDAS fails to assign {}, instead we find {}'.format(value, self.strategy.loc[bar]['FundingRate'])

AssertionError: Error PANDAS fails to assign 0.0, instead we find nan

Output of pd.show_versions()

pd.show_versions()

INSTALLED VERSIONS

commit : d9fff27
python : 3.8.5.final.0
python-bits : 64
OS : Linux
OS-release : 5.1.15-surface-linux-surface
Version : #8 SMP Thu Jun 27 12:03:55 EDT 2019
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.1.0
numpy : 1.19.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.2
setuptools : 49.6.0.post20200814
Cython : 0.29.21
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 1.2.9
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.16.1
pandas_datareader: None
bs4 : 4.9.1
bottleneck : None
fsspec : 0.7.4
fastparquet : None
gcsfs : None
matplotlib : 3.2.2
numexpr : 2.7.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.0
sqlalchemy : None
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : None
numba : 0.50.1

This is the file used to create the environment where the error occurs.

########################################
#
# LEM_Strategy Conda Environment
#
# run: conda env create -f quant_conda_env.yml
#
########################################
name: quant  # COMMENT OUT TO CREATE TEST ENVIRONMENTS (conda env create -n test -f quant_conda_env.yml)
########################################
channels:
  - plotly
  - defaults
  - conda-forge
  #- bjrn        # Channel for google, V20 package (RAY)
########################################
dependencies:
  - python=3.8  # 
  - numpy
  - scipy
  - pandas  #==1.0.4
  - numba
  - cython
  - numexpr
  - statsmodels
  - scikit-learn
  - xlrd
  - xlsxwriter
  - ipywidgets
  - pathos
  - tensorflow-gpu  #==2.1.*
  - keras-gpu
  - tsfresh
  - pytables
  - pyzmq  # ZeroMQ: sockets
# Graphics and Plotting
  - plotly
  - plotly-orca
  - requests
  - matplotlib
  - seaborn
  - cufflinks-py
# Utilities
  - nb_conda_kernels  # For Jupyter
  - spyder-kernels    # Spyder
  - git               # Github support: custom packages and bug forks
# PIP Dependencies (otherwise installed via PIP: Use Conda to install to improve environment integrity over time)
  - yaml                  # For TPQOA (OANDA Wrapper)
  - ujson                 # For TPQOA (OANDA Wrapper)
 # - v20                   # OANDA API V2.0 (from brjn conda channel) (TPQOA installs from PIP))
 # - modin                 # Ray Tutorial  INSTALL MANUALLY
 # - opencv                # Ray Tutorial  INSTALL MANUALLY
 # - gym                   # Ray Tutorial  INSTALL MANUALLY
  - aiohttp               # Ray
  - colorama              # Ray
  - filelock              # Ray
  - redis                 # Ray
  - multidict             # Ray
  - yarl                  # Ray
  - async_timeout         # Ray
  - beautifulsoup4        # Ray
  - soupsieve             # Ray
 # - redis                 # Ray (get from "$pip_deps ray"  shell script. Output '<3.5.0,>=3.3.2') (Ray installs from PIP))
 # - py-spy>=0.2.0         # Ray (get from "$pip_deps ray"  shell script) (Ray installs from PIP))
 # - google                # Ray (from brjn conda channel) (Ray installs from PIP))
# Non-Conda Packages via PIP 
  - pip               # First install PIP PACKAGE MANAGER
  - pip: 
    - "git+git://github.com/yhilpisch/tpqoa.git"      # TPQ OANDA Wrapper
    - "git+git://github.com/yhilpisch/tstables.git"   # Time Series Tables with pandas=1.0 bug fix
    #- "git+git://github.com/tensorflow/model-optimization.git" #  TensorFlow Model Optimization  "import tensorflow_model_optimization as tfmot"
    - cardinality  
    - ray #  Ray: fast and simple framework for building and running distributed applications.

Metadata

Metadata

Assignees

No one assigned

    Labels

    IndexingRelated to indexing on series/frames, not to indexes themselvesUsage Question

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions