算法交易:使用循环创建数据框,将行期权合约名称导出到 csv 并计算数据集的高点和低点

如何解决算法交易:使用循环创建数据框,将行期权合约名称导出到 csv 并计算数据集的高点和低点

我的目标:从 yahoo_fin 收集期权链数据并编译每个“合约名称”行的数据,以便我可以计算 14 天的高点和低点。

我的计划:每天使用循环创建一个选项链数据框,为每个“合同名称”行导出一个 csv,以便我可以编译数据点,然后使用“...滚动(窗口=14).max()" 命令来计算高点和 .min() 的低点。

我的问题:我能够创建每个数据框,但不知道如何根据“合同名称”行将每一行导出到 csv 并且需要将数据添加到以前的数据点(不覆盖)。我也不确定这是否是解决此问题的最佳方法,因此请提供更好的解决方案。

我的代码如下:

我正在使用循环获取期权链数据并为变量中定义的股票行情列表创建数据框

    import pandas as pd
    from yahoo_fin import options
    
    listshare = ('GOOGL','NFLX')
    
    length_1 = len(listshare)
    i = 0
    
    while i < length_1:
        print(listshare[i] + " is uploading data")
       
        locals()[str(listshare[i])+"_df"] = options.get_calls(listshare[i])
        
        i += 1

然后我使用以下代码计算中点(标记)。标记是我需要计算的高点和低点:

    listshare = ('GOOGL','NFLX')
    
    df_list = (GOOGL_df,NFLX_df)
    
    length_2 = len(df_list)
    
    i = 0
    
    while i < length_2:
        
        ##Generate variables
        df_list[i]['Ticker'] = listshare[i]
        today = date.today()
        bid = df_list[i]['Bid']
        ask = df_list[i]['Ask']
        df_list[i]['Mark'] = ask - ((ask-bid)/2)
        mark = df_list[i]['Mark']
        df_list[i]['Date'] = today
        
        
        i += 1

这是数据框的样子

    Ticker  Date    Contract Name   Strike  Mark    Bid Ask % Change    Open Interest   Implied Volatility
0   NFLX    2021-04-28  NFLX210430C00270000 270.0   236.750 235.15  238.35  -   0   284.77%
1   NFLX    2021-04-28  NFLX210430C00300000 300.0   206.750 205.15  208.35  -   2   240.63%
2   NFLX    2021-04-28  NFLX210430C00380000 380.0   126.750 125.15  128.35  -   3   140.23%
3   NFLX    2021-04-28  NFLX210430C00385000 385.0   121.750 120.15  123.35  -15.39% 2   134.57%
4   NFLX    2021-04-28  NFLX210430C00400000 400.0   106.875 105.40  108.35  -   30  125.49%

感谢您的洞察力。

更新: 我已经想出了如何为每个“合同名称”创建单独的 CSV 文件并使用下面的代码附加它们。我目前的问题是我无法访问之前的“标记”列数据来生成新列来标识给定时期内的最低价格或每个时期的价格变化。我在网上读到这可能是由于 CSV 是逗号分隔值而发生的,并且尝试了 .astype(float) 但没有成功。感谢您的洞察力!

from yahoo_fin import options
from datetime import date
import pandas as pd
import os
import csv
today = date.today()



listshare = ('GOOGL','NFLX')
length_1 = len(listshare)
i = 0

# Iterate using a loop
while i < length_1:
    exp_dates = options.get_expiration_dates(listshare[i])
    
    info = {}
    
    for date in exp_dates:
        print(str(listshare[i] + ' calls: ' + date))
        info[date] = options.get_calls(listshare[i],date)
        info[date][['Date']] = today
        info[date][['Expiration']] = date
        info[date].set_index('Date',inplace = True)
        
        #variables
        info[date]['Ticker'] = listshare[i]
        info[date]['Bid'] = info[date]['Bid'].astype(float)
        info[date]['Ask'] = info[date]['Ask'].astype(float)
        bid = info[date]['Bid'].astype(float)
        ask = info[date]['Ask'].astype(float)
        info[date]['Mark'] = ask - ((ask-bid)/2)
        info[date][['Mark']] = info[date]['Mark'].astype(float)
        mark = info[date][['Mark']]
        info[date][['Period Low']] = mark
        
        
        relevant_1 = info[date][info[date]["Bid"] > 0.04]
        relevant = relevant_1[relevant_1['Strike'] %10 == 0]
        
        
        rel = relevant[['Ticker','Contract Name','Expiration','Strike','Bid','Mark','Ask','Period Low']]
        
        
        #print(rel)
        
    
        groupby = rel.groupby('Contract Name')
        
        
        for n,g in groupby:
            csv_filename = "{}.csv".format(n)
            csv = g.to_csv(index=True)
            #print(csv_filename) - "Contract Name.csv"
            #print(g) - "Data points (data frame) grouped by Contract name"
            #print(n) - "Contract Name"
            
            
            #check if file exist,if so append,if not create
            if os.path.exists(csv_filename):
                
                #open file and append current day's options chain
                with open(csv_filename,'a') as csvfile:
                    print('Opening & Appending: ' + str(csv_filename))
                    g.to_csv(csv_filename,mode='a',header=False)
                    csvfile.close()
                    print('Done.')
                    
                #convert csv to df
                df = pd.read_csv(csv_filename)
                
                #define variables
                info[date]['Ticker'] = listshare[i]
                low_period = 14
                df[['Mark']] = df['Mark'].astype(float)
                mark = df[['Mark']].astype(float)
                
                #append dataframe (add current day's data & add 14d low column)
                print('Converting ' + str(csv_filename) + ' to data frame.')
                df[['Mark']] = df[['Mark']].astype(float)
                mark = df[['Mark']].astype(float)
                
                df[['Period Low']] = mark.rolling(window = low_period).min()
                df[['Period Low']] = df[['Period Low']].astype(float)
                period_low = df[['Period Low']].astype(float)
                
                df[['Period High']] = mark.rolling(window = low_period).max()
                df[['Period High']] = df[['Period High']].astype(float)
                period_high = df[['Period High']].astype(float)
                
                df[['Previous Low']] = df[['Period Low']].shift(+1)
                previous_low = df[['Previous Low']]
                
                df[['Mark Delta %']] = ((mark - (mark.shift(+1)))/mark.shift(+1))
                df[['Mark Delta %']] = df[['Mark Delta %']].astype(float)
                mark_delta = df[['Mark Delta %']].astype(float)
                
                df.set_index('Date',inplace = True)
                
                #overwrite existing csv from dataframe
                print('Updating ' + str(csv_filename) + ' from data frame.')
                with open(csv_filename,'w') as out_file:
                    df.to_csv(csv_filename)
                    print(str(csv_filename) + ' is complete.')
                
                
            else:
                
                #Create new csv file
                print('Creating: ' + str(csv_filename))
                g.to_csv(csv_filename)
                

        
    i += 1

附注。我还需要一个过程来覆盖具有相同日期值的行。

解决方法

请稍加保留,但我花了一些时间对此并有一些观察/建议。我的代码如下(基于我认为你需要的)。您必须验证最终结果是否符合您的要求,因为您的某些代码难以遵循。

虽然我保留了您创建单个文件的逻辑,但我认为这是非常多的开销,而不是在有意义的情况下使用单个文件/数据帧。当您想要分析或查找信号/警报时,像这样翻阅文件非常耗时。

您正在使用的模块输出数据帧,您应该只使用这些。最简单的方法是读取到期日期,然后将它们连接起来,然后对它们进行操作。你会看到我的代码是这样做的。

不需要while循环,只需使用列表和for语句。您将在我的代码中看到我按切片限制元素数量(用于测试)。你可以删除那些。

没有必要定义/重新定义系列。这可能只是个人喜好。我删除了所有这些。

你在这里关于浮点数的绊脚石不是 csv 或解析问题;一旦你深入研究,你就会看到出价和询价没有值的地方,它们有一个破折号 (-)。一旦您替换它并更改为浮动,您就不需要搞乱所有这些转换。

无论如何,你有一个很好的流程,这应该收紧一点

today = date.today() + timedelta(days=0) # can can change to negative days to get data from previous days,zzero is today
listshare = ['GOOGL','NFLX']
for symbol in listshare:
    exp_dates = options.get_expiration_dates(symbol)
    df_hold_list = []
    for expdate in exp_dates:
        # print(str(symbol + ' calls: ' + expdate))
        df = options.get_calls(symbol,expdate)
        df['Date'] = today
        df['Expiration'] = expdate
        df['Ticker'] = symbol
        df_hold_list.append(df)
    dff = pd.concat(df_hold_list)

    # do calculations and reduce dataframe
    dff[['Bid','Ask']] = dff[['Bid','Ask']].replace('-','0.0')
    dff['Bid'] = dff['Bid'].astype(float)
    dff = dff[dff["Bid"] > 0.04]
    dff = dff[dff['Strike'] %10 == 0]
    dff['Mark'] = dff['Ask'] - ((dff['Ask']-dff['Bid'])/2)
    dff['Period Low'] = dff['Mark']
    dff.set_index('Date',inplace=True)
    # use copy() as sometimes pandas will throw errors about modifying a slice of a dataframe
    dff = dff[['Ticker','Contract Name','Expiration','Strike','Bid','Mark','Ask','Period Low']].copy()
    print(dff)

    groupby = dff.groupby('Contract Name')
    for n,g in list(groupby)[0:2]:
        csv_filename = "{}.csv".format(n)
        #check if file exist,if so append,if not create
        if os.path.exists(csv_filename):
            #open file and append current day's options chain
            with open(csv_filename,'a') as csvfile:
                print('Opening & Appending: ' + str(csv_filename))
                g.to_csv(csv_filename,mode='a',header=False)
                csvfile.close()
                print('Done.')

            #convert csv to df
            df = pd.read_csv(csv_filename)

            #define variables
            low_period = 14

            df['Previous Low'] = df['Period Low'].shift(+1)

            # check these calcuations??? not sure i have them correct
            df['Period Low'] = df['Mark'].rolling(window = low_period).min()
            df['Period High'] = df['Mark'].rolling(window = low_period).max()
            df[['Mark Delta %']] = ((df['Mark'] - (df['Mark'].shift(+1)))/df['Mark'].shift(+1))
            df.set_index('Date',inplace = True)

            #overwrite existing csv from dataframe
            print('Updating ' + str(csv_filename) + ' from data frame.')
            with open(csv_filename,'w') as out_file:
                df.to_csv(csv_filename)
                print(str(csv_filename) + ' is complete.')

        else:
            #Create new csv file
            print('Creating: ' + str(csv_filename))
            g.to_csv(csv_filename)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


依赖报错 idea导入项目后依赖报错,解决方案:https://blog.csdn.net/weixin_42420249/article/details/81191861 依赖版本报错:更换其他版本 无法下载依赖可参考:https://blog.csdn.net/weixin_42628809/a
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下 2021-12-03 13:33:33.927 ERROR 7228 [ main] o.s.b.d.LoggingFailureAnalysisReporter : *************************** APPL
错误1:gradle项目控制台输出为乱码 # 解决方案:https://blog.csdn.net/weixin_43501566/article/details/112482302 # 在gradle-wrapper.properties 添加以下内容 org.gradle.jvmargs=-Df
错误还原:在查询的过程中,传入的workType为0时,该条件不起作用 &lt;select id=&quot;xxx&quot;&gt; SELECT di.id, di.name, di.work_type, di.updated... &lt;where&gt; &lt;if test=&qu
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct redisServer’没有名为‘server_cpulist’的成员 redisSetCpuAffinity(server.server_cpulist); ^ server.c: 在函数‘hasActiveC
解决方案1 1、改项目中.idea/workspace.xml配置文件,增加dynamic.classpath参数 2、搜索PropertiesComponent,添加如下 &lt;property name=&quot;dynamic.classpath&quot; value=&quot;tru
删除根组件app.vue中的默认代码后报错:Module Error (from ./node_modules/eslint-loader/index.js): 解决方案:关闭ESlint代码检测,在项目根目录创建vue.config.js,在文件中添加 module.exports = { lin
查看spark默认的python版本 [root@master day27]# pyspark /home/software/spark-2.3.4-bin-hadoop2.7/conf/spark-env.sh: line 2: /usr/local/hadoop/bin/hadoop: No s
使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-