BeautifulSoup webscrape .asp只搜索列表中的最后一个

如何解决BeautifulSoup webscrape .asp只搜索列表中的最后一个?

开发过程中遇到BeautifulSoup webscrape .asp只搜索列表中的最后一个的问题如何解决?下面主要结合日常开发的经验,给出你关于BeautifulSoup webscrape .asp只搜索列表中的最后一个的解决方法建议,希望对你解决BeautifulSoup webscrape .asp只搜索列表中的最后一个有所启发或帮助;

问题描述

def get_NYSE_tickers():

 an = ['A','B','C','D','E','F','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','0']

 for value in an:
     resp = requests.get(
         'https://www.advfn.com/nyse/newyorkstockexchange.asp?companIEs={}'.format(value))
     soup = bs.BeautifulSoup(resp.text,'lxml')
     table = soup.find('table',class_='market tab1')
     tickers = []
     for row in table.findAll('tr',class_='ts1',)[0:]:
         ticker = row.findAll('td')[1].text
         tickers.append(ticker)
     for row in table.findAll('tr',class_='ts0',)[0:]:
         ticker = row.findAll('td')[1].text
         tickers.append(ticker)
     with open("NYSE.pickle","wb") as f:
         while("" in tickers):
             tickers.remove("")
         pickle.dump(tickers,f)

 print(tickers)


get_NYSE_tickers()

我的问题是,当我运行此脚本时,我的输出只是“ 0”页面中包含的数据。它始终是列表中的最后一个值。

我也想知道是否有一种方法可以组合

for row in table.findAll('tr',)[0:]:
         ticker = row.findAll('td')[1].text
         tickers.append(ticker)

放入class _ ='ts0'的代码块中,'ts1'似乎不太了解。

我想查看所有的股票代号 https://www.advfn.com/nyse/newyorkstockexchange.asp?companies=Ahttps://www.advfn.com/nyse/newyorkstockexchange.asp?companies=Bhttps://www.advfn.com/nyse/newyorkstockexchange.asp?companies=C等。

在单个泡菜或csv文件中:

” [['AVX','AHC','RNT','AAN','AXF','DVK','RCW','SAD','ABB','ANF','ABM',' IMW”,“ SZM”,“ SZI”,“ ICT”,“ ACN”,“ ABD”,“ ATN”,“ AYI”,“ AEA”,“ ASX”,“ ACM”,“ AEG”,“ AEB” ,“ AEH”,“ AET”,“ AMG”,“ AG”,“ A”,“ ADC”,“ AGU”,“ APD”,“ ARG”,“ AQD”,“ ALZ”,“ ALF”,“ ALK”,“ ALB”,“ ALU”,“ ACL”,“ AXB”,“ ARE”,“ AYE”,“ AGN”,“ AMO”,“ ADS”,“ AOI”,“ AZM”,“ AIB” ,“ ALY”,“ ALM”,“ ALJ”,“ AWP”,“ AMB”,“ AKT”,“ ACO”,“ HES”,“ AMX”,“ ACC”,“ AEP”,“ AXP”,“ AFE”,“ AIG”,“ AVF”,“ AOB”,“ ARP”,“ AWR”,“ AVD”,“ ACF”,“ AGP”,“ AMN”,“ AHS”,“ AP”,“ AXR” ,“ AU”,“ NLY”,“ ATV”,“ ANH”,“ APA”,“ AIT”,“ WTR”,“ ARB”,“ ARJ”,“ ADM”,“ AWI”,“ ARM”,“ AHT”,“ AHL”,“ AGO”,“ AZN”,“ ATO”,“ ATT”,“ AUO”,“ AN”,“ NEH”,“ AVR”,“ AXA”,“ AZZ”,“ AKS” ,“ AAR”,“ AIR”,“ RNT.A”,“ AAN.A”,“ CBJ”,“ SQT”,“ IWK”,“ EOA”,“ REU”,“ MHG”,“ ABT”,“ AKR”,“ BJV”,“ ODY”,“ RDF”,“ MKY”,“ BFN”,“ ACE”,“ ATU”,“ ADX”,“ ASF”,“ AAP”,“ AEO”,“ AEV” ,“ AED”,“ AER”,“ AES”,“ ACS”,“ AFL”,“ AGCO”,“ AEM”,“ GRO”,“ Now”,“ AYR”,“ AAI”,“ ALQ”,“ ABA”,“ ALG”,“ AIN”,“ ACV”,“ AA”,“ AFN”,“ ALX”,“ Y” ,“ ATI”,“ ALE”,“ AB”,“ AZ”,“ AFC”,“ ALL”,“ ANR”,“ MO”,“ ACH”,“ ABK”,“ AKF”,“ AEE”,“ AXL”,“ ADY”,“ AEL”,“ AFG”,“ AM”,“ AFF”,“ ANL”,“ ARL”,“ ASI”,“ AMT”,“ AWK”,“ APU”,“ ABC” ,“ AME”,“ AMP”,“ APH”,“ APC”,“ AGL”,“ AXE”,“ AHR”,“ AOC”,“ AIV”,“ ATR”,“ ara”,“ ABR”,“ ACI”,“ ARD”,“ ARW”,“ ABG”,“ ASH”,“ ALC”,“ AIZ”,“ AF”,“ ATG”,“ AHD”,“ T”,“ ATW”,“ ALV” ,“ AZO”,“ AVB”,“ AVY”,“ AVA”,“ AVP”,“ AXS” ...]“

解决方法

<form method="POST">
            {{ form.hidden_tag() }}
            <table id="myTable">
                <tr class="heading">
                    <td>
                        Item
                    </td>

                    <td>
                        Price
                    </td>
                </tr>
                <div id="itemstable">
                    {% if items|length > 0 %}
                        {% for item in items %}
                            <tr class="item">
                                <td>
                                    <input id="item-{{ item.id }}" name="description-{{ item.id }}" required="" type="text" value="{{ item.description }}" >
                                </td>
                                <td>
                                    <input  id="item-{{ item.id }}" name="amount-{{ item.id }}" required="" type="number" min="0"  value="{{ item.amount }}" onchange="sumamount();">
                                </td>
                            </tr>
                        {% endfor %}
                    {% else %}
                        <tr class="item">
                            <div>
                            <td>
                                <input id="item-{{ items|length }}" name="description-{{ items|length }}" required="" type="text" value="{{form.description}}">
                            </td>
                            <td>
                                <input id="item-{{ items|length }}" name="amount-{{ items|length }}" required="" type="text" value="{{form.amount}}" onchange="sumamount();">
                            </td>
                                </div>
                        </tr>
                    {% endif %}
                </div>
            </table>
            <table>
                <tr class="total">
                    <div>
                    <td>
                        Total:
                    </td>
                    <td>
                       <input id="total" type="text" value="0"readonly>
                    </td>
                        </div>
                </tr>
            </table>

            <button id="add" onclick="addLineItem(event)">Add Item</button>
            <div>
</form>
    <script>

// function to add a row in table
    let currMax = {{ items|length }} + 1;
    function addLineItem(e){
        var lineItem = "<tr class='item'> <td> <input id='item-"+currMax+"' name='description-"+currMax+"' required='' type='text' value=''></td> <td> <input id='item-"+currMax+"' name='amount-"+currMax+"' required='' type='text' value=''></td> </tr>";
        currMax++;
        document.getElementById('myTable').innerHTML += lineItem;
        e.preventDefault();
        console.log("Still on page!");
    }
    function sumamount(){
        sum=0;
        $("input[name^='amount-']").each(function(){
            sum+=Number($(this).val());
        });
        $("#total").val(sum);
    }
</script>
,

我只是将resp.text更改为resp.content,并打印了所有内容。 每次都清空数组。将tickers []放在循环外或每次在循环中打印。

from bs4 import BeautifulSoup 
import requests

def get_NYSE_tickers():

 an = ['A','B','C','D','E','F','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','0']
 tickers = []
 for value in an:
     resp = requests.get(
         'https://www.advfn.com/nyse/newyorkstockexchange.asp?companies={}'.format(value))
     soup =  BeautifulSoup(resp.content,'lxml')
     table = soup.find('table',class_='market tab1')
     for row in table.findAll('tr',class_='ts1')[0:]:
         ticker = row.findAll('td')[1].text
         tickers.append(ticker)
     for row in table.findAll('tr',class_='ts0')[0:]:
         ticker = row.findAll('td')[1].text
         tickers.append(ticker)
print(tickers)
get_NYSE_tickers()

编程问答问答

在 CSS 中设置 cellpadding 和 cellspacing?
如何在 Java 中创建内存泄漏?
浮点数被破坏了吗?
按字符串属性值对对象数组进行排序
如何加快Android模拟器的速度?
如何舍入至多 2 位小数?
使用 Git 版本控制查看文件的更改历史记录
如何在 JavaScript 中检查空/未定义/空字符串?
微信公众号搜索 “ 程序精选 ” ,选择关注!
微信公众号搜 "程序精选"关注