我正在尝试使用Python和tor和privoxy运行Scrapy。 我在https://github.com/khpeek/privoxy-tor-scraper中使用khpeek / privoxy-tor-scraper的抓取器。这是我的目录结构:
- docker-compose.yml
- privoxy
- config
- Dockerfile
- scraper
- Dockerfile
- newnym.py
- requirements.txt
- tor
- Dockerfile
我正在尝试运行以下 docker-compose.yml :
version: '3'
services:
privoxy:
build: ./privoxy
ports:
- "8118:8118"
links:
- tor
tor:
build:
context: ./tor
args:
password: "1234"
ports:
- "9050:9050"
- "9051:9051"
scraper:
build: ./scraper
links:
- tor
- privoxy
其中 tor 的 Dockerfile 是:
FROM alpine:3.7
EXPOSE 9050 9051
ARG password
RUN apk --update add tor
RUN echo "ControlPort 9051" >> /etc/tor/torrc
RUN echo "CookieAuthentication 1" >> /etc/tor/torrc
RUN echo "HashedControlPassword $(tor --quiet --hash-password $password)" >> /etc/tor/torrc
CMD ["tor"]
privoxy 的帽子是:
FROM alpine:latest
EXPOSE 8118
RUN apk --update add privoxy
COPY config /etc/privoxy/
#CMD ["privoxy","--no-daemon"]
CMD ["privoxy","--no-daemon","/etc/privoxy/config"]
其中 config 由两行组成:
listen-address 0.0.0.0:8118
forward-socks5 / tor:9050 .
,抓取工具的 Dockerfile 是:
FROM python:3.6-alpine
ADD . /scraper
WORKDIR /scraper
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
CMD ["python","newnym.py"]
其中 requirements.txt 包含一行请求。最后,程序 newnym.py 设计为仅测试使用Tor更改IP地址是否有效:
from time import sleep,time
import requests as req
import telnetlib
def get_ip():
IPECHO_ENDPOINT = 'http://ipecho.net/plain'
HTTP_PROXY = 'http://privoxy:8118'
return req.get(IPECHO_ENDPOINT,proxies={'http': HTTP_PROXY}).text
def request_ip_change():
#tn = telnetlib.Telnet('privoxy',8118)
tn = telnetlib.Telnet('tor',9051)
tn.read_until("Escape character is '^]'.",2)
tn.write('AUTHENTICATE ""\r\n')
tn.read_until("250 OK",2)
tn.write("signal NEWNYM\r\n")
tn.read_until("250 OK",2)
if __name__ == '__main__':
dts = []
#isOpen('tor',9051)
#isOpen('privoxy',8118)
try:
while True:
ip = get_ip()
t0 = time()
request_ip_change()
while True:
new_ip = get_ip()
if new_ip == ip:
sleep(1)
else:
break
dt = time() - t0
dts.append(dt)
print("{} -> {} in ~{}s".format(ip,new_ip,int(dt)))
except KeyboardInterrupt:
print("Stopping...")
print("Average: {}".format(sum(dts) / len(dts)))
docker-compose build 构建成功,但是如果我尝试 docker-compose up ,则会收到以下错误消息:
scraper_1_651fd6690a2d | Traceback (most recent call last):
scraper_1_651fd6690a2d | File "newnym.py",line 45,in <module>
scraper_1_651fd6690a2d | request_ip_change()
scraper_1_651fd6690a2d | File "newnym.py",line 27,in request_ip_change
scraper_1_651fd6690a2d | tn = telnetlib.Telnet('tor',9051)
scraper_1_651fd6690a2d | File "/usr/local/lib/python3.6/telnetlib.py",line 218,in __init__
scraper_1_651fd6690a2d | self.open(host,port,timeout)
scraper_1_651fd6690a2d | File "/usr/local/lib/python3.6/telnetlib.py",line 234,in open
scraper_1_651fd6690a2d | self.sock = socket.create_connection((host,port),timeout)
scraper_1_651fd6690a2d | File "/usr/local/lib/python3.6/socket.py",line 724,in create_connection
scraper_1_651fd6690a2d | raise err
scraper_1_651fd6690a2d | File "/usr/local/lib/python3.6/socket.py",line 713,in create_connection
scraper_1_651fd6690a2d | sock.connect(sa)
scraper_1_651fd6690a2d | ConnectionRefusedError: [Errno 111] Connection refused
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。