Scrapy|Scrapy Request和Response相关参数介绍
Request 部分源码:
class Request(object_ref):def __init__(self, url, callback=None, method='GET', headers=None, body=None,
cookies=None, meta=None, encoding='utf-8', priority=0,
dont_filter=False, errback=None):self._encoding = encoding# this one has to be set first
self.method = str(method).upper()
self._set_url(url)
self._set_body(body)
assert isinstance(priority, int), "Request priority not an integer: %r" % priority
self.priority = priorityassert callback or not errback, "Cannot use errback without a callback"
self.callback = callback
self.errback = errbackself.cookies = cookies or {}
self.headers = Headers(headers or {}, encoding=encoding)
self.dont_filter = dont_filterself._meta = dict(meta) if meta else None@property
def meta(self):
if self._meta is None:
self._meta = {}
return self._metaurl: 就是需要请求,并进行下一步处理的urlcallback: 指定该请求返回的Response,由那个函数来处理。method: 请求一般不需要指定,默认GET方法,可设置为"GET", "POST", "PUT"等,且保证字符串大写headers: 请求头cookies: cookies,模拟用户登录需要指定用户的cookies,字典dict型meta: 比较常用,在不同的请求之间传递数据使用的。字典dict型request_with_cookies = Request(
url="http://www.example.com",
cookies={'currency': 'USD', 'country': 'UY'},
meta={'dont_merge_cookies': True}
)encoding: 编码类型,使用默认的 'utf-8' 就行。dont_filter: 表明该请求不由调度器过滤。这是当你想使用多次执行相同的请求,忽略重复的过滤器。默认为False。errback: 指定错误处理函数
- 如果希望程序执行一开始就发送POST请求,可以重写Spider类的start_requests(self) 方法,并且不再调用start_urls里的url。
class mySpider(scrapy.Spider):name = 'myspider'
allow_domeas = ['renren.com']
start_urls = ["http://www.renren.com/PLogin.do"]def start_requests(self):
# FormRequest 是Scrapy发送POST请求的方法
for url in self.start_urls:yield scrapy.FormRequest(
url = url,
formdata = https://www.it610.com/article/{"email" : "mr_mao_hacker@163.com", "password" : "axxxxxxxe"},
callback = self.parse_page
)def parse_page(self, response):
"""
请求成的回调函数
"""
pass
Response
class Response(object_ref):
def __init__(self, url, status=200, headers=None, body='', flags=None, request=None):
self.headers = Headers(headers or {})
self.status = int(status)
self._set_body(body)
self._set_url(url)
self.request = request
self.flags = [] if flags is None else list(flags)@property
def meta(self):
try:
return self.request.meta
except AttributeError:
raise AttributeError("Response.meta not available, this response " \
"is not tied to any request")status: 响应码
body: 响应体
url:响应url
self.request (request对象)
self.headers (响应头)
推荐阅读
- 急于表达——往往欲速则不达
- 第三节|第三节 快乐和幸福(12)
- 20170612时间和注意力开销记录
- 2.6|2.6 Photoshop操作步骤的撤消和重做 [Ps教程]
- 对称加密和非对称加密的区别
- 眼光要放高远
- 樱花雨
- 前任
- 2020-04-07vue中Axios的封装和API接口的管理
- 烦恼和幸福