Splitting Beatifulsoup4's findAll() of HTML into list of lines in Python -
i've begun learning python, there project had in mind suspected possible using method of web-crawling through python.
i have been using tutorial series, , aware findall()
appears looked down upon being primitive (i'm unsure why, , i'll gladly learn better alternatives, simple @ same time - started python yesterday).
right now, have extremely project visits specified website , grabs code. however, want implement if
statement - find if line present.
(where using soup.findall('a', {'href': '/login'})
, soup = beautifulsoup(requests.get(url).text)
) - attempts, trying things like: if '/login' in soup
have failed, , not sure how implement if
statement find single word, or line, in html found.
if aware of simpler methods use here i'd grateful, solution identified have html split lines and/or in array, , use if <the entire line> in soup:
.
i think want is
if soup.findall('a', {'href': '/login'}):...
findall returns empty list ([]
) if element isn't found, , python evaluates empty lists false in if-statements.
for example, here trivial example:
>>> soup = bs('<html></html>') >>> assert soup.findall('html') >>> assert soup.findall('login') traceback (most recent call last): file "<stdin>", line 1, in <module> assertionerror
Comments
Post a Comment