python - Extracting text from chart in Beautiful soup -
relatively new beautifulsoup , i'm trying extract data webpage: http://reports.workforce.test.ohio.gov/program-county-wia-reports.aspx?name=gtl8gammduly5gslycy7wq==&datatype=hip9ibmbiwbkor1wvt5bkg==&datatypetext=hip9ibmbiwbkor1wvt5bkg==#
i grab numbers under headings "program completers", "employed second quarter", etc. relevant part of html code is:
<ul class="listbox"> <li class="li1"> <p style="cursor:help" class="listtop" title="wia adult completers individuals have exited wia adult program individual received core staff-assisted service (such job search or placement assistance) or intensive service (such counseling, career planning, or job training). individuals participated in wia through self-service, ohiomeansjobs.com, or other less intensive programs not included in dashboard.">program completers</p> <p id="programcompleters1">18</p></li>
i want string "program completers" , "18". have tried implementing these solutions here, here, , here without luck. 1 version of code is:
from bs4 import beautifulsoup import urllib2 url="http://reports.workforce.test.ohio.gov/program-county-wia-reports.aspx?name=gtl8gammduly5gslycy7wq==&datatype=hip9ibmbiwbkor1wvt5bkg==&datatypetext=hip9ibmbiwbkor1wvt5bkg==" hdr = {'user-agent': 'mozilla/5.0 (windows nt 6.1; wow64) applewebkit/537.36 (khtml, gecko) chrome/37.0.2062.120 safari/537.36', 'accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'} req = urllib2.request(url, headers=hdr) page = urllib2.urlopen(req) soup = beautifulsoup(page) tag in soup.find_all('ul'): print tag.text, tag.next_sibling
this returns text other parts of webpage tagged 'ul'. have been unsuccessful in grabbing text inside chart area. how can retrieve text want?
thank help!
as mentioned before data you're looking in iframe, access @chosen_codex says here:
you can access fields interested by:
data = {} tag in soup.find_all('p'): if tag.get('id'): data[tag.get('id')] = tag.text print(data) >> print(data.get('programcompleters1')) 18
Comments
Post a Comment