python - Extracting text from chart in Beautiful soup -


relatively new beautifulsoup , i'm trying extract data webpage: http://reports.workforce.test.ohio.gov/program-county-wia-reports.aspx?name=gtl8gammduly5gslycy7wq==&datatype=hip9ibmbiwbkor1wvt5bkg==&datatypetext=hip9ibmbiwbkor1wvt5bkg==#

i grab numbers under headings "program completers", "employed second quarter", etc. relevant part of html code is:

<ul class="listbox">                <li class="li1">   <p style="cursor:help" class="listtop" title="wia adult    completers individuals have exited wia adult program    individual received core staff-assisted service (such job    search or placement assistance) or intensive service (such   counseling, career planning, or job training). individuals    participated in wia through self-service, ohiomeansjobs.com, or other    less intensive programs not included in dashboard.">program    completers</p>   <p id="programcompleters1">18</p></li> 

i want string "program completers" , "18". have tried implementing these solutions here, here, , here without luck. 1 version of code is:

from bs4 import beautifulsoup import urllib2  url="http://reports.workforce.test.ohio.gov/program-county-wia-reports.aspx?name=gtl8gammduly5gslycy7wq==&datatype=hip9ibmbiwbkor1wvt5bkg==&datatypetext=hip9ibmbiwbkor1wvt5bkg==" hdr = {'user-agent': 'mozilla/5.0 (windows nt 6.1; wow64) applewebkit/537.36 (khtml, gecko) chrome/37.0.2062.120 safari/537.36',        'accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'}  req = urllib2.request(url, headers=hdr) page = urllib2.urlopen(req)  soup = beautifulsoup(page) tag in soup.find_all('ul'):     print tag.text, tag.next_sibling 

this returns text other parts of webpage tagged 'ul'. have been unsuccessful in grabbing text inside chart area. how can retrieve text want?

thank help!

as mentioned before data you're looking in iframe, access @chosen_codex says here:

http://reports.workforce.test.ohio.gov/wiareports/wia_county.aspx?level=county&datatype=hip9ibmbiwbkor1wvt5bkg==&name=gtl8gammduly5gslycy7wq==&programdate=kf/2jvcffrgqjjodwv7l08atxxm/adw9p1fwfz9j7o8=

you can access fields interested by:

data = {} tag in soup.find_all('p'):     if tag.get('id'):         data[tag.get('id')] = tag.text  print(data)  >> print(data.get('programcompleters1')) 18 

Comments

Popular posts from this blog

css - SVG using textPath a symbol not rendering in Firefox -

Java 8 + Maven Javadoc plugin: Error fetching URL -

datatable - Matlab struct computations -