python - Extracting text from chart in Beautiful soup -

relatively new beautifulsoup , i'm trying extract data webpage: http://reports.workforce.test.ohio.gov/program-county-wia-reports.aspx?name=gtl8gammduly5gslycy7wq==&datatype=hip9ibmbiwbkor1wvt5bkg==&datatypetext=hip9ibmbiwbkor1wvt5bkg==#

i grab numbers under headings "program completers", "employed second quarter", etc. relevant part of html code is:

<ul class="listbox">                <li class="li1">   <p style="cursor:help" class="listtop" title="wia adult    completers individuals have exited wia adult program    individual received core staff-assisted service (such job    search or placement assistance) or intensive service (such   counseling, career planning, or job training). individuals    participated in wia through self-service, ohiomeansjobs.com, or other    less intensive programs not included in dashboard.">program    completers</p>   <p id="programcompleters1">18</p></li>

i want string "program completers" , "18". have tried implementing these solutions here, here, , here without luck. 1 version of code is:

from bs4 import beautifulsoup import urllib2  url="http://reports.workforce.test.ohio.gov/program-county-wia-reports.aspx?name=gtl8gammduly5gslycy7wq==&datatype=hip9ibmbiwbkor1wvt5bkg==&datatypetext=hip9ibmbiwbkor1wvt5bkg==" hdr = {'user-agent': 'mozilla/5.0 (windows nt 6.1; wow64) applewebkit/537.36 (khtml, gecko) chrome/37.0.2062.120 safari/537.36',        'accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'}  req = urllib2.request(url, headers=hdr) page = urllib2.urlopen(req)  soup = beautifulsoup(page) tag in soup.find_all('ul'):     print tag.text, tag.next_sibling

this returns text other parts of webpage tagged 'ul'. have been unsuccessful in grabbing text inside chart area. how can retrieve text want?

thank help!

as mentioned before data you're looking in iframe, access @chosen_codex says here:

http://reports.workforce.test.ohio.gov/wiareports/wia_county.aspx?level=county&datatype=hip9ibmbiwbkor1wvt5bkg==&name=gtl8gammduly5gslycy7wq==&programdate=kf/2jvcffrgqjjodwv7l08atxxm/adw9p1fwfz9j7o8=

you can access fields interested by:

data = {} tag in soup.find_all('p'):     if tag.get('id'):         data[tag.get('id')] = tag.text  print(data)  >> print(data.get('programcompleters1')) 18

Search This Blog

Remember

python - Extracting text from chart in Beautiful soup -

Comments

Post a Comment

Popular posts from this blog

Java 8 + Maven Javadoc plugin: Error fetching URL -

css - SVG using textPath a symbol not rendering in Firefox -

php - Google Calendar Events -