473,320 Members | 2,071 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Screen scraping, then caching.... not working on web parts page?

We have a web parts intranet at our office here I have been working
on.

I have been trying to create a way of getting our OCE TDS 600 plotter
queue to be a web part on the page, seeing as their is no way of
interfacing apart from their 'print exec workgroup' web based program.

So I decided to screen scrape, I've done a scrape of the intranet page
of print exec workgroup (http://tds600 on our network) seemed to go
fine, then i used regex, and some replace's to delete the stuff I
didn't want and change some formatting to suit my page then wrote the
html code to a label, seemed to work fine.

I then decided maybe I should do some caching, as this plotter doesnt
want to be hit up when ever user in the network (with autorefresh
every 5 mins) as it is receiving drawings and stuff.

So I tried caching it just for a minute or two, just so that way only
one person hits it every 2 minutes - the rest get it from cache.

But it doesn't seem to be working, I made a test page, seemed to cache
fine, but then when I made it an ascx control and put it on my web
parts page, it never caches. And perhaps as a result, actually
retreiving the scrape is a hit and miss afair, and even worse, people
can't access print exec (which is used to send drawings to plotter) as
it says theres too many users connected (theres no license limit - but
it's obviously a hard limit set so that bandwidth for plots doesnt
suffer - so the web connections must be making lots and overloading
it)

I'm making two parts, 1 to get the status on rolls (empty icons etc)
and one for queue, they are essentiall they same.

Expand|Select|Wrap|Line Numbers
  1. Sub Page_Load(sender as Object, e as EventArgs)
  2. Dim RollStatus
  3. Const strURL As String = "http://tds600/servlet/PEBServlet?
  4. pag=info"
  5.  
  6.  
  7. If Cache.Get("RollState") IsNot Nothing Then
  8. RollStatus = Cache.Get("RollState")
  9. Response.Write("retrieved from cache")
  10. End If
  11.  
  12. Else
  13. response.write ("Regenerating Data")
  14. Dim PrintExecPage As String
  15. Dim req As HttpWebRequest = WebRequest.Create(strURL)
  16. req.Timeout = 1000
  17.  
  18. Dim resp As HttpWebResponse = req.GetResponse()
  19.  
  20. Dim sr As New StreamReader(resp.GetResponseStream())
  21. PrintExecPage = sr.ReadToEnd()
  22.  
  23. Dim RegEx As New Regex("<table width='100%' border='0'
  24. cellpadding='2' cellspacing='0'><tr>(.*?)Feeder</td>(.*?)</table>",
  25. RegexOptions.None)
  26.  
  27. Dim objmatch As Match
  28.  
  29. For Each objmatch In RegEx.Matches(PrintExecPage)
  30.  
  31. 'filter out the shit, store it in RollStatus string
  32.  
  33.  
  34. Next
  35.  
  36. If Len(RollStatus) 20 Then
  37. Cache.Insert("RollState", RollStatus, Nothing,
  38. DateTime.Now.AddMinutes(3), TimeSpan.Zero)
  39. response.write ("Cache Written")
  40.  
  41. End If
  42.  
  43. sr.Close()
  44. resp.Close()
  45.  
  46.  
  47. End If
  48.  
  49.  
  50. If RollStatus Is Nothing Then
  51. RollStatus = "Did not load."
  52. End If
  53.  
  54.  
  55. lblHTMLOutput.Text = RollStatus
  56.  
  57. End Sub
  58.  
whenever the stream actually loads, it tells me its regenerating data,
and that its written to cache, then I hit refresh, and it says the
same thing - no recall from cache.
Any ideas? what have I done wrong? Is it possible our web server is
deleting cache, is there any way to check?

Jun 25 '07 #1
0 1321

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Roland Hall | last post by:
Am I correct in assuming screen scraping is just the response text sent to the browser? If so, would that mean that this could not be screen scraped? function moi() { var tag = '<a href='; var...
1
by: niv | last post by:
Hello, I would like to screen scrape certain parts of a webpage...how can I do this in asp.net For instance.... a stockticker thats embeded on a webpage.. I dont want the entire page.. I...
4
by: rachel | last post by:
Hello, I am currently contracted out by a real estate agent. He has a page that he has created himself that has a list of homes.. their images and data in html format. He wants me to take...
2
by: Steve | last post by:
I am working on an application to screen scrape information from a web page. I have the base code working but the problem is I have to login before I can get the info I need. The page is hosted on...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.