By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,982 Members | 1,932 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,982 IT Pros & Developers. It's quick & easy.

File.Move Access Issue

100+
P: 233
I am attempting to move a file, but I am receiving an exception stating that the file cannot be moved because it is in use. I realize this is a common error, and also that my application is the one that is keeping the file open, but what I don't know is where.

I am opening the file (.html) using the Html Agility Pack, which has a .Load method. There is not a way, that I know of, to unload or close a document once open. Other than that, I scrape some data from the .html document but not a lot more. I am also downloading a file (using webclient) from a link from the .html file, but I cannot move that file either. Note my code below.

Expand|Select|Wrap|Line Numbers
  1. private void DeleteFiles(string strPODoc)
  2.              {
  3.                  string strDestination; 
  4.          string strPOFile = ParseString(strPODoc, @"c:\");                
  5.  
  6.                  strDestination=@"\\epsa\orders\PROCESSED\";
  7.  
  8.                  if (File.Exists(strPODoc))
  9.                  {
  10.                      File.Move(strPODoc, strDestination + strPOFile);
  11.                      Console.WriteLine(strPODoc + " has been processed");
  12.                  }
  13.                  else
  14.                  {
  15.                      Console.WriteLine(strPODoc + " does not exist");
  16.                      Console.Read();
  17.                  }                 
  18.              }
Dec 11 '08 #1
Share this Question
Share on Google+
14 Replies


balabaster
Expert 100+
P: 797
If the Agility pack is loading the file and not allowing you to unload it and that's what's causing the lock, then my view is that you need to ditch the agility pack and use something more productive... either that or you need to email the producers of this pack and let them know that there's a shortfall in their API that needs addressing in short order.

How are you scraping the file exactly? In the past, I've always used something Regular Expressions to scrape files, regardless of the format. This allows a far greater level of control in your application meaning that you can control when files are open and closed.
Dec 11 '08 #2

100+
P: 233
With the agility pack, I can collect all similar tags into an array, and then just select which one I need to use. I briefly looked at using regular expressions to parse the html, but I was very pleased with the ease of use of the Html Agility Pack.
Dec 11 '08 #3

100+
P: 233
I should also note that I am not able to move the file I download with WebClient. Even if I call WebClient.Dispose(), I cannot gain access to the file.
Dec 11 '08 #4

nukefusion
Expert 100+
P: 221
@mcfly1204
How are you downloading the file? Using the DownloadFile() method of the WebClient?
Do you access the file using any sort of FileStream?
If possible, a small code sample of your download function would certainly be helpful.
Dec 11 '08 #5

balabaster
Expert 100+
P: 797
@mcfly1204
By similar, do you mean, for example, all the anchor tags? or all the paragraph tags? etc

<a href="firstlink.aspx">First Link Text</a>
<a href="secondlink.aspx">Second Link Text</a>
<a href="thirdlink.aspx">Third Link Text</a>
...

Using Regex you can grab them:

Expand|Select|Wrap|Line Numbers
  1. Dim MatchCol = Regex.Matches(HtmlString, "(?i:<a.*?</a>)")
I'd probably use that...MatchCol is then a collection of all my anchor tags. You could do something similar for other tags changing up some of the inside pattern.

There are a whole bunch of example patterns on the net for this, but most of them reference a longer pattern that doesn't allow for things like anchor tags that contain images instead of text.
Dec 11 '08 #6

100+
P: 233
@nukefusion
I download the file using the following:

Expand|Select|Wrap|Line Numbers
  1. private void GetVectorArt(string strPODoc)
  2.             {
  3.                 string strURL = strBaseURL + strArtURL;
  4.                 string strPath = @"c:\" + strPO + "download.zip";
  5.  
  6.                 WebClient client = new WebClient();
  7.  
  8.                 client.DownloadFile(strURL, strPath);
  9.  
  10.                 this.SendToArtwork(strPath, strPODoc);
  11.             } 
Once downloaded, I send an email with the file as an attachment. After that, I am simply trying to move or delete the downloaded file.
Dec 11 '08 #7

balabaster
Expert 100+
P: 797
Never mind... (delete this reply please)
Dec 11 '08 #8

100+
P: 233
@balabaster
I am an in the process of changing from the Agility Pack to regular expressions as you hve recommended, but I seem to be having some trouble with nested tags. Do you have much regex experience in regards to HTML? I have been using this expression for TDs:

(?s)(?<=<td[^>]*>.*?)<td[^>]*>.*?</td>
Dec 11 '08 #9

balabaster
Expert 100+
P: 797
@mcfly1204
Ah yes, this particular element of HTML can be a bit more of a chore when it comes to regular expressions. Not to say it can't be done if you know what you're doing.

The following regular expression will find the corresponding closing tag for your opening tag:

(?x:<td>(?>(?!<td>|</td>).|<td>(?<Depth>)|</td>(?<-Depth>))*(?(Depth)(?!))</td>)
You would then need to recurse through each nested item that contained other nested items. This will pull out your top level cells from the table as a collection. You'd then need to figure out which (if any) of those contained nested tables and then do the same thing.

Now, this might be easier with something like the Agility pack, I don't know, I've not used it.

With XElement and XQuery you can basically strip out a collection of all <td> elements which you can then work on as an entity. However, the XML objects aren't great with HTML and I've not done an awful lot with HTML - I've done a bunch of XML and while I know they're remarkably similar, there's aspects of html that the XML objects just fall over and die on - such as elements that don't have closures, like the <IMG> tag, the <BR> tag and the <HR> tag and while their old forms have been deprecated in favour of new XML standards <IMG /> <HR /> and <BR />, there are still many websites out there using the old form that will trip your application up.

To be honest though, if the Agility pack is locking the file and not allowing you to copy it, I would get onto them and report it as a bug and tell them it needs fixing. I hate having to code around bugs, and for something as potentially complex as nested tags, like tables, spans, divs, fonts etc, Regular Expressions could have the potential to become a nightmare if you don't know what you're doing or if you need to start getting complicated with what you're trying to pull out of the HTML.
Dec 11 '08 #10

100+
P: 233
So I am now parsing all of my data using regex, but I am still having the issue of not being able to move the original file.

Expand|Select|Wrap|Line Numbers
  1. public void SelectFiles()
  2.             {
  3.                 string Pickup;
  4.                 string[] strFiles;
  5.                 string[] HTMLs;
  6.  
  7.                 Pickup = @"\\epsa\orders";
  8.                 strFiles = Directory.GetFiles(Pickup, "*.htm");
  9.  
  10.                 foreach (string htm in strFiles)
  11.                 {
  12.                     string HTML;
  13.  
  14.                     HTML = ParseString(htm.ToString(), ".htm");
  15.                     HTML = ParseString(HTML, @"\\epsa\orders\");
  16.  
  17.                     File.Copy(htm, @"c:\orders\" + HTML + ".html");
  18.                 }
  19.  
  20.                 HTMLs = Directory.GetFiles(Pickup, "*.html");
  21.  
  22.                 if (HTMLs.Length < 1)
  23.                 {
  24.                     return;
  25.                 }
  26.                 else
  27.                 {
  28.                     Console.WriteLine("Total HTML Files: " + HTMLs.Length + System.Environment.NewLine);
  29.                 }
  30.  
  31.                 foreach (string x in HTMLs)
  32.                 {
  33.                         this.GetContents(x);
  34.                 }
  35.             }
I cannot figure out what is locking the file. GetContents simply reads the contents of the file with a streamreader inclosed in a using statement.
Dec 12 '08 #11

balabaster
Expert 100+
P: 797
I don't see anything wrong with this code... I need to see the processing code for where you open the file and parse it. This code is fine... I suspect you're not closing and disposing the file stream where you're parsing the file which is what is causing your problem.
Dec 12 '08 #12

100+
P: 233
Expand|Select|Wrap|Line Numbers
  1. private void GetContents(string HTML)
  2.             {
  3.                 using (StreamReader reader = new StreamReader(HTML))
  4.                 {
  5.                     string htmlContent = reader.ReadToEnd();
  6.  
  7.                     this.GetLinks(HTML, htmlContent);
  8.                     this.GetTDs(HTML, htmlContent);
  9.                     this.GetSpans(HTML, htmlContent);
  10.                 }
  11.             }
  12.  
  13.             private void GetLinks(string HTML, string htmlContent)
  14.             {
  15.                 string linkRegEx = "(?i:<a.*?</a>)";
  16.                 link = new string[4];
  17.  
  18.                 Regex rLinks = new Regex(linkRegEx);
  19.  
  20.                 MatchCollection links = rLinks.Matches(htmlContent);
  21.  
  22.                 for (int i = 0; i < (links.Count - 1); i++)
  23.                 {
  24.                     link[i] = links[i].Value;
  25.                 }
  26.  
  27.                 ArtURL = SplitString(link[2], '"', 1);
  28.  
  29.                 BaseURL = @"https://springboard.4imprint.com/PO/view/";
  30.                 ArtURL = BaseURL + ArtURL;
  31.             }
  32.  
  33.             private void GetTDs(string HTML, string htmlContent)
  34.             {
  35.                 string tdRegEx = @"<td(?:\s[^>]*)?>(?:(?>[^<]+)|<(?!td(?:\s[^>]*)?>))*?</td>";
  36.                 td = new string[95];
  37.  
  38.                 Regex rTds = new Regex(tdRegEx);
  39.  
  40.                 MatchCollection tds = rTds.Matches(htmlContent);
  41.  
  42.                 for (int i = 0; i < (tds.Count - 1); i++)
  43.                 {
  44.                     td[i] = tds[i].Value;
  45.                 }
  46.  
  47.                 PONum = SplitString(td[1], '>', 2);
  48.                 PONum = ParseString(PONum, @"</span");
  49.                 PONum = ParseString(PONum, "Purchase Order");
  50.                 DistAddr1 = "101 Commerce St.";
  51.                 DistAddr2 = "PO Box 320";
  52.                 DistAddr3 = "Oshkosh, WI 54901";
  53.                 DistPhone = "920-236-7272";
  54.                 DistFax = "902-236-7282";
  55.                 OppCon = SplitString(td[22], '>', 3);
  56.                 OppCon = ParseString(OppCon, @"</td");
  57.  
  58.                 ShipToName = SplitString(td[81], '>', 1);
  59.                 ShipToName = ParseString(ShipToName, @"</td");
  60.                 ShipToAddr1 = SplitString(td[82], '>', 1);
  61.                 ShipToAddr1 = ParseString(ShipToAddr1, @"</td");
  62.                 ShipToAddr2 = SplitString(td[83], '>', 1);
  63.                 ShipToAddr2 = ParseString(ShipToAddr2, @"</td");
  64.                 ShipToAddr3 = SplitString(td[84], '>', 1);
  65.                 ShipToAddr3 = ParseString(ShipToAddr3, @"</td");
  66.                 ShipCity = SplitString(td[85], '>', 1);
  67.                 ShipCity = ParseString(ShipCity, @"</td");
  68.                 ShipState = SplitString(td[86], '>', 1);
  69.                 ShipState = ParseString(ShipState, @"</td");
  70.                 ShipZip = SplitString(td[87], '>', 1);
  71.                 ShipZip = ParseString(ShipZip, @"</td");
  72.  
  73.                 OrderDate = SplitString(td[21], '>', 3);
  74.                 OrderDate = ParseString(OrderDate, @"</td");
  75.                 ReqShipDate = SplitString(td[92], '>', 2);
  76.                 ReqShipDate = ParseString(ReqShipDate, @"</span");
  77.                 InHandsDate = SplitString(td[94], '>', 2);
  78.                 InHandsDate = ParseString(InHandsDate, @"</span");
  79.  
  80.                 ItemNo = SplitString(td[32], '>', 1);
  81.                 ItemNo = ParseString(ItemNo, @"</td");
  82.                 ItemQty = SplitString(td[91], '>', 2);
  83.                 ItemQty = ParseString(ItemQty, @"</span");
  84.                 UnitCost = SplitString(td[34], '>', 1);
  85.                 UnitCost = ParseString(UnitCost, @"</td");
  86.                 TotalCost = SplitString(td[35], '>', 1);
  87.                 TotalCost = ParseString(TotalCost, @"</td");
  88.  
  89.                 ImpPhrase = "";
  90.                 Notes = ParseString(td[63], @"</td>");
  91.                 Notes = ParseString(Notes, @"<br>");
  92.                 Notes = ParseString(Notes, @"<td>");
  93.             }
Dec 12 '08 #13

100+
P: 233
I think I have it, it has nothing to do with what I have posted, it is because of a method where I send the file as an attachment:

Expand|Select|Wrap|Line Numbers
  1. private void SendToArtwork(string Artwork, string PODoc)
  2.             {
  3.                 PONum = PONum.Trim();
  4.  
  5.                 MailMessage message = new MailMessage();
  6.                 message.To.Add("toemail");
  7.                 message.From = new MailAddress("fromemail");
  8.  
  9.                 message.Priority = MailPriority.Normal;
  10.                 message.Subject = "Artwork and Purchase Order For PO# " + PONum;
  11.                 message.Body = "Attached is the original purchase order number " + PONum + ", as well as the included vector artwork.";
  12.  
  13.                 Attachment attPO = new Attachment(PODoc);
  14.                 Attachment attArt = new Attachment(Artwork);
  15.                 message.Attachments.Add(attPO);
  16.                 message.Attachments.Add(attArt);
  17.  
  18.                 SmtpClient smtp = new SmtpClient("exchangeserver");
  19.                 smtp.Credentials = new NetworkCredential("username", "password");
  20.  
  21.                 try
  22.                 {
  23.                     smtp.Send(message);
  24.                 }
  25.                 catch
  26.                 {
  27.                     System.Environment.Exit(5);
  28.                 }
  29.             }
Dec 12 '08 #14

balabaster
Expert 100+
P: 797
Add this to the end of the try block (after your catch block) in your SendToArtwork method and see if it helps:
Expand|Select|Wrap|Line Numbers
  1. finally{
  2.   attArt.Dispose();
  3.   attPO.Dispose();
  4.   message.Dispose();
  5. }
You want to do it as a finally so that resources are always closed and memory reclaimed, even if the try fails.

It looks to me like the failure to dispose of the attachment objects and the message object properly may be what's causing your issue.

You don't close your reader properly in your GetContents() method - inside the closure for your using (StreamReader....) statement add:
Expand|Select|Wrap|Line Numbers
  1. reader.Close();
To make sure that your reader is closed releasing any objects that are locked by it prior to it being disposed by the closure of your using statement.
So the full block of code should look like:
Expand|Select|Wrap|Line Numbers
  1. using (StreamReader reader = new StreamReader(HTML))
  2. {
  3.   string htmlContent = reader.ReadToEnd();
  4.   this.GetLinks(HTML, htmlContent);
  5.   this.GetTDs(HTML, htmlContent);
  6.   this.GetSpans(HTML, htmlContent);
  7.   reader.Close();
  8. }
Given that in this case the reader is only working on a string, it won't make any real world difference - but it is good coding practice to close connections when you've finished with them so that no adverse side-effects of locked resources occur later. Other than that, I don't see anywhere else your code could be locking objects and not releasing resources before the classes are disposed.

One thing I'm not entirely sure of: The using keyword is used to automatically dispose of objects that implement the IDisposable interface. If a class implements IDisposable properly, then it will be disposed of at the closure of your using statement...the assumption is that Microsoft implements all their classes properly but if it's not implemented properly, then there's a chance your object may not be disposed of properly at the closure of the using statement so while it is expected that:
Expand|Select|Wrap|Line Numbers
  1. using obj as new TargetObject(){
  2. }
would dispose of the TargetObject instance at the completion of the code block, if TargetObject() doesn't implement IDisposable properly, then the instance must be disposed of manually using something like obj.Dispose();

Don't assume that things work the way they should be expected to work until you've proved to yourself that they do...

I've not used the mail and smtp classes, so I've not got any experiences with their glitches or gaps I'm afraid.
Dec 12 '08 #15

Post your reply

Sign in to post your reply or Sign up for a free account.