473,769 Members | 8,134 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Replacing a string inside of a PDF

I am having a lot more trouble with this than I thought I would. Here
is what I want to do in pseudocode.

Open c:\some.pdf
Replace "Replace this" with "Replaced!"
Save c:\some_edited. pdf

I can do this in notepad and it works fine, but when I start getting in
to reading the files I think it has some encoding problem. I tried
saving the file with every encoding option. When I open a PDF in the
text editor I normally use it says it is ANSI with Mac style carriage
returns. Winmerge will not let me compare the files because it says
they are binary.

Anyone know what I have to do?

Jul 20 '06
14 1801
Finally,
Can you achieve what you actually need?

Samuel
"Josh Baltzell" <jo**********@g mail.comwrote in message
news:11******** **************@ i42g2000cwa.goo glegroups.com.. .
Here is another test I wrote that sucessfully generates a bunch of
useless files encoded in different ways.

::::::::::::::: ::::::::::::::: ::::::::::::::: ::::::::::::::: :
Public Function StringTest()
Dim PDFFile As String
Dim PDFFolder As IO.Directory

Response.Write( "Start String:" & DateTime.Now.To LongTimeString
& ":" & Now.Millisecond & "<br>")

For Each PDFFile In PDFFolder.GetFi les(Server.MapP ath("PDF"))
'Open the file
Dim FileStream As IO.StreamReader
FileStream = IO.File.OpenTex t(PDFFile)

'Load the file in to a string
Dim Contents As String = FileStream.Read ToEnd

'Replace text in string
Contents = Contents.Replac e("ABC123456789 0",
"ABC1111111111" )

'Close stream
FileStream.Clos e()

'Create ASCII output file
Dim OutputFileName As String = Server.MapPath( "PDFOutput\ "
& DateTime.Now.To FileTimeUtc.ToS tring & "STRING-ASCII.pdf")
Dim fs As FileStream = File.Create(Out putFileName)
Dim PDFStream As StreamWriter = New StreamWriter(fs ,
System.Text.Enc oding.ASCII)
PDFStream.Write (Contents)
PDFStream.Close ()
fs.Close()

'Create BigEndianUnicod e output file
OutputFileName = Server.MapPath( "PDFOutput\ " &
DateTime.Now.To FileTimeUtc.ToS tring & "STRING-BigEndianUnicod e.pdf")
fs = File.Create(Out putFileName)
PDFStream = New StreamWriter(fs ,
System.Text.Enc oding.BigEndian Unicode)
PDFStream.Write (Contents)
PDFStream.Close ()
fs.Close()

'Create default formatted output file
OutputFileName = Server.MapPath( "PDFOutput\ " &
DateTime.Now.To FileTimeUtc.ToS tring & "STRING-Default.pdf")
fs = File.Create(Out putFileName)
PDFStream = New StreamWriter(fs ,
System.Text.Enc oding.Default)
PDFStream.Write (Contents)
PDFStream.Close ()
fs.Close()

'Create Unicode output file
OutputFileName = Server.MapPath( "PDFOutput\ " &
DateTime.Now.To FileTimeUtc.ToS tring & "STRING-Unicode.pdf")
fs = File.Create(Out putFileName)
PDFStream = New StreamWriter(fs ,
System.Text.Enc oding.Unicode)
PDFStream.Write (Contents)
PDFStream.Close ()
fs.Close()

'Create UTF7 output file
OutputFileName = Server.MapPath( "PDFOutput\ " &
DateTime.Now.To FileTimeUtc.ToS tring & "STRING-UTF7.pdf")
fs = File.Create(Out putFileName)
PDFStream = New StreamWriter(fs , System.Text.Enc oding.UTF7)
PDFStream.Write (Contents)
PDFStream.Close ()
fs.Close()

'Create UTF8 output file
OutputFileName = Server.MapPath( "PDFOutput\ " &
DateTime.Now.To FileTimeUtc.ToS tring & "STRING-UTF8.pdf")
fs = File.Create(Out putFileName)
PDFStream = New StreamWriter(fs , System.Text.Enc oding.UTF8)
PDFStream.Write (Contents)
PDFStream.Close ()
fs.Close()

Next

Response.Write( "Stop String:" & DateTime.Now.To LongTimeString &
":" & Now.Millisecond & "<br>")

End Function
::::::::::::::: ::::::::::::::: ::::::::::::::: ::::::::::::::: :

Jul 23 '06 #11
Josh Baltzell wrote:
I am having a lot more trouble with this than I thought I would. Here
is what I want to do in pseudocode.

Open c:\some.pdf
Replace "Replace this" with "Replaced!"
Save c:\some_edited. pdf

I can do this in notepad and it works fine, but when I start getting in
to reading the files I think it has some encoding problem. I tried
saving the file with every encoding option. When I open a PDF in the
text editor I normally use it says it is ANSI with Mac style carriage
returns. Winmerge will not let me compare the files because it says
they are binary.
<snip>

Winmerge is right, a PDF file is actually a binary image, not a plain
text in a given encoding. You should load it as a stream of bytes.

On the other hand, since you want to perform text replacements in the
file, you may load it with an encoding that doesn't apply
transformations on the bytes in the file, such as the Ansi encoding:

Sub PDFReplaceText( ByVal Path As String, ByVal OldText As String, _
ByVal OutPath As String, ByVal NewText As String)

Const ANSI As Integer = 1252

Dim Encoding As Text.Encoding = Text.Encoding.G etEncoding(ANSI )
Dim sr As New IO.StreamReader (Path, Encoding)
Dim Data As String = sr.ReadToEnd
sr.Close()

Data = Data.Replace(Ol dText, NewText)

Dim sw As New IO.StreamWriter (OutPath, False, Encoding)
sw.Write(Data)
sw.Close()

End Sub

HTH.

Regards,

Branco.

Jul 23 '06 #12
Branco,

This worked perfect. My knowlege about the encoding options in general
is very weak, so thanks for spelling it out for me with some code.

Samuel,

Thank you to you too. You have both been a big help.

Thank you,
Josh Baltzell

Jul 24 '06 #13
I am glad to hear,

Is Branco's code works as is?
"Josh" <jo**********@g mail.comwrote in message
news:11******** **************@ 75g2000cwc.goog legroups.com...
Branco,

This worked perfect. My knowlege about the encoding options in general
is very weak, so thanks for spelling it out for me with some code.

Samuel,

Thank you to you too. You have both been a big help.

Thank you,
Josh Baltzell

Jul 24 '06 #14
I put the encoding options in to my own code, so I am not positive.
This is the final sub I ended up with.

Public Sub ReplaceText(ByV al FilePath As String, ByVal OriginalText
As String, ByVal NewText As String)
Dim PDFFolder As IO.Directory
Dim Encoding As System.Text.Enc oding =
Encoding.GetEnc oding(1252)

'Open the file
Dim FileStream As New IO.StreamReader (FilePath, Encoding)

'Load the file in to a string
Dim Contents As String = FileStream.Read ToEnd

'Replace text in string
Contents = Contents.Replac e(OriginalText, NewText)

'Close stream
FileStream.Clos e()

'Write string as bytes to output file
Dim OutputFileName As String = FilePath
Dim sw As New IO.StreamWriter (OutputFileName , False, Encoding)
sw.Write(Conten ts)
sw.Close()

End Sub

Samuel Shulman wrote:
I am glad to hear,

Is Branco's code works as is?
"Josh" <jo**********@g mail.comwrote in message
news:11******** **************@ 75g2000cwc.goog legroups.com...
Branco,

This worked perfect. My knowlege about the encoding options in general
is very weak, so thanks for spelling it out for me with some code.

Samuel,

Thank you to you too. You have both been a big help.

Thank you,
Josh Baltzell
Jul 24 '06 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2254
by: dornick | last post by:
So I want to do the above, and I really, REALLY don't want to rewrite the entire file. I've been working on it for a while now, and can't for the life of me get it functioning. Basically, I want to replace the last text character of a certain line. So far all I've done has centered around trying to put the "put" pointer write before the character to write (ban pun, I know). But when I tried to use put(), nothing happened and a call to...
3
1809
by: Son of Sam | last post by:
Hi, I just want to open a file, let it replace 1 string (which occurs a few times) and write it to the same file, it must be binary. I tryed following code but it doesnt work as I thought: #include <iostream> #include <fstream> #include <string> #include <tchar.h> #include <stdio.h>
5
9123
by: Tim Quon | last post by:
Hi I have a pointer to char and need to replace a string inside with another string. Something like that: char* str_replace(const char* oldString, const char* toBeReplaced, const char* replaceWith); How can I code the function if the string is of dynamic length?
2
3413
by: Christopher Beltran | last post by:
I am currently trying to replace certain strings, not single characters, with other strings inside a word document which is then sent to a browser as a binary file. Right now, I read in the word file, convert the FileStream into a string using Unicode encoding, then do a replace, then convert the string back to a byte using Unicode encoding which i then Response.WriteBinary(bytes) to the browser. This works fine although the actual...
5
6011
by: D | last post by:
hi there , i want to do something fairly simple (well it was simple in PERL) using the replace function of Regex... but i cannot find the docs to help me on it... i want to use a regex to find a string: ^HOST=(.+);*$ and then replace group 1 (inside the .+) with another string... say the variable strReplace.
2
2886
by: Tim_Mac | last post by:
hi, i have a tricky problem and my regex expertise has reached its limit. i have read other posts on this newsgroup that pull out the plain text from a html string, but that won't work for me because i want to preserve the html, and replace some of the plain text. i basically want to show the user's search terms highlighted in the page, like google does, but i want to do this server side (i have the mechanics of intercepting the html...
1
4506
by: patelgaurav85 | last post by:
Hi, I want to convert xml in one format into another xml format shown below Input xml : <Name> <Name1> <Name11>Name11</Name11> <Name12>Name12</Name12>
10
15096
blazedaces
by: blazedaces | last post by:
Alright guys, so the title explains exactly my goal. The truth is I'm going to be reading in a lot of data from an xml file. The file is too large and there's too much data to store in arraylists without running out of memory, so I'm reading and as I'm reading I'm going to write to a file. This is the thing though, I already can do this and have it done, but I want to modify the program so you can choose what data you want to take out. To...
1
1645
by: Matt Herzog | last post by:
Hey All. I'm learning some python with the seemingly simple task of updating a firewall config file with the new IP address when my dhcpd server hands one out. Yeah, I know it's dangerous to edit such a file "in place" but this is just an exercise at this point. I would not mind using file handles except they seem to add complexity. The only apparent problem I have with my script so far is that it's adding lots of blank lines to the file...
0
10219
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9998
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9865
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7413
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6675
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5310
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5448
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3967
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2815
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.