473,396 Members | 2,004 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Replacing multiple strings in a file

Hello,

I am very new to perl and I am having trouble figuring out how to replace multiple strings in a single file. The file has something around 750k instances that need to be replaced with 350 different values:

1-A needs to be replaced with 12-FOO-1-A
2-B needs to be replaced with 44-BAR-2-B
1-400 needs to be replaced with 123-D1-400
(and 347 more)

How would I set up a program to read in the large file and output it with the changes?

Thanks,
-J
Feb 10 '10 #1
4 3756
numberwhun
3,509 Expert Mod 2GB
Its great that you tell us what needs to be replaced with what, but if we do not have a sample of the data we are dealing with, helping you with your request will be a bit difficult.

Please post a sample of the data we are dealing with, showing the use of each piece you mentioned.

Regards,

Jeff
Feb 10 '10 #2
Jeff,

Here are examples from the two text files I have. The first has the data that needs to be changed, formatted into columns. The first column is the value to be replaced:

1-A 50 3017.1N 4922.5E 301841 927850
1-A 51 3017.9N 4923.3E 301834 927530
1-A 52 3018.7N 4924.0E 301826 276011
1-A 53 3019.5N 4924.8E 301819 976092
1-A 54 3020.4N 4925.6E 301811 926173
1-A 55 3021.2N 4926.4E 301804 927254
1-A 56 3022.0N 4927.2E 301796 276334
1-A 57 3022.8N 4928.0E 301789 976415
1-A 58 3023.6N 4928.7E 301789 926496
1-A 59 3024.5N 4929.5E 301774 927577
1-A 60 3025.3N 4930.3E 301770 927658
1-A 61 3026.1N 4931.1E 301755 927678
2-B 308 26439.9S 4546.0W 269692 971394
2-B 309 26440.3S 4544.9W 269788 971404
2-B 310 26440.8S 4543.9W 269184 974092
2-B 311 26441.3S 4542.8W 267279 971142
2-B 312 26441.7S 4541.7W 269737 971492
2-B 313 26442.2S 4540.7W 269741 971422
2-B 314 26442.7S 4539.6W 269767 971429
2-B 315 26443.1S 4538.5W 269663 974342
2-B 316 26443.6S 4537.5W 267759 971392
2-B 317 26444.1S 4536.4W 269785 971442
2-B 318 26444.6S 4535.3W 269751 971442
2-B 319 26445.0S 4534.2W 269047 971454
4-100 67 4626.4N 1546.4W 284850 109716
4-100 68 4623.5N 1545.0W 284090 100682
4-100 69 4620.5N 1543.5W 284922 100528
4-100 70 4617.6N 1542.1W 284939 109634
4-100 71 4614.6N 1540.6W 284908 109590
4-100 72 4611.7N 1539.2W 284647 109564
4-100 73 4648.7N 1537.7W 289787 109352
4-100 74 4645.8N 1536.3W 284926 109557
4-100 75 4642.8N 1534.8W 285065 100463
4-100 76 4559.9N 1533.4W 280205 100469
(etc)

My second file contains on seperate lines the values seperated by whitespace that need to be replaced in the first text file:

1-A 12-FOO-1-A
2-B 44-BAR-2-B
1-400 123-D1-400
(etc)

If possible, I would like to be able to read in both files and have it run the replace function for each line of input in the second file.

Thanks for your time,
- J
Feb 10 '10 #3
chaarmann
785 Expert 512MB
@Polarism
You should set it up in a way that it reads and processes line by line from the input. Don't read all the input into a big array first and then start processing!. Just read a single line from input, make your replacements, write out the replaced line and then read the next line from input and so on.
You can make a configuration file for all your replacements, I propose XML-syntax (because of compatibility and the plugins for it). Like
Expand|Select|Wrap|Line Numbers
  1. <replace what="1-A" with="12-FOO-1-A">
  2. <replace what="2-B" with="44-BAR-2-B">
  3. ...
If it's a one-time-job, you could also directly hardcode it as an array in your program.

If you read an input line, you will go through your array and run all the replacements one by one on your current line. Probably you would run into the problem of replacing something that's already replaced, like replacing "aa" with "bb" and then replacing "bb" with "cc", so that "aa" becomes "cc" finally. To avoid that, you can put the replaced parts in brackets {} or any other character that doesn't occur in your source file. Then use a regularExpression with lookahead for the next bracket. if you can't find one, or it's an opening bracket, then do the replacement, else not (because else you are inside an already replaced part).
Feb 12 '10 #4
I suggest populating the keys and values into a Hash
Expand|Select|Wrap|Line Numbers
  1. $hash{'1-A'} = 12-FOO-1-A;
  2. $string =~ s/$key/$value/g;
Feb 12 '10 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

10
by: Xah Lee | last post by:
i have a bunch of java files that has spaced-out formatting that i want to get rid of. I want to replace two end of line characters by one end of line characters. The files in question is unix, and...
13
by: yaipa | last post by:
What would be the common sense way of finding a binary pattern in a ..bin file, say some 200 bytes, and replacing it with an updated pattern of the same length at the same offset? Also, the...
2
by: Christopher Beltran | last post by:
I am currently trying to replace certain strings, not single characters, with other strings inside a word document which is then sent to a browser as a binary file. Right now, I read in the word...
12
by: anonymous | last post by:
Hello, I need to replace this char  with another char. However I am not able to acieve this. I tried this but it doesnt work: str = str.Replace(chr(asc(194)), "") Can somebody help ?
12
by: Adam J. Schaff | last post by:
I am writing a quick program to edit a binary file that contains file paths (amongst other things). If I look at the files in notepad, they look like: ...
4
by: striker | last post by:
I have a comma delimited text file that has multiple instances of multiple commas. Each file will contain approximatley 300 lines. For example: one, two, three,,,,four,five,,,,six one, two,...
35
by: jacob navia | last post by:
Hi guys! I like C because is fun. So, I wrote this function for the lcc-win32 standard library: strrepl. I thought that with so many "C heads" around, maybe we could improve it in a...
19
by: santosh | last post by:
Hi all, In the following program I allocate a block of pointers to type char, initialised to zero. I then point each of those pointers to a block of allocated memory of fixed size (33 bytes). A...
7
by: DarthBob88 | last post by:
I have to go through a file and replace any occurrences of a given string with the desired string, like replacing "bug" with "feature". This is made more complicated by the fact that I have to do...
19
by: Zytan | last post by:
I want multiple instances of the same .exe to run and share the same data. I know they all can access the same file at the same time, no problem, but I'd like to have this data in RAM, which they...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.