469,934 Members | 2,589 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,934 developers. It's quick & easy.

searching and comparing in files

Hi,

I have the following problem:
In one file (addresses) I have a bunch of lines like this:

New 0x8048240 Old 0x0: jne 804824a
New 0x8048253 Old 0x0: je 8048293
New 0x80482c7 Old 0x0: jp 80482e0
....

In another file (ranges) I have this:

[804820c-8048249]
[804824f-804826b]
[8048283-8048292]
....

I would like to do the following in C:
for each line in the addresses file (for example: New 0x8048240 Old 0x0: jne
804824a) I need to check if the first address (8048240) AND the second
address (804824a) is located in one of the ranges in the ranges file.(the 2
addresses may be situated in different ranges). Than I also need to count
them by sort of jump (for example: jne: 12 are good).

I've been messing around with fscanf and stuff like that but I ain't no good
at it... can someone help me plz.

Thank you in advance
Chéraaar
Nov 14 '05 #1
7 1544
Chéraaar wrote:
Hi,

I have the following problem:
In one file (addresses) I have a bunch of lines like this:

New 0x8048240 Old 0x0: jne 804824a
New 0x8048253 Old 0x0: je 8048293
New 0x80482c7 Old 0x0: jp 80482e0
...

In another file (ranges) I have this:

[804820c-8048249]
[804824f-804826b]
[8048283-8048292]
...

I would like to do the following in C:
for each line in the addresses file (for example: New 0x8048240 Old 0x0: jne
804824a) I need to check if the first address (8048240) AND the second
address (804824a) is located in one of the ranges in the ranges file.(the 2
addresses may be situated in different ranges). Than I also need to count
them by sort of jump (for example: jne: 12 are good).

I've been messing around with fscanf and stuff like that but I ain't no good
at it... can someone help me plz.


Sure. Give us your code cooked down to the minimum, point out your
problems and get help plus an invaluable code review.

To get you started on format strings:
unsigned long rangestart, rangeend, addnew, addjump;
.....
if (fscanf(addresses,"New 0x%lx Old 0x%*lx: %*[a-z] %lx%*[^\n]\n",
&addnew, &addjump) != 2)
{
/* Handle error */
}
..... /* Same for range */
if (addnew >= rangestart && addnew <= rangeend
&& addjump >= rangestart && addjump <= rangeend)
{
....
}

Obviously untested as there was no minimal example for me to
play with.
Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 14 '05 #2
>> I have the following problem:
In one file (addresses) I have a bunch of lines like this:

New 0x8048240 Old 0x0: jne 804824a
New 0x8048253 Old 0x0: je 8048293
New 0x80482c7 Old 0x0: jp 80482e0
...

In another file (ranges) I have this:

[804820c-8048249]
[804824f-804826b]
[8048283-8048292]
...

I would like to do the following in C: for each line in the addresses
file (for example: New 0x8048240 Old 0x0: jne 804824a) I need to check
if the first address (8048240) AND the second address (804824a) is
located in one of the ranges in the ranges file.(the 2 addresses may be
situated in different ranges). Than I also need to count them by sort
of jump (for example: jne: 12 are good).

Sure. Give us your code cooked down to the minimum, point out your
problems and get help plus an invaluable code review.

thx a lot for the info
I have tried the following:

unsigned long range_start, range_end, add_new, add_jump, test1, test2;

list = fopen("list", "r");
diota1 = fopen("diota", "r");
diota2 = fopen("diota", "r");
int count_new_ok = 0;
int count_both_ok = 0;

while(fscanf(list,"New 0x%lx Old 0x%*lx: %*[a-z] %lx%*[^\n]\n", &add_new,
&add_jump) == 2)
{
while(fscanf(diota1,"[%lx-%lx]%*[^\n]\n", &range_start,
&range_end) == 2)
{
if (add_new >= range_start && add_new <= range_end)
{
count_new_ok++;
while(fscanf(diota2,"[%lx-%lx]%*[^\n]\n",
&range_start, &range_end) == 2)
{
if (add_jump >= range_start && add_jump <=
range_end)
{
count_both_ok++;
/*here I need to keep track of the
different kind of jumps defined in
[a-z]*/
break;
}
}
break;
}
}

}
fclose(list);
fclose(diota1);
fclose(diota2);

printf("count_new_ok: %d\ncount_both_ok: %d\n",count_new_ok,
count_both_ok);

were list is like this:
New 0x8048214 Old 0x0: jbe 8048224
New 0x8048234 Old 0x0: jae 8048273
New 0x804823b Old 0x0: jno 8048259
New 0x804824c Old 0x0: jp 8048255
New 0x8048267 Old 0x0: jp 804826e
New 0x8048287 Old 0x0: jo 804828c
New 0x804829d Old 0x0: jnp 80482cc
New 0x80482d7 Old 0x0: ja 80482e8

and diota:
[804820c-8048223]
[8048228-8048235]
[8048250-8048254]
[8048273-804828b]
[804828f-80482ab]
[80482c1-80482cb]
[80482d0-80482d8]
[80482de-80482ee]
[80482ef-80482f4]
[8048304-8048323]

the "new" address is checked and if not in an interval it takes the
following line. If in an interval it checks the "jmp" address.

as you can see the first line would give count_new_ok++ cause the "new"
address is in the first interval. the "jump" address is in no interval.
the second line would give count_new_ok++ AND count_both_ok++ because the
"new" address is in second interval en the "jump" address is in 4th
interval... and so on for al the lines in file "list" (addresses in first
post)
Now the problem I have is that this program only checks the first line...
it won't go any further... the output I get is 1 for count_new_ok and 0
for count_both_ok. I don't know why it doesn't go to the next lines.

Then there is another thing: I need to now how many jumps for each type of
jmp are both_ok. Like the second line would give jae_ok++ (there are 16
types of jumps). I kind of need a string to store the type of jump from
the list file and according to that make the right ++.

Thank you very much for your time and knowledge Cheers, Chéraaar
Nov 14 '05 #3
never mind, problem solved :)
Nov 14 '05 #4
chéraaar wrote:
never mind, problem solved :)


Grrr, just as I was about to post a corrected version of your code...
No joke.

Cheers ;-)
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 14 '05 #5

"Michael Mair" <Mi**********@invalid.invalid> schreef in bericht
news:3a*************@individual.net...
chéraaar wrote:
never mind, problem solved :)


Grrr, just as I was about to post a corrected version of your code...
No joke.

Cheers ;-)
Michael


hehe sorry, but I'm still interested because I solved it by editing the
files where I'm reading from...
instead of using the scanstring you told me to use I removed all unnecessary
characters so that I have for example:
804520 80453a jle
and I use:
while(fscanf(list, "%lx %lx %s", &address_new[n], &address_jump[n],
&opcode[n])==2) n++;
and this works to read the whole file into these arrays (which I then use to
make the checks), which wasn't working with the unedited version of list...
So if you could point me out why this isn't working with
while(fscanf(list,"New 0x%lx Old 0x%*lx: %*[a-z] %lx%*[^\n]\n", &add_new,
&add_jump) == 2) that would be very nice :)

Greetz,
Chéraaar


Nov 14 '05 #6
Chéraaar wrote:
"Michael Mair" <Mi**********@invalid.invalid> schreef in bericht
news:3a*************@individual.net...
chéraaar wrote:
never mind, problem solved :)


Grrr, just as I was about to post a corrected version of your code...
No joke.

Cheers ;-)
Michael


hehe sorry, but I'm still interested because I solved it by editing the
files where I'm reading from...
instead of using the scanstring you told me to use I removed all unnecessary
characters so that I have for example:
804520 80453a jle
and I use:
while(fscanf(list, "%lx %lx %s", &address_new[n], &address_jump[n],
&opcode[n])==2) n++;
and this works to read the whole file into these arrays (which I then use to
make the checks), which wasn't working with the unedited version of list...
So if you could point me out why this isn't working with
while(fscanf(list,"New 0x%lx Old 0x%*lx: %*[a-z] %lx%*[^\n]\n", &add_new,
&add_jump) == 2) that would be very nice :)


The whitespace belongs at the beginning, i.e.
" New 0x%lx Old 0x%*lx: %*[a-z] %lx%*[^\n]"
should do.
The code below is not well tested and not the cleanest way to do
it but I was not sure about your intent.

Apart from that, there are tools/languages much better suited to
purposes like that.
Cheers
Michael
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define STRINGIZE(s) #s
#define XSTR(s) STRINGIZE(s)
#define JMP_NAMELEN 4
#define JMP_SCANFMT "%"XSTR(JMP_NAMELEN)"s"

enum jumps { JBE, JAE, JNO, JP, JO, JNP, JA, NUM_COUNTERS };
const char jumpnames[NUM_COUNTERS][JMP_NAMELEN] = {
"jbe", "jae", "jno", "jp", "jo", "jnp", "ja"
};

int main (void)
{
unsigned int count_new = 0, count_jump[NUM_COUNTERS] = {0};
int success;
char buf[JMP_NAMELEN];
unsigned long range_start, range_end, add_new, add_jump;
FILE *list, *diota1, *diota2;
size_t i;

success = 0;
if ( !(list = fopen("list", "r")) ) {
/* Handle error */
}
else if ( !(diota1 = fopen("diota", "r")) ) {
/* Handle error */
fclose(list);
}
else if ( !(diota2 = fopen("diota", "r")) ) {
/* Handle error */
fclose(diota1);
fclose(list);
} else {
success = 1;
}

if (!success) {
fprintf(stderr, "Error opening files\n");
exit(EXIT_FAILURE);
}

success = 1;
while (fscanf(list, " New 0x%lx Old 0x%*lx: " JMP_SCANFMT
" %lx%*[^\n]", &add_new, buf, &add_jump) == 3)
{
if (fscanf(diota1, " [%lx-%lx]%*[^\n]", &range_start, &range_end)
!= 2)
{
success = 0; break;
}
if (add_new >= range_start && add_new <= range_end) {
count_new++;
}
if (fscanf(diota2, " [%lx-%lx]%*[^\n]", &range_start, &range_end)
!= 2)
{
success = 0; break;
}
if (add_jump >= range_start && add_jump <= range_end) {
for (i=0; i<NUM_COUNTERS; i++) {
if (strncmp(jumpnames[i], buf, JMP_NAMELEN) == 0) {
count_jump[i]++;
break;
}
}
if (i == NUM_COUNTERS) {
success = 0; break;
}
}
}
if (!feof(list))
success = 0;
fclose(diota2);
fclose(diota1);
fclose(list);

if (!success) {
fprintf(stderr, "Error while scanning files\n");
exit(EXIT_FAILURE);
}

printf("count_new: %3d\n", count_new);
for (i=0; i<NUM_COUNTERS; i++)
printf("count_jump[%.*s]: %3d\n", JMP_NAMELEN, jumpnames[i],
count_jump[i]);
printf("\n");

return 0;
}

--
E-Mail: Mine is an /at/ gmx /dot/ de address.
Nov 14 '05 #7
Op Thu, 24 Mar 2005 11:17:09 +0100, schreef Michael Mair:
Chéraaar wrote:
"Michael Mair" <Mi**********@invalid.invalid> schreef in bericht
news:3a*************@individual.net...
The whitespace belongs at the beginning, i.e.
" New 0x%lx Old 0x%*lx: %*[a-z] %lx%*[^\n]"
should do.
The code below is not well tested and not the cleanest way to do
it but I was not sure about your intent.

Apart from that, there are tools/languages much better suited to
purposes like that.


gonna try that.
Thank you so much for all the help and the code!
Chéraaar

Nov 14 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by John | last post: by
3 posts views Thread by s99999999s2003 | last post: by
3 posts views Thread by system55 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.