473,404 Members | 2,137 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,404 software developers and data experts.

Reading Simple Flatfiles with a COBOL Mindset

assumption: I am new to C and old to COBOL

I have been reading a lot (self teaching) but something is not sinking
in with respect to reading a simple file - one record at a time.
Using C, I am trying to read a flatfile. In COBOL, my simple file
layout and READ statement would look like below.

Question: what is the standard, simple coding convention for reading
in a flatfile - one record at a time?? SCANF does not work because of
spaces; I tried FGETS and STRUCT to emulate my COBOL perspective but
that does not work (though I may have coding this wrong). C likes to
deliver data in streams but FGETS is akin to reading a single record.

I know I am missing something that is very simple but the examples
that I have come across avoid this simple scenario. Please explain -
an example would be great.

thanks
kevin
.......

01 employee-record.
03 emp-id pic 9(5).
03 emp-dept pic x(5).
03 emp-name.
05 emp-name-last pic x(20).
05 emp-name-first pic x(20).
03 emp-hire-date.
05 emp-hire-date-mm pic 9(2).
05 emp-hire-date-dd pic 9(2).
05 emp-hire-date-yy pic 9(4).

read employee-flatfile into employee-record.
Nov 14 '05 #1
6 3725
In article <ae**************************@posting.google.com >,
KevinD <de******@dteenergy.com> wrote:
assumption: I am new to C and old to COBOL
If you haven't already, get a copy of K&R2[1] and read it. If you've
done programming of almost any sort before, it's as good an introduction
to C as you'll find.

Also worth reading is Steve Summit's C FAQ at
http://www.eskimo.com/~scs/C-faq/top.html . The HTML version is somewhat
out of date, but the first link on the page points you at alternate
versions, including a more up-to-date text one.
Now on to your question...
I have been reading a lot (self teaching) but something is not sinking
in with respect to reading a simple file - one record at a time.
Using C, I am trying to read a flatfile. In COBOL, my simple file
layout and READ statement would look like below.

Question: what is the standard, simple coding convention for reading
in a flatfile - one record at a time?? SCANF does not work because of
spaces; I tried FGETS and STRUCT to emulate my COBOL perspective but
that does not work (though I may have coding this wrong). C likes to
deliver data in streams but FGETS is akin to reading a single record.
struct[2] is, if I'm not grossly misunderstanding you, approximately
equivalent to what you're thinking about as a "record", so you'll
probably want to eventually stuff the data you read into a struct and
return that struct.

fgets reads a *line* at a time. It looks like your file format uses
multiple lines per record, so you'd end up calling fgets multiple times
and then looking at what's in each line.

If this is a fixed format (you know that the next line you read when you
start will be the "01 employee-record." followed by 9 lines of data as
described below), things are a little bit easier, but a fully general
routine to read such a file format would read a line (with fgets), figure
out what it's describing the beginning of (sscanf would be a good place to
start for that, though it may end up not being what you need), and then
read the next lines (with fgets again) and extract the appropriate data
(with sscanf, various str* functions, and/or your own parsing code).
At each step the data extracted would be put somewhere accessible,
probably into a struct that you end up returning.

(You obviously don't want to do all this inline every time you want
to read a record, so once you've got it working wrap it up nicely in a
function so that when you want to actually read a record it's a one-line
function call.)

01 employee-record.
03 emp-id pic 9(5).
03 emp-dept pic x(5).
03 emp-name.
05 emp-name-last pic x(20).
05 emp-name-first pic x(20).
03 emp-hire-date.
05 emp-hire-date-mm pic 9(2).
05 emp-hire-date-dd pic 9(2).
05 emp-hire-date-yy pic 9(4).


As a zeroth approximation to what you'd want:
--------
struct employee_name
{
char *last;
char *first;
};
struct employee_hire_date
{
int month;
int day;
int year;
};
struct employee_record
{
EMP_ID_TYPE/*int?*/ emp_id;
DEPT_TYPE/*int?*/ emp_dept;
struct employee_name emp_name;
struct employee_hire_date emp_hire_date;
};

struct employee_record read_employee_record(FILE *in)
{
char buf[1024];
struct employee_record ret;

fgets(buf,sizeof buf,in);
/*Check that this is the start-of-record line
-OR-
assume that we're called because something else read that line
(which means this bit belongs in the caller)
*/

fgets(buf,sizeof buf,in);
/*Check that this line is the emp-id line, and extract the
value represented into ret.emp_id
*/

/*Similar to above for emp_dept*/

/*read_employee_name() will read the lines describing the name,
extract the relevant information, and return it packed into
a struct employee_name
*/
ret.emp_name=read_employee_name(in);

/*Similar to above for emp_hire_date*/

return ret;
}

/*And when you want to read a record, do something like
the_record=read_employee_record(my_input_file);
*/
--------
dave

[1] "The C Programming Language, 2nd edition", Brian W. Kernighan
and Dennis M. Ritchie, ISBN 0-13-110362-8 (paperback), 0-13-110370-9
(hardback).

[2] C is case-sensitive. Get used to using lower-case when you're talking
about parts of the language, since that makes it easier for C
programmers to understand.

--
Dave Vandervies dj******@csclub.uwaterloo.ca
[A]nd yet just *this* topic, the Standard C language, provokes questions,
queries, puzzlement, anguish, outrage, confusion, enlightenment, applications,
critiques, and corrections. We won't need *more*. --Chris Dollin in CLC
Nov 14 '05 #2
KevinD wrote:
assumption: I am new to C and old to COBOL

I have been reading a lot (self teaching) but something is not sinking
in with respect to reading a simple file - one record at a time.
Using C, I am trying to read a flatfile. In COBOL, my simple file
layout and READ statement would look like below.

Question: what is the standard, simple coding convention for reading
in a flatfile - one record at a time?? SCANF does not work because of
spaces; I tried FGETS and STRUCT to emulate my COBOL perspective but
that does not work (though I may have coding this wrong). C likes to
deliver data in streams but FGETS is akin to reading a single record.

I know I am missing something that is very simple but the examples
that I have come across avoid this simple scenario. Please explain -
an example would be great.


C's I/O streams have no notion of "record," aside from
the fairly weak notion of "line" as expressed in fgets() and
a few other, relatively obscure parts of the library.

But don't despair. Ask yourself "What does a record
look like, when thought of as an undifferentiated stream of
bytes?" Then read the appropriate number of bytes from the
stream and impose your interpretation on them: The first
five are alphabetic characters denoting a stock ticker
symbol, the next ten are decimal digits giving the last
trade price in millicents, the next twenty are alphabetics
giving the name of the latest executive to serve his
company from behind bars, and so on. Extract whatever's
needed from this layout, convert it to more convenient forms
if you like (e.g., the fields of decimal digits might become
`int' or `double' values), and away you go.

C streams come in two principal flavors (three, really,
but I have a hunch you're not interested in wide characters
just yet): there are text streams and binary streams. If
your data file looks like a bunch of lines consisting of
textual characters, you should access it with a text stream:
use "r" as the second argument to fopen(). But if your
file contains "binary garbage" like numbers expressed in
binary or packed decimal format, use a binary stream: pass
"rb" as fopen()'s second argument.

In the binary case, it *may* be that you can use fread()
to plop a fixed number of bytes from the file straight into
a properly-arranged struct, and be on your way without any
need for further interpretation. However, there are lots of
pitfalls in this approach, and I couldn't recommend it without
knowing a lot more about your situation than I'm likely to
have time to discover.

--
Er*********@sun.com

Nov 14 '05 #3
"KevinD" <de******@dteenergy.com> wrote in message
news:ae**************************@posting.google.c om...
assumption: I am new to C and old to COBOL

I have been reading a lot (self teaching) but something is not sinking
in with respect to reading a simple file - one record at a time.
I think the conceptual 'gap' is because you perhaps don't
realize that C's i/o system has no notion of a 'record'.
Everything is a 'stream of characters'. If you want to
impose some sort of 'structure' such as a fixed record
length, you do that yourself in your code.
Using C, I am trying to read a flatfile. In COBOL, my simple file
layout and READ statement would look like below.

Question: what is the standard, simple coding convention for reading
in a flatfile - one record at a time??
There isn't one, since there's no notion of 'record'.
SCANF does not work because of
spaces; I tried FGETS
'fgets()' reads up to a newline or end of file, thus it effectively
reads variable length 'records', delimited by newline characters.
and STRUCT
'struct' (not the all lower-case, C is case-sensitive), is indeed
part of the solution.
to emulate my COBOL perspective but
that does not work (though I may have coding this wrong). C likes to
deliver data in streams but FGETS is akin to reading a single record.


Sort of. :-)

If you want a fixed 'flat' record size,

Open your file in binary mode
(see second argument to 'fopen()')

Create a record type using the 'struct' keyword, e.g.

struct record
{
char name[30];
char phone[16];
};

Read the file with 'fread()', and write to it with 'fwrite()' (these
are the 'unformatted' i/o functions ). Move around in the file with
'fseek()'. Also 'ftell()' may be of use.

Finally, note that this will render your data file platform-specific.
(Binary representations of data can and do vary among platforms).

-Mike
Nov 14 '05 #4
KevinD wrote:

I have been reading a lot (self teaching) but something is not
sinking in with respect to reading a simple file - one record at
a time. Using C, I am trying to read a flatfile. In COBOL, my
simple file layout and READ statement would look like below.


C doesn't have records, in your sense. Even if you create a
struct that mirrors your record, there is no guarantee that
writing it to or from a file mimics anything for any other
compiler or system.

What C does have is streams of bytes that can be read to or from a
file. Sometimes these may be characters, with lines demarcated by
newline markers, and then known as a text file. Some systems have
special processing for text files, others do not.

Your task, should you deign to accept it, is to discover the exact
file format required, in terms of a sequence of byte values, and
design code to transfer suitable blocks to and from the files.

I suspect that all the fields shown in your Cobol example are
actually text fields in that they hold representation of chars in
some code or other. If they are EBCDIC they will usually need
translation for a C system.

--
"Churchill and Bush can both be considered wartime leaders, just
as Secretariat and Mr Ed were both horses." - James Rhodes.
"We have always known that heedless self-interest was bad
morals. We now know that it is bad economics" - FDR
Nov 14 '05 #5
de******@dteenergy.com (KevinD) wrote:
# assumption: I am new to C and old to COBOL
#
# I have been reading a lot (self teaching) but something is not sinking
# in with respect to reading a simple file - one record at a time.
# Using C, I am trying to read a flatfile. In COBOL, my simple file
# layout and READ statement would look like below.
#
# Question: what is the standard, simple coding convention for reading
# in a flatfile - one record at a time?? SCANF does not work because of
# spaces; I tried FGETS and STRUCT to emulate my COBOL perspective but
# that does not work (though I may have coding this wrong). C likes to
# deliver data in streams but FGETS is akin to reading a single record.
#
# I know I am missing something that is very simple but the examples
# that I have come across avoid this simple scenario. Please explain -
# an example would be great.
#
# thanks
# kevin
#
#
# ......

# 01 employee-record.
# 03 emp-id pic 9(5).
# 03 emp-dept pic x(5).
# 03 emp-name.
# 05 emp-name-last pic x(20).
# 05 emp-name-first pic x(20).
# 03 emp-hire-date.
# 05 emp-hire-date-mm pic 9(2).
# 05 emp-hire-date-dd pic 9(2).
# 05 emp-hire-date-yy pic 9(4).

On most C implementations you can overlay a struct of just char[] fields
on a char string without worrying about padding. However to be safer and
to do the type conversions (most C implementation require decimal to binary
conversion to do arithmetic and zero byte terminated strings), I would be
inclined to write a parser routine.

struct employee_record {
int emp_id;
char emp_dept[5+1]; /*+1 for zero byte terminator*/
struct {
char emp_name_last[20+1];
char emp_name_first[20+1];
} emp_name;
struct {
int emp_hire_date_mm;
int emp_hire_date_dd;
int emp_hire_date_yy;
} emp_hire_date;
};

static long pic9(int n,char *line,int *pos) {
long num; char *t = malloc(n+1); memcpy(t,line+*pos,n); t[n] = 0;
*pos += n;
num = strtol(t,0,10); free(t);
return num;
}

static void picx(char *string,int n,char *line,int *pos) {
memcpy(string,line+*pos,n); string[n] = 0;
*pos += n;
}

static int read_employee_record(
FILE *employee_flatfile,
struct employee_record *employee_record
) {
char line[5+5+20+20+2+2+4+2];
if (fgets(line,sizeof line,employee_flatfile)) {
char *nl = strchr(line,'\n');
if (nl) *nl = 0;
if (strlen(line)==sizeof line-2) {
int pos = 0;
employee_record->emp_id = pic9(5,line,&pos);
picx(employee_record->emp_dept,5,line,&pos);
picx(employee_record->emp_name.emp_name_last,20,line,&pos);
picx(employee_record->emp_name.emp_name_first,20,line,&pos);
employee_record->emp_hire_date.emp_hire_date_mm = pic9(2,line,&pos);
employee_record->emp_hire_date.emp_hire_date_dd = pic9(2,line,&pos);
employee_record->emp_hire_date.emp_hire_date_yy = pic9(4,line,&pos);
return 0;
}else
return -1;
}else
return -1;
}

# read employee-flatfile into employee-record.

read_employee_record(employee_flatfile,&employee_r ecord);

--
SM Ryan http://www.rawbw.com/~wyrmwif/
Haven't you ever heard the customer is always right?
Nov 14 '05 #6
Dominic Shields wrote:

On 9 Sep 2004 12:55:56 -0700, de******@dteenergy.com (KevinD) wrote:
I tried FGETS and STRUCT to emulate my COBOL perspective but
that does not work (though I may have coding this wrong). C likes to
deliver data in streams but FGETS is akin to reading a single record.
I use the word "record" to mean that which is in between two newlines.


I use the word "line" when discussing streams.

N869
7.19.2 Streams
[#2] A text stream is an ordered sequence of characters
composed into lines, each line consisting of zero or more
characters plus a terminating new-line character.

--
pete
Nov 14 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Slant | last post by:
I have been using MySQL in conjunction with PHP for many years now... I have been using the exact same method for not only connecting to a database and performing the queries but also displaying...
5
by: el_roachmeister | last post by:
I have been using flat files for a while but thought I should learn mysql. There are two things I dont like about mysql compared to using flatfiles. They are: 1) When creating a table why do I...
0
by: Mario Fischer | last post by:
Hi! I wrote a small SQL-parser for SELECT-statements for the purpose of easier switching between databases. The parser also works with multiple-table-queries. Until here everything is fine, but...
242
by: James Cameron | last post by:
Hi I'm developing a program and the client is worried about future reuse of the code. Say 5, 10, 15 years down the road. This will be a major factor in selecting the development language. Any...
30
by: Stuart Turner | last post by:
Hi Everyone, I'm working hard trying to get Python 'accepted' in the organisation I work for. I'm making some good in-roads. One chap sent me the text below on his views of Python. I wondered...
3
by: KevinD | last post by:
thank you for your helpful explanations. In my first note I forgot to mention that my simple flatfile is a text file with a newline character at the end thus I able to get an entire record . ...
2
by: singlal | last post by:
Hi, my question was not getting any attention because it moved to 2nd page; so posting it again. Sorry for any inconvenience but I need to get it resolved fast. Need your help! ...
0
by: cobug | last post by:
Dear COBOL Users, Articles are being sought for the COBOL User Groups (COBUG) newsletters. Will you help us in our efforts to provide newsletters for the COBOL community at large? The...
5
by: Alberto Salvati | last post by:
Hi, List. My company has a VERY BIG product base on db2 udb v7.x. We want to di an upgrade to v9, but.... current db has a lot of procedure (cobol..!). Therefore, we've planned to rewrite this...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.