473,406 Members | 2,867 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

Text editor data structures

SD
I am thinking about writing a text editor in C for unix sometime soon.
I am just doing this to learn more about C. I want to write something
like ed.c, a simple line editor. What types of data structures would be
appropriate? I am thinking about using a linked list, but I am also
wondering whether a tree would be useful. Please give me your
ideas...thanks, tilak

Nov 14 '05 #1
14 8755
SD wrote:
I am thinking about writing a text editor in C for unix sometime soon.
I am just doing this to learn more about C. I want to write something
like ed.c, a simple line editor. What types of data structures would be
appropriate? I am thinking about using a linked list, but I am also
wondering whether a tree would be useful. Please give me your
ideas...thanks, tilak


In this n.g. , we primarily the C programming language, as prescribed
by the standard.

<OT>
Having said that, if you are starting on writing a text editor, get a
copy of 'Design Patterns' - Erich Gamma et al. That should help you a
lot.
</OT>
Nov 14 '05 #2

SD wrote:
I am thinking about writing a text editor in C for unix sometime soon. I am just doing this to learn more about C. I want to write something
like ed.c, a simple line editor. What types of data structures would be appropriate? I am thinking about using a linked list, but I am also
wondering whether a tree would be useful. Please give me your
ideas...thanks, tilak


Asking what data structure to use witout any context of what exactly
you want to use it for is not likely to get you a very good answer. You
probably need to spend a little time doing some design and determine
what types of things you need different data structures to do. Do you
need sorting capabilities in a given time? Will you only need to look
through values sequencially? Many other questions will need to be
answered before you can decide on a data structure. The way your
question is currently written no one here has any idea what you really
want to use the structures for.

Nov 14 '05 #3

"SD" <ti****@gmail.com> wrote

I am thinking about writing a text editor in C for unix sometime soon.
I am just doing this to learn more about C. I want to write something
like ed.c, a simple line editor. What types of data structures would be
appropriate? I am thinking about using a linked list, but I am also
wondering whether a tree would be useful. Please give me your
ideas...thanks, tilak

Normally a test editor is buitl around linked lists.

typedef struct line
{
struct line *next;
struct line *prev;
char *text;
} LINE;

define a "line" as any block of test terminated by an "\n" or end of file.

This structure means that insertions, deletions, and block moves can be
implemented quite efficiently. A modern computer is actually so fast that
you can probably move up an entire text file by one byte each time the user
presses a key, but there is no point throwing away CPU cycles.

You will need to call malloc() to grab new blocks, and also to allocate the
text. You might want to allocate lines in units of 100 or so to avoid
contuinal calls to realloc(). This would require a "capacity" member of the
LINE structure.

You probably want LINE *start and LINE *tail to be global variables, for
convenience, and also a global insertion point.

The main problem is displaying the text. You can use printf(), but such an
editor is extremely awkward to use. Really you need to look a a library such
as curses which allows to to place text at will on the screen.
Nov 14 '05 #4
"Malcolm" <ma*****@55bank.freeserve.co.uk> writes:
"SD" <ti****@gmail.com> wrote

I am thinking about writing a text editor in C for unix sometime soon.
I am just doing this to learn more about C. I want to write something
like ed.c, a simple line editor. What types of data structures would be
appropriate? I am thinking about using a linked list, but I am also
wondering whether a tree would be useful. Please give me your
ideas...thanks, tilak

Normally a test editor is buitl around linked lists.

typedef struct line
{
struct line *next;
struct line *prev;
char *text;
} LINE;

define a "line" as any block of test terminated by an "\n" or end of file.

[...]

Is that true? I haven't looked at the internals of any text editors
lately, but I think a lot of them store buffers of characters, not
lists of lines. Some may also use temporary files to avoid using too
much memory when editing large files. One disadvantage of a linked
list of lines is that accessing any given line requires an O(n)
search.

There are a number of open-source text editors. GNU has an "ed"
implementation, for example.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #5

"Keith Thompson" <ks***@mib.org> wrote
Normally a test editor is buitl around linked lists.


Is that true? I haven't looked at the internals of any text editors
lately, but I think a lot of them store buffers of characters, not
lists of lines.

It's ages since I've been in the text-editor writing business, and things
may have moved on. The central problem is that the user expects to be able
to insert text interactively at any point in the document, and on a slow
computer it isn't possible to shift the entire string up by one byte if the
user types text in at the beginning. So some sort of indexing system is
necessary.

A linked list is the most obvious solution, but the newline-terminated line
doesn't have to be your unit. It's just the easiest thing to choose if you
are doing jobs which are inherently line-based.

However there might be a better way of doing things. One very important
feature is the "undo". If you provide this then you do need an undo buffer,
which you can combine with an edit buffer, so it might make sense to abandon
linked lists for some sort of batch update system. If anyone has experience
of writing a modern text editor, then thoughts are welcome.
Nov 14 '05 #6
Much of this question is somewhat off-topic, since it deals more with
software design than the C language in particular.

For a beginner's exercise, using linked lists where each element is a
line is probably the wisest approach. However if you want to be a bit
more aggressive and make the editor a more "industrial grade", then I
would suggest...

1. The principal data structure would be the "buffer."

2. Buffers would be represented as an in-memory header containing
things like a buffer name, an associated filename (if any), root
pointer to a "section tree", a line count, maybe a character descriptor
of some kind, etc...

3. The tree structure would also be in memory and could be done either
as a binary tree or even an n-way tree though I'd probably begin with a
binary tree. The leaves of this tree would be "text sections"

4. A text section would consist of a continuous text string (including
0 or more line terminations) that represents continuously typed in text
data which could exist either in-memory or in temporary files
(depending on length).

5. All edits would be accomplished by the appropriate splitting of text
sections and logical insertions of new ones. Discarded text sections
would not be immediately deleted however.

6. A transaction log data structure would be maintained that would
consist of a header and an extendable array (realloc'd in memory as
necessary) that would sequentially identify edit operations and
maintain pointers to text sections that may no longer be part of the
buffer's current text image. This transaction log would permit the
"undo" operation to be implemented. The transaction log header should
be part of the buffer header it is associated with.

7. The buffer image would be derived by walking the tree and emitting
the text sections in order. This is how a file save would be
implemented.

8. A flatten operation should be implemented that clears out the
transaction log, frees associated memory, and reduces the buffer image
to a single text section.

I would begin by designing the structs needed to implement these
concepts and define the prototypes of the major internal functions.

Nov 14 '05 #7
SD wrote:
I am thinking about writing a text editor in C for unix
sometime soon. I am just doing this to learn more about C. I
want to write something like ed.c, a simple line editor. What
types of data structures would be appropriate?


Buffer Gap and paged Buffer Gap seem to be popular.

Try posting this on comp.editors - they go into great detail
about such things from time to time.

In the meantime, you might like playing with Ant's Editor
vIOCCC91 - which uses Buffer Gap, by the way. To compile
on my system, I type: gcc ant.c -o ant -lcurses. Yours
may be different.

#include <stdio.h>
#include <ctype.h>
#include <curses.h>
#define BUF 0x40000
#define T isspace(*(t=Z(p)))&&
#define V return
#define _ while
#define e int
#define o char
e d,i,j,m,n,p,q,x,y;
o*c,b[BUF],*f,*g=b,*h,*t;

o*Z(e a){if(a<0)V b;V b+a+(b+a<g?0:h-g);}

P(o*a){V a-b-(a<h?0:h-g);}

S(){p=0;}

bf(){n=p=P(c);}

Q(){q=1;}

G(){t=Z(p);_(t<g)*--h=*--g;_(h<t)*g++=*h++;p=P(h);}

B(){_(!T b<t)--p;_(T b<t)--p;}

M(e a){_(b<(t=Z(--a))&&*t-'\n');V b<t?++a:0;}

N(e a){_((t=Z(a++))<c&&*t-'\n');V t<c?a:P(c);}

A(e a,e j){i=0;_((t=Z(a))<c&&*t-'\n'&&i<j){
i+=*t-'\t'?1:8-(i&7);++a;}V a;}

L(){0<p&&--p;}

R(){p<P(c)&&++p;}

U(){p=A(M(M(p)-1),x);}

D(){p=A(N(p),x);}

H(){p=M(p);}

E(){p=N(p);L();}

J(){m=p=M(n-1);_(0<y--)D();n=P(c);}

K(){j=d;_(0<--j)m=M(m-1),U();}

X(){G();p=h<c?P(++h):p;}

F(){FILE*fp=fopen(f,"w");j=p;p=0;G();
fwrite(h,1,(e)(c-h),fp);fclose(fp);p=j;}

W(){_(!T t<c)++p;_(T t<c)++p;}

Y(){m=p<m?M(p):m;if(n<=p){m=N(p);i=m-P(c)?d:d-2;
_(0<i--)m=M(m-1);}move(0,0);i=j=0;n=m;for(;;){p-n||(y=i,x=j);
t=Z(n);if(d<=i||c<=t)break;
if(*t-'\r')addch(*t),j+=*t-'\t'?1:8-(j&7);
if(*t=='\n'||COLS<=j)++i,j=0;++n;}clrtobot();
if(++i<d)mvaddstr(i,0,"<<EOF>>");move(y,x);refresh ();}

I(){G();_((j=getch())!='\f'){
if(j-'\b')g-h&&(*g++=(o)(j-'\r'?j:'\n'));else b<g&&--g;
p=P(h);Y();}}

C(){clear();Y();}

e(*z[])()={L,D,U,R,B,J,K,W,H,E,S,bf,I,X,F,C,Q,G};
o k[]="hjklHJKL[]tbixWRQ";

e main(e u,o**v){FILE*fp;h=c=b+BUF;if(u<2)V 2;initscr();
d=LINES;raw();noecho();idlok(stdscr,1);
if(0!=(fp=fopen(f=*++v,"r"))){g+=fread(b,1,BUF,fp) ;g=g<b?b:g;
fclose(fp);}S();_(!q){Y();j=getch();
for(i=0;k[i]&&j-k[i];++i);(*z[i])();}endwin();V 0;}

--
the end

Nov 14 '05 #8
SD wrote:
I am thinking about writing a text editor in C for unix
sometime soon. I am just doing this to learn more about C. I
want to write something like ed.c, a simple line editor. What
types of data structures would be appropriate?


Buffer Gap and paged Buffer Gap seem to be popular.

Try posting this on comp.editors - they go into great detail
about such things from time to time.

In the meantime, you might like playing with Ant's Editor
vIOCCC91 - which uses Buffer Gap, by the way. To compile
on my system, I type: gcc ant.c -o ant -lcurses. Yours
may be different.

#include <stdio.h>
#include <ctype.h>
#include <curses.h>
#define BUF 0x40000
#define T isspace(*(t=Z(p)))&&
#define V return
#define _ while
#define e int
#define o char
e d,i,j,m,n,p,q,x,y;
o*c,b[BUF],*f,*g=b,*h,*t;

o*Z(e a){if(a<0)V b;V b+a+(b+a<g?0:h-g);}

P(o*a){V a-b-(a<h?0:h-g);}

S(){p=0;}

bf(){n=p=P(c);}

Q(){q=1;}

G(){t=Z(p);_(t<g)*--h=*--g;_(h<t)*g++=*h++;p=P(h);}

B(){_(!T b<t)--p;_(T b<t)--p;}

M(e a){_(b<(t=Z(--a))&&*t-'\n');V b<t?++a:0;}

N(e a){_((t=Z(a++))<c&&*t-'\n');V t<c?a:P(c);}

A(e a,e j){i=0;_((t=Z(a))<c&&*t-'\n'&&i<j){
i+=*t-'\t'?1:8-(i&7);++a;}V a;}

L(){0<p&&--p;}

R(){p<P(c)&&++p;}

U(){p=A(M(M(p)-1),x);}

D(){p=A(N(p),x);}

H(){p=M(p);}

E(){p=N(p);L();}

J(){m=p=M(n-1);_(0<y--)D();n=P(c);}

K(){j=d;_(0<--j)m=M(m-1),U();}

X(){G();p=h<c?P(++h):p;}

F(){FILE*fp=fopen(f,"w");j=p;p=0;G();
fwrite(h,1,(e)(c-h),fp);fclose(fp);p=j;}

W(){_(!T t<c)++p;_(T t<c)++p;}

Y(){m=p<m?M(p):m;if(n<=p){m=N(p);i=m-P(c)?d:d-2;
_(0<i--)m=M(m-1);}move(0,0);i=j=0;n=m;for(;;){p-n||(y=i,x=j);
t=Z(n);if(d<=i||c<=t)break;
if(*t-'\r')addch(*t),j+=*t-'\t'?1:8-(j&7);
if(*t=='\n'||COLS<=j)++i,j=0;++n;}clrtobot();
if(++i<d)mvaddstr(i,0,"<<EOF>>");move(y,x);refresh ();}

I(){G();_((j=getch())!='\f'){
if(j-'\b')g-h&&(*g++=(o)(j-'\r'?j:'\n'));else b<g&&--g;
p=P(h);Y();}}

C(){clear();Y();}

e(*z[])()={L,D,U,R,B,J,K,W,H,E,S,bf,I,X,F,C,Q,G};
o k[]="hjklHJKL[]tbixWRQ";

e main(e u,o**v){FILE*fp;h=c=b+BUF;if(u<2)V 2;initscr();
d=LINES;raw();noecho();idlok(stdscr,1);
if(0!=(fp=fopen(f=*++v,"r"))){g+=fread(b,1,BUF,fp) ;g=g<b?b:g;
fclose(fp);}S();_(!q){Y();j=getch();
for(i=0;k[i]&&j-k[i];++i);(*z[i])();}endwin();V 0;}

--
the end

Nov 14 '05 #9
ed_davis2 wrote:
.... snip ...
#include <stdio.h>
#include <ctype.h>
#include <curses.h>
#define BUF 0x40000
#define T isspace(*(t=Z(p)))&&
#define V return
#define _ while
#define e int
#define o char


The next time this nonsense appears so does a PLONK.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 14 '05 #10
SD wrote:
I am thinking about writing a text editor in C for unix sometime soon.
I am just doing this to learn more about C. I want to write something
like ed.c, a simple line editor. What types of data structures would be
appropriate? I am thinking about using a linked list, but I am also
wondering whether a tree would be useful. Please give me your
ideas...thanks, tilak


Assuming you mean the data structure for the text buffer's contents, one
fancy but very cool and appropriate solution is the rope. Here's a page
discussing SGI's C++ implementation of ropes in a good bit of detail:

http://www.sgi.com/tech/stl/Rope.html

To my knowledge there's no good C implementation of ropes, and this
could be a novel project - but perhaps overkill for someone just
starting out. My two cents.
--
Derrick Coetzee
I grant this newsgroup posting into the public domain. I disclaim all
express or implied warranty and all liability. I am not a professional.
Nov 14 '05 #11
ed_davis2 wrote:
In the meantime, you might like playing with Ant's Editor [ . . . ]

e main(e u,o**v){FILE*fp;h=c=b+BUF;if(u<2)V 2;initscr();
d=LINES;raw();noecho();idlok(stdscr,1);
if(0!=(fp=fopen(f=*++v,"r"))){g+=fread(b,1,BUF,fp) ;g=g<b?b:g;
fclose(fp);}S();_(!q){Y();j=getch();
for(i=0;k[i]&&j-k[i];++i);(*z[i])();}endwin();V 0;}


Sweet, sweet obfuscation.
--
Derrick Coetzee
Nov 14 '05 #12

In article <41**************@yahoo.com>, CBFalconer <cb********@yahoo.com> writes:
ed_davis2 wrote:

#include <stdio.h>
#include <ctype.h>
#include <curses.h>
#define BUF 0x40000
#define T isspace(*(t=Z(p)))&&
#define V return


The next time this nonsense appears so does a PLONK.


Why? Citing or quoting IOCCC entries, particularly in response to
overly-broad, OT, or otherwise suspect questions, is a comp.lang.c
tradition of long standing. (About as long as the IOCCC's been
around, I'd guess.)

--
Michael Wojcik mi************@microfocus.com

He smiled and let his gaze fall to hers, so that her cheek began to
glow. Ecstatically she waited until his mouth slowly neared her own.
She knew only one thing: rdoeniadtrgove niardgoverdgovnrdgog.
Nov 14 '05 #13
Michael Wojcik wrote:
ed_davis2 wrote:

#include <stdio.h>
#include <ctype.h>
#include <curses.h>
#define BUF 0x40000
#define T isspace(*(t=Z(p)))&&
#define V return


The next time this nonsense appears so does a PLONK.


Why? Citing or quoting IOCCC entries, particularly in response to
overly-broad, OT, or otherwise suspect questions, is a comp.lang.c
tradition of long standing. (About as long as the IOCCC's been
around, I'd guess.)


I over-reacted. He said nothing about citing an IOCCC entry, and I
took it as another of those idiot postings using a herd of
one-letter defines only to annoy. We just stamped out this a few
months ago.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!
Nov 14 '05 #14
CBFalconer <cb********@yahoo.com> wrote:
Michael Wojcik wrote:
ed_davis2 wrote:

#define T isspace(*(t=Z(p)))&&
#define V return

The next time this nonsense appears so does a PLONK.
Why? Citing or quoting IOCCC entries, particularly in response to
overly-broad, OT, or otherwise suspect questions, is a comp.lang.c
tradition of long standing. (About as long as the IOCCC's been
around, I'd guess.)


I over-reacted. He said nothing about citing an IOCCC entry,


He did, but not obviously so:
you might like playing with Ant's Editor vIOCCC91


Note version number.

Richard
Nov 14 '05 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Ed Suominen | last post by:
I'm thinking of implementing a real-time collaborative text editor in Python using Twisted. An initial plan is to use a Twisted PB server daemon that accepts user:password:file connections from...
3
by: Java script Dude | last post by:
I have still yet to see a JavaScript Editor that comes close to reading a good JS book, learing it and using it with a text editor. Anyway, here my recipe for build successfull DHTML...
4
by: Thomas Paul Diffenbach | last post by:
Can anyone point me to an open source library of /statically allocated/ data structures? I'm writing some code that would benefit from trees, preferably self balancing, but on an embedded system...
20
by: Luke Matuszewski | last post by:
Welcome As suggested i looked into JSON project and was amazed but... What about cyclical data structures - anybody was faced it in some project ? Is there any satisactional recomendation... ...
11
by: efrat | last post by:
Hello, I'm planning to use Python in order to teach a DSA (data structures and algorithms) course in an academic institute. If you could help out with the following questions, I'd sure...
7
by: Prasad | last post by:
Hi all, I am trying to develop a simple rich text editor I do only require bold, itlaic, underline.. The code for IE is <script> function displayEditor(editor, html, width, height) { ...
29
by: Mik0b0 | last post by:
Hallo to everyone. This fall I am going to start data structures as a part of C language course. The problem is I could not find any satisfying tutorial about structures in C. There are plenty of...
3
by: Joshua J. Kugler | last post by:
A while back, I seem to remember coming across a small program that could view and edit python data structures via a nice expanding tree view. I'm now in need of something like that (to verify...
3
by: kkshansid | last post by:
i cant insert data from rich text box to my sqlserver database while all other data inserts using strDescription = trim(Request.Form("rte1")) if i use rte safe then error is shown and whole...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.