join of array

Pavel Stehule

Hello,

Is possible merge two arrays like

array[1,2,3] + array[4,5,6] => array[1,2,3,4,5,6]

select array_append(array[1,2,3], array[2,3]);
ERROR: function array_append(integer[], integer[]) does not exist
regards
Pavel Stehule
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 11 '05 #1

Subscribe Post Reply

4645

Tom Lane

Pavel Stehule <st*****@kix.fsv.cvut.cz> writes:

Is possible merge two arrays like
array[1,2,3] + array[4,5,6] => array[1,2,3,4,5,6]

I was about to say that || would do it, but I see that's not quite
right:

regression=# SELECT ARRAY[1,2,3] || ARRAY[4,5,6];
?column?
-------------------
{{1,2,3},{4,5,6}}
(1 row)

Offhand, I would think that '{1,2,3,4,5,6}' would be what I'd
intuitively expect to get from "concatenating" these arrays.
Joe, do we really have this implemented per spec?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 11 '05 #2

Tom Lane

Joe Conway <ma**@joeconway.com> writes:

Hmmm, it made sense to me, at at least at some point ;-). Here's the
SQL99 guidance (SQL200X doesn't give any more detailed guidance): 4.11.3.2 Operators that operate on array values and return array values
<array concatenation> is an operation that returns the array value made
by joining its array value operands in the order given.
That's about as clear as mud :-( ... but I found a clearer statement in
SQL99 6.31:

2) If <array concatenation> is specified, then:

a) Let AV1 be the value of <array value expression 1> and let
AV2 be the value of <array value expression 2>.

b) If either AV1 or AV2 is the null value, then the result of
the <array concatenate function> is the null value.

c) Otherwise, the result is the array comprising every element
of AV1 followed by every element of AV2.

(c) seems to be pretty clearly what Pavel wants for the 1-D case, but
it's not immediately clear how to apply it to multidimensional arrays.
We also have
ARRAY[1,2] || 3 == '{1,2,3}'
and
ARRAY[[1,2],[3,4]] || ARRAY[5,6] == '{{1,2},{3,4},{5,6}}'
and
ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] ==
'{{{1,2},{3,4}},{{1,2},{3,4}}}' I think the first two still make sense. I guess the third case ought to be:
ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] ==
'{{1,2},{3,4},{1,2},{3,4}}'
?

Probably. AFAICS this doesn't affect the data copying at all, only the
way in which the result's dimension values are computed, right?

Also, we might want to take another look at the rules for selecting the
lower-bounds of the result array. In the cases where we're joining
N+1-D to N-D (including 1-D to scalar) it still seems to make sense to
preserve the subscripts of the higher-dimensional object, so the lower-
dimensional one is "pushed" onto one end or the other. In the N-D to
N-D case I can't see any really principled way to do it; for lack of
a better idea, I suggest preserving the subscripts of the lefthand
input (ie, using its lower-bound).

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #3

Joe Conway

Tom Lane wrote:

Pavel Stehule <st*****@kix.fsv.cvut.cz> writes:
Is possible merge two arrays like
array[1,2,3] + array[4,5,6] => array[1,2,3,4,5,6]

I was about to say that || would do it, but I see that's not quite
right:

regression=# SELECT ARRAY[1,2,3] || ARRAY[4,5,6];
?column?
-------------------
{{1,2,3},{4,5,6}}
(1 row)

Offhand, I would think that '{1,2,3,4,5,6}' would be what I'd
intuitively expect to get from "concatenating" these arrays.
Joe, do we really have this implemented per spec?

Hmmm, it made sense to me, at at least at some point ;-). Here's the
SQL99 guidance (SQL200X doesn't give any more detailed guidance):

4.11.3.2 Operators that operate on array values and return array values
<array concatenation> is an operation that returns the array value made
by joining its array value operands in the order given.

So I guess it ought to be changed.

We also have
ARRAY[1,2] || 3 == '{1,2,3}'
and
ARRAY[[1,2],[3,4]] || ARRAY[5,6] == '{{1,2},{3,4},{5,6}}'
and
ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] ==
'{{{1,2},{3,4}},{{1,2},{3,4}}}'

I think the first two still make sense. I guess the third case ought to be:
ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] ==
'{{1,2},{3,4},{1,2},{3,4}}'
?

If this sounds good, I'll work on a patch for the behavior as well as
the docs.

Joe
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 11 '05 #4

elein

ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] ==
'{{{1,2},{3,4}},{{1,2},{3,4}}}'

I think the first two still make sense. I guess the third case ought to be:
ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] ==
'{{1,2},{3,4},{1,2},{3,4}}'
?
I do not think this is right. I think the current behaviour
is right. You are effectively dereferencing or flattening
the second array which changes the definition of the second
object.

The ability to do the dereference/flattening is useful,
but it is not the || operator. How about |* which would
flatten 1 level? Of course, that begs the question
of what about n levels and I'm not sure about that.

--elein

On Fri, Aug 15, 2003 at 08:34:14AM -0700, Joe Conway wrote: Tom Lane wrote:
Pavel Stehule <st*****@kix.fsv.cvut.cz> writes:
Is possible merge two arrays like
array[1,2,3] + array[4,5,6] => array[1,2,3,4,5,6]

I was about to say that || would do it, but I see that's not quite
right:

regression=# SELECT ARRAY[1,2,3] || ARRAY[4,5,6];
?column?
-------------------
{{1,2,3},{4,5,6}}
(1 row)

Offhand, I would think that '{1,2,3,4,5,6}' would be what I'd
intuitively expect to get from "concatenating" these arrays.
Joe, do we really have this implemented per spec?

Hmmm, it made sense to me, at at least at some point ;-). Here's the
SQL99 guidance (SQL200X doesn't give any more detailed guidance):

4.11.3.2 Operators that operate on array values and return array values
<array concatenation> is an operation that returns the array value made
by joining its array value operands in the order given.

So I guess it ought to be changed.

We also have
ARRAY[1,2] || 3 == '{1,2,3}'
and
ARRAY[[1,2],[3,4]] || ARRAY[5,6] == '{{1,2},{3,4},{5,6}}'
and
ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] ==
'{{{1,2},{3,4}},{{1,2},{3,4}}}'

I think the first two still make sense. I guess the third case ought to be:
ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] ==
'{{1,2},{3,4},{1,2},{3,4}}'
?

If this sounds good, I'll work on a patch for the behavior as well as
the docs.

Joe
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 11 '05 #5

Joe Conway

elein wrote:

ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] ==
'{{{1,2},{3,4}},{{1,2},{3,4}}}'

I think the first two still make sense. I guess the third case ought to be:
ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] ==
'{{1,2},{3,4},{1,2},{3,4}}'
?

I do not think this is right. I think the current behaviour
is right. You are effectively dereferencing or flattening
the second array which changes the definition of the second
object.

It makes sense in analogy to
ARRAY[1,2] || ARRAY[3,4] == '{1,2,3,4}'

In the case of, e.g. ARRAY[[1,2],[3,4]] || ARRAY[[5,6],[7,8]],
'{1,2}', '{3,4}', '{5,6}', and '{7,8}' are "elements" of the higher
level array, just like 1, 2, 3, & 4 are elements of '{1,2,3,4}'

Joe
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 11 '05 #6

expect

I'd like to summarize what I know (or don't know) since this topic has been
hit around a little and I'm new to this. I'm hoping it will clear things up,
at least for me. You are all the experts, I want to make sure I am singing
from the same page.

data sample:
id | fm | ls | addr | city | st | z |c| start|end
----------------------------------------------------------------------------------

191922C,Bob Cobb,D'Obbalina Sr.,312 Elm Street,Yountville,CA,94599,5,062001,082009
339111C,Elma Thelma,Velma,98 Oak Lane,St. Louis,MO,63119-2065,,,
What I wanted to do was to import lots of these from a text file. In the case
where there is an empty string (i.e. no value after a comma) I wanted to
define the column in the table in a way that would accept the empty string but
replace it with the default value for that column. I didn't know that the
copy command is just some C code that stuffs the data into the db ala
fois grois.

What I would really benefit from (and I hope some other new soul would too)
is if someone would outline exactly how they would approach this problem.

Maybe provide the correct table definition and the copy command. Or if that
just won't work an alternate approach. I realize that some of you have
done this partially but there have been too many replies to get into a
single cohesive instruction.
Anyway I suppose my initial frustration in trying to do this may have blinded
me from reason.
create table contact (
id character(7) NOT NULL,
fm character(30) DEFAULT 'xzxzxzxz',
ls character(30) DEFAULT 'xzxzxzxz',
addr character(30) DEFAULT '123 xzxzxzxz',
city character(25) DEFAULT 'xzxzxzxz',
st character(2) DEFAULT 'xz',
c character(1) DEFAULT 'x',
start decimal(6) DEFAULT 122038,
end decimal(6) DEFAULT 122038,
CONSTRAINT handle PRIMARY KEY (id)
) WITHOUT OIDS;
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #7

Joe Conway

elein wrote:

you said we had:
We also have

^^^^

There are two variants each of two cases. The first case is what started
this discussion. The newest reading of the SQL99 spec says that we
*must* do this:
1a) ARRAY[1,2] || ARRAY[3,4] == '{1,2,3,4}'

Quoting the paragraph provided by Tom:
"c) Otherwise, the result is the array comprising every element
of AV1 followed by every element of AV2."

The variant is that when you have an "array of arrays", i.e. a
multidimensional array (which Peter E pointed out earlier is part of
SQL99 too), the spec wording implies that we also *must* do this:
1b) ARRAY[[1],[2]] || ARRAY[[3],[4]] == '{{1},{2},{3},{4}'
The second case is not directly addressed by the spec as far as I can
see, i.e. it is a Postgres extension. That is:
2a) ARRAY[1,2] || 3 == '{1,2,3}'

So by analogy the multidimensional variant is:
2b) ARRAY[[1],[2]] || ARRAY[3] == '{{1},{2},{3}}'

Cases 1a and 1b are currently wrong according to the spec., and that's
the change we've been discussing. Cases 2a and 2b currently work as
shown and are correct IMHO (although Tom pointed out a lower bound index
issue that I'll address in my response to him).

Does this help?

Joe
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #8

Joe Conway

Tom Lane wrote:

That's about as clear as mud :-( ... but I found a clearer statement
in SQL99 6.31:

2) If <array concatenation> is specified, then:

a) Let AV1 be the value of <array value expression 1> and let AV2 be
the value of <array value expression 2>.

b) If either AV1 or AV2 is the null value, then the result of the
<array concatenate function> is the null value.

c) Otherwise, the result is the array comprising every element of AV1
followed by every element of AV2.

(c) seems to be pretty clearly what Pavel wants for the 1-D case, but
it's not immediately clear how to apply it to multidimensional
arrays.

Thanks -- I found the corresponding paragraph in SQL200x (6.35) and it
pretty much reads the same.
Probably. AFAICS this doesn't affect the data copying at all, only
the way in which the result's dimension values are computed, right?
Looks that way to me.

Also, we might want to take another look at the rules for selecting
the lower-bounds of the result array. In the cases where we're
joining N+1-D to N-D (including 1-D to scalar) it still seems to make
sense to preserve the subscripts of the higher-dimensional object, so
the lower- dimensional one is "pushed" onto one end or the other.
This is mostly the way it currently works:

regression=# create table arr(f1 int[]);
CREATE TABLE
regression=# insert into arr values ('{}');
INSERT 2498103 1
regression=# update arr set f1[-2] = 1;
UPDATE 1
regression=# select array_lower(f1,1) from arr;
array_lower
-------------
-2
(1 row)

regression=# select array_lower(f1 || 2, 1) from arr;
array_lower
-------------
-2
(1 row)

regression=# select array_lower(0 || f1, 1) from arr;
array_lower
-------------
-3
(1 row)
regression=# update arr set f1 = ARRAY[[1,2],[3,4]];
UPDATE 1
regression=# select array_lower(f1,1) from arr;
array_lower
-------------
1
(1 row)

regression=# select array_lower(f1 || ARRAY[5,6], 1) from arr;
array_lower
-------------
1
(1 row)

regression=# select array_lower(ARRAY[-1,0] || f1, 1) from arr;
array_lower
-------------
1
(1 row)
It looks like the only "wrong" case is the last one. Will fix.
In the N-D to N-D case I can't see any really principled way to do
it; for lack of a better idea, I suggest preserving the subscripts of
the lefthand input (ie, using its lower-bound).

OK, will do.

Thanks,

Joe

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 11 '05 #9

elein

I guess I am arguing against the spec. :-)
But given the spec...
The spec is consistent in that it seems to
dereference the right operand one level.

However, that would still make 2b inconsistent
in the evaluation of the right operand.

1a) ARRAY[1,2] || ARRAY[3,4] == '{1,2,3,4}'
1b) ARRAY[[1],[2]] || ARRAY[[3],[4]] == '{{1},{2},{3},{4}}' and ARRAY[1,2] || ARRAY[[3],[4]] == '{1,2,{3},{4}}'
So by analogy the multidimensional variant is:
2b) ARRAY[[1],[2]] || ARRAY[3] == '{{1},{2},{3}}' I would think this would be '{{1},{2}, 3}}'
and ARRAY[1,2] || ARRAY[[3],[4]] == '{1,2,{3},{4}}'

I do see the analogy you are making. But I
respectfully disagree (with the spec ?) that
the type/structure of the left operand should be
taken into account when evaluating the right operand.

elein

On Fri, Aug 15, 2003 at 10:36:54AM -0700, Joe Conway wrote: elein wrote:
you said we had:
We also have

^^^^

There are two variants each of two cases. The first case is what started
this discussion. The newest reading of the SQL99 spec says that we
*must* do this:
1a) ARRAY[1,2] || ARRAY[3,4] == '{1,2,3,4}'

Quoting the paragraph provided by Tom:
"c) Otherwise, the result is the array comprising every element
of AV1 followed by every element of AV2."

The variant is that when you have an "array of arrays", i.e. a
multidimensional array (which Peter E pointed out earlier is part of
SQL99 too), the spec wording implies that we also *must* do this:
1b) ARRAY[[1],[2]] || ARRAY[[3],[4]] == '{{1},{2},{3},{4}'
The second case is not directly addressed by the spec as far as I can
see, i.e. it is a Postgres extension. That is:
2a) ARRAY[1,2] || 3 == '{1,2,3}'

So by analogy the multidimensional variant is:
2b) ARRAY[[1],[2]] || ARRAY[3] == '{{1},{2},{3}}'

Cases 1a and 1b are currently wrong according to the spec., and that's
the change we've been discussing. Cases 2a and 2b currently work as
shown and are correct IMHO (although Tom pointed out a lower bound index
issue that I'll address in my response to him).

Does this help?

Joe
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #10

Joe Conway

Tom Lane wrote:

I believe the behavior Elein wants can be had by writing
ARRAY[ n_d_array , n_d_array ]
(Joe, would you confirm that's true, and document it? I don't think
either section 8.10 or section 4.2.8 makes clear that you can build
arrays from smaller array values rather than just scalars.) As long as
we have that alternative, it's not necessary that concatenation do the
same thing.

Well this works:
regression=# select ARRAY[ARRAY[[1,2],[3,4]],ARRAY[[5,6],[7,8]]];
array
-------------------------------
{{{1,2},{3,4}},{{5,6},{7,8}}}
(1 row)
But I was disappointed that this doesn't:

regression=# select ARRAY['{{1,2},{3,4}}'::int[],'{{5,6},{7,8}}'::int[]];
ERROR: multidimensional ARRAY[] must be built from nested array expressions

Nor does this:

create table arr(f1 int[], f2 int[]);
insert into arr values (ARRAY[[1,2],[3,4]],ARRAY[[5,6],[7,8]]);
regression=# select ARRAY[f1,f2] from arr;
ERROR: multidimensional ARRAY[] must be built from nested array expressions

It does work for the element to array case:

create table els(f1 int, f2 int);
insert into els values (1,2);
regression=# select ARRAY[f1,f2] from els;
array
-------
{1,2}
(1 row)
Should I try to make the second and third cases work?

Joe
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 11 '05 #11

Tom Lane

Joe Conway <ma**@joeconway.com> writes:

But I was disappointed that this doesn't: regression=# select ARRAY['{{1,2},{3,4}}'::int[],'{{5,6},{7,8}}'::int[]];
ERROR: multidimensional ARRAY[] must be built from nested array expressions
Drat, I was assuming that that *would* work.
Should I try to make the second and third cases work?

Could you look at how big a change it'd be, anyway? Offhand I think it
may just mean that the subscript-checking done in parse_expr.c needs to
be done at runtime instead. Remember parse_expr should only be
concerned about determining datatype, and for its purposes all arrays of
a given element type are the same --- subscript checking should happen
at runtime. (It seems likely that having an ndims field in ArrayExpr
is inappropriate.)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #12

Tino Wildenhain

Hi,

Tom Lane wrote:

Joe Conway <ma**@joeconway.com> writes:
But I was disappointed that this doesn't:

regression=# select ARRAY['{{1,2},{3,4}}'::int[],'{{5,6},{7,8}}'::int[]];
ERROR: multidimensional ARRAY[] must be built from nested array expressions

Drat, I was assuming that that *would* work.

Should I try to make the second and third cases work?

Could you look at how big a change it'd be, anyway? Offhand I think it
may just mean that the subscript-checking done in parse_expr.c needs to
be done at runtime instead. Remember parse_expr should only be
concerned about determining datatype, and for its purposes all arrays of
a given element type are the same --- subscript checking should happen
at runtime. (It seems likely that having an ndims field in ArrayExpr
is inappropriate.)

Wouldn't it be a good idea to just extend the partner arrays? Say
if we concenate array A(Na,..,Xa) || B(Nb,...,Xb)
The resulting array C would be of dimension
C(Na+Nb,max(Oa,Ob),max(Pa,Pb), ... max(Xa,Xb))
So concenation would be an extending and right hand appending (at first
level)

Regards
Tino Wildenhain

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #13

Ron Johnson

On Fri, 2003-08-15 at 13:32, elein wrote:

PostgreSQL is an ORDBMS, not just an RDBMS.
But y'all are talking about the SQL standard here.
A column holds a type of value. Any kind. The
structure and operands define the type. The data
defines the value. This holds true for simple types
like an integer or complex types like an array.

The database data is relatively "type blind" in an
ORDBMS. It uses the standard overloaded operands
to determine the type of function to perform for
all of the usual RDBMS utilities.
Constraints, triggers, sorting, etc. all apply.

That's what the ORDBMS stuff can give you.
Arrays are a natural extension.

Arrays don't necessarily imply denormalization.
It depends on how you use them. The same rule
applies for integers.
I dunno 'bout that...
elein

On Fri, Aug 15, 2003 at 01:13:52PM -0500, Ron Johnson wrote:

Why are arrays even mentioned in the the same breath wrt relations
DBMSs? Aren't they an anathema to all we know and love?

--
+---------------------------------------------------------------+
| Ron Johnson, Jr. Home: ro***********@cox.net |
| Jefferson, LA USA |
| |
| "Man, I'm pretty. Hoo Hah!" |
| Johnny Bravo |
+---------------------------------------------------------------+

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 11 '05 #14

elein

In response to both Andrew Gould and Ron Johnson...

If arrays are not natural in the organization of
your data, don't use them. That is the guideline.

If the array defines something specific they are
very natural. The confusion could be that arrays
are abstract types.

Specific implementations which use arrays might
be clearer. For example, a definition of a polygon
is an array of Points. Points, themselves are an
array.

(The actual postgreSQL implementation of polygons and points
doesn't use the newer cleaner array abstraction, I think.
But if I were reimplementing them, I would build on
top of the new array capabilities. The point is to show
an array structured object which makes sense in context.)

Of course you can denomalize via arrays, but it tends
to make things harder for you. And I believe the
same thing is true for denormalized integer columns.

elein
================================================== ===========
el***@varlena.com www.varlena.com
PostgreSQL Consulting & Support
PostgreSQL General Bits http://www.varlena.com/GeneralBits/
================================================== ===========
"Free your mind the rest will follow"
-- En Vogue
On Fri, Aug 15, 2003 at 02:20:18PM -0500, Ron Johnson wrote:

On Fri, 2003-08-15 at 13:32, elein wrote:
PostgreSQL is an ORDBMS, not just an RDBMS.

But y'all are talking about the SQL standard here.
A column holds a type of value. Any kind. The
structure and operands define the type. The data
defines the value. This holds true for simple types
like an integer or complex types like an array.

The database data is relatively "type blind" in an
ORDBMS. It uses the standard overloaded operands
to determine the type of function to perform for
all of the usual RDBMS utilities.
Constraints, triggers, sorting, etc. all apply.

That's what the ORDBMS stuff can give you.
Arrays are a natural extension.

Arrays don't necessarily imply denormalization.
It depends on how you use them. The same rule
applies for integers.

I dunno 'bout that...
elein

On Fri, Aug 15, 2003 at 01:13:52PM -0500, Ron Johnson wrote:

Why are arrays even mentioned in the the same breath wrt relations
DBMSs? Aren't they an anathema to all we know and love?

--
+---------------------------------------------------------------+
| Ron Johnson, Jr. Home: ro***********@cox.net |
| Jefferson, LA USA |
| |
| "Man, I'm pretty. Hoo Hah!" |
| Johnny Bravo |
+---------------------------------------------------------------+

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 11 '05 #15

Andrew L. Gould

On Friday 15 August 2003 02:56 pm, elein wrote:

In response to both Andrew Gould and Ron Johnson...

If arrays are not natural in the organization of
your data, don't use them. That is the guideline.

If the array defines something specific they are
very natural. The confusion could be that arrays
are abstract types.

Specific implementations which use arrays might
be clearer. For example, a definition of a polygon
is an array of Points. Points, themselves are an
array.

(The actual postgreSQL implementation of polygons and points
doesn't use the newer cleaner array abstraction, I think.
But if I were reimplementing them, I would build on
top of the new array capabilities. The point is to show
an array structured object which makes sense in context.)

Of course you can denomalize via arrays, but it tends
to make things harder for you. And I believe the
same thing is true for denormalized integer columns.

elein
================================================== ===========
el***@varlena.com www.varlena.com

Thanks, Elein. The polygon example makes it clearer. In the books I have
here, the examples show how to use arrays but they use data that I would move
to another table.

Best regards,

Andrew

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #16

Ron Johnson

On Fri, 2003-08-15 at 15:36, Andrew L. Gould wrote:

On Friday 15 August 2003 02:56 pm, elein wrote:
In response to both Andrew Gould and Ron Johnson...

If arrays are not natural in the organization of
your data, don't use them. That is the guideline.

If the array defines something specific they are
very natural. The confusion could be that arrays
are abstract types.

Specific implementations which use arrays might
be clearer. For example, a definition of a polygon
is an array of Points. Points, themselves are an
array.

(The actual postgreSQL implementation of polygons and points
doesn't use the newer cleaner array abstraction, I think.
But if I were reimplementing them, I would build on
top of the new array capabilities. The point is to show
an array structured object which makes sense in context.)

Of course you can denomalize via arrays, but it tends
to make things harder for you. And I believe the
same thing is true for denormalized integer columns.

elein
================================================== ===========
el***@varlena.com www.varlena.com

Thanks, Elein. The polygon example makes it clearer. In the books I have
here, the examples show how to use arrays but they use data that I would move
to another table.

This is what makes me nervous about db arrays: the tendency for
denormalization.

--
+---------------------------------------------------------------+
| Ron Johnson, Jr. Home: ro***********@cox.net |
| Jefferson, LA USA |
| |
| "Man, I'm pretty. Hoo Hah!" |
| Johnny Bravo |
+---------------------------------------------------------------+

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 11 '05 #17

Jason Godden

create table import_contact (
id character(7) not null primary key,
fm character(30),
ls character(30),
addr character(30),
city character(25),
st character(2),
c character(1),
start decimal(6),
end decimal(6),
) WITHOUT OIDS;

cat datafile.dat | psql -dthedatabase -c "copy import_contact from stdin
delimiter ',' null ''";

echo "insert into contact select
id,
case fm when null then 'xzxzxzxz' else fm end,
case ls when null then 'xzxzxzxz' else ls end,
case addr when null then '123 xzxzxzxz' else addr end,
case city when null then 'xzxzxzxz' else city end,
case st when null then 'xz' else st end,
case c when null then 'x' else c end,
case start when null then 122038 else start,
case end when null then 122038 else end
from import_contact;" | psql -dthedatabase

Could be one way although it's not atomic. Can rewrite the copy command to be
a copy from file command to do that and use the \i command (or redirect to
psql from file/stdin). Simple but there are many other methods to get this
thing to work. If you don't want to recreate the defaults everytime then you
could have subselects that reference the pg system tables extract the default
value for the columns you are looking for.

Also could create the insert statements with a script on the outside or
replace any blank (null in reality) fields with the default value and copy
that straight to the table.

On Sat, 16 Aug 2003 03:18 am, expect wrote:

I'd like to summarize what I know (or don't know) since this topic has been
hit around a little and I'm new to this. I'm hoping it will clear things
up, at least for me. You are all the experts, I want to make sure I am
singing from the same page.

data sample:
id | fm | ls | addr | city | st | z |c|
start|end
---------------------------------------------------------------------------
-------

191922C,Bob Cobb,D'Obbalina Sr.,312 Elm
Street,Yountville,CA,94599,5,062001,082009 339111C,Elma Thelma,Velma,98 Oak
Lane,St. Louis,MO,63119-2065,,,
What I wanted to do was to import lots of these from a text file. In the
case where there is an empty string (i.e. no value after a comma) I wanted
to define the column in the table in a way that would accept the empty
string but replace it with the default value for that column. I didn't
know that the copy command is just some C code that stuffs the data into
the db ala fois grois.

What I would really benefit from (and I hope some other new soul would too)
is if someone would outline exactly how they would approach this problem.

Maybe provide the correct table definition and the copy command. Or if
that just won't work an alternate approach. I realize that some of you
have done this partially but there have been too many replies to get into a
single cohesive instruction.
Anyway I suppose my initial frustration in trying to do this may have
blinded me from reason.
create table contact (
id character(7) NOT NULL,
fm character(30) DEFAULT 'xzxzxzxz',
ls character(30) DEFAULT 'xzxzxzxz',
addr character(30) DEFAULT '123 xzxzxzxz',
city character(25) DEFAULT 'xzxzxzxz',
st character(2) DEFAULT 'xz',
c character(1) DEFAULT 'x',
start decimal(6) DEFAULT 122038,
end decimal(6) DEFAULT 122038,
CONSTRAINT handle PRIMARY KEY (id)
) WITHOUT OIDS;
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #18

expect

On Sat, 16 Aug 2003 09:33:51 +1000
Jason Godden <ja*********@optushome.com.au> wrote:
Ahh, thanks for this. And thanks to all the others that helped me on my way.
Hopefully I'll be able to give something back to the group. Although that
might be hard with all the experts here.

Perhaps I can document this and provide it for public consumption.

create table import_contact (
id character(7) not null primary key,
fm character(30),
ls character(30),
addr character(30),
city character(25),
st character(2),
c character(1),
start decimal(6),
end decimal(6),
) WITHOUT OIDS;

cat datafile.dat | psql -dthedatabase -c "copy import_contact from stdin
delimiter ',' null ''";

echo "insert into contact select
id,
case fm when null then 'xzxzxzxz' else fm end,
case ls when null then 'xzxzxzxz' else ls end,
case addr when null then '123 xzxzxzxz' else addr end,
case city when null then 'xzxzxzxz' else city end,
case st when null then 'xz' else st end,
case c when null then 'x' else c end,
case start when null then 122038 else start,
case end when null then 122038 else end
from import_contact;" | psql -dthedatabase

Could be one way although it's not atomic. Can rewrite the copy command to be
a copy from file command to do that and use the \i command (or redirect to
psql from file/stdin). Simple but there are many other methods to get this
thing to work. If you don't want to recreate the defaults everytime then you
could have subselects that reference the pg system tables extract the default
value for the columns you are looking for.

Also could create the insert statements with a script on the outside or
replace any blank (null in reality) fields with the default value and copy
that straight to the table.

On Sat, 16 Aug 2003 03:18 am, expect wrote:
I'd like to summarize what I know (or don't know) since this topic has been
hit around a little and I'm new to this. I'm hoping it will clear things
up, at least for me. You are all the experts, I want to make sure I am
singing from the same page.

data sample:
id | fm | ls | addr | city | st | z |c|
start|end
---------------------------------------------------------------------------
-------

191922C,Bob Cobb,D'Obbalina Sr.,312 Elm
Street,Yountville,CA,94599,5,062001,082009 339111C,Elma Thelma,Velma,98 Oak
Lane,St. Louis,MO,63119-2065,,,
What I wanted to do was to import lots of these from a text file. In the
case where there is an empty string (i.e. no value after a comma) I wanted
to define the column in the table in a way that would accept the empty
string but replace it with the default value for that column. I didn't
know that the copy command is just some C code that stuffs the data into
the db ala fois grois.

What I would really benefit from (and I hope some other new soul would too)
is if someone would outline exactly how they would approach this problem.

Maybe provide the correct table definition and the copy command. Or if
that just won't work an alternate approach. I realize that some of you
have done this partially but there have been too many replies to get into a
single cohesive instruction.
Anyway I suppose my initial frustration in trying to do this may have
blinded me from reason.
create table contact (
id character(7) NOT NULL,
fm character(30) DEFAULT 'xzxzxzxz',
ls character(30) DEFAULT 'xzxzxzxz',
addr character(30) DEFAULT '123 xzxzxzxz',
city character(25) DEFAULT 'xzxzxzxz',
st character(2) DEFAULT 'xz',
c character(1) DEFAULT 'x',
start decimal(6) DEFAULT 122038,
end decimal(6) DEFAULT 122038,
CONSTRAINT handle PRIMARY KEY (id)
) WITHOUT OIDS;
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #19

Jason Godden

Just a note on that example:

I didn't properly end the case commands on the last two fields in the insert
and end should probably be quoted. No I haven't tested it. Should be:

echo "insert into contact select
id,
case fm when null then 'xzxzxzxz' else fm end,
case ls when null then 'xzxzxzxz' else ls end,
case addr when null then '123 xzxzxzxz' else addr end,
case city when null then 'xzxzxzxz' else city end,
case st when null then 'xz' else st end,
case c when null then 'x' else c end,
case start when null then 122038 else start end,
case "end" when null then 122038 else "end" end
from import_contact;" | psql -dthedatabase

Rgds,

Jason

On Sat, 16 Aug 2003 01:10 pm, expect wrote:

On Sat, 16 Aug 2003 09:33:51 +1000
Jason Godden <ja*********@optushome.com.au> wrote:
Ahh, thanks for this. And thanks to all the others that helped me on my
way. Hopefully I'll be able to give something back to the group. Although
that might be hard with all the experts here.

Perhaps I can document this and provide it for public consumption.
create table import_contact (
id character(7) not null primary key,
fm character(30),
ls character(30),
addr character(30),
city character(25),
st character(2),
c character(1),
start decimal(6),
end decimal(6),
) WITHOUT OIDS;

cat datafile.dat | psql -dthedatabase -c "copy import_contact from stdin
delimiter ',' null ''";

echo "insert into contact select
id,
case fm when null then 'xzxzxzxz' else fm end,
case ls when null then 'xzxzxzxz' else ls end,
case addr when null then '123 xzxzxzxz' else addr end,
case city when null then 'xzxzxzxz' else city end,
case st when null then 'xz' else st end,
case c when null then 'x' else c end,
case start when null then 122038 else start,
case end when null then 122038 else end
from import_contact;" | psql -dthedatabase

Could be one way although it's not atomic. Can rewrite the copy command
to be a copy from file command to do that and use the \i command (or
redirect to psql from file/stdin). Simple but there are many other
methods to get this thing to work. If you don't want to recreate the
defaults everytime then you could have subselects that reference the pg
system tables extract the default value for the columns you are looking
for.

Also could create the insert statements with a script on the outside or
replace any blank (null in reality) fields with the default value and
copy that straight to the table.

On Sat, 16 Aug 2003 03:18 am, expect wrote:
I'd like to summarize what I know (or don't know) since this topic has
been hit around a little and I'm new to this. I'm hoping it will clear
things up, at least for me. You are all the experts, I want to make
sure I am singing from the same page.

data sample:
id | fm | ls | addr | city | st | z |c|
start|end
-----------------------------------------------------------------------
---- -------

191922C,Bob Cobb,D'Obbalina Sr.,312 Elm
Street,Yountville,CA,94599,5,062001,082009 339111C,Elma Thelma,Velma,98
Oak Lane,St. Louis,MO,63119-2065,,,
What I wanted to do was to import lots of these from a text file. In
the case where there is an empty string (i.e. no value after a comma)
I wanted to define the column in the table in a way that would accept
the empty string but replace it with the default value for that column.
I didn't know that the copy command is just some C code that stuffs
the data into the db ala fois grois.

What I would really benefit from (and I hope some other new soul would
too) is if someone would outline exactly how they would approach this
problem.

Maybe provide the correct table definition and the copy command. Or if
that just won't work an alternate approach. I realize that some of you
have done this partially but there have been too many replies to get
into a single cohesive instruction.
Anyway I suppose my initial frustration in trying to do this may have
blinded me from reason.
create table contact (
id character(7) NOT NULL,
fm character(30) DEFAULT 'xzxzxzxz',
ls character(30) DEFAULT 'xzxzxzxz',
addr character(30) DEFAULT '123 xzxzxzxz',
city character(25) DEFAULT 'xzxzxzxz',
st character(2) DEFAULT 'xz',
c character(1) DEFAULT 'x',
start decimal(6) DEFAULT 122038,
end decimal(6) DEFAULT 122038,
CONSTRAINT handle PRIMARY KEY (id)
) WITHOUT OIDS;
---------------------------(end of
broadcast)--------------------------- TIP 2: you can get off all lists
at once with the unregister command (send "unregister
YourEmailAddressHere" to ma*******@postgresql.org)

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #20

Matthew D. Fuller

On Fri, Aug 15, 2003 at 01:37:50PM -0500 I heard the voice of
Andrew L. Gould, and lo! it spake thus:

On Friday 15 August 2003 01:13 pm, Ron Johnson wrote:

Why are arrays even mentioned in the the same breath wrt relations
DBMSs? Aren't they an anathema to all we know and love?

This gives rise to a couple of good questions:

When and why would you want to use arrays instead of a relational model?

When it's appropriate 8-}

I've found it to be extremely rare, but it DOES happen. For instance, in
one project, I needed a bit of data in a record, which could store
between 1 and 7 (integer) values in it. Never less than 1, never more
than 7, and the ordering of them was essential to preserve. I *COULD* do
it in another table, with an order column (which would have to be
maintained 'manually' in the code), and, if one were neurotic, a seperate
constraint to keep from somehow getting too many records for each item in
the other table.... but it was far easier to just slap it in an array.

Now, any time ordering isn't important, OR a field can grow without
realistic bounds, arrays naturally aren't important. For instance, if
I'd needed 15 entries, I might well have gone ahead and referenced
another table, just because it would get too irritating dealing with the
arrays.

--
Matthew Fuller (MF4839) | fu******@over-yonder.net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/

"The only reason I'm burning my candle at both ends, is because I
haven't figured out how to light the middle yet"

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 11 '05 #21

Similar topics