467,076 Members | 973 Online

# join of array

 Hello, Is possible merge two arrays like array[1,2,3] + array[4,5,6] => array[1,2,3,4,5,6] select array_append(array[1,2,3], array[2,3]); ERROR: function array_append(integer[], integer[]) does not exist regards Pavel Stehule ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend Nov 11 '05 #1
• viewed: 4265
Share:
20 Replies
 Pavel Stehule writes: Is possible merge two arrays like array[1,2,3] + array[4,5,6] => array[1,2,3,4,5,6] I was about to say that || would do it, but I see that's not quite right: regression=# SELECT ARRAY[1,2,3] || ARRAY[4,5,6]; ?column? ------------------- {{1,2,3},{4,5,6}} (1 row) Offhand, I would think that '{1,2,3,4,5,6}' would be what I'd intuitively expect to get from "concatenating" these arrays. Joe, do we really have this implemented per spec? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend Nov 11 '05 #2
 Joe Conway writes: Hmmm, it made sense to me, at at least at some point ;-). Here's the SQL99 guidance (SQL200X doesn't give any more detailed guidance): 4.11.3.2 Operators that operate on array values and return array values is an operation that returns the array value made by joining its array value operands in the order given. That's about as clear as mud :-( ... but I found a clearer statement in SQL99 6.31: 2) If is specified, then: a) Let AV1 be the value of and let AV2 be the value of . b) If either AV1 or AV2 is the null value, then the result of the is the null value. c) Otherwise, the result is the array comprising every element of AV1 followed by every element of AV2. (c) seems to be pretty clearly what Pavel wants for the 1-D case, but it's not immediately clear how to apply it to multidimensional arrays. We also have ARRAY[1,2] || 3 == '{1,2,3}' and ARRAY[[1,2],[3,4]] || ARRAY[5,6] == '{{1,2},{3,4},{5,6}}' and ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] == '{{{1,2},{3,4}},{{1,2},{3,4}}}' I think the first two still make sense. I guess the third case ought to be: ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] == '{{1,2},{3,4},{1,2},{3,4}}' ? Probably. AFAICS this doesn't affect the data copying at all, only the way in which the result's dimension values are computed, right? Also, we might want to take another look at the rules for selecting the lower-bounds of the result array. In the cases where we're joining N+1-D to N-D (including 1-D to scalar) it still seems to make sense to preserve the subscripts of the higher-dimensional object, so the lower- dimensional one is "pushed" onto one end or the other. In the N-D to N-D case I can't see any really principled way to do it; for lack of a better idea, I suggest preserving the subscripts of the lefthand input (ie, using its lower-bound). regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to ma*******@postgresql.org so that your message can get through to the mailing list cleanly Nov 11 '05 #3
 Tom Lane wrote: Pavel Stehule writes:Is possible merge two arrays likearray[1,2,3] + array[4,5,6] => array[1,2,3,4,5,6] I was about to say that || would do it, but I see that's not quite right: regression=# SELECT ARRAY[1,2,3] || ARRAY[4,5,6]; ?column? ------------------- {{1,2,3},{4,5,6}} (1 row) Offhand, I would think that '{1,2,3,4,5,6}' would be what I'd intuitively expect to get from "concatenating" these arrays. Joe, do we really have this implemented per spec? Hmmm, it made sense to me, at at least at some point ;-). Here's the SQL99 guidance (SQL200X doesn't give any more detailed guidance): 4.11.3.2 Operators that operate on array values and return array values is an operation that returns the array value made by joining its array value operands in the order given. So I guess it ought to be changed. We also have ARRAY[1,2] || 3 == '{1,2,3}' and ARRAY[[1,2],[3,4]] || ARRAY[5,6] == '{{1,2},{3,4},{5,6}}' and ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] == '{{{1,2},{3,4}},{{1,2},{3,4}}}' I think the first two still make sense. I guess the third case ought to be: ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] == '{{1,2},{3,4},{1,2},{3,4}}' ? If this sounds good, I'll work on a patch for the behavior as well as the docs. Joe ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings Nov 11 '05 #4
 > ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] == '{{{1,2},{3,4}},{{1,2},{3,4}}}' I think the first two still make sense. I guess the third case ought to be: ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] == '{{1,2},{3,4},{1,2},{3,4}}' ? I do not think this is right. I think the current behaviour is right. You are effectively dereferencing or flattening the second array which changes the definition of the second object. The ability to do the dereference/flattening is useful, but it is not the || operator. How about |* which would flatten 1 level? Of course, that begs the question of what about n levels and I'm not sure about that. --elein On Fri, Aug 15, 2003 at 08:34:14AM -0700, Joe Conway wrote: Tom Lane wrote:Pavel Stehule writes:Is possible merge two arrays likearray[1,2,3] + array[4,5,6] => array[1,2,3,4,5,6] I was about to say that || would do it, but I see that's not quiteright:regression=# SELECT ARRAY[1,2,3] || ARRAY[4,5,6]; ?column?------------------- {{1,2,3},{4,5,6}}(1 row)Offhand, I would think that '{1,2,3,4,5,6}' would be what I'dintuitively expect to get from "concatenating" these arrays.Joe, do we really have this implemented per spec? Hmmm, it made sense to me, at at least at some point ;-). Here's the SQL99 guidance (SQL200X doesn't give any more detailed guidance): 4.11.3.2 Operators that operate on array values and return array values is an operation that returns the array value made by joining its array value operands in the order given. So I guess it ought to be changed. We also have ARRAY[1,2] || 3 == '{1,2,3}' and ARRAY[[1,2],[3,4]] || ARRAY[5,6] == '{{1,2},{3,4},{5,6}}' and ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] == '{{{1,2},{3,4}},{{1,2},{3,4}}}' I think the first two still make sense. I guess the third case ought to be: ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] == '{{1,2},{3,4},{1,2},{3,4}}' ? If this sounds good, I'll work on a patch for the behavior as well as the docs. Joe ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster Nov 11 '05 #5
 elein wrote: ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] == '{{{1,2},{3,4}},{{1,2},{3,4}}}'I think the first two still make sense. I guess the third case ought to be: ARRAY[[1,2],[3,4]] || ARRAY[[1,2],[3,4]] == '{{1,2},{3,4},{1,2},{3,4}}'? I do not think this is right. I think the current behaviour is right. You are effectively dereferencing or flattening the second array which changes the definition of the second object. It makes sense in analogy to ARRAY[1,2] || ARRAY[3,4] == '{1,2,3,4}' In the case of, e.g. ARRAY[[1,2],[3,4]] || ARRAY[[5,6],[7,8]], '{1,2}', '{3,4}', '{5,6}', and '{7,8}' are "elements" of the higher level array, just like 1, 2, 3, & 4 are elements of '{1,2,3,4}' Joe ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings Nov 11 '05 #6
 I'd like to summarize what I know (or don't know) since this topic has been hit around a little and I'm new to this. I'm hoping it will clear things up, at least for me. You are all the experts, I want to make sure I am singing from the same page. data sample: id | fm | ls | addr | city | st | z |c| start|end ---------------------------------------------------------------------------------- 191922C,Bob Cobb,D'Obbalina Sr.,312 Elm Street,Yountville,CA,94599,5,062001,082009 339111C,Elma Thelma,Velma,98 Oak Lane,St. Louis,MO,63119-2065,,, What I wanted to do was to import lots of these from a text file. In the case where there is an empty string (i.e. no value after a comma) I wanted to define the column in the table in a way that would accept the empty string but replace it with the default value for that column. I didn't know that the copy command is just some C code that stuffs the data into the db ala fois grois. What I would really benefit from (and I hope some other new soul would too) is if someone would outline exactly how they would approach this problem. Maybe provide the correct table definition and the copy command. Or if that just won't work an alternate approach. I realize that some of you have done this partially but there have been too many replies to get into a single cohesive instruction. Anyway I suppose my initial frustration in trying to do this may have blinded me from reason. create table contact ( id character(7) NOT NULL, fm character(30) DEFAULT 'xzxzxzxz', ls character(30) DEFAULT 'xzxzxzxz', addr character(30) DEFAULT '123 xzxzxzxz', city character(25) DEFAULT 'xzxzxzxz', st character(2) DEFAULT 'xz', c character(1) DEFAULT 'x', start decimal(6) DEFAULT 122038, end decimal(6) DEFAULT 122038, CONSTRAINT handle PRIMARY KEY (id) ) WITHOUT OIDS; ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to ma*******@postgresql.org) Nov 11 '05 #7
 elein wrote: you said we had:We also have ^^^^ There are two variants each of two cases. The first case is what started this discussion. The newest reading of the SQL99 spec says that we *must* do this: 1a) ARRAY[1,2] || ARRAY[3,4] == '{1,2,3,4}' Quoting the paragraph provided by Tom: "c) Otherwise, the result is the array comprising every element of AV1 followed by every element of AV2." The variant is that when you have an "array of arrays", i.e. a multidimensional array (which Peter E pointed out earlier is part of SQL99 too), the spec wording implies that we also *must* do this: 1b) ARRAY[[1],[2]] || ARRAY[[3],[4]] == '{{1},{2},{3},{4}' The second case is not directly addressed by the spec as far as I can see, i.e. it is a Postgres extension. That is: 2a) ARRAY[1,2] || 3 == '{1,2,3}' So by analogy the multidimensional variant is: 2b) ARRAY[[1],[2]] || ARRAY[3] == '{{1},{2},{3}}' Cases 1a and 1b are currently wrong according to the spec., and that's the change we've been discussing. Cases 2a and 2b currently work as shown and are correct IMHO (although Tom pointed out a lower bound index issue that I'll address in my response to him). Does this help? Joe ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org Nov 11 '05 #8
 Tom Lane wrote: That's about as clear as mud :-( ... but I found a clearer statement in SQL99 6.31: 2) If is specified, then: a) Let AV1 be the value of and let AV2 be the value of . b) If either AV1 or AV2 is the null value, then the result of the is the null value. c) Otherwise, the result is the array comprising every element of AV1 followed by every element of AV2. (c) seems to be pretty clearly what Pavel wants for the 1-D case, but it's not immediately clear how to apply it to multidimensional arrays. Thanks -- I found the corresponding paragraph in SQL200x (6.35) and it pretty much reads the same. Probably. AFAICS this doesn't affect the data copying at all, only the way in which the result's dimension values are computed, right? Looks that way to me. Also, we might want to take another look at the rules for selecting the lower-bounds of the result array. In the cases where we're joining N+1-D to N-D (including 1-D to scalar) it still seems to make sense to preserve the subscripts of the higher-dimensional object, so the lower- dimensional one is "pushed" onto one end or the other. This is mostly the way it currently works: regression=# create table arr(f1 int[]); CREATE TABLE regression=# insert into arr values ('{}'); INSERT 2498103 1 regression=# update arr set f1[-2] = 1; UPDATE 1 regression=# select array_lower(f1,1) from arr; array_lower ------------- -2 (1 row) regression=# select array_lower(f1 || 2, 1) from arr; array_lower ------------- -2 (1 row) regression=# select array_lower(0 || f1, 1) from arr; array_lower ------------- -3 (1 row) regression=# update arr set f1 = ARRAY[[1,2],[3,4]]; UPDATE 1 regression=# select array_lower(f1,1) from arr; array_lower ------------- 1 (1 row) regression=# select array_lower(f1 || ARRAY[5,6], 1) from arr; array_lower ------------- 1 (1 row) regression=# select array_lower(ARRAY[-1,0] || f1, 1) from arr; array_lower ------------- 1 (1 row) It looks like the only "wrong" case is the last one. Will fix. In the N-D to N-D case I can't see any really principled way to do it; for lack of a better idea, I suggest preserving the subscripts of the lefthand input (ie, using its lower-bound). OK, will do. Thanks, Joe ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend Nov 11 '05 #9
 I guess I am arguing against the spec. :-) But given the spec... The spec is consistent in that it seems to dereference the right operand one level. However, that would still make 2b inconsistent in the evaluation of the right operand. 1a) ARRAY[1,2] || ARRAY[3,4] == '{1,2,3,4}' 1b) ARRAY[[1],[2]] || ARRAY[[3],[4]] == '{{1},{2},{3},{4}}' and ARRAY[1,2] || ARRAY[[3],[4]] == '{1,2,{3},{4}}' So by analogy the multidimensional variant is: 2b) ARRAY[[1],[2]] || ARRAY[3] == '{{1},{2},{3}}' I would think this would be '{{1},{2}, 3}}' and ARRAY[1,2] || ARRAY[[3],[4]] == '{1,2,{3},{4}}' I do see the analogy you are making. But I respectfully disagree (with the spec ?) that the type/structure of the left operand should be taken into account when evaluating the right operand. elein On Fri, Aug 15, 2003 at 10:36:54AM -0700, Joe Conway wrote: elein wrote:you said we had:We also have ^^^^ There are two variants each of two cases. The first case is what started this discussion. The newest reading of the SQL99 spec says that we *must* do this: 1a) ARRAY[1,2] || ARRAY[3,4] == '{1,2,3,4}' Quoting the paragraph provided by Tom: "c) Otherwise, the result is the array comprising every element of AV1 followed by every element of AV2." The variant is that when you have an "array of arrays", i.e. a multidimensional array (which Peter E pointed out earlier is part of SQL99 too), the spec wording implies that we also *must* do this: 1b) ARRAY[[1],[2]] || ARRAY[[3],[4]] == '{{1},{2},{3},{4}' The second case is not directly addressed by the spec as far as I can see, i.e. it is a Postgres extension. That is: 2a) ARRAY[1,2] || 3 == '{1,2,3}' So by analogy the multidimensional variant is: 2b) ARRAY[[1],[2]] || ARRAY[3] == '{{1},{2},{3}}' Cases 1a and 1b are currently wrong according to the spec., and that's the change we've been discussing. Cases 2a and 2b currently work as shown and are correct IMHO (although Tom pointed out a lower bound index issue that I'll address in my response to him). Does this help? Joe ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org Nov 11 '05 #10
 Tom Lane wrote: I believe the behavior Elein wants can be had by writing ARRAY[ n_d_array , n_d_array ] (Joe, would you confirm that's true, and document it? I don't think either section 8.10 or section 4.2.8 makes clear that you can build arrays from smaller array values rather than just scalars.) As long as we have that alternative, it's not necessary that concatenation do the same thing. Well this works: regression=# select ARRAY[ARRAY[[1,2],[3,4]],ARRAY[[5,6],[7,8]]]; array ------------------------------- {{{1,2},{3,4}},{{5,6},{7,8}}} (1 row) But I was disappointed that this doesn't: regression=# select ARRAY['{{1,2},{3,4}}'::int[],'{{5,6},{7,8}}'::int[]]; ERROR: multidimensional ARRAY[] must be built from nested array expressions Nor does this: create table arr(f1 int[], f2 int[]); insert into arr values (ARRAY[[1,2],[3,4]],ARRAY[[5,6],[7,8]]); regression=# select ARRAY[f1,f2] from arr; ERROR: multidimensional ARRAY[] must be built from nested array expressions It does work for the element to array case: create table els(f1 int, f2 int); insert into els values (1,2); regression=# select ARRAY[f1,f2] from els; array ------- {1,2} (1 row) Should I try to make the second and third cases work? Joe ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend Nov 11 '05 #11
 Joe Conway writes: But I was disappointed that this doesn't: regression=# select ARRAY['{{1,2},{3,4}}'::int[],'{{5,6},{7,8}}'::int[]]; ERROR: multidimensional ARRAY[] must be built from nested array expressions Drat, I was assuming that that *would* work. Should I try to make the second and third cases work? Could you look at how big a change it'd be, anyway? Offhand I think it may just mean that the subscript-checking done in parse_expr.c needs to be done at runtime instead. Remember parse_expr should only be concerned about determining datatype, and for its purposes all arrays of a given element type are the same --- subscript checking should happen at runtime. (It seems likely that having an ndims field in ArrayExpr is inappropriate.) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to ma*******@postgresql.org) Nov 11 '05 #12
 Hi, Tom Lane wrote: Joe Conway writes:But I was disappointed that this doesn't:regression=# select ARRAY['{{1,2},{3,4}}'::int[],'{{5,6},{7,8}}'::int[]];ERROR: multidimensional ARRAY[] must be built from nested array expressions Drat, I was assuming that that *would* work.Should I try to make the second and third cases work? Could you look at how big a change it'd be, anyway? Offhand I think it may just mean that the subscript-checking done in parse_expr.c needs to be done at runtime instead. Remember parse_expr should only be concerned about determining datatype, and for its purposes all arrays of a given element type are the same --- subscript checking should happen at runtime. (It seems likely that having an ndims field in ArrayExpr is inappropriate.) Wouldn't it be a good idea to just extend the partner arrays? Say if we concenate array A(Na,..,Xa) || B(Nb,...,Xb) The resulting array C would be of dimension C(Na+Nb,max(Oa,Ob),max(Pa,Pb), ... max(Xa,Xb)) So concenation would be an extending and right hand appending (at first level) Regards Tino Wildenhain ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to ma*******@postgresql.org so that your message can get through to the mailing list cleanly Nov 11 '05 #13
 On Fri, 2003-08-15 at 13:32, elein wrote: PostgreSQL is an ORDBMS, not just an RDBMS. But y'all are talking about the SQL standard here. A column holds a type of value. Any kind. The structure and operands define the type. The data defines the value. This holds true for simple types like an integer or complex types like an array. The database data is relatively "type blind" in an ORDBMS. It uses the standard overloaded operands to determine the type of function to perform for all of the usual RDBMS utilities. Constraints, triggers, sorting, etc. all apply. That's what the ORDBMS stuff can give you. Arrays are a natural extension. Arrays don't necessarily imply denormalization. It depends on how you use them. The same rule applies for integers. I dunno 'bout that... elein On Fri, Aug 15, 2003 at 01:13:52PM -0500, Ron Johnson wrote: Why are arrays even mentioned in the the same breath wrt relations DBMSs? Aren't they an anathema to all we know and love? -- +---------------------------------------------------------------+ | Ron Johnson, Jr. Home: ro***********@cox.net | | Jefferson, LA USA | | | | "Man, I'm pretty. Hoo Hah!" | | Johnny Bravo | +---------------------------------------------------------------+ ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend Nov 11 '05 #14
 In response to both Andrew Gould and Ron Johnson... If arrays are not natural in the organization of your data, don't use them. That is the guideline. If the array defines something specific they are very natural. The confusion could be that arrays are abstract types. Specific implementations which use arrays might be clearer. For example, a definition of a polygon is an array of Points. Points, themselves are an array. (The actual postgreSQL implementation of polygons and points doesn't use the newer cleaner array abstraction, I think. But if I were reimplementing them, I would build on top of the new array capabilities. The point is to show an array structured object which makes sense in context.) Of course you can denomalize via arrays, but it tends to make things harder for you. And I believe the same thing is true for denormalized integer columns. elein ================================================== =========== el***@varlena.com www.varlena.com PostgreSQL Consulting & Support PostgreSQL General Bits http://www.varlena.com/GeneralBits/ ================================================== =========== "Free your mind the rest will follow" -- En Vogue On Fri, Aug 15, 2003 at 02:20:18PM -0500, Ron Johnson wrote: On Fri, 2003-08-15 at 13:32, elein wrote: PostgreSQL is an ORDBMS, not just an RDBMS. But y'all are talking about the SQL standard here. A column holds a type of value. Any kind. The structure and operands define the type. The data defines the value. This holds true for simple types like an integer or complex types like an array. The database data is relatively "type blind" in an ORDBMS. It uses the standard overloaded operands to determine the type of function to perform for all of the usual RDBMS utilities. Constraints, triggers, sorting, etc. all apply. That's what the ORDBMS stuff can give you. Arrays are a natural extension. Arrays don't necessarily imply denormalization. It depends on how you use them. The same rule applies for integers. I dunno 'bout that... elein On Fri, Aug 15, 2003 at 01:13:52PM -0500, Ron Johnson wrote: Why are arrays even mentioned in the the same breath wrt relations DBMSs? Aren't they an anathema to all we know and love? -- +---------------------------------------------------------------+ | Ron Johnson, Jr. Home: ro***********@cox.net | | Jefferson, LA USA | | | | "Man, I'm pretty. Hoo Hah!" | | Johnny Bravo | +---------------------------------------------------------------+ ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings Nov 11 '05 #15
 On Friday 15 August 2003 02:56 pm, elein wrote: In response to both Andrew Gould and Ron Johnson... If arrays are not natural in the organization of your data, don't use them. That is the guideline. If the array defines something specific they are very natural. The confusion could be that arrays are abstract types. Specific implementations which use arrays might be clearer. For example, a definition of a polygon is an array of Points. Points, themselves are an array. (The actual postgreSQL implementation of polygons and points doesn't use the newer cleaner array abstraction, I think. But if I were reimplementing them, I would build on top of the new array capabilities. The point is to show an array structured object which makes sense in context.) Of course you can denomalize via arrays, but it tends to make things harder for you. And I believe the same thing is true for denormalized integer columns. elein ================================================== =========== el***@varlena.com www.varlena.com Thanks, Elein. The polygon example makes it clearer. In the books I have here, the examples show how to use arrays but they use data that I would move to another table. Best regards, Andrew ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to ma*******@postgresql.org so that your message can get through to the mailing list cleanly Nov 11 '05 #16
 On Fri, 2003-08-15 at 15:36, Andrew L. Gould wrote: On Friday 15 August 2003 02:56 pm, elein wrote: In response to both Andrew Gould and Ron Johnson... If arrays are not natural in the organization of your data, don't use them. That is the guideline. If the array defines something specific they are very natural. The confusion could be that arrays are abstract types. Specific implementations which use arrays might be clearer. For example, a definition of a polygon is an array of Points. Points, themselves are an array. (The actual postgreSQL implementation of polygons and points doesn't use the newer cleaner array abstraction, I think. But if I were reimplementing them, I would build on top of the new array capabilities. The point is to show an array structured object which makes sense in context.) Of course you can denomalize via arrays, but it tends to make things harder for you. And I believe the same thing is true for denormalized integer columns. elein ================================================== =========== el***@varlena.com www.varlena.com Thanks, Elein. The polygon example makes it clearer. In the books I have here, the examples show how to use arrays but they use data that I would move to another table. This is what makes me nervous about db arrays: the tendency for denormalization. -- +---------------------------------------------------------------+ | Ron Johnson, Jr. Home: ro***********@cox.net | | Jefferson, LA USA | | | | "Man, I'm pretty. Hoo Hah!" | | Johnny Bravo | +---------------------------------------------------------------+ ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster Nov 11 '05 #17
 create table import_contact ( id character(7) not null primary key, fm character(30), ls character(30), addr character(30), city character(25), st character(2), c character(1), start decimal(6), end decimal(6), ) WITHOUT OIDS; cat datafile.dat | psql -dthedatabase -c "copy import_contact from stdin delimiter ',' null ''"; echo "insert into contact select id, case fm when null then 'xzxzxzxz' else fm end, case ls when null then 'xzxzxzxz' else ls end, case addr when null then '123 xzxzxzxz' else addr end, case city when null then 'xzxzxzxz' else city end, case st when null then 'xz' else st end, case c when null then 'x' else c end, case start when null then 122038 else start, case end when null then 122038 else end from import_contact;" | psql -dthedatabase Could be one way although it's not atomic. Can rewrite the copy command to be a copy from file command to do that and use the \i command (or redirect to psql from file/stdin). Simple but there are many other methods to get this thing to work. If you don't want to recreate the defaults everytime then you could have subselects that reference the pg system tables extract the default value for the columns you are looking for. Also could create the insert statements with a script on the outside or replace any blank (null in reality) fields with the default value and copy that straight to the table. On Sat, 16 Aug 2003 03:18 am, expect wrote: I'd like to summarize what I know (or don't know) since this topic has been hit around a little and I'm new to this. I'm hoping it will clear things up, at least for me. You are all the experts, I want to make sure I am singing from the same page. data sample: id | fm | ls | addr | city | st | z |c| start|end ---------------------------------------------------------------------------------- 191922C,Bob Cobb,D'Obbalina Sr.,312 Elm Street,Yountville,CA,94599,5,062001,082009 339111C,Elma Thelma,Velma,98 Oak Lane,St. Louis,MO,63119-2065,,, What I wanted to do was to import lots of these from a text file. In the case where there is an empty string (i.e. no value after a comma) I wanted to define the column in the table in a way that would accept the empty string but replace it with the default value for that column. I didn't know that the copy command is just some C code that stuffs the data into the db ala fois grois. What I would really benefit from (and I hope some other new soul would too) is if someone would outline exactly how they would approach this problem. Maybe provide the correct table definition and the copy command. Or if that just won't work an alternate approach. I realize that some of you have done this partially but there have been too many replies to get into a single cohesive instruction. Anyway I suppose my initial frustration in trying to do this may have blinded me from reason. create table contact ( id character(7) NOT NULL, fm character(30) DEFAULT 'xzxzxzxz', ls character(30) DEFAULT 'xzxzxzxz', addr character(30) DEFAULT '123 xzxzxzxz', city character(25) DEFAULT 'xzxzxzxz', st character(2) DEFAULT 'xz', c character(1) DEFAULT 'x', start decimal(6) DEFAULT 122038, end decimal(6) DEFAULT 122038, CONSTRAINT handle PRIMARY KEY (id) ) WITHOUT OIDS; ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to ma*******@postgresql.org) ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org Nov 11 '05 #18