Hi all,
I've just finished almost all of what has turned out to be a real bear of a
project. It has to import data from a monthly spreadsheet export from another
program, and convert that into normalized data. The task is made more
difficult by the fact that the structure itself can vary from month to month
(in well defined ways).
So, I used the SQL-centric approach, taking vertical stripes at a time so
that, for instance, for each field with simple, repeating text data, I make a
group-by pass, inserting into the destination lookup table, then I do another
query, joining from the input table to the text fields in the lookups to get
the foreign keys for the main destination table. When I have multiple columns
that need to become 1-M, I make a pass for each of those columns, inserting
lookup record that identifies the split, then inserting the rows into the
many-side table, yadda, yadda, yadda.
All that was going swimmingly, and performing pretty well until I got to the
fields containing multiple, delimited values. My whole dedign is based on
using SQL/DML passes for everything, but the only way I could figure out to
make that work was to call a user defined function from within the query to
pull out, in successive query slices, argument 1, argument 2, etc. I was
going to use a where clause to exclude null results (input doesn't have
arguments n or above), and quit after the first pass with .RecordsAffecte d =
0.
Sounds good, but with a mere 8000 input rows, I had to cancel the first pass
after waiting about 20 minutes. Note that a whole import process takes about
3 minutes if this part is ommitted, and that includes about 40 vertical split
passes. The UDF is very simple, and performs quite fast from VB, but I guess
the overhead of calling this function from a query is VERY SEVERE. Just to
get this thing out the door, I've decided to just tnot split these for now
since we're not doing queries of that data yet.
So, my next thought is that, perhaps this step is better done by simply,
brute-force, cycling through a recordset of the input table, and inserting
rows into the destination tables. If I'm soing that, though, then am I really
getting any benefit worth speaking of by doing everything -else- as SQL-DML in
vertical slices, or would I have been much better off, just doing it the
procedural way, walking through the input table, and creating destination rows
one row at a time? I know it would have been easier to write, but here I was
trying to do things the "right" way.
If I do something like this again, would my code end up performing just as
well, and being easier to write and maintain using simple iteration rather
than SQL-DML?
Thanks for any opinions,
- Steve J
Nov 12 '05
20 2767
On Sat, 31 Jan 2004 22:35:32 GMT, Steve Jorgensen
<no****@nospam. nospam> wrote: I got the impression your SQL statements were executing VBA functions. I was talking (unclearly) of building SQL statements that contain only literal values, no function calls.
Oh, right. I see what you're getting at.
The issue was that the only way I could figure out to use a query to split up the multi-vlaued fields was to call a UDF.
And you were trying to approach this in a set-based way. Sorry; I
lost the context somewhere.
Did you consider using third-party software like DataJunction? I had
a project that involved moving loads of data between a SQL system and
Lotus Notes, and a third-party program really made my life easier.
(It wasn't DataJunction, but it was the same kind of thing.)
--
Mike Sherrill
Information Management Systems This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Christos TZOTZIOY Georgiou |
last post by:
I found at least one case where decombining and recombining a unicode
character does not result in the same character (see at end).
I have no extensive knowledge about Unicode, yet I believe that this
must be a problem of the Unicode 3.2 specification and not Python's.
However, I haven't found out how the decomp_data (in unicodedata_db.h)
is built, and neither did I find much more info about the specifics of
Unicode 3.2. I thought about...
|
by: Torsten Bronger |
last post by:
Hallöchen!
I have a file that looks a little bit like a C header file with a
long list of variables (actually constants) definitions, e.g.
VI_ATTR_TIMO = 0x54378
....
Actually I need this in a couple of low-level modules that are
imported into the main module, and in the main module itself. They
|
by: Doug Baroter |
last post by:
Hi,
One of my clients has the following situation. They use Access DB for
data update etc. some business functions while they also want to view
the Access data quickly and more efficiently in SQL Server 2000. Huge
Access db with over 100 user tables, over 60 MB data.
The DTS package that comes with SQL Server 2000 seems pretty "messy"
in the sense that it assumes that one needs to do one time import only
or accurately it does not...
|
by: Steve Jorgensen |
last post by:
I'm restarting this thread with a different focus.
The project I'm working on now id coming along and will be made to work, and
it's too late to start over with a new strategy. Still, I'm not coming to a
firm conclusion over whether it was the better approach, and wonder if I
should do it differently the next time I'm faced with a similar issue.
I needed an app to automatically import from spreadsheets with a semi-dynamic
structure,...
|
by: Colleyville Alan |
last post by:
I need to import a spreadsheet into an Access table. The spreadsheet has
performance for mutual funds for various periods. The problem is that if
there is no info for a particular period, the spreadsheet contains a dash in
the cell (for example if the fun is only 4 years old and I look in the
columns for 5-year and 10-year performance).
I cannot change the dashes to blank with a global change, the negative
numbers would lose the...
| |
by: lwhite |
last post by:
MS SQL 2000 sp4 on WinXp Pro SP2
I am very new to this so please let me know what I can do to make it
easier for you to understand the problem.
I have a non delimited text file. This text file has several columns
that for the most part are fixed length but..
The fixed format starts with a variable length number( char 10) as the
first column and a max (char 30) description field and another (char
50) long description field.
|
by: Hemant Shah |
last post by:
Folks,
Today, I was exporting a table in one database and then importing it in
another database. The table in destination database was missing one column
(my mistake while creating the table), but import did not complain about it.
Source table:
Column Type Type
|
by: bbcrock |
last post by:
I have three tables with a relationship I've never worked with
before. Can anyone suggest/comment on the best way to create a third
normal form relationship between these tables?
The tables basically are:
TRAIN (TRAIN_ID and 15 columns about train specs, etc)
TRUCK (TRUCK_ID and 12 columns about truck specs, etc)
TRANSPORTATION_ITEM
This table has, among others, two columns, TRUCK_ID and TRAIN_ID. If
|
by: bugboy |
last post by:
At what point is normalizing worth forgetting?
I have a column varchar (4) that will repeat one of only eight possible values for each row. Is it really more efficient to break this out into it's own table and link it with keys than to just have it repeat? ... it's only 4 characters..
I suppose if there are only 8 values i could make the column a char (1).. would this be more efficient or am i just splitting hairs here?
Everywhere i...
|
by: Frank Millman |
last post by:
Hi all
I am familiar enough with the normal use of 'import'. However, I have
found a use for it which seems effective, but I have not seen it used
like this before, so I am not sure if there are any downsides.
I know that when a module is imported the first time, it is
'executed'. This normally entails setting up constants, classes,
functions, etc, that you want to make available to the importer.
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
| |
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| | |