Am trying to import 20,000+ lines of text in a file
FTP'd from a UNIX platform to windows via FTP
session in a DOS box. About 2000 records have
multiple lines in them separated by CRLF's.
That's not the idea. Each line was supposed to
be a separate record. I don't know why the 2000
records having MULTIPLE lines of text are so
stubborn. I've run UNIX2DOS utility against the
raw text file - no help. I know that UNIX and MAC
systems standards for end-of-line termination characters
differ. I really thought the UNIX2DOS utility would
do the trick. But each time I import the text file -
same thing.
Here's the SQL that shows me the 2000 problematic
rows:
SELECT Apple.Field1 FROM Apple
WHERE (((Apple.Field1) Like "*" & Chr$(13) & "*"));
I've analyzed the text file with a HEX editor and I see
nothing different being used for end-of-line termination
in the problematic rows than in the other lines (&H0d
and &H0a). Why are 2000 records being formed during
the text importation with 2, 3 and perhaps more lines
in a single record? What can I do to stop that from
happening? I really don't feel like doing this manually
in 2000 different places in a 20000+ line text file. 8 2112
BTW, if I select about 50 rows of the pertinent field
in an open table, copy with CTRL-C and paste into
notepad, I see the problematic row data is wrapped
in dbl-quotes. The non-problematic row data has no
such quote wrapping. Dunno if that's important, but
it is factual.
MLH wrote: Am trying to import 20,000+ lines of text in a file FTP'd from a UNIX platform to windows via FTP session in a DOS box. About 2000 records have multiple lines in them separated by CRLF's.
That's not the idea. Each line was supposed to be a separate record. I don't know why the 2000 records having MULTIPLE lines of text are so stubborn. I've run UNIX2DOS utility against the raw text file - no help. I know that UNIX and MAC systems standards for end-of-line termination characters differ. I really thought the UNIX2DOS utility would do the trick. But each time I import the text file - same thing.
Here's the SQL that shows me the 2000 problematic rows: SELECT Apple.Field1 FROM Apple WHERE (((Apple.Field1) Like "*" & Chr$(13) & "*"));
I've analyzed the text file with a HEX editor and I see nothing different being used for end-of-line termination in the problematic rows than in the other lines (&H0d and &H0a). Why are 2000 records being formed during the text importation with 2, 3 and perhaps more lines in a single record? What can I do to stop that from happening? I really don't feel like doing this manually in 2000 different places in a 20000+ line text file.
It's difficult to diagnose what is happening from the information that
you provided, but you might want to attempt your ftp again, and specify
binary mode. FTP will insert returns in ASCII mode.
In absolute worse case, you could read the file in with VBA in character
mode and properly terminate the lines.
--
Randy Harris
tech at promail dot com
I'm pretty sure I know everything that I can remember.
MLH <CR**@NorthState.net> wrote in
news:dk********************************@4ax.com: Am trying to import 20,000+ lines of text in a file FTP'd from a UNIX platform to windows via FTP session in a DOS box. About 2000 records have multiple lines in them separated by CRLF's.
That's not the idea. Each line was supposed to be a separate record. I don't know why the 2000 records having MULTIPLE lines of text are so stubborn. I've run UNIX2DOS utility against the raw text file - no help. I know that UNIX and MAC systems standards for end-of-line termination characters differ. I really thought the UNIX2DOS utility would do the trick. But each time I import the text file - same thing.
Here's the SQL that shows me the 2000 problematic rows: SELECT Apple.Field1 FROM Apple WHERE (((Apple.Field1) Like "*" & Chr$(13) & "*"));
I've analyzed the text file with a HEX editor and I see nothing different being used for end-of-line termination in the problematic rows than in the other lines (&H0d and &H0a). Why are 2000 records being formed during the text importation with 2, 3 and perhaps more lines in a single record? What can I do to stop that from happening? I really don't feel like doing this manually in 2000 different places in a 20000+ line text file.
The rows have the CrLf embedded in the quoted strings. That
means they were placed in the original file.
Try to redo the import specification to set the "Text
Delimiter" value to the double-quote.
--
Bob Quintal
PA is y I've altered my email address.
> The rows have the CrLf embedded in the quoted strings. That means they were placed in the original file.
Sorry, what is meant by 'original file'? The file is created
on-the-fly using ls command to do a dir of the unix box
current directory - that listing is what's being written to
the local PC during the process. Strangely, out of 851
root-level dirs jointly containing 24000+ 2nd level dirs,
only about 2000 entries (rows which ultimately become
records) are comprised of multiple CRLF-separated lines
that are enclosed in dbl-quotes. 90% of the lines are not
enclosed in dbl-quotes but DO have carriage returns &
line feeds at the ends of the lines. They have to wrap
somehow. Try to redo the import specification to set the "Text Delimiter" value to the double-quote.
>It's difficult to diagnose what is happening from the information that you provided, but you might want to attempt your ftp again, and specify binary mode. FTP will insert returns in ASCII mode.
In absolute worse case, you could read the file in with VBA in character mode and properly terminate the lines.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Well, I did it the old-fashioned way. I used elbow grease.
If it ever becomes necessary 4 me 2 do this again, it will
remain a problem - I have no idea how to prevent it. Any-
way, the SQL enabled me to single out the crappy stuff,
get rid of the dbl-quotes, parse the lines out properly, and
append the RIGHT stuff back to the table in an orderly
fashion.
Sounds like the problem is in the unix box...
you could fix the text with vba before you import to fix it - remove
dbl qoutes that come directly after vbcrlf and remove vbcrlfs that
proceed quotes. then remove all remaining quotes.
but i'd fix the output from the unix box - or throw it back to the
person who is responsible for it.
MLH <CR**@NorthState.net> wrote in
news:9q********************************@4ax.com: The rows have the CrLf embedded in the quoted strings. That means they were placed in the original file. Sorry, what is meant by 'original file'? The file is created on-the-fly using ls command to do a dir of the unix box current directory - that listing is what's being written to the local PC during the process.
Then it is the directory structure that contains some
descriptors or metadata which contains the double quotes..
Strangely, out of 851 root-level dirs jointly containing 24000+ 2nd level dirs, only about 2000 entries (rows which ultimately become records) are comprised of multiple CRLF-separated lines that are enclosed in dbl-quotes. 90% of the lines are not enclosed in dbl-quotes but DO have carriage returns & line feeds at the ends of the lines. They have to wrap somehow. Try to redo the import specification to set the "Text Delimiter" value to the double-quote.
--
Bob Quintal
PA is y I've altered my email address.
MLH wrote: The rows have the CrLf embedded in the quoted strings. That means they were placed in the original file. Sorry, what is meant by 'original file'? The file is created on-the-fly using ls command to do a dir of the unix box current directory - that listing is what's being written to the local PC during the process. Strangely, out of 851 root-level dirs jointly containing 24000+ 2nd level dirs, only about 2000 entries (rows which ultimately become records) are comprised of multiple CRLF-separated lines that are enclosed in dbl-quotes. 90% of the lines are not enclosed in dbl-quotes but DO have carriage returns & line feeds at the ends of the lines. They have to wrap somehow. Try to redo the import specification to set the "Text Delimiter" value to the double-quote.
You've got something else going on. The ls command won't put double
quotes around anything. FTP, in ASCII mode, will sometimes however, add
CR-LF every 255 characters.
--
Randy Harris
tech at promail dot com
I'm pretty sure I know everything that I can remember. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: TJ |
last post by:
Anyone know how to display process id's (normally found by going to
Component Services and switching to applications to 'Status View') in a web
page?
I'm trying to keep remote users from having...
|
by: Graham Simms |
last post by:
I have tried to run a query where the reuslts are restricted by a
HAVING clause with 2 conditions, but the second condition seems to be
ignored.
for example
SELECT userid, AVG(position) FROM...
|
by: |
last post by:
I fail to understand why that the memory allocated in the void
create(int **matrix) does not remain. I passed the address of matrix so
shouldn't it still have the allocated memory when it returns...
|
by: Edward Diener |
last post by:
Given
value class X
{
public:
// Not allowed: X():i(100000),s(10000) { }
// Allowed
void InitializeDefaults() { i = 100000; s = 10000; }
private:
int i;
|
by: Vinod |
last post by:
Hi,
In VC8 project, I am having a struct which is having a char* variable.
Now I am creating a 3 elements array object for the struct. I send the base
address of the object using VARIANT to a...
|
by: vanandwiz |
last post by:
Please find the code below.
SELECT
COMPANY.NAME,
COUNT(ITEM.ID)
FROM
ITEM,
COMPANY
WHERE ITEM.COMPANYID = COMPANY.ID
GROUP BY ROLLUP(COMPANY.NAME)
|
by: DR |
last post by:
I heard there is some trick to referencing statics in C# CLR stored
procedure without having to mark the assembly as unsafe. Does anyone know
this? This is usefull as the case of needing a little...
|
by: Philia |
last post by:
Hello,
I'm having problem migrating to postgreSQL from MySQL, some of my queries does not work in postgres.
Query(simplified):
SELECT t1.bus_stop, t1.time,
(case when t1.time >2200 then...
|
by: Jack |
last post by:
Hi,
Performance-wise, is there a difference between these 2 clauses?
I assume a large database...
1) With the WHERE clause:
Select FLD1, FLD2, FLD3
From myFile
Where FLD1 = 1 And FLD2 = 1
|
by: tiptap |
last post by:
Hey Guys,
I have a huge statement loads of if statements in... and its getting bigger.
On closer inspection there is only 3 difference in the select statement. so I thought I could cut the...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
| | |