473,385 Members | 1,863 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Back to the basics...

Folks,

I'm just now beginning to use Access. To date, I've managed hundreds
of thousands of records in SAS, but would like to expand my software
horizon. I've read some and sat throught some classes, but one of
things that I can't seem to get my head wrapped around is this: if
you're given a large table that has missing and incomplete records and
you want to clean that data up, (replace missing dates using the
previous record and randomly assign records a value based on a number
of criteria for example), do you actually modify the original table or
simpy create queries that create "temporary" tables that contain the
new and improved records?

I know that in SAS, you may have several data steps and several
temporary data sets, but when you get the data like you want it, you
create a permanent SAS data set that is stored on your computer or a
server. In the end, your raw data remains in tact and you also have a
cleaned copy of that data for subsequent analyses. All of the
intermediate data sets are gone when you exit SAS. Perhaps queries in
Access can be viewed as intermediate data steps. And I suppose that if
you wanted, you could create a permanent copy of the cleaned table once
you got the data just the way you wanted it. Does this make any sense
at all?

Mike

Apr 18 '06 #1
8 1307
only if every field in your table is really effed up. no need to create
a completely seperate table.

i'd say the approach i prefer is to just use update queries to
supplement data where missing using an Is Null criteria in the records
i'd like to change. you may have to supplement that Is Null to handle
empty strings. allen browne has a good discussion on nulls and empty
sets on his website.

if your data is in really bad shape you may want to create an Append
query to move data out of the current table and into a new one, but to
avoid asynchronous data you'll have to be very careful to specify
absolutely all correctable possibilities when the table is first
updated.

like i said, i'd use Update queries and take it field by field or a
couple fields at a time and keep things in a more or less piecemeal
fashion to avoid missing things.

if you are pretty well rebuilding a table from the ground-up, you'll
probably want to take a close look at your data normalization at the
same time to ensure you don't have to go through this process more than
once.

Apr 18 '06 #2
Often, you will have a raw table of data.

You certainly might make a copy of this table, and then work on it....

You might run several update queries to set a date value for all columns
that don't have a date.

You then might delete all recodes that have some status value of a
particular value. You can continue these updates and cleaning as much as you
want.

However, each time you make a update, or clean out some data, you are fact
modifying the table you are working on.

If you just are doing some selects and query of data, then the results are
not saved anywhere, but you can save the query. This is an advantage in MOST
cases, since if you have a query that returns the data you want, and then
main table is modified, then that query will always return the correct data.
(if you add, or delete records in the table..then the query will reflect
his). So, a query is not really a snapshot of the data, and in MOST cases
you don't want that.

However, if you finally do get your query the way you want, you can simply
change the query in the sql builder to a append query..and run it....you
will then have new table with just this data. So, it is very easy to "save"
the data you slice and dice..it just a question of you deciding to do this
(access does not have some type of session or such...but it is easy and up
to you if you want to save the data to a new table for history reasons.

It is also not clear how your previously cleaned up data, and removed data,
and set values of data without suing some type of query, or programming
system that let you do this....(or, perhaps you used sql in these other
systems...and you can/should do the same in ms-access).

In ms-access, we typically use the query builder to modify and clean up
data....

--
Albert D. Kallal (Access MVP)
Edmonton, Alberta Canada
pl*****************@msn.com
http://www.members.shaw.ca/AlbertKallal
Apr 18 '06 #3
Hey Jamey - Thanks a lot for the insight. That's exactly the kind of
information I was looking for. I don't see the need to rebuild the
table from ground zero, so it sounds as though update queries are the
way to go. Do you have a link to Allen Browne's discussion of nulls
and empty sets?

Have a great day. I'm sure you'll be seeing my name a lot as I dive
into this stuff. Can't wait!

Mike

Apr 18 '06 #4
Albert,

Thanks for the help. It really cleared things up. The only queries
I've ever done have been select queries. I couldn't visualize how you
could do much with such a query. It sounds as though Update queries
are what I really need to be thinking about. As for your question
about how I did things before, here is a snipet of SAS code that is
used to assign SEASON OF HARVEST to each deer:

DATA TEMP.L;

LENGTH SEASON $ 40;

SET TEMP.K;

IF CS IN (9999,999)

THEN SEASON='NASA';

ELSE IF CS IN (9998,998)

THEN SEASON='RAVENNA';

ELSE IF CS=930

THEN SEASON='SALT FORK';

* ELSE IF CS=992

THEN SEASON='SHAWNEE';

ELSE IF CS=991

THEN SEASON='WILDCAT';

ELSE IF (CS=7307 AND DATE IN
('24OCT05'D,'25OCT05'D,'26OCT05'D,'27OCT05'D,

'28OCT05'D,'29OCT05'D) AND
WEAPON=4)
THEN SEASON='SHAWNEE';

ELSE IF DATE IN ('19NOV05'D,'20NOV05'D) AND WEAPON IN (1,4,5)

THEN SEASON='YOUTH';

ELSE IF DATE IN
('28NOV05'd,'29NOV05'd,'30NOV05'd,'01DEC05'd,'02DE C05'd,

'03DEC05'd,'04DEC05'd)

THEN SEASON='GUN';

ELSE IF DATE IN ('27DEC05'D,'28DEC05'D,'29DEC05'D,'30DEC05'D)

THEN SEASON='SWML';

RUN;

****
The code below ADDRESSES DATE ISSUES - CONVERTS DATE VARIABLES TO SAS
DATE VALUE AND ASSIGNS A
DATE TO RECORDS WITH MISSING
VALUES

****
DATA TEMP.A;

LENGTH ORIGINALDATE $ 30;

SET SASUSER.QURY5798;

IF ('01OCT05'd LE DATE LE '31JAN06'd) THEN DO;

ORIGINALDATE='GOOD';

END;

ELSE DO;

ORIGINALDATE='BAD';

END;

RUN;

DATA TEMP.B;

RETAIN NEWDATE_RETAINED;

SET;

IF ORIGINALDATE='GOOD' THEN DO;

NEWDATE_RETAINED=DATE;

END;

DATA TEMP.C;

SET;

DROP DATE;

DATA TEMP.D;

SET;

RENAME NEWDATE_RETAINED=DATE;

RUN;
There are about 1/2 dozen or so of these sorts of operations that are
used to clean the data up. I have another section that makes sure that
all the county names are valid and so on.

Would these sorts of things be difficult to do with update queries?

Mike

Apr 18 '06 #5
Takeadoe wrote:
Albert,

Thanks for the help. It really cleared things up. The only queries
I've ever done have been select queries. I couldn't visualize how you
could do much with such a query. It sounds as though Update queries
are what I really need to be thinking about. As for your question
about how I did things before, here is a snipet of SAS code that is
used to assign SEASON OF HARVEST to each deer:

DATA TEMP.L;

LENGTH SEASON $ 40;

SET TEMP.K;

IF CS IN (9999,999)

THEN SEASON='NASA';

ELSE IF CS IN (9998,998)

THEN SEASON='RAVENNA';

ELSE IF CS=930

THEN SEASON='SALT FORK';

* ELSE IF CS=992

THEN SEASON='SHAWNEE';

ELSE IF CS=991

THEN SEASON='WILDCAT';

ELSE IF (CS=7307 AND DATE IN
('24OCT05'D,'25OCT05'D,'26OCT05'D,'27OCT05'D,

'28OCT05'D,'29OCT05'D) AND
WEAPON=4)
THEN SEASON='SHAWNEE';

ELSE IF DATE IN ('19NOV05'D,'20NOV05'D) AND WEAPON IN (1,4,5)

THEN SEASON='YOUTH';

ELSE IF DATE IN
('28NOV05'd,'29NOV05'd,'30NOV05'd,'01DEC05'd,'02DE C05'd,

'03DEC05'd,'04DEC05'd)

THEN SEASON='GUN';

ELSE IF DATE IN ('27DEC05'D,'28DEC05'D,'29DEC05'D,'30DEC05'D)

THEN SEASON='SWML';

RUN;

****
The code below ADDRESSES DATE ISSUES - CONVERTS DATE VARIABLES TO SAS
DATE VALUE AND ASSIGNS A
DATE TO RECORDS WITH MISSING
VALUES

****
DATA TEMP.A;

LENGTH ORIGINALDATE $ 30;

SET SASUSER.QURY5798;

IF ('01OCT05'd LE DATE LE '31JAN06'd) THEN DO;

ORIGINALDATE='GOOD';

END;

ELSE DO;

ORIGINALDATE='BAD';

END;

RUN;

DATA TEMP.B;

RETAIN NEWDATE_RETAINED;

SET;

IF ORIGINALDATE='GOOD' THEN DO;

NEWDATE_RETAINED=DATE;

END;

DATA TEMP.C;

SET;

DROP DATE;

DATA TEMP.D;

SET;

RENAME NEWDATE_RETAINED=DATE;

RUN;
There are about 1/2 dozen or so of these sorts of operations that are
used to clean the data up. I have another section that makes sure that
all the county names are valid and so on.

Would these sorts of things be difficult to do with update queries?

Mike

You can do all sorts of stuff with queries; append, update, delete,
total/sum, etc. Once you have your records in a table, prior to running
any "action" queries, make a copy of the table.

I notice the dates are in a format that is not a "date" for access, its
a string. You might need to fix that...or create a column that makes a
valid "date". See DateSerial in help. The reason is I saw a criteria
for dates in a range.

You mentioned you may be posting many questions as you dig into Access.
One of the best sources of information is GoogleGroups since many of
us have been there/done that. See
http://groups.google.com/advanced_search?hl=en. In the group box I
usually enter *Access* and then the keywords I'm searching for to only
return answers from those using MSAccess.

Another place is http://www.mvps.org/access for code examples.

In queries, here's a basic tip: Surround strings with " or '. If
looking for the word Test, in the criteria row enter "Test". If Date,
surround by #. Ex: Between #1/1/2006# And #4/15/2006#. And if a
number, nothing but the number.

I think you'll find the querybuilder intuitive...but you don't want to
hose the table when trying out something new so that's why I mention
always having a master you can return to if the need to start over
occurs. BTW...if you create an action query like an Update, you can
press the View icon and it basically acts like a Select query. If you
press the ! exclamation button, it will run the query.


Apr 18 '06 #6
Salad - Hey thanks again for the leads. The MVPS web site looks like
it will be pretty handy! I'm pretty stoked about all of this stuff! I
can't wait to do my first query. Since I've already cleaned up this
year data using SAS, yet still have the original "uncleaned" data in an
Access table, I'm going to try and pick away at this 1 step at a time
and see if I can't get the Access table to look like the final SAS data
set. You've probably already seen my question about the replacing
missing and invalid date values. I appreciate the leads and the help.
I will do my best to learn as much by example and limit my questions.
The last thing I want is for people to think they're doing my work.

Take care,

Mike

Apr 18 '06 #7
Thank you.

Mike

Apr 24 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

72
by: Mel | last post by:
Are we going backwards ? (please excuse my spelling...) In my opinion an absolute YES ! Take a look at what we are doing ! we create TAGS, things like <H1> etc. and although there are tools...
1
by: manish | last post by:
Hi, I am a fresher in the programming field i.e although I have done programming at the basic level but at professional level I am very new and I am facing many problems. These probllems are...
2
by: Phil | last post by:
I have the following code but do not know the best way to return the updated DataTable back to the database. I believe I can use the Update method of the Data Adapter, BUT if true, I also believe I...
1
by: RC | last post by:
I have an Access 2002 database with many tables and forms (but just to keep things simple, let's say the DB has one Table "Table1" and one Form "Form1"). I have managed to cobble together so much...
14
by: chitu | last post by:
Hi pals, Hope u all of r intersted in C programming.Let start from d basics.
3
Mague
by: Mague | last post by:
Hey, I need help to some how open a exe file with Microsoft Visual basics. You mite think im trying to steal someone elses project but im not i wont to no how they did it and upgrade it if...
20
by: Prisoner at War | last post by:
Hi, People, Is it possible to have an "empty" or "dummy" <a href***without*** the browser jumping back up the page?? I have a hyperlink that doesn't point to another document, but is used to...
0
by: babu8880 | last post by:
www.convensysglobal.com/elearn You can login and attend courses and similar resources ask your doubts from experienced engineers in iphone and Objective C Domain You can attend live classes...
23
by: artur223 | last post by:
Hello, I have a problem with IF(IIf) statement in query expression builder(criteria field). It should be like : If !! = "" then criteria field should be filled ! else criteria field should be...
2
Lookin4Toons
by: Lookin4Toons | last post by:
Trying to create a text box w/user input to enter a suggested user name and query AD to see if the name is available. This is for HR users who do not have access to the Admin tools to open the...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.