473,841 Members | 1,822 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Lesson 1 – introduction to data analysis using r

149 New Member
In this lesson we will initially learn about the features and uses of R.

R is a software environment that is excellent for data analysis and graphics.
It was initially created in 1993 by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand. They created R as a language to help teach introductory statistics to their students. They based R on the S language that was developed earlier at Bell Labs in the 1970s.
After some time they made R available as an open source GNU project. A very active R community now exists around the world.
R is considered a Domain Specific Language as it was designed primarily for data analysis.
R programs are typically created using functions and the programs are executed by an R interpreter.
R is not just a programming language as it has native support for creating high quality data visualizations.
R is used across many industries such as healthcare, retail, and financial services.
R can be used to analyze both structured and unstructured datasets.
R can help you explore a new dataset and perform descriptive analysis.
R is also excellent at building predictive models.

There are many reasons why learning R is beneficial.
As a Data Analyst or Data Scientist – R can be used to dig deeper into your data than is possible using spreadsheet-based tools alone
As a software developer – R can enable data analytics computations and graphics into new or existing applications with minimal effort.
With the explosion of Big Data, there are many new scenarios where using R is an excellent choice to help meet user demands.
As a data analyst, R can be used to perform classical statistical tests and predictive models.
R also has native support for handling time-series datasets.
Classification and clustering models can be used to better detect patterns.
As a developer R is a powerful functional programming language.
Since R scripts are interpreted it encourages an interactive approach to development.
R scripts are typically written using expressions and built-in functions.
R provides native support for many useful types of data structures. Many of these data structures will be explored in other lessons.
External libraries can be used to extend the capabilities of R.
As your R skills improve you will likely start to define your own functions and possibly new Classes to meet the demands of your users.
Installing R is quite simple.
Simply navigate to the R Project page and click on the Comprehensive R Archive Network or CRAN link.
CRAN is a set of servers around the world that store identical, up-to-date, versions of code and documentation for R.
There are binary installers available for Windows, Linux, and Mac OS platforms. It is possible to build R from source, but it is best to avoid this step if possible so you can get started using R more quickly.
Installing R on Windows involves downloading the MSI file and executing it.
There are 32-bit and 64-bit installation options available. We will use the 64-bit version for our coursework as it has higher limits on the amount of memory that can be used.
Once the Windows installation has finished you can get started with R by launching the R command line environment or the RGui tool.
RGui provides some useful productivity features beyond the R command line environment for R users.
Installing R on Linux involves either: downloading the appropriate RPM file from the CRAN website or use of a Linux package manager such as YUM as shown.
Note that you must be logged in as a root user or have sudo privileges on your Linux system to complete the installation.
Once installed on the system any user can use R.
By default, there is an R command line and GUI provided, but many R users prefer to use a more comprehensive Integrated Development Environment (IDE) such as RCmdr or Rstudio.
RStudio is an excellent alternative to the RGui tool provided with R. RStudio is a available on Linux, Mac OS X, and Windows.
In this configuration we are using RStudio on a Linux server from within a browser.
This environment is ideal for occasional R users as they would not need to install R on their own computer to use it.
Let's examine some of the tiled windows show here:
• In the top left corner we are able to view the 2013_cars.csv data file and an R source file called cars.R.
• In the bottom left corner we have the R Console.
• In the top right corner we have access to the objects in the current R workspace and a history of recently used R commands.
• In the bottom right corner we have a histogram plot of data along with access to the R help utility.

It is worth the time and effort to install an IDE such as RStudio as you learn R.
Previously we stated that R can be extended using packages.
There are over 4000 different packages available in CRAN and more being added frequently.
The packages published in CRAN are categorized based on their functionality into Task Views.
During this course we will primarily use the built-in or standard set of packages, but you may wish to explore some of the additional packages along the way.
The base R environment provides a significant set of functions for data analysis, but there are many excellent packages available from the R Community.
The new packages can be installed add using the install.package s() function.
CRAN will be searched for the package or you may have given a new package that is not available in CRAN.
Simply use the same function and direct it to the compressed archive file for the new package.
Here we see that the RJDBC package is being installed to enable connectivity to database servers such as Informix or DB2 through a JDBC driver.
If your develop an R script that uses functions that are not part of the R base your script should contain the library() or require() functions within the first few lines in the script so the package is loaded into memory during runtime.
Sep 4 '14 #1
1 5365
5,501 Recognized Expert Moderator Expert
Please provide proper citations for these articles.
Sep 29 '14 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

by: Geoffrey | last post by:
I am working on a file conversion project that reads data from a one file format, reformats in and writes in out to another. The data is records of informations - names address, account number,statistics. The numeric values in the original file are stored in what appears to be a "packed" data format,using a structure that does not use any of the more standard "C" formats, so I can't use the "struct" module. As an example, the number...
by: Magic1812 | last post by:
Magic Software invites you to join us this coming Tuesday (January 27th, 2004) at 12:00 EDT / 17:00 GMT for a FREE live Webinar: Title: Data Integrity Using eDeveloper Date: January 27, 2004 Time: 12:00 PM EST / 17:00 GMT Presenter: Yuval Asheri
by: Mason | last post by:
If you are looking for a data analysis tool take a look at Databeacon's .NET smart client tool @ http://www.databeacon.com/PressReleaseOct142.cfm
by: Bart | last post by:
Dear all, I would like to encrypt a large amount of data by using public/private keys, but I read on MSDN: "Symmetric encryption is performed on streams and is therefore useful to encrypt large amounts of data. Asymmetric encryption is performed on a small number of bytes and is therefore only useful for small amounts of data." There is not possibility to do it? I have tried to encrypt a 300kB file by RSA Algorithm, but I received...
by: OldSchool | last post by:
Sir its me julius, Sir how can i compare the mscomm data recieved using RFID and Compare it with my Text data. Sir example. Private Sub MSComm1_OnComm() Text1.SetFocus ' STX Data ETX + Cr
by: Gary42103 | last post by:
Hi I need Perl Script to do Data Parsing using existing data files. I have my existing data files in the following directory: Directory Name: workfs/ams Data File Names: 20070504.dat, 20070503.dat, 20070502.dat In each of above data files there will be some millions of records. So my job is read those data files and also read first 3 letters of each record in all above data files and write into new data files.For example
by: orabalu | last post by:
Hi Guys, Can you give me some examples for Incremental load in PL/SQL for Datawarehouse projects. Regards, Balu
by: sevana | last post by:
Hi, I would like to ask PHP professionals if they know any PHP module that has any data analysis or data mining functionality? And if you do, does it help? Is there a need for such module? Thanks a lot in advance! Kind regards, Sevana OY Finland
by: DR | last post by:
Why is its substantialy slower to load 50GB of gzipped file (20GB gzipped file) then loading 50GB unzipped data? im using System.IO.Compression.GZipStream and its not maxing out the cpu while loading the gzip data! Im using the default buffer of the stream that i open on the 20GB gzipped file and pass it into the GZipStream ctor. then System.IO.Compression.GZipStream takes an hour! when just loading 50GB file of data takes a few minutes!
by: bestbird7788 | last post by:
Hi, everybody I need to conduct a large amount of data analysis on database. Could anyone recommend an interactive application for data analysis? The requirements are: 1. Able to cope with the unexpected requirement rapidly. 2. Able to perform further computations on results interactively. 3. Easy to confront even a large amount of complex computations What would you great expert recommend? Thanks in advance.
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.