<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:og="http://ogp.me/ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:schema="http://schema.org/" xmlns:sioc="http://rdfs.org/sioc/ns#" xmlns:sioct="http://rdfs.org/sioc/types#" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" version="2.0" xml:base="https://www.linuxjournal.com/tag/r">
  <channel>
    <title>R</title>
    <link>https://www.linuxjournal.com/tag/r</link>
    <description/>
    <language>en</language>
    
    <item>
  <title>Open Science, Open Source and R</title>
  <link>https://www.linuxjournal.com/content/open-science-open-source-and-r</link>
  <description>  &lt;div data-history-node-id="1340424" class="layout layout--onecol"&gt;
    &lt;div class="layout__region layout__region--content"&gt;
      
            &lt;div class="field field--name-field-node-image field--type-image field--label-hidden field--item"&gt;  &lt;img src="https://www.linuxjournal.com/sites/default/files/nodeimage/story/bigstock-Statistics-and-Analysis-of-Dat-15762752.jpg" width="800" height="517" alt="data analysis and statistics" typeof="foaf:Image" class="img-responsive" /&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-author field--type-ds field--label-hidden field--item"&gt;by &lt;a title="View user profile." href="https://www.linuxjournal.com/users/andy-wills" lang="" about="https://www.linuxjournal.com/users/andy-wills" typeof="schema:Person" property="schema:name" datatype="" xml:lang=""&gt;Andy Wills&lt;/a&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"&gt;&lt;p&gt;&lt;em&gt;Free software will save psychology from the Replication Crisis.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Study reveals that a lot of psychology research really is just
'psycho-babble'"&lt;/em&gt;.—&lt;a href="https://www.independent.co.uk/news/science/study-reveals-that-a-lot-of-psychology-research-really-is-just-psycho-babble-10474646.html"&gt;&lt;em&gt;The Independent&lt;/em&gt;.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;
Psychology changed forever on the August 27, 2015. For the previous
four years, the 270 psychologists of the &lt;a href="https://www.researchgate.net/publication/281286234_Estimating_the_Reproducibility_of_Psychological_Science"&gt;Open Science Collaboration&lt;/a&gt;
had been quietly re-running 100 published psychology
experiments. Now, finally, they were ready to share their findings.
The results were shocking. Less than half of the re-run experiments
had worked.
&lt;/p&gt;

&lt;p&gt;
When someone tries to re-run an experiment, and it doesn't work, we
call this a &lt;em&gt;failure to replicate&lt;/em&gt;. Scientists had known about failures
to replicate for a while, but it was only quite recently that the
extent of the problem became apparent. Now, an almost existential
crisis loomed. That crisis even gained a name: the &lt;a href="https://en.wikipedia.org/wiki/Replication_crisis"&gt;Replication Crisis&lt;/a&gt;.
Soon, people started asking the same questions about other areas
of science. Often, they got similar answers. Only half of results
in &lt;a href="https://www.federalreserve.gov/econresdata/feds/2015/files/2015083pap.pdf"&gt;economics&lt;/a&gt; replicated. In pre-clinical &lt;a href="https://www.researchgate.net/publication/236932344_Reproducibility_Six_red_flags_for_suspect_work"&gt;cancer studies&lt;/a&gt;,
it was worse; only 11% replicated.
&lt;/p&gt;

&lt;h3&gt;
Open Science&lt;/h3&gt;

&lt;p&gt;
Clearly, something had to be done. One option would have been to
conclude that psychology, economics and parts of medicine could
not be studied scientifically. Perhaps those parts of the universe
were not lawful in any meaningful way? If so, you shouldn't be
surprised if two researchers did the same thing and got different
results.
&lt;/p&gt;

&lt;p&gt;
Alternatively, perhaps different researchers got different results
because they were doing different things. In most cases, it wasn't
possible to tell whether you'd run the experiment exactly the same
way as the original authors. This was because all you had to go on
was the journal article—a short summary of the methods used and
results obtained. If you wanted more detail, you could, in theory,
request it from the authors. But, we'd already known for a decade
that this approach was seriously broken—in about 70% of cases,
&lt;a href="https://www.researchgate.net/publication/6763307_The_poor_availability_of_psychological_research_data_for_reanalysis"&gt;data requests ended in failure&lt;/a&gt;.
&lt;/p&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-link field--type-ds field--label-hidden field--item"&gt;  &lt;a href="https://www.linuxjournal.com/content/open-science-open-source-and-r" hreflang="en"&gt;Go to Full Article&lt;/a&gt;
&lt;/div&gt;
      
    &lt;/div&gt;
  &lt;/div&gt;

</description>
  <pubDate>Tue, 19 Feb 2019 12:30:00 +0000</pubDate>
    <dc:creator>Andy Wills</dc:creator>
    <guid isPermaLink="false">1340424 at https://www.linuxjournal.com</guid>
    </item>
<item>
  <title>A Good Front End for R</title>
  <link>https://www.linuxjournal.com/content/good-front-end-r</link>
  <description>  &lt;div data-history-node-id="1339772" class="layout layout--onecol"&gt;
    &lt;div class="layout__region layout__region--content"&gt;
      
            &lt;div class="field field--name-field-node-image field--type-image field--label-hidden field--item"&gt;  &lt;img src="https://www.linuxjournal.com/sites/default/files/nodeimage/story/12343f2.png" width="800" height="475" alt="screenshot" typeof="foaf:Image" class="img-responsive" /&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-author field--type-ds field--label-hidden field--item"&gt;by &lt;a title="View user profile." href="https://www.linuxjournal.com/users/joey-bernard" lang="" about="https://www.linuxjournal.com/users/joey-bernard" typeof="schema:Person" property="schema:name" datatype="" xml:lang=""&gt;Joey Bernard&lt;/a&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"&gt;&lt;p&gt;
R is the de facto statistical
package in the Open Source world. It's also quickly becoming the default
data-analysis tool in many scientific disciplines.
&lt;/p&gt;

&lt;p&gt;
R's core design includes
a central processing engine that runs your code, with
a very simple interface to the outside world. This basic interface
means it's been easy to build graphical interfaces that wrap the
core portion of R, so lots of options exist that you
can use as a GUI.
&lt;/p&gt;

&lt;p&gt;
In this article, I look at one of the available GUIs:
RStudio. RStudio is a commercial program, with a free community version,
available for Linux, Mac OSX and Windows, so your data analysis
work should port easily regardless of environment.
&lt;/p&gt;

&lt;p&gt;
For Linux, you can install the main RStudio package from the
&lt;a href="https://www.rstudio.com/products/rstudio/download"&gt;download page&lt;/a&gt;.
From there, you can
download RPM files for Red Hat-based distributions or DEB files for
Debian-based distributions, then use either &lt;code&gt;rpm&lt;/code&gt; or
&lt;code&gt;dpkg&lt;/code&gt;
to do the installation.
&lt;/p&gt;

&lt;p&gt;
For example, in Debian-based distributions,
use the following to install RStudio:

&lt;/p&gt;&lt;pre&gt;
&lt;code&gt;
sudo dpkg -i rstudio-xenial-1.1.423-amd64.deb
&lt;/code&gt;
&lt;/pre&gt;


&lt;p&gt;
It's important to note that RStudio is only the GUI interface. This
means you need to install R itself as a separate step. Install the core
parts of R with:

&lt;/p&gt;&lt;pre&gt;
&lt;code&gt;
sudo apt-get install r-base
&lt;/code&gt;
&lt;/pre&gt;


&lt;p&gt;
There's also a community repository of available packages, called CRAN,
that can add huge amounts of functionality to R. You'll want to install
at least some of them in order to have some common tools to use:

&lt;/p&gt;&lt;pre&gt;
&lt;code&gt;
sudo apt-get install r-recommended
&lt;/code&gt;
&lt;/pre&gt;


&lt;p&gt;
There are equivalent commands for RPM-based distributions too.
&lt;/p&gt;

&lt;p&gt;
At this
point, you should have a complete system to do some data analysis.
&lt;/p&gt;

&lt;p&gt;
When you first start RStudio, you'll see a window that looks
somewhat like Figure 1.
&lt;/p&gt;
&lt;img src="https://www.linuxjournal.com/sites/default/files/styles/max_650x650/public/u%5Buid%5D/12343f1.png" width="650" height="386" alt="Screenshot" class="image-max_650x650" /&gt;&lt;p&gt;&lt;em&gt;
Figure 1. RStudio creates a new session, including a console interface to R, where
you can start your work.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;
The main pane of
the window, on the left-hand side, provides a console interface where
you can interact directly with the R session that's running in the
back end.
&lt;/p&gt;

&lt;p&gt;
The right-hand side is divided into two sections, where each
section has multiple tabs. The default tab in the top section
is an environment pane. Here, you'll see all the objects that
have been created and exist within the current R session.
&lt;/p&gt;

&lt;p&gt;
The
other two tabs provide the history of every command given and a list
of any connections to external data sources.
&lt;/p&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-link field--type-ds field--label-hidden field--item"&gt;  &lt;a href="https://www.linuxjournal.com/content/good-front-end-r" hreflang="en"&gt;Go to Full Article&lt;/a&gt;
&lt;/div&gt;
      
    &lt;/div&gt;
  &lt;/div&gt;

</description>
  <pubDate>Thu, 26 Apr 2018 14:30:00 +0000</pubDate>
    <dc:creator>Joey Bernard</dc:creator>
    <guid isPermaLink="false">1339772 at https://www.linuxjournal.com</guid>
    </item>
<item>
  <title>Galit Shmueli et al.'s Data Mining for Business Analytics (Wiley)</title>
  <link>https://www.linuxjournal.com/content/galit-shmueli-et-als-data-mining-business-analytics-wiley</link>
  <description>  &lt;div data-history-node-id="1339537" class="layout layout--onecol"&gt;
    &lt;div class="layout__region layout__region--content"&gt;
      
            &lt;div class="field field--name-field-node-image field--type-image field--label-hidden field--item"&gt;  &lt;img src="https://www.linuxjournal.com/sites/default/files/nodeimage/story/12237f8.jpg" width="300" height="428" alt="" typeof="foaf:Image" class="img-responsive" /&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-author field--type-ds field--label-hidden field--item"&gt;by &lt;a title="View user profile." href="https://www.linuxjournal.com/users/james-gray" lang="" about="https://www.linuxjournal.com/users/james-gray" typeof="schema:Person" property="schema:name" datatype="" xml:lang=""&gt;James Gray&lt;/a&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"&gt;&lt;p&gt;
The updated 5th edition of the book &lt;em&gt;Data Mining for Business
Analytics&lt;/em&gt; from Galit
Shmueli and collaborators and published by &lt;a href="http://wiley.com"&gt;Wiley&lt;/a&gt; is a standard guide to data mining and analytics that adds
two new co-authors and a trove of new material vis-á-vis its predecessor. R is a
free, open-source and popularity-gaining software environment for statistical
computing and graphics. Trailing with the subtitle &lt;em&gt;Concepts, Techniques, and
Applications in R&lt;/em&gt;, the new 5th edition of &lt;em&gt;Data Mining for
Business Analytics&lt;/em&gt;
continues to provide an applied approach to data-mining concepts and methods,
using the R software as a canvas on which to illustrate. 
&lt;/p&gt;
&lt;img src="http://www.linuxjournal.com/files/linuxjournal.com/ufiles/imagecache/large-550px-centered/u1000009/12237f8.jpg" alt="" title="" class="imagecache-large-550px-centered" /&gt;&lt;p&gt;
With the book, readers
learn how to implement a variety of popular data-mining algorithms in R to tackle
business problems and opportunities. Material covered in-depth includes both
statistical and machine-learning algorithms for prediction, classification,
visualization, dimension reduction, recommender systems, clustering, text mining
and network analysis. 
&lt;/p&gt;

&lt;p&gt;
The new 5th edition includes material from business,
government, a dozen case studies demonstrating applications for the data-mining
techniques described, and exercises in each chapter that help readers gauge and
expand their comprehension and competency of the material. &lt;em&gt;Data Mining for
Business Analytics&lt;/em&gt; can serve as either a text book or a reference for
analysts, researchers and practitioners working with quantitative methods in
myriad fields.
&lt;/p&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-link field--type-ds field--label-hidden field--item"&gt;  &lt;a href="https://www.linuxjournal.com/content/galit-shmueli-et-als-data-mining-business-analytics-wiley" hreflang="und"&gt;Go to Full Article&lt;/a&gt;
&lt;/div&gt;
      
    &lt;/div&gt;
  &lt;/div&gt;

</description>
  <pubDate>Fri, 03 Nov 2017 16:11:00 +0000</pubDate>
    <dc:creator>James Gray</dc:creator>
    <guid isPermaLink="false">1339537 at https://www.linuxjournal.com</guid>
    </item>

  </channel>
</rss>
