<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:og="http://ogp.me/ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:schema="http://schema.org/" xmlns:sioc="http://rdfs.org/sioc/ns#" xmlns:sioct="http://rdfs.org/sioc/types#" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" version="2.0" xml:base="https://www.linuxjournal.com/tag/statistics">
  <channel>
    <title>statistics</title>
    <link>https://www.linuxjournal.com/tag/statistics</link>
    <description/>
    <language>en</language>
    
    <item>
  <title>Galit Shmueli et al.'s Data Mining for Business Analytics (Wiley)</title>
  <link>https://www.linuxjournal.com/content/galit-shmueli-et-als-data-mining-business-analytics-wiley</link>
  <description>  &lt;div data-history-node-id="1339537" class="layout layout--onecol"&gt;
    &lt;div class="layout__region layout__region--content"&gt;
      
            &lt;div class="field field--name-field-node-image field--type-image field--label-hidden field--item"&gt;  &lt;img src="https://www.linuxjournal.com/sites/default/files/nodeimage/story/12237f8.jpg" width="300" height="428" alt="" typeof="foaf:Image" class="img-responsive" /&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-author field--type-ds field--label-hidden field--item"&gt;by &lt;a title="View user profile." href="https://www.linuxjournal.com/users/james-gray" lang="" about="https://www.linuxjournal.com/users/james-gray" typeof="schema:Person" property="schema:name" datatype="" xml:lang=""&gt;James Gray&lt;/a&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"&gt;&lt;p&gt;
The updated 5th edition of the book &lt;em&gt;Data Mining for Business
Analytics&lt;/em&gt; from Galit
Shmueli and collaborators and published by &lt;a href="http://wiley.com"&gt;Wiley&lt;/a&gt; is a standard guide to data mining and analytics that adds
two new co-authors and a trove of new material vis-á-vis its predecessor. R is a
free, open-source and popularity-gaining software environment for statistical
computing and graphics. Trailing with the subtitle &lt;em&gt;Concepts, Techniques, and
Applications in R&lt;/em&gt;, the new 5th edition of &lt;em&gt;Data Mining for
Business Analytics&lt;/em&gt;
continues to provide an applied approach to data-mining concepts and methods,
using the R software as a canvas on which to illustrate. 
&lt;/p&gt;
&lt;img src="http://www.linuxjournal.com/files/linuxjournal.com/ufiles/imagecache/large-550px-centered/u1000009/12237f8.jpg" alt="" title="" class="imagecache-large-550px-centered" /&gt;&lt;p&gt;
With the book, readers
learn how to implement a variety of popular data-mining algorithms in R to tackle
business problems and opportunities. Material covered in-depth includes both
statistical and machine-learning algorithms for prediction, classification,
visualization, dimension reduction, recommender systems, clustering, text mining
and network analysis. 
&lt;/p&gt;

&lt;p&gt;
The new 5th edition includes material from business,
government, a dozen case studies demonstrating applications for the data-mining
techniques described, and exercises in each chapter that help readers gauge and
expand their comprehension and competency of the material. &lt;em&gt;Data Mining for
Business Analytics&lt;/em&gt; can serve as either a text book or a reference for
analysts, researchers and practitioners working with quantitative methods in
myriad fields.
&lt;/p&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-link field--type-ds field--label-hidden field--item"&gt;  &lt;a href="https://www.linuxjournal.com/content/galit-shmueli-et-als-data-mining-business-analytics-wiley" hreflang="und"&gt;Go to Full Article&lt;/a&gt;
&lt;/div&gt;
      
    &lt;/div&gt;
  &lt;/div&gt;

</description>
  <pubDate>Fri, 03 Nov 2017 16:11:00 +0000</pubDate>
    <dc:creator>James Gray</dc:creator>
    <guid isPermaLink="false">1339537 at https://www.linuxjournal.com</guid>
    </item>
<item>
  <title>Learning Data Science</title>
  <link>https://www.linuxjournal.com/content/learning-data-science</link>
  <description>  &lt;div data-history-node-id="1339530" class="layout layout--onecol"&gt;
    &lt;div class="layout__region layout__region--content"&gt;
      
            &lt;div class="field field--name-field-node-image field--type-image field--label-hidden field--item"&gt;  &lt;img src="https://www.linuxjournal.com/sites/default/files/nodeimage/story/binary-code-507786_640.jpg" width="640" height="452" alt="" typeof="foaf:Image" class="img-responsive" /&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-author field--type-ds field--label-hidden field--item"&gt;by &lt;a title="View user profile." href="https://www.linuxjournal.com/users/reuven-m-lerner" lang="" about="https://www.linuxjournal.com/users/reuven-m-lerner" typeof="schema:Person" property="schema:name" datatype="" xml:lang=""&gt;Reuven M. Lerner&lt;/a&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"&gt;&lt;p&gt;
In my last few articles, I've written about data science and
machine learning. In case my enthusiasm wasn't obvious from my
writing, let me say it plainly: it has been a long time since I last
encountered a technology that was so poised to revolutionize the world
in which we live.
&lt;/p&gt;

&lt;p&gt;
Think about it: you can download, install and use open-source data science libraries, for free. You can download rich data sets on nearly
every possible topic you can imagine, for free. You can analyze that
data, publish it on a blog, and get reactions from governments and
companies.
&lt;/p&gt;

&lt;p&gt;
I remember learning in high school that the difference between freedom
of speech and freedom of the press is that not everyone has a printing
press. Not only has the internet provided everyone with the
equivalent of a printing press, but it has given us the power to
perform the sort of analysis that until recently was exclusively
available to governments and wealthy corporations.
&lt;/p&gt;

&lt;p&gt;
During the past year, I have increasingly heard that data science is
the sexiest profession of the 21st century and the one that will
be in greatest demand. Needless to say, those two things make for a very
appealing combination! It's no surprise that I've seen a major uptick
in the number of companies inviting me to teach on this subject.
&lt;/p&gt;

&lt;p&gt;
The upshot is that you—yes, you, dear reader—should spend time
in the coming months, weeks and years learning whatever you can
about data science. This isn't because you will change jobs and
become a data scientist. Rather, it's because everyone is going to become a data scientist. No matter what work you do, you'll be better at
it, because you will be able to use the tools of data science to analyze
past performance and make predictions based on it.
&lt;/p&gt;

&lt;p&gt;
Back when I started to develop web applications, it was the norm to
have a database team that created the tables and queries. Nowadays,
although there certainly are places that have a full-time database staff, the
assumption is that every developer has at least a passing familiarity
with relationship (or even NoSQL) databases and how to work with
them. In the same way that developers who understand databases are
more powerful than those who don't, people in the computer field who
understand data science are more powerful than those who don't.
&lt;/p&gt;

&lt;p&gt;
There is a bit of bad news on this front, though. If you thought that
the pace of technological change in programming and the web moved at a
breakneck pace, you haven't seen anything yet! The world of data
science—the tools, the algorithms, the applications—are moving
at an overwhelming speed. The good news is that everyone is
struggling to keep up, which means if you find yourself
overwhelmed, you're probably in very good company. Just be sure to keep
moving ahead, aiming to increase your understanding of the theory,
algorithms, techniques and software that data scientists use.
&lt;/p&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-link field--type-ds field--label-hidden field--item"&gt;  &lt;a href="https://www.linuxjournal.com/content/learning-data-science" hreflang="und"&gt;Go to Full Article&lt;/a&gt;
&lt;/div&gt;
      
    &lt;/div&gt;
  &lt;/div&gt;

</description>
  <pubDate>Tue, 24 Oct 2017 12:19:27 +0000</pubDate>
    <dc:creator>Reuven M. Lerner</dc:creator>
    <guid isPermaLink="false">1339530 at https://www.linuxjournal.com</guid>
    </item>
<item>
  <title>Image Processing on Linux</title>
  <link>https://www.linuxjournal.com/content/image-processing-linux</link>
  <description>  &lt;div data-history-node-id="1339523" class="layout layout--onecol"&gt;
    &lt;div class="layout__region layout__region--content"&gt;
      
            &lt;div class="field field--name-field-node-image field--type-image field--label-hidden field--item"&gt;  &lt;img src="https://www.linuxjournal.com/sites/default/files/nodeimage/story/12172fijif4.jpg" width="700" height="562" alt="" typeof="foaf:Image" class="img-responsive" /&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-author field--type-ds field--label-hidden field--item"&gt;by &lt;a title="View user profile." href="https://www.linuxjournal.com/users/joey-bernard" lang="" about="https://www.linuxjournal.com/users/joey-bernard" typeof="schema:Person" property="schema:name" datatype="" xml:lang=""&gt;Joey Bernard&lt;/a&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"&gt;&lt;p&gt;
I've covered several scientific packages in this space that generate
nice graphical representations of your data and work, but I've
not gone in the other direction much. So in this article, I cover
a popular image processing package called ImageJ. Specifically, I am
looking at &lt;a href="https://imagej.net/Fiji"&gt;Fiji&lt;/a&gt;,
an instance of ImageJ bundled with a set of plugins
that are useful for scientific image processing.
&lt;/p&gt;

&lt;p&gt;
The name Fiji is a
recursive acronym, much like GNU. It stands for "Fiji
Is Just ImageJ". ImageJ is a useful tool for analyzing images in
scientific research—for example, you may use it for classifying
tree types in a landscape from aerial photography. ImageJ can do
that type categorization. It's built with a plugin architecture,
and a very extensive collection of plugins is available to increase the
available functionality.
&lt;/p&gt;

&lt;p&gt;
The first step is to install ImageJ (or Fiji). Most distributions will
have a package available for ImageJ. If you wish, you can install it
that way and then install the individual plugins you need for your
research. The other option is to install Fiji and get the most commonly
used plugins at the same time. Unfortunately, most Linux distributions
will not have a package available within their package repositories for
Fiji. Luckily, however, an easy installation file is available from the
main website. It's a simple zip file, containing a directory with all of
the files required to run Fiji. When you first start it, you
get only a small toolbar with a list of menu items (Figure 1).
&lt;/p&gt;
&lt;img src="http://www.linuxjournal.com/files/linuxjournal.com/ufiles/imagecache/large-550px-centered/u1000009/12172fijif1.png" alt="" title="" class="imagecache-large-550px-centered" /&gt;&lt;p&gt;
Figure 1. You get a very minimal interface when you first start Fiji.
&lt;/p&gt;

&lt;p&gt;
If you don't
already have some images to use as you are learning to work
with ImageJ, the Fiji installation includes several sample images.
Click the File→Open
Samples menu item for a dropdown list of sample images (Figure 2). These
samples cover many of the potential tasks you might be interested
in working on.
&lt;/p&gt;
&lt;img src="http://www.linuxjournal.com/files/linuxjournal.com/ufiles/imagecache/large-550px-centered/u1000009/12172fijif2.jpg" alt="" title="" class="imagecache-large-550px-centered" /&gt;&lt;p&gt;
Figure 2. Several sample images are available that you can use as you
learn how to work with ImageJ.
&lt;/p&gt;

&lt;p&gt;
If you installed Fiji, rather than ImageJ alone,
a large set of plugins already will be installed. The first one of note
is the autoupdater plugin. This plugin checks the internet for updates
to ImageJ, as well as the installed plugins, each time ImageJ is
started.
&lt;/p&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-link field--type-ds field--label-hidden field--item"&gt;  &lt;a href="https://www.linuxjournal.com/content/image-processing-linux" hreflang="und"&gt;Go to Full Article&lt;/a&gt;
&lt;/div&gt;
      
    &lt;/div&gt;
  &lt;/div&gt;

</description>
  <pubDate>Tue, 17 Oct 2017 13:30:00 +0000</pubDate>
    <dc:creator>Joey Bernard</dc:creator>
    <guid isPermaLink="false">1339523 at https://www.linuxjournal.com</guid>
    </item>
<item>
  <title>Popcon - Are You In Or Out?</title>
  <link>https://www.linuxjournal.com/content/popcon-are-you-or-out</link>
  <description>  &lt;div data-history-node-id="1017753" class="layout layout--onecol"&gt;
    &lt;div class="layout__region layout__region--content"&gt;
      
            &lt;div class="field field--name-field-node-image field--type-image field--label-hidden field--item"&gt;  &lt;img src="https://www.linuxjournal.com/sites/default/files/nodeimage/story/popcon_cropped.png" width="556" height="283" alt="" typeof="foaf:Image" class="img-responsive" /&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-author field--type-ds field--label-hidden field--item"&gt;by &lt;a title="View user profile." href="https://www.linuxjournal.com/users/michael-reed" lang="" about="https://www.linuxjournal.com/users/michael-reed" typeof="schema:Person" property="schema:name" datatype="" xml:lang=""&gt;Michael Reed&lt;/a&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"&gt;&lt;p&gt;Those of you who regularly install Debian may have noticed a prompt that asks you if you would like to install Popcon, the Debian Popularity Contest. Popcon gathers statistics about package usage and periodically submits it to Debian. The anonymous statistics gathered by the script are freely available on the Debian website, and the script can be invoked manually to give a clearer idea of package usage on your own system.&lt;/p&gt;
&lt;p&gt;I must admit that I had always declined to take part in the survey. Some people will object on privacy grounds, but personally, I trust that Debian aren't going to do anything devious with the info. I had opted out because it sounded like another possible point of failure and didn't actually know what the project did.&lt;/p&gt;
&lt;p&gt;If you didn't select it when installing Debian, you can install Popcon at any time via the package manager, and this doesn't hamper the quality of the data. If you're installing it manually, bear in mind that it installation script prompts for user input, so make sure that you can view the text output of your package management system. The information that it is actually gathering is the installation date and most recent access date of every package on your system. By default, Popcon gathers the information and submits it once a week using a cron job.&lt;/p&gt;
&lt;p&gt;Once installed, you can invoke it automatically by typing (as root) &lt;/p&gt;
&lt;p&gt;&lt;code&gt;popularity-contest&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;You'll receive a long list of all of the packages on your system arranged in order of most recently accessed. Here is a sample of the output when I ran it on my Debian Sid box.&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;
1290877204 1290877209 iptables /usr/sbin/ip6tables-apply OLD&lt;br /&gt;
1290877204 1290877339 ed /usr/bin/red OLD&lt;br /&gt;
1290877204 1290877401 laptop-detect /usr/sbin/laptop-detect OLD&lt;br /&gt;
1290877204 1290877230 libnfsidmap2 /usr/lib/libnfsidmap/static.so OLD&lt;br /&gt;
1290877204 1290877414 libruby1.8 /usr/lib/ruby/1.8/net/ftp.rb OLD&lt;br /&gt;
1290877204 1290877455 google-gadgets-gst /usr/lib/google-gadgets/modules/gst-audio-framework.so OLD&lt;br /&gt;
1290877204 1290877246 tcpd /usr/sbin/tcpd OLD
&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;The first two numbers are the access and the creation time of the most recently accessed file within the library. The time is presented in Unix time format, that is, number of seconds elapsed since midnight January 1970. This is followed by the name of the library and the most recently accessed file in that library. The last piece of information is a tag which indicates if that library is considered old (not accessed for more than a month). There are tags to indicate if the library is recently installed or contains no runnable programs. &lt;/p&gt;
&lt;p&gt;Obviously, the output for a typical system is going to be vast. For this reason, if you're invoking it from the command line, either piping to a file or grep is the best approach. For example, piping it to a file with&lt;/p&gt;
&lt;p&gt;&lt;code&gt;popularity-contest &gt;popcon.txt&lt;/code&gt;&lt;/p&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-link field--type-ds field--label-hidden field--item"&gt;  &lt;a href="https://www.linuxjournal.com/content/popcon-are-you-or-out" hreflang="und"&gt;Go to Full Article&lt;/a&gt;
&lt;/div&gt;
      
    &lt;/div&gt;
  &lt;/div&gt;

</description>
  <pubDate>Mon, 31 Jan 2011 14:00:00 +0000</pubDate>
    <dc:creator>Michael Reed</dc:creator>
    <guid isPermaLink="false">1017753 at https://www.linuxjournal.com</guid>
    </item>

  </channel>
</rss>
