Science the GNU Way, Part I
In my past several articles, I've looked at various packages to do all kinds of science. Sometimes, however, there just isn't a tool to solve a particular problem. That's the great thing about science. There is always something new to discover and study. But, this means it's up to you to develop the software tools you need to do your analysis. This article takes a look at the GNU Scientific Library, or GSL. This library is the Swiss Army library of routines that you will find useful in your work.
First, you need to to get a copy of GSL and install it on your system. Because it is part of the GNU Project, it is hosted at https://www.gnu.org/s/gsl. You always can download and build from the source code, but all major distributions should have packages available. For example, on Debian-based systems, you need to install the package libgsl0-dev to develop your code and gsl-bin to run that code. GSL is meant for C and C++, so you also need a compiler. Most of you probably already are familiar with GCC, so I stick with that here.
The next step is actually the last step. I'm looking at compiling now so that I can focus on all the tools available in GSL. All the header files for GSL are stored in a subdirectory named gsl. So, for example, if you wanted to include the math header file, you would use:
#include <gsl/gsl_math.h>
All the functions are stored in a single library file called libgsl.a or libgsl.so. You also need a library to handle basic linear algebra. GSL provides one (in the file libgslcblas.so), but you can use your own. Additionally, you need to link in the math library. So, the final compile and link command should look like this: Garrick, one line below.
gcc -o hello_world hello_world.c -lgsl -lgslcblas -lm
There are optional inline versions for some of the performance-critical
functions in GSL. To use these, you need to include
-DHAVE_INLINE
with
your compile command. To try to help with portability issues, GSL offers
some functions that exist only on certain platforms (and not others). As an
example, the BSD math library has a function called
hypot
. GSL offers
its own version, called gsl_hypot
, that you can use on non-BSD
platforms. Some functions have both a general algorithm as well
as optimized versions for specific platforms. This way, if you are running
on a SPARC, for example, you can select a version optimized for SPARC if it
exists.
One of the first things you likely will want to do is check
whether you are getting correct results from your code or if there were
errors. GSL has a number of functions and data structures available in the
header file named gsl_errno.h. Functions return a value of zero if everything
is fine. If there were any problems in trying to complete the requested
action, a nonzero value is returned. This could be an actual error
condition, like a wrong data type or memory error, or it could be a
condition like not being able to converge to within the requested accuracy
in the function call. This is why you always need to check the return value
for all GSL function calls. The actual values returned in an error
condition are error codes, defined in the file gsl_errno.h. They are
defined as macros that start with GSL_
. Examples
include the following:
-
GSL_EDOM
— domain error, used by functions when an argument doesn't fall into the domain over which the function is defined. -
GSL_ERANGE
— range error, either an overflow or underflow. -
GSL_ENOMEM
— no memory available.
The library will use values only up to 1024. Values above this are available
for use in your own code. There also are string versions of these error
codes available. You can translate the error code to its text value with
the function gsl_errno()
.
Now that you know how to compile your program and what to do with errors,
let's start looking at what kind of work you can do with GSL. Basic
mathematical functions are defined in the file gsl_math.h. The set of
mathematical constants from the BSD math library are provided by this part
of GSL. All of the constants start with M_
. Here are
a few of them:
-
M_PI
— pi. -
M_SQRT2
— the square root of 2. -
M_EULER
— Euler's constant.
There also are capabilities for dealing with infinities and non-numbers. Three macros define the values themselves:
-
GSL_POSINF
— positive infinity. -
GSL_NEGINF
— negative infinity. -
GSL_NAN
— not a number.
There also are functions to test variables:
-
gsl_isnan
— is it not a number? -
gsl_isinf
— is it infinite? -
gsl_finite
— is it finite?
There is a macro to find the sign of a number.
GSL_SIGN(x)
returns the
sign of x: 1 if it is positive and –1 if it is negative. If you are
interested in seeing whether a number is even or odd, two macros
are defined: GSL_IS_ODD(x)
and
GSL_IS_EVEN(x)
. These return 1 if the
condition is true and 0 if it is not.
A series of elementary functions are part of the BSD math library. GSL provides versions of these for platforms that don't have native versions, including items like:
-
gsl_hypot
— calculate hypotenuse. -
gsl_asinh
,gsl_acosh
,gsl_atanh
— the arc hyperbolic trig functions.
If you are calculating the power of a number, you would use
gsl_pow_int(x,n)
, which gives you x to the power of n. There are
specific versions for powers less than 10. So if you wanted to find the
cube of a number, you would use gsl_pow_3
. These are very efficient and
highly optimized. You even can inline these specialized functions when
HAVE_INLINE
is defined.
Several macros are defined to help you find
the maximum or minimum of numbers, based on data type. The basic
GSL_MAX(a,b)
and GSL_MIN(a,b)
simply return either the maximum or
minimum of the two numbers a and b. GSL_MAX_DBL
and
GSL_MIN_DBL
find
the maximum and minimum of two doubles using an inline function.
GSL_MAX_INT
and GSL_MIN_INT
do the same for integer arguments.
When you
do any kind of numerical calculation on a computer, errors
always are introduced by round-off and truncation. This is because you can't exactly
reproduce numbers on a finite binary system. But, what if you want to
compare two numbers and see whether they are approximately the same? GSL
provides the function gsl_fcmp(x,y,epsilon)
. This function compares the
two doubles x and y, and checks to see if they are within epsilon of each
other. If they are within this range, the function returns 0. If x < y, it
returns –1, and it returns 1 if x > y.
Complex numbers are used in many scientific fields. Within GSL, complex data
types are defined in the header file gsl_complex.h, and relevant functions
are defined in gsl_complex_math.h. To store complex numbers, the data
type gsl_complex
is defined. This is a struct that stores the two
portions. You can set the values with the functions
gsl_complex_rect(x,y)
or gsl_complex_polar(x,y)
. In the first, this
represents x+iy; whereas
in the second, x is the radius, and y is the angle in a polar representation.
You can pull out the real and imaginary parts of a complex number with the
macros GSL_REAL
and GSL_IMAG
. There is a function available to find the
absolute value of a complex number,
gsl_complex_abs(x)
, where x is of
type gsl_complex. Because complex numbers actually are built up of two
parts, even basic arithmetic is not simple. To do basic math, you can use
the following:
-
gsl_complex_add(a,b)
-
gsl_complex_sub(a,b)
-
gsl_complex_mul(a,b)
-
gsl_complex_div(a,b)
You can calculate the conjugate with
gsl_complex_conjugate(a)
and the
inverse with gsl_complex_inverse(a)
.
There are functions for basic
mathematical functions. To calculate the square root, you would use
gsl_complex_sqrt(x)
. To calculate the logarithm, you would use
gsl_complex_log(x)
. Several others are available too.
Trigonometric functions are provided, like
gsl_complex_sin(x)
. There also
are functions for hyperbolic trigonometric functions, along with the relevant
inverse functions.
Now that you have the basics down, my next article will explore all the actual scientific calculations you can do. I'll look at statistics, linear algebra, random numbers and many other topics.
Science image via Shutterstock.com.